Lab 1 :- MSBI Fundamentals, Data flow, Control Flow, ETL, Dataware house. (SSIS)

MSBI Step by Step Training Lab 1 :- MSBI Fundamentals, Data flow, Control Flow, ETL, Data Warehouse. (SSIS)
50 minutes
Share the link to this page
Copied
  Completed
In this lab we will learn MSBI Fundamental, Data Flow, Control Flow, ETL and Data warehouse.

  Download

Transcript

Hello everyone and welcome to learn msbi in four days with a project end to end, again, I repeat, welcome to the series of learn msbi in four days with a project end to end. Now, each day of this training comprises of eight hours of teaching because a person works maximum for eight hours right. So, every day of this training will comprise eight hours of teaching. So that means you can also say that the series can also be renamed as or it can also be named as, learn msbi in 32 hours with a project. The first question which will come to your mind is, is it really possible to learn msbi in four days in 32 hours, that means I sit on Monday, and then we have Monday, Tuesday, Wednesday, Thursday and Thursday. I'm done with MSP and Friday I can go and do a party.

The answer is yes. Absolutely yes, because even though msbi is a giant thing, still msbi is simple, it is sweet. And I can bet you, I can give you 100% guarantee, nobody gives that in the world I can give you 100% guarantee that if you watch even this one hour video, you should be able to work on MSP a project. Now, before we move ahead, let us try to understand this word bi let us try to understand the definition of this word business intelligence. Business Intelligence or bi is nothing but it's a process of converting data into information. So bi again I repeat, bi is nothing but converting your data into information.

Now if you look at the English vocabulary of both of these things as data information, you will understand that they look very similar but they are very much different. When we say data, data is normally a technical format, right? For example, your data can lie in a CSV format. Your data can lie in SQL Server. Your data can lie in some binary format, your data can lie in an Excel sheet, I don't know what right. So, data is more of a technical format.

And if you ask a simple end user to go and understand this technical format, it is very difficult. So, you have to take this data, you have to take all this technical format and present it to the user in a very user friendly way. So that he can understand that data and make meaning out of the data. That thing you know, which is where we say that we have to make meaning out of the data is nothing but information. So bi is nothing but it's a process of transforming from your technical data to a meaningful information so that the end user can look at that information and make proper decisions. He can do forecasting He can run his business in a more proper manner.

Now, this journey from data to information involves a very important step called as analysis. Now, this analysis involves a lot of complexity. Because if you look at the data side data can be in different formats, you can have data in CSV, you can have data in XML, you can have data in SQL Server and to go and run the analysis algorithm individually on each one of these data sources is very difficult tedious and it is also illogical actually. So, the best approach would be to take this data and dump it into some central database. This central database is termed as data warehouse and later on what we can do is on this central database, we will go and run the analysis algorithm. Right.

So, first part is you Data then this data is dumped into your central database called as data warehouse and on this data warehouse, the analysis is run and from that analysis information is generated. Now, again the journey from data to data warehouse is not so easy, it involves a lot of processing it involves a lot of massaging of data. If you see the data part the data is available in different different data sources. So you have XML, you have CSV, you have binary formats, etc. And if you want to go and bring all of this data, different different data types into the central database, the first thing we have to do is you have to do extraction of data you have to extract data from these data sources and dump them into this common platform right or in the Common Data Warehouse. So the first process involves extraction.

The second process involves transformation. Transformation means it Look at the data sources you know they have different different data types for example, must be you know one of the data sources terms a male as M and a female as F. And in some other data source you know, they're seeing male as one and female as zero right. So, when we bring this different different data types into the common data warehouse, right, we need to go and ensure that the data has been transformed into the common format. So the second step involves transformation. And the last step involves loading this transformed data into data warehouse. So in short, the journey from data to Data Warehouse involves extraction, transformation and loading.

This complete process in msbi world or I'll say rather in bi world is termed as ETL. Now, once the data has been analyzed from the data warehouse, we would like to store this analysis into some kind of a data store. So this data store structure is termed as a cube and from this cube we display the information or we display this data to the end user as information. So some kind of reporting software will read this data from this queue, and show it to the end user. So information. Now, this is the complete bi cycle, right, from data to information, the first thing what we need to do is we need to do ETL.

After that we need to do analysis and create a cube out of it. I'll talk about this cube structure later on in more details, when we look into SSIS. And after that, we would like to go and display this analysis to the end user in a simplified graphical manner or must be in a tabular manner or whatever it is right. Now, Microsoft ms bi solution, right? Remember, this bi word is very general business intelligence, whatever I'm discussing now is BI. So you have both Microsoft as well as non Microsoft solution for it.

But when you use the word msbi that means Microsoft SQL Server bi that means we are specifically talking about Microsoft technologies here. Now msbi provides end to end support to the full bi cycle. So, this bi cycle which I discussed, right from data to information msba provides full support for it. Now, when you go and create msba project I will talk about the necessary software later on. But for now, you can see on the screen you know, if you want to go and create a msba project, right, you are given different different templates here, you can see that I've clicked on this ms, this business intelligence link. And as soon as I click on this business intelligence links, you know, there are lots of templates you can see right which is displayed on the screen and each one of these templates so, you know some section of your bi life cycle.

For example, the ETL section that is a journey from data to the data warehouse is done by integration services. Project, the analysis part, you know, where we run the analysis and we create a cube is done by this analysis services template, the reporting part, you know, where we take that, that that massage data or the analyzed data and show it to the end user is done by the report server project. So, all of these three templates, you know, so integration services, report server and Analysis Services actually help you to get that full bi automation. So this integration services is also termed as SSIS. The Analysis Services system does SSIS that SS stands for SQL Server, so SQL Server Analysis Services, SQL Server integration services, and the last one is report reporting services that is SSRS. So SSIS to do the ETL SSIS to do the analysis, and SSRS to display that analyze data to the entity And before we move ahead, you know, here is a nice cartoon, you know, which is created by my friend, Mr. Speaker.

And you know, if you look at this cartoon, you will understand, you know, the importance of each one of these templates and where you know, they are helpful in the big cycle. Right. So, now that we know you know that what is the role of SSIS? What is the role of SSIS? What is the role of SSRS you know, we can now go and start executing our project. So, before we exclude a project, let's try to understand what project we are going to complete.

Now, before we start a project we need to have all the necessary tools with us right. So that you know, we don't have any problems while executing the project. So the first thing is, we are going to learn msbi using SQL Server 2014, which is the latest edition while I'm making this video. So go ahead and download the SQL Server 2014 Enterprise Edition. You can see that I have not said you to download some odd no Ordinary edition like express or something I've told you to download enterprise a full blown edition because this tutorial is targeted to make you a true msbi professional. So I don't want to lose any kind of installations or any kinds of things.

So just go ahead and download this this ultimate kind of thing that is Enterprise Edition. And remember that the Enterprise Edition evaluation period is 180 days, which is more than enough for you to learn msbi Okay, so first thing is you need to go and download SQL Server Enterprise Edition install it. The next thing we need is the SQL Server Data Tools. So go ahead and download the SQL Server Data Tools for 2013. So when you search on Google SQL Server Data Tools 2013 you will come to this link. So go ahead and install this SQL Server Data Tools.

2013 Now, you must be Wondering that why have we installed two different tools? Right? So the first one that the first thing that is SQL Server 2014 Enterprise Edition, what you have downloaded is nothing but the actual SQL Server our DBMS. So when you install the SQL Server Enterprise Edition, you would get something like this, you know where you can go and create a database where you can go and create tables etc. So the Enterprise Edition is the SQL Server Enterprise Edition is nothing but the main SQL Server for storing data, creating tables, writing stored procedures, etc. And the next thing you know, what you have downloaded the data tools is to create the msba project.

So the whole coding of extraction, transformation loading, analysis reporting, you're going to go and do by using this 2013 data tool. So when you install 2013 data tools, right, what you get is you get such kind of a menu. If you're not able to locate this, what you can do is you can go to SQL Server 2000 14 and inside that you go to SQL Server 2014. Inside that you can find your SQL Server Data Tools for Visual Studio 2013. Now remember, in case you're going to use different you know, if then for the new versions coming, it will just be SQL Server Data Tools for like 2015 or 2016 right. So, only the versions will change here, but the other things would remain same right.

Now once so, so you have the database and you have the two year and if you want to go and create a project, what you will do is we will do a file new project and you will click on this business intelligence menu and then you can go ahead and start creating SSIS project or ICS project or SSRS project, depending on situation, what you're working on. So, let us first discuss about the project. What we will execute in this msbi learning The project is a very, very simple customer information system, this customer information system, you know, gets data from CSV files. So, these CSV files have to be imported, that means it has to be extracted, it has to be transformed, it has to be imported into the system. And then from from the from the imported data, we need to go and give some reports to the end user.

So this customer data which comes into the system is a simple CSV file. So, basically it has a customer name. Remember CSV file means it's a comma separated file. So, it has customer name, it has customer amount. It has customer code and the sales date, you know when the customer actually brought the product. So, let me go and put some data into this CSV file here.

And this CSV file, you can Get from the correspond vd.com from the site, right? So you don't have to really type this out. So I'm going to go and enter some dummy data here. So, and let's assume that this customer code is unique okay. So, the customer code which is coming into the system has to be unique okay. So 1001 shave and 100 point 23 let us say and the seal state is 112 thousand 10 right in the same v 1002 let us say Roku 200 300 is amount to slash 2012.

So, you can see that I have entered like eight records over here in the CSV and since 2015 right So, you can see that I have entered approximately eight records over here. So, let me go and save this file. Now, the extension of the file is not important so, it can be txt or it can be any other external But the format of the file has to be CSV format means the internal data representation of the file has to be comma separated. So at this moment, I will just give the name here as customer dot txt. So, the extension does not matter, but the format of the file that means over here, it should be comma separated. So basically, this custom information system, you know, gets this customer data inside this CSV file.

And the CSV file has to be imported first thing in the data warehouse, and then over over that data warehouse, we need to go and generate the report. So now the first thing is we need to go and create a data warehouse where you'd like to go and upload this CSV file, right. So I'm going to go to my SQL Server 2014. So go ahead and open SQL Server 2014. Right click on this SQL Server Management Studio and open SQL Server. So let us go to SQL Server and let us create a very simple database here called as customer data.

Red House. And that has given name to this database as customer data warehouse. Okay. So now the goal is that we need to go and upload this CSV file into our customer data warehouse. So if you go to the database here, we have that customer data warehouse, you can see we have the customer data warehouse somewhere. So there it is.

So now our goal is to load the CSV file into this customer data warehouse. And if you remember the first stage in our bi lifecycle, which loads the data into the data warehouse is by using ETL. And in order to do ETL, we need to go and create a SSIS project. So the first thing is in this customer data warehouse, let us go and create a simple table. So I'm going to go and create a table here. This table will have four fields.

One is a customer code and customer name and where care customer amount. Now, when we talk about financial values like amount or any kind of accounts, financial values, we would like to keep, keep that data type as money and sales date when the customer did the sales, just put it as sales date. Right? didn't save this. I'll save this as TBL customer, right? And let me go and make this as a primary key.

So I'm going to go and make this customer code as a primary key controls. Okay, I can see I'm getting this error saying that saving changes is not permitted. This is one of those errors which you will keep getting in SQL Server. What this means what This arrow says is that if you want to go and see these changes, right, then the table, you know, has to be dropped and recreated, which is a dangerous thing, right? So he's actually sending out a warning to you. So what you can do here is remember this step, you will always get this error again and again, Tools, Options.

And you need to go to the designer here, and you need to go and see prevent saving changes that require stable re creation, right. And so Ctrl S, but remember that it's a very dangerous thing. So don't try to do this in production. But what it means is that you know, it will actually go and drop your table and recreate it, create the same with using this new changes what you have done. So I'm done with everything nice. So right.

So now our main goal is to load this data, right load this data from the CSV file into this SQL Server table here, right? So that's our main goal. So in order to do this, remember, we said We have to use SSIS to do the ETL we have to use SSIS to do the analysis, and we have to use SSRS to do the reporting right. So at this moment, our goal is to extract this data from this file and push it to this table right. So, let us go and open SQL Server Data Tools for Visual Studio 2013 you can get it get this thing from here. Or if you wish, you can get this thing from here as well.

Right. So let me go and click on this so that it is opening. So let us click on new project. So we can see that the Visual Studio 2013 SQL Server Data Tools has opened I've clicked on new project. And let us go to this business intelligence menu. Let us click on integration services.

Remember why integration services because we want to do ETL you can see this word here. ETL. Okay, now In order to create SSIS project there are two ways one is you can use this from scratch template, right and the other one is you can use the wizard. So let us not use the wizard let us try to go and learn the hard way. Right so I'm going to go and browse over here and create this project in a separate folder all together. So I'm going to create a folder here saying customer customer information system.

Okay. All right. Select this folder. And let me give a name you're saying? Wow SSIS customer. Okay.

This Okay. Now once the new project is created, you would see something like this on your screen. You would see it five tabs at the top. And you'll see this SSIS Toolbox. In case you're not able to see the SSIS Toolbox, what my suggestion is to go and click on this SSIS menu and click on the SSIS Toolbox toolbox menu. Okay, so you will see this five tabs and the toolbox.

For now, for now, don't think about these three tabs, you can see there are three tabs here parameters, event handlers, and Package Explorer. Don't think about these three tabs for now, okay, concentrate only on the toolbox control flow and data flow. So, in other words, at this moment, we just concentrate on control flow data flow and SSIS Toolbox. The other three tabs, you know, I will talk later, you know, as we move ahead again, from these two tabs, that is control flow and data flow. Let us concentrate more at this moment on data flow tab. As time goes by, I will definitely explain you all the tabs, you know, but I don't want to pressurize On the first time itself with all these concepts, right, let us try to complete that project complete that project means load that file that CSV file into table.

Right? So let us try to learn less concepts here less theory. And as, as time comes in, I will explain you all the tabs, don't worry, it's a promise, okay. Now, but in case you know, you still want to have a definition around you, this control flow is nothing but it invokes the data flow. So that's not the full definition. But in case you know, some some of the people you know, they would like to still have some kind of a very mild definition, then you can think about that this control flow invokes the data flow, okay.

So let us go ahead and drag and drop this data flow task component from this SSIS Toolbox. And let us give it a name you're saying load CSV. Right. Now, if I double click on this, you can see that it has fallen back to the data flow task. Okay, so you can see The Ctrl show tab. If I double click on it, it goes to data flow.

So in other words, you can see that control flow has data flow code. In other words, control flow invokes the data flow. And second, one more important point to note here is, you can see that the SSIS Toolbox changes, you know, when it goes from control flow to data flow, you can see here, the SSIS Toolbox looks like this, you know, when you're in control flow, but the time you go to data flow, the SSIS Toolbox looks different. Okay. Now, this data flow tab is the tab in a very go and write the ETL code. In other words, your extraction transformation and loading code is all written in this data flow tab.

Control flow invokes the data flow. So that's a very simple definition for now. But later on, I will go into more in depth definition in as the video Mosaic, but for now, remember, control flow invokes the data flow. So when you when you go and double click on the controller So it goes to the data flow tab. And you can see now, when you see this data flow tab when you are on the data flow tab, and if you see the toolbox, you can see that it has that essence of ETL. You can see here, there are components for source.

So in other words, you'd like to first go and extract the data from the source. So that is done by all these source component. The next thing is you'd like to go and do some transformation right on the data, you can see for that we have the transforms. And finally, you'd like to go and load that into the the destination. So you can see your ETL thing is seen in this data flow data flow tab here. So, remember the source is for extraction, this transform is for transformation and destination is for loading.

So all your ETL is done in this data flow tab. So let us go step by step. So the first thing is we need to do the extraction. In other words, we need to go and define the source from where the data will come in. Right Remember that our source at this moment is nothing but it's a simple flat CSV file, right. So you can see that when I clicked on this other sources, you can see there are lots of different types of sources here.

But for now, what we need is we need a flat file source. So I'm going to go and drag and drop this flat file source over here. Now every component you drag and drop on the data flow task or control flow task. The first thing you need to do is you need to go and configure the component. Now, if any component is not configured, then you can see the red sign here like this. So you can see now because I have just dragged and drop this component, you know there is a red sign here indicating that this component is not configured.

All right. In order to configure the component, you need to right click on the component and then say Edit. So you need to right click on the component and say Edit. Now, every source component in order to connect to the actual data, you know needs something called as the Connection Manager. So for example, you can see now here, we have this flat file source and this flat file source has to get connected to this customer dot CSV, right. So in order to connect this component to this actual CSV source right to the flat file source, we need something called as the Connection Manager.

So let us go and create a connection manager manager here. So, I'm going to go and click on new here and let us give a nice name to this Connection Manager here. So, let us give a nice name like CSV connection, right, and some small description here so that you know whenever some other developers read this, you know they would understand what what is the use of this Connection Manager. And the next thing is we need to go and give the file name so at this moment, our file is located. So this file is located. Customer dot txt is located in this on the C drive.

So I will go and browse into the C drive customer, customer text. Right? There it is. Now, the flat file Connection Manager has lots of options here you can see code page, locale etc, right? Don't get distracted with all the options for now. So let me start off with some minimal options so that we can start moving this project ahead and later on.

We'll discuss more details about it. So the first thing you can see here, this small checkbox here saying that the false data Oh, is it the column names so it means that you know the first row here. Does it involve the column names? Yes, it is. So you can see that it is checked. Second, it is a delimited file.

Yes, it's a delimited file. And you can go here and you can also see a preview you know saying that okay, how does the preview looks like and in case you You know, your delimiter is not a common column, sorry, a comma, I'm sorry, if it is a tab or if it's a semi colon, you can always go and choose one of these options. But at this moment, you know, our column delimiter is a comma. So I'm going to use this and what is your root row delimiter. So at this moment it is entered. So I'm gonna go and see use this and leave these things at this moment, do a preview and just say, okay, and just see, okay?

Now the time you do okay, you can see that the red sign has gone off, you know, this indicates that you're, you're competent has been configured. And as a best practice, you know, always go and put some description you can see now at this moment, it is just flat file source. And tomorrow, if somebody comes and reads this program, he won't be able to understand it right. So let us go and give some description you're saying okay, read from CSV. Right. Great.

Now, if you watch the source component very closely, there are two arrows coming out from the Source component right one is you know the blue arrow and the other one is the red arrow. Now the blue arrow is the arrow you know where the actual data will come out and the red arrow will will have errors inside it. So, in case you know while reading from the CSV file, you know if there are any kind of errors, you know those will be sent into this red arrow here and in this blue arrow, the actual data will come out. So, now, what we'll do is we have already done the extraction The next thing is transformation right. So, for now, we will not do any kind of transformation let us just do the loading part. So, in other words you know the data just extracted and loaded and then I will gradually add the transformation later on okay.

So, let us go ahead here and add a destination here. So, currently you can see that I am putting a do dotnet destination Why? Because my end database is SQL Server, right? And what I need to do is I need to take this blue arrow and drag and drop to this ad or dotnet destination because at the End of the day, the data is coming in the blue arrow, right, so I need to go and drag and drop the blue arrow to the Add destination input. So the output of this source is now becoming the input to the destination. And again, you can see that this destination component is not configured, so it is having a red color.

Again, I need to right click, I need to say Edit. And here again, we need to specify the Connection Manager. But this time, the the connection, the source at the other end, you know is SQL Server. So let's go to our SQL Server here. So we need to get the server name to get the server name, you need to click on file. And you need to click on Connect.

And whatever server name you get here, you're going to go and copy this. Again, I repeat, you need to click on File Connect. And you can copy the server name from here. And I'll say new source so we need to go and specify the data connection. So again, I'm going to go and click on new here. And I will say this server name Ctrl V. And I will see the database, you know where I need to go and load this is our customer data warehouse, right?

So, customer data warehouse, there it is. My authentication at this moment is Windows. OK, I'll set Test Connection. And I will say, OK, and I will say, OK. And now, a database has lots of tables, right? So this CSV file has to be uploaded into which table so at this moment, I have only one table TBL customer. So that's where we're going to go.

And we're going to go and upload this data. And the next thing is going to go and specify the mappings. So basically, if you remember, our customer CSV file was this right? So we need to go and specify the mapping saying that basically, a which field of the CSV file will map to which field of the table right so you can see So you can see here. So I've said that okay, customer core maps with customer core customer name maps with customer name, amount maps with amount and sales date maps with sales date. Now remember that at this moment, the field names of the table and the CSV column names are same, that's why the mapping has happened automatically.

But in case the mapping was not automatic, you know, so, you would see something like this. So, if you if you know, you, you will see that the mappings are not created. So you need to go and drag and drop the source and map it to the destination like this, right. And then I will say, OK, and this is done. So you can see now, this is again, changed back here. So the red sign has gone off again, let's go and put some proper naming convention here, seeing that this is a customer data warehouse, right.

And it does go and save this. So let me go and run this application. Now, in order to run this application. You can get On this start button here, or you can click on F fi. So let me click on the start button. And and you can see that there is an error here.

And this arrow says that I'm not able to DC utilize this package. Okay? It's good that arrows are coming and let me help you out, you know what this error is all about. So this error is because of, you know, the little bit and the 64 bit runtime issue, right? So what you can do is right click on this project, okay, you can see now, we have a solution here and we have a project. So right click on the SSIS project, remember, this is a solution.

And this is a project okay? So in one solution, you can have multiple projects. So right click on this and go to properties. And over here, go and change this. Run 64 bit run time true to false, because it's possible that you know what must have happened is when I did the installation, I must have installed the 32 SSIS runtime. Okay, so finally I've done that.

And now let us go and run this. And I'm going to go and run this, let us see what happens. So there you can see my SSIS packages running. And while my SSIS package is running, you can see that if everything works fine, it actually shows you a green sign. But if there are errors, it actually shows you this red sign here. So you can see here, I have some errors here, you can see that one is I'm seeing a green sign here, which indicates that the reading from CSV is fine.

But the time it went to insert into SQL server that means into the database there was a problem. So you can see that there is a red sign over here. Now this red sign here does not indicate what exactly is the error it does tell you that yes, there is a error here, but it does not indicate that what exactly is the error. So when you get an error like this right in order to go and see exactly what the error is, you can see that there is a progress tab. OK, so click on this progress tab and go and see the first red sign in this progress tab. So, this progress tab actually tells you that what exactly is happening you know, when the package is executing.

So basically, it is validating basically this starting basically the pipeline is getting executed, etc. But you can see this red sign here, you know, indicates that where exactly the error happened, and it also provides a detailed description of it. So what I'm going to do is I'm going to go and copy this thing. So I'm going to go and copy this message and let us open this message in a notepad. Also, let me copy the next error message after this as well so that I can get a better insight of it. So we can see that that our error messages and what I can see is that you know, it says that it is trying to convert a string into a money data type, and that's where it is failing.

And if you remember right, our SQL Server Never had one of our fields as many data types okay. So, let us go and open our CSV and let us see that if any one of these fields over here are not of proper numeric. So, you can see that okay this is numeric This is numeric. So, you can see this is the customer amount which is which is there. So, I can see that almost all the values are good Oh, but I can see that you know, there are some empty enters down below Can you see this you can see that down below after this last record 1008 you can see that there are some empty empty enters right? And because of that this you know must be he's trying to take nothing value and that nothing value is not able to convert into into into money right.

So, let us go and delete this empty values here. Right now. To save this file now let me go and read on this package again to read on this package first thing you have to stop the package you will stop this execution you can see that I have moved my mouse on this stop debugging right so I'm going to go and stop the debugging and again now let me go and start the package. So that the packet starts running remember if it is green it is successful if it is red, there is a problem now you can see that all the records have you know successfully been been extracted, transformed and loaded. And you can see here it also shows you the number of rows you know that has moved from the source to the destination, okay, so you can see this value here, eight rows, okay, so let us quickly go back to our SQL server to just check if everything was okay, so I'm going to go to my SQL Server.

Let us go and open the table here. So you can see I right click on this TBL customer I will say select top thousand rows and let us see that if All our eight rows of the CSV has been uploaded into the table have not. So that is and you can see that all the rows have been uploaded successfully 1001102 till 1008. And all the rules have come in to the SQL Server. Now, in order to fix issues like the empty rows in which I've just done from the CSV file, one of the things you'd like to have in your hand is what kind of rows are getting circulated from the source to the destination or what rows are moving from one component to the other component. So this path, you know, this part which connects you know, one component to the other component, you would like to know that what row is getting transferred from this component to the other component.

So for that, you know, you can use something called as data viewers so you can right click On this arrow here on the path which connects one component to the other component, and you can click on this menu here called as enable Data Viewer. Once you go and click on enable Data Viewer, you can see that he puts a small icon offer lens here in which indicates that Data Viewer is now activated for this pop. So now if I go and run this So what it does is, you know, it, you know, before the data gets, you know, Trump gets into that customer data warehouse, you can see that you know, the program has halted over a year and he has popped up a small window of small grid, you know, indicating that what kind of rows are getting transferred from this component to this component.

So, in case you have empty rows, you know, you could have easily notice down below, right, so, this feature, you know, you can always use, you know, to debug the SSIS program. And you can see that again, there is an error sign here. Can you guess what is this error sign this error sign. So if you go down below, you can see that it says it's a primary key violation error. Right? So if you go and put this, it can see that it's a primary key violation error, duplicate key.

Why? Because if you remember, in our TBL customer, the customer code is unique. It has a primary key, right? So that's why you know, when I tried to run the package again, it actually showed me this error here because you know, duplicates are not allowed, allowed for a primary key primary key. So you can see that how easy it is to know what the error is in SSIS. Again, a repeat, go to the progress tab, scroll below and go to the first read sign, copy this text and read it carefully.

So now we are done with extraction and loading. But we You'd like to see some examples are on transformation as well. So, let us do like this. If you see at this moment, you know the customer name is going all in the same case as it is. In other words, you know, if the customer name is in a small case, in the text file, it is going as a small case in the database, we would like to go and make all these customer name in uppercase, so, we would like to do some transformation on the data. So, before the data goes from the extraction to the loading part, we would like to go and capitalize the customer name.

Okay, so let me go and delete this data first or else I will get a primary key violation. So I'm going to go and delete this data from here. And let's go back here let us go and delete this link from extraction to the loading. Now you can see here one is we have the source that is extraction One is we have the loading that is the destination. Now we have the transforms in this other transforms as well as we have the common transforms in the common folder. Okay.

So, what we'll do is let us go and put a derived column A derived column is nothing but it is an extra column which will get created from the existing column data. So, what I would like to do here is I would like to take that customer name and create one more column called as a per customer name right. So, let me go and first put this output to the derived column and here I will say this is capital case right. So, your I will go into Edit So, your I would like to take the existing customer name and I would like to go and add a new column you can see you can either replace the existing column or you can add a new column here called us you customer name is upper customer name okay. And you can see there are lots of functions in which we can use there are mathematical functions that are string functions right.

So, let us go and use the string function so, the best way you can do is watch this very, very closely. Let us say I want to do uppercase, you can see there is there are lots of string functions here. But to do uppercase, you can see there's a uppercase function here. So let us go and drag and drop this uppercase over here. And let us give customer name to this. Again, repeat.

Again, let me show this demo. I'm dragging and dropping this uppercase here. And they're giving the customer name as an input to the upper function. Right. And let me do okay. So now I have added a derived column or I'll say there is an extra column which has been added called as your customer name.

So that data now I will pass it to the customer data warehouse. And over here, I will go and change my mappings. Remember, our mappings currently are pointing towards this customer name. So I don't want the data to be now taken from the customer name. We would like to take the data from the you customer name, right and LLC, okay. And it is See if this so let us again go and run the program.

So I'm going to go and click on start here. So that it goes eight rows transferred nice. Let us go back let us do a select let us see ah that it is transformation. So, we have seen a very very decent cycle of extraction, transformation and loading. Now, before we quickly wrap up this first one hour of training, I would like to talk about the structure of SSIS project. So, in order to see the structure of SSIS project, you have to click on View and you have to click on Solution Explorer, okay.

Once you click on that, you know you get something like this. Now, at the base level at the bottom level whenever you are doing this drag and drop you are You know, writing things over here you are working on a file called as a dot DTS x file. So you can see the system as a package. So the package is you know, where you go and write your data flow where you go and write your control flow. So this package dot DTS x is is what has your actual extraction, transformation loading, your control flow, your data flow, everything goes inside this package. So whenever you are dragging and dropping on this designer like this, you are actually modifying this DTS x file.

Now this DTS x file at the bottom below is nothing but it's XML file. If you right click on this and if you just say, view code, you will see that you know, there is nothing but XML. At the end of the day, you don't go and modify this XML because it is very difficult to understand. You always go via the designer, but you can see at the base level it is nothing but it's XML file, right? So okay, Now close the package. So, at the base level, what we have is we have the DTS x file that is our package.

So this is termed as our package. Now this package belongs to a project. So you can see this SSIS customer, it is actually a project. So this DTS x or the package belongs to a project. And the project belongs to a solution. So again, I repeat, the way the structure of SSIS project is, at the base, we have the XML file, which is our DTS x file.

DTS stands for data transformation services. In the in the previous version of SQL Server, you know, when you wanted to move data from a source to a destination, there was a concept called as DTS DDS data transformation services, people who are very old to SQL Server, you know, they would know what I'm saying. Now, what they've done is they have they have taken the same DTS and build this complete SSIS over it. That's why they named it as DTS x x minutes that is something extra to it, okay. So, you have package package belongs to a project and project belongs to a solution. So, you can have multiple packages inside a project and you can have multiple projects inside the solution.

So, in this also I can go and add new projects if I want or I can also go and add new packages if I want right. So, remember the structure of SSIS project package, project and solution. So, we have completed almost one hour of training and you can see that we have covered so many topics in this one hour. We first started with a definition of bi after that we saw a full bi lifecycle. After that we saw that how msbi fits into this full bi lifecycle, right so we talked about SSIS, SSIS and SSRS. So we said that SSIS is for ETL SSIS is for analysis and SSRS is for reporting After that, we started with a small project and we created a CSV file.

We extracted that CSV file we loaded into SQL Server. We also saw that how we can go and see the errors by looking at the execution Results tab. We also saw, you know, what is the difference between a control flow tab and a data flow tab. We also saw you know how the structure of the SSIS project is. So basically, we have a DTS x file and the DTS x file is nothing but an XML it belongs to a project and project belongs to a solution. So you can see that in this one hour of training, we completed so many topics, we almost did a full SSIS project.

So as I've said, it's a full days of training that is 32 hours. So we have just completed one hour for training till now, and we have more 31 hours to go. And if you have the zeal to learn, then I have the zeal to teach you believe so it If you are willing to see the remaining 31 hours, yes, I am here to teach you for the remaining 31 hours. So keep moving, don't give up, lock yourself up for four days in the room and come out as a true VA professional. And remember that this is just a start. Okay, so my goal is to take this project to a very, very professional level very, very kind of a life project kind of thing.

So I'm going to cover a lot of in depth topics, so don't miss out the remaining 31 hours. Keep going, keep rocking and let us become a true msbi professional end to end

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.