Lab 3:- For Loop, Variables, Parameters and Debugging. (SSIS)

MSBI Step by Step Training Lab 3:- For Loop, Variables, Parameters and Debugging. (SSIS)
33 minutes
Share the link to this page
Copied
  Completed
This video is Lab 3 of Learn MSBI in 4 days with Project series where we will learn Loop, Variables, Parameters and Debugging.

  Download

Transcript

So, let us move ahead with lap three. And in that three I would like to go and add some more requirements to my customer project. If you see at this moment our customer SSIS project is loading only one file at a time means only one customer file can be loaded in a single run right and why there is only one single customer file because if you remember our CSV connection, right is pointing towards this customer text file. Okay. Now let us say we get a requirement from the customer and the customer says that I'm going to send you both files. So my customer record will not come in one customer text file, but they will come in multiple customer text files.

So what did what what they are saying is that basically, I'm going to go and send you files like this where basically You have customer one dot txt where you have customer two dot txt and so on, okay. That means, our program now should have the capability of reading each one of these files one by one. Now, if you analyze properly this activity this task, there are two aspects to it. One is we need to go and write ETL code for these individual files and that we have already written. The second thing is, we need to go and run that ETL in a for loop. So, in other words we need to go and do a for loop and the for loop will go and read every file from this folder, this bulk Files folder and read one by one and initiate that extraction transformation and loading process.

So, in simple words What we need to do is the ETL we have already written. So, this ETL What do you see over here right this data flow task I will say rather remember in the first class I said that ETL is nothing but your data flow task. So, the data flow task code we have already written right. So, you can see that this data flow task is for a single file okay. So, let me just go and put a texture saying this is for a for a single file. Now, what we need to do is we need to go and do a for loop for this single ETL and achieve our achieve our goal.

Now, if you remember, we said that in lab one in data flow task, you will do ETL activities. In control flow tasks, you will do non ETL activities So, at this moment, you know loading the CSV file and putting into data data flow task is an ETL activity and that we have already done in the data flow task, but this for loop is a non ETL activity. So, that code we need to write in the control flow. So, in other words in control flow, we have activities like sending mean for loops, rebuild indexes. So, if you go to the toolbox of control flow very quickly, so, if I go to SSIS and if I go to the SSIS Toolbox, you will see that the toolbox elements show you all non ETL thing So, you can see here in the common if you see bulk insert, file file copying it's a non ETL activity FTP non ETL activity web service Non EDL activity right and in the same way a for loop is also a non ETL activity right.

So, let us go ahead and drag and drop a for loop a for loop on this control flow task here. So, I'm going to go and drag and drop a for loop here. Now, you can see the first thing what you will notice here is there are two types of for loop one is there is a for loop container and then that is a for loop each containers. So, you can see actually there are two types of for loop. So, let us first try to understand you know what is the difference between both of these for loops, and then we will see that what is the best to choose from these both for this condition a simple for container or a four loop container loops in a constant value. So, for example If you want to loop until the value is 10 or must be less than 10 or until the value is greater than 10.

So, in case you want to loop until a fixed count, then you will use the for loop container. The for each loop container actually loops through a collection. Now this collection can be files in a folder. This collection can be rules in a record set. So basically a for each loop container loops through a collection of entity like record set, folders, etc. So at this moment, if you see we want to go and loop through the files of a folder right?

So definitely the for each loop container is more appropriate. So what I'm gonna do is I'm going to go and drag and drop this load CSV inside this for each loop container. Now there are two things here. Once we have the control flow, where the for loop is happening, and the other we have the data flow, where the extraction transformation and loading process is taking place, right. So we have this for loop container where the, the loop is happening, right. And we have the data flow task where the ETL is happening.

Now, the responsibility of the for loop container is to go and just read the file names of that folders. So, the responsibility of this for each container is to go to this bulk Files folder and start reading customer one dot txt customer two dot txt with a full path name, right and pass on this file name to the ETL. Right. And inside this ETL if you remember we had this read from CSV file right. This read from CSV file is using this CSV connection So, that filename with the fully qualified path name has to be passed over here to this file Connection Manager again I repeat, we have the for loop, which is happening in the control flow. So, from there, the for loop will read the full file path name and pass it to the data flow and inside the data flow, he has to go and set the file name over here.

So, in other words, you know, this is more runtime, isn't it? So, now, the first question comes is that how can control flow pass the full file name with the path to the data flow task and that is by using something called as variables. So let us go ahead and create a variable so let us right click on this control flow wide area and you can see that there is a variable menu here. So click on these variables menu. And this kind of box actually pops up. So, over here you can see that there are two three buttons at the top.

So, you can see this left side button here let us say add variable and we will say your full file path right and this variable we will create it as a string right. So, variables are nothing but you know, they are they are temporary storage ages, you know where we can go and store some value. So, what we can do now is we can go and set this full file path in this for each loop container right and this variable then will be passed to the data flow task and the data flow task file Connection Manager will pick up the full file path name from this variable. So, let us go ahead and start configuring the for each loop container. So, our five path integration Or else your folder path, you know where all the files are lying is this folder right? So let me go and copy this full file path here, right.

So let us now go and start configuring the for each loop container. So let us right click on this for each loop container, let us click on Edit. And over here we will say, in the collections, we will say that okay I want to go and loop through a folder through a file, right so you can see there's a for each file enumerator and you can see that you can also loop through a do dotnet collections, a new collections, node list etc. Right? But at this moment, our interest is that we want to actually go and loop through the through the files, and the folder is actually this. And if you see at this moment, you know all our files or text files, the internal content is in the format of CSV.

Externally you know there are simple text files right? So I'm going to go and say okay text files and I will say yes, use the fully qualified name. And in case you need this folder has subfolders and if you want to also travels that you can use this as well, but at this moment, I will not check that because we do not have any folders as such. Now the next thing which is very important is variable mappings. So here we have to specify that where in which variable do we want to store the full file path name right? So, I'm going to go and say okay, store this in this full file path variable which I've just created right.

And I will just say okay. So now what will happen is, at this moment, the for each loop container will run and every time he will go and get a file, he will set the full file path name with the folder name into this variable full file path. Right. Now the next thing is, we need to go into this load CSV file and specify the variable here on runtime. Now in order to set that variable to our CSV connection, right, so you can see down below is our CSV connection. So in order to go and set that variable file name to the CSV connection file name, so right click on the CSV connection and click on Properties.

So, in the properties, you will find there is something called as the connection string. This connection string at this moment is static. In other words, it is pointing towards a hard coded name. So in order to go and make this dynamic, we need to go and set that variable to this connection string. And in order to do that, we have something called as expressions in SSIS. So by using this Express We can go and set that variable value to this connection string.

So let's go and click on this. So in this expression, now I can see, I want to go and set the connection string to what? So I want to go and set the connection string to this variable value. Right. So expressions all over their societies are used, you know, when you want to set the variable value to a property of a component. Okay?

So I'm sorry, okay, and I'm done. Great. So let us do control us. Good. So we are almost done. So if you see at this moment in our text files, we have eight records here.

And we have eight records here. So my expectation is inside our database, you know, I would expect you know 16 records to get inserted Okay, so I'm going to go and do a select here. So, at this moment we do not have any record needs to start right. And also it won't be 16 Records actually Can you guess why? Because you know we have some negative values here as well right negative values and zero values. So, these values will actually go to the to the error handling routine right.

So, I would be expecting that we will get 14 values. So these two values you can see minus 100 and zero Will you know will not be going to the main table right. So, these validations will also fire. So, let us go and run the SSIS package. So, let us see what happens. So, you can see first and then the second file, okay, so both of the files have been loaded.

Two rows in each of the batch have gone To the error So, in other words, we had eight rows here and eight rows in the second file that means we should get at least 12 rows in the main table right. So, if I do a select here, you can see that we have all the records here 10011002. So, all those records which are there in the file have been loaded. So, you can see 1001 is loaded 1002 is loaded 1006 and 1007 are not not loaded because they have negative values as well as you know, they have zero values right in the same way. If I look at the second text file, you know that also has been read. And I can see all the values have been inserted here, except for these two values because these two values are having minus hundred and zero, right.

So that was a very, very simple example of how to use the for loop and insert It, insert the text files into the database. Nice. So now nice the application is running, right? It is looping. It is filling the variable with the file name. And then it is passing to the ETL.

And ETL is doing the extraction transformation and loading. A lot of time in when you have these kinds of for loops, you'd like to go and do debugging. You'd like to debug, you'd like to pause, you'd like to go and watch the variable values, right. So what if you want to go and debug this for each loop and see what is happening inside the variable? which path is picked up? How do you do it?

So again, debugging is very easy in SSIS. Now in SSIS, debugging happens at component levels. Means For example, here you can see now I have two components. One is the for each loop container, and I have the load CSV, right? So if I want to go and debug this code, component, then I have to, then I have to go and say right click on this and say, edit breakpoints. And then over here I can go and specify saying that Okay, so when do you want to go and debug this component, when it is a, you know, just when the pre execute is happening, that means just before the execution, post execution after the execution, 1 million errors.

So in other words, when do you want to go and break on this component. So I'm going to go and put a breakpoint here saying on pre execute. And the time I put a breakpoint, you can see that there is a brown circle your circle here, shown, and this brown circle is absolutely on the for each loop container. So this brown circle says that there is a big point here, and the application will wait here, and you can do the debugging. In the same way I would like to go and put a breakpoint on the load CSV component. So again, right click, and I will say Edit breakpoints.

And again, I can see over here on pre execute. So at this moment I'm using on pre execute, but you can use the other events you know as per your need right. So now, let me go and run this program in a debug mode in order to go and run the program in a debug mode. You can either click on this start over here or either you can say debug and start debugging or you can go and press the f5 button. So, let me go and start debugging here. So that my application is running.

Now in a debug mode you can see you can see at this moment, you know there is a small ello icon here of arrow, which indicates that currently it is running this for each loop container. In other words, it is now on the pre execute of the for each loop container. Right. Now if you want to go and move ahead, that means if you want to go and move to the next step that is The load CSV file component, then you need to go and click on debug. And you need to say continue. So either you can say debug or continue debug and continue or either you can say f fi, or either you can click on this continue button here.

So let me go ahead and click on the continue button. So you can see now, this ello arrow icon is now on the second component right now, it will let us say I'm on this debug mode the program is currently halted, you can see that this thing moving over here saying that, you know, the data flow task will run, okay. Now at this moment, I want to go and watch you know, what is the value in my variable. So if you remember, we had this variable here, the full file, full file path. So I would look I would like to know that this for each loop container has said, What in that full file path variable, right? So if I want to go and debug that, what I can do is I can right click over here.

And I can say add watch from here. Or what I can do is I can click on debug here, and I can go and call a quick watch. Okay, so let me do a quick watch here. And the name of the variable is the full file path. So you can just type it over here if you wish. So you can just say a full file path and press Enter.

Now the time you press enter, you can see that in the variable he's saying that the value currently is C colon vault files, customer one dot txt, that means at this moment, it is reading customer one dot txt. Fine, right. Now, you would also like to go and see this variable in a while you are running the program, right? So what you can do is you can click on Add watch over here, so that you know while the program is running, you can see this values right here at the side of the program where it is running, okay? Now let me again press Continue. Right.

Now the next part of the for loop will start right. And you can see now the part of the file is see bulk files, customer two dot txt. So you can see now, you know, we are now moving step by step, dig up debugging step by step. And now if I go and run this, after this, my program finishes execution, right is finishes execution. So you can see if you want to go and debug your program, again, I will repeat the steps, you will right click on the component, right, so you will right click on this component, and you will say Edit breakpoints. And then you will put the breakpoint on whichever event you want.

And afterwards you know when you want to go and watch the variables you will say debug and then you will say quick watch so you can see now the quick watch is not seen when you are not running in a debug mode but the time you start running in a debug mode. So if you start running in a debug mode, you should again see the quick watch window so that my program is running. And remember, whenever you are in a debug mode, always keep watching this ello arrow over here, which indicates that we're currently the debugging has stopped. Right? So again, debug quick watch. So you can see this quick watch window or this quick watch menu can only be seen when you are in the debugging mode.

And again, you know, in case you want to go and remove all the breakpoints, so what you can do is either you can go here, edit, break breakpoint, and you can go and uncheck over here. Or either what you can do is if you want to go and remove all the breakpoints, you can say debug here, and you can say delete all breakpoints. So if I go and delete all breakpoints, it will actually go and remove all the breakpoints that are it is right. So this is nice. We have Come to a very great logical conclusion at this moment we are loading batch files we are doing validations we are ensuring that whatever are the errors, those are captured we are doing data conversion and so many things. So a very, I not a sustainable project but yes, it can work in production now, right so it's time to go and take this project and deploy it into production.

But now when we go and take this project and deploy it into production, our folders will change our database name will change our connection strings will change right. So we would need you know, parameters from outside. So you know, which we can provide and set those values internally. For example, if you see at this moment, this for each loop container is looping through files in a folder. So at this moment, if you see the folder path is C something right. So, if you see here at this moment, my folder path is see bulk files, but when I go in production must be it is no more Siebel files It is something else right?

So I would like to go and provide, you know, these kind of values through my parameters, right. So that's where we have this very nice tab here called as parameter. So, let us go and add a parameter here. And I will say, bulk file path. Okay. So bulk file path.

And definitely it's a string so it's a string, because file paths are normally strings. Also, let us clearly define, what is the difference between a variable and a parameter a variable is something more internal for example, at this moment, if you see we have a variable here, you know in which we are setting the current file path name, right. So a variable is more internal to SSIS application. So, variables you know we use you know when you want to pass data from control flow to data flow while parameters are more external they are passed from outside. So, as I said that basically if I want to go and deploy this application in a production server, then I need to get some parameters from outside. So, at that time I need parameters.

So, please note the variables you know which we discussed previously, if you remember we have a variable here called as full file full file path. This variable is used to pass data from control flow to data flow, right. But this parameter at this moment what I'm creating here is going to be set from outside in a while to deployment or while you run this program standalone. Remember, at this moment, at this moment, you are actually developing this SSIS package. So, you have this Visual Studio you can See all of these things right? But when you go to production, it won't be like this in production you send just one package setup right.

So, at that time in production, you are not going to have such kind of SQL Server Data to lower there and Visual Studio and these things so or that you need a compiled downsized version. So, when you have a compiled downsized version, there should be a facility by which you can pass parameters from outside and set things right. So remember, parameters are more external, while variables are more internal to the SSIS application to pass data. So what I'll do at this moment is I will set this let me go to my C drive. And in the C drive, what I've done is I've created a new folder here called a C bulk file new. So what I'd like to do is at this moment, internally, the program is pointing towards c bulk files.

I would like to point him towards this new bulk file new, right? So what we'll do is by default, edit, address, copy this, I will say by default, he will point towards this new value, okay? Now remember that at this moment I'm setting a default value or default value means in case if the administrator forgets to provide this value from outside, then this default value will be taken, right? So I'll say Ctrl S, but now the next thing is in the control flow, we should go and set this bulk fight path value to that folder name right. So over here, if you see at this moment, the folder is here right. So somehow, we have to go and set this parameter name let me go and copy the parameter name that is bulk file path.

So somehow, we have to Send this bulk file path value to that folder path, right? So, edit bulk file path. Yeah, we have to set it here. And if you remember, just five minutes before I said that we have a very, very awesome thing in SSIS called as expressions. Expressions helps you to set parameters and variable names values to the property. So these are the properties you know folder.

So what we'll do is we'll click on expression here. And we'll say that to the folder property or answer to the directory property, it's a directory at this moment, right? Yeah, directory, okay. So I'll say okay, take this directory. And I would like to go and set a parameter. Now I can remember here you can see over here very quickly.

This is a variable and this is is a package of a parameter package a package for the parameter I'm sorry. So this bulk file path is a parameter which is represented by this dollar sign, while this is a variable, which is represented, you know by, by just simply without $1 sign, right? So I'm going to go and take this value and set it okay. And I'll say okay, and I will say, Okay, so what will happen now is, you know, the program will actually take this parameter value, and not the old one, right? So let me go to my C drive here, and must be just for safety sake, I will go and delete that old folder what I have. So I'm gonna go here and delete this bulk files at this moment.

And I expect him to take all the data from the bulk file new. And also, I would like to cross check. Is the data getting from bulk file news or new or not? Right? So how do I cross Check I can go and enable Yes. debugging breakpoints right remember so I'm going to go and on this breakpoints, and I'm going to go and watch the value in in my watch window right.

So there it is running. And there it is, you can see now, the value is see bulk file new, which indicates that he is not taking the value from the component, but he's taking the value from the parameter, right. Great. And this program will run definitely, but you can see the value is now see bulk file. But now one thing which will come to your mind is you will see that Okay, so the parameter is also hard coded, right. The parameter is hard coded inside Visual Studio Absolutely.

See this value. What I've said here is a default value. Remember, it is not a value, which is runtime, right? So when you take this package and you deliver It into production, then you can set this value from outside. So what we'll do is the next 30 minutes, we will spend time in understanding how to go and deploy SSIS package, right. And then I will give you a small homework.

So basically the next lecture of 30 minutes will not take this application and do deployment, I will take a separate application, a very small application, which will be a simple file copy application. And that small application in that we will learn file copy as well, right. And as well as we will learn different ways of doing deployment. And then what you can do is then you can take this program and deploy it yourself. So that will be a homework for you. Right.

So in the next 30 minutes, we will learn how to do deployment in SSIS. And once you learn deployment, I would love to see that if you can go and deploy this program into when I say production server means into Your SQL Server right right. So, this was a very nice 30 minutes of the video and we have done a lot of things in this first thing yes we have done a for loop. Second things we learned about variables. Third thing we learned about parameters and fourth thing we learned about debugging right. And you can see that in small small times, you know, in this small small durations of training, we are trying to achieve large big things practically.

So, that completes our two hours of training and we have still 30 hours still pending right. So, let us keep moving. So the next lab is on deployment. And after that, we will take up the next things and in case you are liking what we are doing here and in case you want to give us some feedback saying that sir, add this extra sir put this more requirement, then we'll suggest you that Please go ahead on facebook.com slash questpond. And please put your feedback saying that can we take this project like this? Can we take this project like this, so that accordingly you know, as I'm recording, I can take up those requirements and make this msba lecture more interesting.

So thank you so much. Keep moving ahead more 30 hour still to go. If you're not giving up. I'm definitely not giving up. Okay. Thank you.

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.