Lab 2:- Conditional split, Data conversion and Error handling. (SSIS)

MSBI Step by Step Training Lab 2:- Conditional split, Data conversion and Error handling. (SSIS)
33 minutes
Share the link to this page
Copied
  Completed
This video is Lab 2 of Learn MSBI in 4 days with Project series where we will learn Conditional split, Data conversion and Error handling.

  Download

Transcript

So let us move ahead to lap two. And in lap two, let us start with a very small requirement for our customer project. So at this moment what is happening is, in our customer project, we are reading the CSV file, and we are dumping that CSV file into our data warehouse, right. But now, I want to put a small validation here. I want to say that if the customer amount is less than zero, these records should not be inserted into the database. I want these records to go into some error log file or bad records file or something, right?

I don't want these records to be logged into my main table. So if you remember, you know, our main table was this right? So I don't want you to know that the data should be logged into this table. So first thing let me go and clean this. So I'm going to go and just delete all the records at this moment. Right, and I will go here and let me add some negative records.

So we can see here for Rahul and Rohit, I'm gonna go and add some negative records, or I'm going to go and add something zero, right? So I want to go and apply a validation saying that when this file gets loaded, I want to only put values which are greater than zero. So anything zero and less than zero, right are not allowed, they should go into an error log txt file or into a bad record txt file, right. So let's go back to our SSIS package. So let's go back to our SSIS package. So in order to achieve that, you know, we have a very, very nice component here in our DTS in our data flow task, I will say in our data flow task, we have a nice thing here called as the conditional split, if conditions.

So we can go and check that okay. If the customer amount is equal to zero, or it is below zero, then don't allow to insert into the database. So now let us go go ahead and let me go and delete this path from here. And here let me go and put a conditional split. So let me put this component of conditional split here. And I will say okay, output of the capital case, component, I'm giving two conditional split.

And then I will say right click and edit. So once you click on Edit, it actually opens up this conditional split transformation editor here. At the first glance, you know, this editor will look very confusing, but don't worry, let us go slow, slow, follow me what I'm doing. The first thing is load I look at this tab here of the columns. So expand that. So if we look at this tab, if you look at this, this tab here it actually opens up, you know, whichever of the fields which are coming from your CSV.

So you can see here, the customer code is coming from CSV customer name, customer amount sales date. And you can see that there is one more field here called as the you customer name. This new customer name field is actually coming from this capital case component, right? And don't worry too much about this variable and parameters at this moment leave it we'll talk about this later on. The next thing what we find is in the right hand side, you will find there are lots of functions here. Okay, you'll find the lots of functions like the string function, and then we have some date and time function, etc.

Okay? And operators etc. So what we want to do now is I want to say that okay, if the customer name right, so I'm going to say okay, now there are two outputs, which will come here you can see at the down below, you can go and put multiple conditions here. So I'll say Okay, first is error one, okay, error component, or I'll say error condition. So I'll say if the customer amount is less than look at this less than zero, right? Or so again go and search here for the or condition.

So you can see there is a logical OR so I'll say all look at look at the way I'm dragging and dropping things. Okay? Again a bracket start. So again, I'll say customer amount. And in case you are good in writing things, you know, you can just just type out things here, but at this moment, you know, I'm assuming that a lot of people are not programmers, right? So I'm going to go and say equal to says try to follow the functions at the top is equal to zero, right?

So you can see now what I'm trying to do is I'm saying here, if the customer amount is less than zero or if the customer amount is equal to zero, then it is a error right? But now look at look at things here. You can see that it is in a red color. So this red color indicates an error. It says that it will be There is some error in the syntax here right. Now, first thing that there is no error in the syntax as such, but the error is in the data type of customer amount you can do comparison of numeric values like greater than zero or equal to zero when this customer Amount field is numeric, if it is not numeric if it is a string you know, then this comparison is not possible.

So, you would like to go and check you know that okay. So, what is the data type of this customer amount right. So, let us go and close this. Now, in order to know the data type, we need to understand one concept here called as the Connection Manager. So, you can see over here something called as connection managers. So, what are these What are these and how are they created actually.

Now, whenever extraction transformation and loading happens especially when extraction and loading happens, they need connection managers if you see extraction extraction loads from some source. So the connection between the source and that extraction component is done via Connection Manager. In the same way when you're loading back to a destination, that loading logic bite needs a Connection Manager. So you can see at this moment, you know, we have two connection managers here. One is the CSV connection, which actually tells you you know, which is the path from where the CSV will be loaded. And then we have this SQL Server Connection Manager which actually tells you the SQL Server RDBMS details right.

Now, when these connection managers actually connect to your source or to your destination, they define the data type automatically. You can see here in the CSV can We have an advanced tab again, I'll tell you how I went here, you need to go on this Connection Manager of CSV connection, double click on it, and this pops up. So if you go to the advanced here, you can see that by default, the data type of all the fields he has taken as DT underscore str. So in the same way, if you see, the customer amount here is also taken as DT underscore str. So DT stands for data transformation, and str means in a string. Now, you must be wondering that Why is it taking it a string?

Why didn't he take this as a numeric data type? Remember, when data is loaded in CSV files, they are simple text. If you look at CSV file, I can go and put anything what I wish here, right? So CSV file is a free text file. So that's what by default, your SSIS package things you know that The kind of column which is coming from CSV is always a string right. So, in the same way even though the data of the customer amount is string or sorry a numeric he has still taken it a string right.

So, that means when this data is read from here he is reading it a string and because he is reading it a string, that's why when you go to the conditional split here, and when you try to go and put a numeric comparison so when you go and try to do something like this, you know, greater than zero. He's not able to do it, he says that not possible, right. So if you try to do okay here, you should land up into some error and again, if you see this error very, very closely, it says that the data types DT underscore W, str or str, are incompatible for these kinds of operators for greater than sign and less than sign duty. SDR is a very inappropriate data type. So in other words, now we need to go and convert this you know, we need to go and do some kind of a data conversion, right so that we can do the numeric comparison.

And that's where we have one more component here called as data conversion. So let us go ahead and delete this thing from here. And let me go and put a data conversion component. So I'll say and always put proper text names because at one moment of time, you will not know what this component is doing right? So like, dual one slight select, like don't do a double click just do one click Select. And then you can go and start typing here.

So I'll say this is customer amount, conversion, right? So I know that yes, is going to go and convert this customer amount into a numeric Right. So again, I'll do an edit. So I'll say here I'm interested to go and change this customer amount or I want to convert this customer amount into. So basically I want to go and convert this customer amount into an A money. Now remember that in our SQL Server at this moment very quickly, the data type of our TBL customer, the customer amount, so let's go and quickly see the data type.

So, if you see the data type, it is a money, right? So over here, the appropriate mapping data type, remember SSIS has its own set of data types. So for example, over here, you know if you have their cat or invalid care for that we have here something called as DT underscore str. If you have your something called as money then we have here something called as DT underscore currency. So the is a mapping between SSIS data type and SQL Server Data Type, I will talk about more more of these mappings, you know, how they how they how they work, you know, in, in the other part of the video, but at this moment, because this currency, this customer amount is a money. So what we need to do is we need to go and create a winner to go and apply our DT underscore currency so that it is currency DT underscore c y.

Okay. Now, he gives you two options, he says that, do you want to go and create a new copy of a customer? Or do you want to go and you know, apply on the same one so I'll say yes, I want to go and create a new copy of customer here. So I'll say this is customer amount, currency. Okay, so this basically, you know, I've just given him a column name so that I can differentiate that the first customer amount is Actually a string and this customer amount, see why we actually currency right. And then I will go and give the input of this to the conditional split.

And now let's go and do a right click Edit. So now, I shouldn't be getting those red error. So you can see first thing, we have this customer amount, which is a string data type. And you can see we have the customer amount c y, which we just added is the currency datatype. So I'll see. One output is the error.

So when do we have an error, we have an error when this customer amount is less than zero. Or now, in case you know the signs, you can just type it out in case you don't know the signs. You can always go and start using this helps around so you can say that operator and search for or so must be you can start using this UI here. So I'll say or Customer amount. See, if I look at the way I'm dragging and dropping into the sequel to zero. So if it is less than zero, or if it is equal to zero, then it is an error.

Or as everything is, everything is fine, you can see now, I don't see that red sign, which indicates that this condition is properly working, right. And if you remember in the previous situation, you know, this was showing us a red sign, right? So I'll just do okay. And that's it. And now I will go and drag and drop the output to this customer data warehouse. The first thing you know, he says is that so what output Do you want to drag and drop to this customer data warehouse?

Do you want to drag and drop that error? Or do you want to go and drag and drop the conditional default output? Now remember, if you see over here, I just put one condition called as error What it means is that everything other than this condition is a proper thing, right? So that's why if you see now when I go and drag and drop this output here, so when I go and drag and drop this output here, so he says that Okay, so what do you want to drag and drop here the error or other than the error condition? So I'll say yes, the default output, which is nothing but my proper output, I want to go and drag and drop here. And there it is.

And now what I would like to do is I would like to go and create create an error destination txt file, right? So I would like to see that Okay, give this error output. Okay, so I'm going to go and drag and drop to the flat file destination. So we can see if the Edit output is now going to the flat file destination, right? So I'll go and say Edit. And I will say new, create a new connection file, file Connection Manager.

And I'll give this name as editor, Connection Manager. So let me just go on, let me go and browse. here and let me create a text file here. So I'm going to go and create a text file here saying this is our text file, right? And our text file so whatever our errors you know, which actually will actually come to this file here and yes, I want all the columns and everything so that is fine. And I'll just say okay let me check the mappings Yes, my mappings are fine, okay, right and Ctrl S right.

So, if you see it now, the first thing is, he will read from the CSV file, he will do the capitalization. After that, you know, this component here converts from a string to a numeric, then I can do the conditional comparison. So if it is greater than zero, all the values come to the customer data warehouse in case it is equal to zero or less than that, you know, those all come to this error file. Here, right? So, again, always put proper text in a destination. Right?

And now remember, in our text file, if you remember, at this moment, you know, these two records should not go into the database. So these two records should not be going to this part, it should be going to the error part, right? So let me go and run this. So, I'm going to go and run this year. And there we have an error, right? So again, I'm not going to remove the errors, you know, in case I get an error, I'm going to go and record that error.

So that in case you get that error, you should be able to remote right, so this error is again about that 64 bit and 32 bit, so right click, go to the properties, go to your debugging tab and say runtime 64 bit is false, right. So that is because of the 32 bit and 64 bit issue. Now let me go back and run again so that it is running. So, two records are going to go so very quickly. So, two records we are expecting it to go into the error file right there you can see two records into the error file and six records should go into the database right. So, if I go back to my database here, so, let us go back to our database, let us do a select here select star So, there it is 123456 and 1061006 and 1007 have not loaded right and those are actually loaded into that txt file.

So, if you go to the error file now very quickly, so, you can go to my C drive so, in this if I Go to the error file you can see that those errors are now sent to the error file right. So, we can see by using the conditional split you know, how we are able to put validations you know and how we are able to ensure that proper data goes into the proper source. So, great. So, we are able to read from the CSV file, we are able to do capitalization we are able to do the data conversion we are able to check you know if the value is zero and it is less than zero that means, if it is negative, we should not put that value into the database. So, everything is good, but errors happen bad things happen right. For example, now in this case assume that in the customer amount you start getting non numeric values for example, here somebody says hundred for some reason, you know, the text file which comes you know it has some non numeric values In a column which is numeric, for example, at this moment, the customer amount is a currency datatype.

Right, and we have a non numeric value here. So what will happen in this case? In this case, definitely the application will crash. Because in our database, if you remember, so in our database, if you remember, the data type of currency amount is very quickly, just to show you once again, the data type of, of customer amount, I'm sorry, is money, right? So definitely not over here, things will crash. So if I run this program so now, when the application runs, you can see that there is an error here, right?

And if you watch very closely, the error is happening, you know, when he's trying to convert the customer amount, right? So when it's true To convert that customer amount into a currency data type, the error is happening. And that's obvious, you know, because at this moment, the data type is currency. And he's trying to convert this hundred word into a new into a currency into a numeric, right. And if you go on the progress bar just to see that what the error is, and if you remember how to see that error, see for the first red sign here, and you can see here, data conversion failed while converting customer amount to customer amount currency. Right.

And he's also also stating you know, that basically on which column the data has on which column the data conversion failed, right. So how do we fix this? So in case if we have such kind of error, we again would like to handle these errors, right. So to handle errors, so let me go and stop this debugging here. So let me go and stop debugging. Now, if you remember In the first video, the first one out of the video, we said that every component has two outputs.

So one output in a, which is with this sign with this blue color arrow, and the other output with this red color arrow. Now the output with the blue color arrow is where the data comes in. So when there are no errors, everything is fine, everything is good, then the data comes via this blue arrow, right. But when there are errors, the error data is sent into this red arrow here. And now at this moment, the error is happening while we do the conversion of the customer amount. So definitely, we need to do the error handling over here.

So we need to use this red arrow here to do the error handling, right. So what I'm gonna do is I'm gonna go back here, and I will, I will go and pick up add on file, so I'm going to go and create an error file. So let us go to the toolbox. So let me go to the SSIS Toolbox here. This is toolbox. And let's take a URL destination file.

So I'm going to go and take a flat file destination. Now, before we move ahead very quickly before I, I go and connect these errors with the error file destination, I would like to point out a very important thing here. By default when an error happens SSIS application stops right there. So if you remember when we ran the application, right, just a minute ago, when the error happened during the customer amount conversion, so you can see now is reading from the CSV file afterwards the capitalization happens once the error happens, there was no moment further application stopped right there. Right. So in other Words by default the behavior is that the application fails and stops right there.

Now, how can we see this behavior? Or I'll say rather, how can we change this behavior because at this moment, what we want to do is we don't want the application to stop what we want is we want the applications should send you know, whatever are the errors, you know, from this read sign, probably to this flat file destination, right. So to see this behavior, right click on this say Edit. And so the data conversion transformation editor opens. And just down below, you can see that there is a nice button here saying configure error output. So let us click on this.

Now in this configured error output, you can see by default, it says that when the error happens, this component will fail. But now there are more three options here I'll say more two options either one is yes fail component. So in case when the error happens by default it will fail the component The second one is ignore failure. So, sometimes you feel that okay even if there is a failure, just ignore and keep continuing with the next rows and the third one is redirect row. So, this redirect row is what we want. So, in other words, when the error happens, we want that error row to be redirected to that red arrow output right.

So, I can go and select this redirect rule over here. You can also see that there are two things actually one is there is an error and then there is there are truncations. So sometimes you know truncations means what you know for example, if there is four character and you are trying to push 10 characters, then the six characters will be removed or truncated, right. Now, a lot of times, you know some of the application treat or some of the situation in application treat truncation as an error, or some other application don't treat it as a error. So you can you know, they have given a separate box here saying that, so what do you want to do while doing truncation Do you want to ignore it or Do you want to feel the component right. So, one is there are errors like data conversion, the other one is you know, because of the size mismatch, you know, there is a truncation.

So, again for truncation you know, they have given more given all the three options of what you want to do and you can see down below you know, there is a small box here saying up apply and again those three options, so, this box is given why because, let us say that you have lots of columns here. So, what you can do is you can select all So, you can just go and say okay, select all this and say apply to all of these sales cells, redirect row right. So, this down drop down, what you see here is that, you know, when you when you have multiple rows and lots of columns and you want to go and apply to all of them some common common thing right, you can do that. So, at this moment, yes, even if there is an error, I want to redirect the row, even if there is a truncation I want to read Direct I'll say Apply also okay.

And I will say okay, you're right. Now once I say Apply you can see that there is a small warning here it says that rules you know which are sent to the error output will be lost. In other words, he says that you said that all the errors what happens you know, we'll go to that red arrow here, but you are not defined, you know, what you want to do with the error output. So the rules will be lost. So he's suggesting you to ensure that this red arrow is connected to something so what I'm gonna do is I'm going to go and take this red arrow and connect to this final destination here, right? So I will say yes, redirect the row.

And I will say Apply Yes, I'll say okay, right. So any kind of data conversion errors will now be redirected to this flat file destination. So here I will say conversion, arrows light. So in this, let me know go here and click Edit. So I'll say new. And I will say delimited.

And I'll say this is conversion error file Connection Manager, right. So in just a browser and I will say one is the error and the other one is this is this file will contain errors due to conversion, right? error conversion. Okay. And it also of conversion, and I will say open. I would like to take all the columns what I have here, so, yes, everything and I'll say OK. And I will say yes, take all the files, take all the columns once okay.

Right. So So now, this is how the flow of our application is at this moment. So we read from the CSV file, the capitalization is done afterwards, you know, the customer amount is a data converted and in case the data conversion fails, that data conversion will be returned to this error file over here. And after that there is a conditional split. And in the conditional split again, if the values are equal to zero or less than zero, then they again go back to the error file. Now remember that this error and this error have a big difference.

I'm sure that you have understood the difference between both of these errors, this is more of a technical error, you know error which happens while data conversion file not found and those kind of things, this is more validation error or business errors or as a business validations must be we should not use that word error. It's more of business validation. So this red sign should not be confused with this if condition here at this moment, right? So, let us go and see your files. So, this the following thing should happen now, the first row should go over here to this data conversion error all the other rows should go to the customer data warehouse leaving these two rows these two rows should go to this error here. So, over here I expect two rows here I expect one row and all the remaining rows will go to this default thing here right.

So, let me go and run this let us see what happens so that our application is starting. so files are getting read and there it is. So you can see first thing, let me zoom here and let us see what do we see here? Right. So one row to the error that is right tools, which are, you know, having those business validation issues and the remaining five rows have gone to the actual table right. So, if I go now to my folder just to check, so, let us go to a folder here.

So, in the conversion text file, you can see that that ship 100 has come here in the error, we have those two things right to two records, right and in our SQL Server very quickly, we should see the remaining five records. So, we can see now how things are happening, right. So there it is. This will give you a good picture, right? So you can see, these are business validation errors. These are technical errors, and this is our actual data.

So this brings us to the end of lab two and this lab two we covered three employees. on things. So, the first thing is we covered the if condition the conditional split, second one we covered data conversion. And the third one we covered you know, how to handle technical errors like data conversion file not found and so on right. Now, if you look at we have, we have completed approximately one and a half hours of the training and in this one and a half hours, can you believe you are able to do the complete extraction transformation and loading process, you are able to handle errors, you are able to apply business validations you are able to do you know data conversion. So, we can see that within this one and a half hours, you know, we have completed so many things.

So, basically, you know when you start learning from the project perspective, from actual project perspective, the learning actually becomes much faster, right. So, I hope that you like this lab too. And after lab two, you know, I'm going to have small question or question session. So please go and read that questions and try to answer it yourself, you know, do not take any kind of help. And in case you are able to answer them, that means that you are doing very good. And this is my appeal, or else it is my request, I'll say rather cannot appeal a request that, you know, if you really like you know what we are doing here at questpond, please go to facebook.com slash questpond.

And see that yes, you have completed lab one, you have completed lab two. With that we come to know that how many people are actually watching this video series. And the more you watch that video series, the more we record about it, right. So if you want to really keep encouraging us for a video series, we would like to see participation saying that yes, I completed lab two and I'm waiting for lab three or if there are any issues around the lab, please do also write about it. In case there are some technical question please go and write about the same as well. So please go to facebook.com slash correspond and please give your inputs.

Thank you so much.

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.