Decision Tree Classifier-I

Machine Learning Using Python Other classification Algorithms
13 minutes
Share the link to this page
Copied
  Completed
You need to have access to the item to view this lesson.
One-time Fee
$69.99
List Price:  $99.99
You save:  $30
€65.39
List Price:  €93.41
You save:  €28.02
£55.92
List Price:  £79.90
You save:  £23.97
CA$96.01
List Price:  CA$137.16
You save:  CA$41.15
A$107.15
List Price:  A$153.08
You save:  A$45.93
S$95.13
List Price:  S$135.90
You save:  S$40.77
HK$547.14
List Price:  HK$781.66
You save:  HK$234.52
CHF 63.86
List Price:  CHF 91.23
You save:  CHF 27.37
NOK kr775.40
List Price:  NOK kr1,107.76
You save:  NOK kr332.36
DKK kr487.78
List Price:  DKK kr696.86
You save:  DKK kr209.07
NZ$118.01
List Price:  NZ$168.60
You save:  NZ$50.58
د.إ257.06
List Price:  د.إ367.24
You save:  د.إ110.18
৳7,680.49
List Price:  ৳10,972.60
You save:  ৳3,292.11
₹5,842.03
List Price:  ₹8,346.11
You save:  ₹2,504.08
RM332.86
List Price:  RM475.54
You save:  RM142.67
₦86,437.65
List Price:  ₦123,487.65
You save:  ₦37,050
₨19,491.96
List Price:  ₨27,846.85
You save:  ₨8,354.89
฿2,586.09
List Price:  ฿3,694.58
You save:  ฿1,108.48
₺2,265.39
List Price:  ₺3,236.41
You save:  ₺971.02
B$363.53
List Price:  B$519.35
You save:  B$155.82
R1,302.64
List Price:  R1,861
You save:  R558.35
Лв127.90
List Price:  Лв182.73
You save:  Лв54.82
₩96,270.48
List Price:  ₩137,535.16
You save:  ₩41,264.67
₪262.29
List Price:  ₪374.71
You save:  ₪112.42
₱4,033.94
List Price:  ₱5,763.02
You save:  ₱1,729.07
¥10,867.12
List Price:  ¥15,525.12
You save:  ¥4,658
MX$1,187.12
List Price:  MX$1,695.96
You save:  MX$508.84
QR254.93
List Price:  QR364.20
You save:  QR109.27
P994.08
List Price:  P1,420.18
You save:  P426.09
KSh9,360.69
List Price:  KSh13,372.99
You save:  KSh4,012.30
E£3,358.63
List Price:  E£4,798.26
You save:  E£1,439.62
ብር4,003.77
List Price:  ብር5,719.92
You save:  ብር1,716.15
Kz58,546.63
List Price:  Kz83,641.63
You save:  Kz25,095
CLP$67,216.99
List Price:  CLP$96,028.39
You save:  CLP$28,811.40
CN¥506.70
List Price:  CN¥723.89
You save:  CN¥217.19
RD$4,073.53
List Price:  RD$5,819.58
You save:  RD$1,746.04
DA9,418.34
List Price:  DA13,455.35
You save:  DA4,037.01
FJ$158.31
List Price:  FJ$226.17
You save:  FJ$67.86
Q543.96
List Price:  Q777.12
You save:  Q233.16
GY$14,650.29
List Price:  GY$20,929.88
You save:  GY$6,279.59
ISK kr9,815.39
List Price:  ISK kr14,022.59
You save:  ISK kr4,207.20
DH707.71
List Price:  DH1,011.06
You save:  DH303.35
L1,237.78
List Price:  L1,768.33
You save:  L530.55
ден4,025.24
List Price:  ден5,750.59
You save:  ден1,725.35
MOP$563.96
List Price:  MOP$805.69
You save:  MOP$241.73
N$1,304.33
List Price:  N$1,863.42
You save:  N$559.08
C$2,570.38
List Price:  C$3,672.13
You save:  C$1,101.75
रु9,397.27
List Price:  रु13,425.24
You save:  रु4,027.97
S/263.43
List Price:  S/376.35
You save:  S/112.91
K270.11
List Price:  K385.89
You save:  K115.77
SAR262.49
List Price:  SAR375.01
You save:  SAR112.51
ZK1,873.89
List Price:  ZK2,677.10
You save:  ZK803.21
L325.37
List Price:  L464.84
You save:  L139.46
Kč1,643.47
List Price:  Kč2,347.91
You save:  Kč704.44
Ft25,458.03
List Price:  Ft36,370.18
You save:  Ft10,912.14
SEK kr764.90
List Price:  SEK kr1,092.76
You save:  SEK kr327.86
ARS$61,327.27
List Price:  ARS$87,614.14
You save:  ARS$26,286.87
Bs483.57
List Price:  Bs690.85
You save:  Bs207.27
COP$273,218.78
List Price:  COP$390,329.27
You save:  COP$117,110.49
₡35,710.66
List Price:  ₡51,017.42
You save:  ₡15,306.75
L1,733.65
List Price:  L2,476.75
You save:  L743.09
₲524,442.73
List Price:  ₲749,236.02
You save:  ₲224,793.28
$U2,683.09
List Price:  $U3,833.15
You save:  $U1,150.06
zł283.24
List Price:  zł404.64
You save:  zł121.40
Already have an account? Log In

Transcript

Hello everyone, welcome to the course of machine learning with Python. In this video, we shall learn about decision tree classifier. So, what is the decision tree classifier? Our decision tree splits the data set using the structure of a tree and it makes a decision at every node and hence is known as decision tree. So, decision tree is a tree like structure and it comprises of root node internal load leaf node and branches what each component described root node and internal nodes represent based on an attribute or feature branches represent outcome of the taste or attributes. And leaf nodes represent target class level.

So example of a decision tree classifier first creating a model. So let's say this is our data set. So this is basically a credit card data set. Okay, so here we have three features. Default medical status and taxable income and whether the person will be likely to cheat or not, that is basically our label okay. So here we have this categorical attribute refund marital status is also categorical attribute.

However, taxable income is a continuous attribute and this is our class level, which is basically a binary classification. So, either we have to classify the class is or to the class No, so, we can feed a decision tree like following for the given training data. So, if we find equals to Yes, so, if we see if default equals two years, then in all the cases cheat is no So, if refund This is then cheat is no. However, if refund is no we cannot tell anything about it. So, we have to look for another attribute which is marital status. If marital status is married, then again there is no cheating.

That means, if differently No and marital status is also married then it is no cheating and if marital status is single or divorce, then we cannot comment we have to look into another attribute which is taxable income. Now if taxable income is less than a tk 10 no cheat and if it is greater than a TK, then yes cheat. Okay, so how do we apply the model on a test data? So let's say this is our test data, we have refund no marital status married and taxable income a TK is the person likely to cheat that is the question okay. So we have to find out yes or no answer for this question. So this is our decision tree we have fitted with our training data set.

So we start from the root which is nothing but the default attributes shape. So we can see that the different activities No, so we will move to our solution. of the decision tree file this AK marital status is married okay. So, we have obtained our leaf node. So as soon as we get the leaf node we assign cheat to note because the class attribute of this leaf node is no okay. If somehow we have voted into this leaf node, the class attribute of this leaf node is yes, in that case the cheat would be yes but however, as per this test data we have arrived at No.

Okay. So the person here is most likely not to cheat. Okay. Now decision tree building is also called induction of the decision tree. There are many algorithms to construct a decision tree from the label training data set. Hance algorithm card, which stands for classification and regression tree.

Id three ID for it, Dakota Mizer c 4.5. Then sleep slick stands for supervised learning in West and spleen, scalable parallelizable induction of decision trees. And there are many more. so greedy strategy split the records based on an attribute taste that optimizes certain criteria. So we'll be discussing about these criteria later. Now what are the issues determine how to split the records, how to specify active test conditions, how to determine my best split, and to determine when to stop splitting.

Now how to specify test conditions depends on the activity types. second test could be either categorical or continuous. And it also depends on number of ways to split. It could be two way speed or binary split up from it. We split the splitting based on categorical attributes. So let's say this is marital status.

It could be either single married or divorced. So this is a multi way split of these categorical attributes marital status. Similarly, smoking habit would be either a smoker or nonsmoker. So this is a simple two way split. Now splitting based on continuous attribute. So let's say income could be less than 25,000 between 25,050 thousand and more than 50,000.

So this is a multi way split and it's less than 40 and greater than 40. This is basically a two way split. So how to determine based split notes with homogeneous class distribution, our preferred need a measure of node impurity, what is the intuition consider a binary classification problem. Following scenarios are observed in two different nodes. So, for this node, the number of samples of plasma is 10. And number of samples of plus two is 10.

How good is this note number of samples in class one is a number of samples in class two is to note that each of the node contains 20 number of samples. However, the left node is non homogeneous note because both the classes present equally, it has high degree of impurity and it needs further split. However, the note on the right hand side is more homogeneous and it has low degree of impurity, because these note is most likely to be class one note, now how to determine this tip, let's consider the binary classification, Class C one and C two. Let em denotes the measure of note impurity. We shall discuss about the measures of known impurity shortly. So, before splitting the number of examples in class one is n 01.

And number of examples in class two is n 02. And based on that the impurity measure is a suffix zero. Late there are two attributes capital E And B, if capital A we are choosing for splitting, then let's say capital A splits into yes or no. And if yes it goes to note in one and if no it goes to note into now, it note anyone number of samples in class one is in one one n number of samples in class two is in one two, whether it note to the number of samples in class one is in to one and number of samples in class two is in two. So, we can obtain the impurity measure of each of these nodes. And combining these two we will be obtaining the impurity measure of this split, which is m one two.

Similarly, we can obtain the impurity measure of split with the attribute B, which is m three four, then what is the gain gain in this case is m zero m zero minus m one to M In this case, it is m zero minus m three four. Now, if m zero minus m one two is greater than m zero minus m three four, then we shall go for a and if m zero minus m three four is greater than m zero minus m one two, then we will split according to be the measures of no impurity. There are various ways to measure node impurity, Gini index, it is used in cars sleep speed, etc, information gain it is used in ID three and C 415. misclassification error and minimum In this lecture we shall discuss Gini index for computation of impurities Gini index for a given node P is calculated as Gini t is equal to one minus sum over probability of j given t whole square for all G, we have to sum it over for ology, talk probability of j given is the relative frequency of class j at node t, the maximum value of Gini index is one minus one upon NC for NC is equal to number of classes.

It occurs when all the samples at the nodes are equally distributed among classes, that implies high degree of impurity, minimum hallouwe. of Gini index is zero. It occurs when all the samples at a node belong to only one class that implies homogeneity. So examples of computing Gini index let us consider these examples. See one number of example of plus one is zero. A number of examples in class two is six.

The probability of class one is zero and probability of class two is one. So Gini is one minus zero plus one, which is nothing but zero. Let us consider another example here number of samples in class one is one a number of samples in class two is five. Two similarly we can opt in genique was 2.278 The last example number of samples in class one is three and number of samples in class two is also three. Here the Gini is computed to be point five. So, as you can see, if the Gini of a particular distribution is zero that means it is fueled or homogeneous know, if the genie is more closer 2.5 or one minus one upon NC for NC class classification, then it is a impure look.

Okay, so splitting based on Genie so when a parent node P is split into k partitions or children, the genie of split is calculated as following. So Genie have split based on certain criteria would be equals to sum over ni by in Jimmy have I items from one cookie were in He's number of samples at child AI. And is the number of samples at node p that is the parent node P. Now we are dividing the decision tree based on attributes in let's say, before splitting, there are total six samples in class one and six samples in class two. So distribution in parent node is this and the Gini index in the parent code will be point five. Now after the splitting, the node one, class one is five and plus two is to. The genie node in one will be point 408.

And a note to it would be point three two. Okay, so genie of speed based on the criteria a would be seven upon 12 multiplied with point 401 plus five upon 12. multiplied with point three, two So this point 3713 similarly we can also explain based on an activity we can calculate the Gini of the split okay we can see that the Gini of the speed with attribute B is point 45 eight. So again with respect to splitting by attribute a is 0.1287 and with respect attribute B is 0.0142. So hence we shall speed the parent nodes based on the attribute a as it provides more gain a way to stop splitting ruling ablution tree can be stopped by following criteria when there are no records or sample to split further, when the leaf nodes are homogeneous or nearly homogeneous, when the tree height is equal to some predefined height, so decision tree are very much advantages it is inexpensive to construct.

It is extremely fast at classifying unknown records. It is highly interpretable It is very much accurate can work on both categorical and quantitative attributes. See you in the next lecture. Thank you.

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.