Python implementation of LR with binary and multiclass classification problem

Machine Learning Using Python Logistic Regression Analysis
8 minutes
Share the link to this page
Copied
  Completed
You need to have access to the item to view this lesson.
One-time Fee
$69.99
List Price:  $99.99
You save:  $30
€65.39
List Price:  €93.41
You save:  €28.02
£55.92
List Price:  £79.90
You save:  £23.97
CA$96.01
List Price:  CA$137.16
You save:  CA$41.15
A$107.15
List Price:  A$153.08
You save:  A$45.93
S$95.13
List Price:  S$135.90
You save:  S$40.77
HK$547.14
List Price:  HK$781.66
You save:  HK$234.52
CHF 63.86
List Price:  CHF 91.23
You save:  CHF 27.37
NOK kr775.40
List Price:  NOK kr1,107.76
You save:  NOK kr332.36
DKK kr487.78
List Price:  DKK kr696.86
You save:  DKK kr209.07
NZ$118.01
List Price:  NZ$168.60
You save:  NZ$50.58
د.إ257.06
List Price:  د.إ367.24
You save:  د.إ110.18
৳7,680.49
List Price:  ৳10,972.60
You save:  ৳3,292.11
₹5,842.03
List Price:  ₹8,346.11
You save:  ₹2,504.08
RM332.86
List Price:  RM475.54
You save:  RM142.67
₦86,437.65
List Price:  ₦123,487.65
You save:  ₦37,050
₨19,491.96
List Price:  ₨27,846.85
You save:  ₨8,354.89
฿2,586.09
List Price:  ฿3,694.58
You save:  ฿1,108.48
₺2,265.39
List Price:  ₺3,236.41
You save:  ₺971.02
B$363.53
List Price:  B$519.35
You save:  B$155.82
R1,302.64
List Price:  R1,861
You save:  R558.35
Лв127.90
List Price:  Лв182.73
You save:  Лв54.82
₩96,270.48
List Price:  ₩137,535.16
You save:  ₩41,264.67
₪262.29
List Price:  ₪374.71
You save:  ₪112.42
₱4,033.94
List Price:  ₱5,763.02
You save:  ₱1,729.07
¥10,867.12
List Price:  ¥15,525.12
You save:  ¥4,658
MX$1,187.12
List Price:  MX$1,695.96
You save:  MX$508.84
QR254.93
List Price:  QR364.20
You save:  QR109.27
P994.08
List Price:  P1,420.18
You save:  P426.09
KSh9,360.69
List Price:  KSh13,372.99
You save:  KSh4,012.30
E£3,358.63
List Price:  E£4,798.26
You save:  E£1,439.62
ብር4,003.77
List Price:  ብር5,719.92
You save:  ብር1,716.15
Kz58,546.63
List Price:  Kz83,641.63
You save:  Kz25,095
CLP$67,216.99
List Price:  CLP$96,028.39
You save:  CLP$28,811.40
CN¥506.70
List Price:  CN¥723.89
You save:  CN¥217.19
RD$4,073.53
List Price:  RD$5,819.58
You save:  RD$1,746.04
DA9,418.34
List Price:  DA13,455.35
You save:  DA4,037.01
FJ$158.31
List Price:  FJ$226.17
You save:  FJ$67.86
Q543.96
List Price:  Q777.12
You save:  Q233.16
GY$14,650.29
List Price:  GY$20,929.88
You save:  GY$6,279.59
ISK kr9,815.39
List Price:  ISK kr14,022.59
You save:  ISK kr4,207.20
DH707.71
List Price:  DH1,011.06
You save:  DH303.35
L1,237.78
List Price:  L1,768.33
You save:  L530.55
ден4,025.24
List Price:  ден5,750.59
You save:  ден1,725.35
MOP$563.96
List Price:  MOP$805.69
You save:  MOP$241.73
N$1,304.33
List Price:  N$1,863.42
You save:  N$559.08
C$2,570.38
List Price:  C$3,672.13
You save:  C$1,101.75
रु9,397.27
List Price:  रु13,425.24
You save:  रु4,027.97
S/263.43
List Price:  S/376.35
You save:  S/112.91
K270.11
List Price:  K385.89
You save:  K115.77
SAR262.49
List Price:  SAR375.01
You save:  SAR112.51
ZK1,873.89
List Price:  ZK2,677.10
You save:  ZK803.21
L325.37
List Price:  L464.84
You save:  L139.46
Kč1,643.47
List Price:  Kč2,347.91
You save:  Kč704.44
Ft25,458.03
List Price:  Ft36,370.18
You save:  Ft10,912.14
SEK kr764.90
List Price:  SEK kr1,092.76
You save:  SEK kr327.86
ARS$61,327.27
List Price:  ARS$87,614.14
You save:  ARS$26,286.87
Bs483.57
List Price:  Bs690.85
You save:  Bs207.27
COP$273,218.78
List Price:  COP$390,329.27
You save:  COP$117,110.49
₡35,710.66
List Price:  ₡51,017.42
You save:  ₡15,306.75
L1,733.65
List Price:  L2,476.75
You save:  L743.09
₲524,442.73
List Price:  ₲749,236.02
You save:  ₲224,793.28
$U2,683.09
List Price:  $U3,833.15
You save:  $U1,150.06
zł283.24
List Price:  zł404.64
You save:  zł121.40
Already have an account? Log In

Transcript

Hello everyone, welcome to the course of machine learning with Python. In this video, we should learn how to use inbuilt Python function inside the cyclin library for implementation of the logistic regression. So, here we should also use the iris dataset. So, in this cell, we have imported the necessary libraries. So, now we shall import the data. So, this we have already described in the last video.

So I have not elaborating on this particular concept. So the shape of the data set is 150. So that is there are 150 number of data points in the data set. And the feature vectors is of dimension four. So this is basically my feature vector. And this is basically all the class levels Okay.

Now we can divide the data into train and test. So from a scalar dot model selection we can import train test split. So x 10 comma x test y train comma y this will be nothing But take a speed where we have obtained the taste test size as point three that means the 30% data will be used as taste, so, what is the extent of ship right now, okay. So, we have to run this. So, first then we can run this particular sir okay. So, there are total 105 number of training samples and 45 is samples Okay.

Now, we will do something called a feature scaling. So, from a skill under pre processing, we are importing the class called standard scaler. So, this is c is nothing but the instance or the object of the class standard scalar. Now, extreme will be a C dot fit underscore transform extreme x test will be a C dot fit underscore transform of x test. Let's go ahead and run this particular cell. Now we can see that exchange is have is standardized and exchanges also standardize how we can do That.

So, if we do NP dot mean extreme axis equals to zero. So we can see that the values of or the mean values of all the columns of the X is close to zero. And also we can do the same thing with the standard deviation. Okay, we can see that all the exchange now has a standard deviation of all ones similarly or exist. Okay, so that means both extreme and x test has been standardized. Okay, now let's go ahead and define our logistic regression model.

So from SK learn dot linear underscore model. So this is basically the sub package under the hood. scaler packet or the scalar library, we are importing logistic regression note the camel form l capital and our capital okay. So Ella underscore model is opposed to logistic regression the solver is is Vf Gs. So, this is basically to solve the gradient descent function and multi class. So, as this is a multi class classification problem.

So, we have to define which kind of decomposition technique we want to use. So, we are using OCR techniques so, what is movia one versus rest? So, we'll elaborate on this concept So, what is one versus rest? So, in one versus rest we are actually practically class training three classifier differently so, one for each class. So, let's say setosa non citizen This is one classifier versicolor non for similar another classifier virginica number 22 and other classified well let's see if one This sample comes, we pass this test sample through all these classifier, whichever produces the highest probability, we'll take that. So, let's say vertical and non versicolor produces the highest probability with respect to her see color, then we can claim that the particular taste of chip falls into the secret category.

So, that is how it is done. So this is called the Ovi or da one versus waste decomposition technique. It is also sometimes referred to as one versus all or a VA technique. Okay, now, we shall fit our training data set into our logistic regression model. So we are doing LR underscore model dot fit x 10 comma y 10. So, let's go ahead and run this particular cell okay.

So as you can see, there are so many parameters over here, fine. So these are all default whenever it gets maximum, like tradition hundred, the tolerance 0.001. So all our basics inbuilt inside these luxury regression class Okay, so these are all predefined attributes. Now, we shall go ahead and test the fitted model. So the predicted output will be equals to LR underscore model dot predict the test or the test set. So, this is our white predicted fine and this is our whitelist So, as you can see there are some mismatches, but for evaluation we'll be using confusion matrix.

So, in the last lecture we have dived our confusion matrix function, okay all we have defined the function how to compute the confusion matrix, but here we will use the inbuilt function to build the confusion matrix which is confusion uncouple score matrix which is defined under the sub package matrix of the SK learn package or the square on library. So, we teach Fine seeing as the confusion matrix of whiteness comma white privilege So, it takes two arguments one is the original test sample or the test labels and the predicted test levels okay. So, we go ahead and run this particular cell and print our confusion matrix. So, as we can see, there are a few of diagonal elements. So, that means the accuracy is not hundred percent, but let's see what will be that you received. So, total number of corrects or the total number of correctly identified patterns are correctly identified tests test samples will be nothing but the sum of the diagonal elements and for that we can use NP dot trace function.

So NP dot trace will produce some of the diagonal elements of any Square Enix. Okay, so here our confusion matrix is a square or the square matrix correct. So what we can do we can find the sound The diver elements using NP dot Chris. So, the number of current correctly identified test sample is equals to NP dot trace of the confusion matrix. And the total number of this sample as we have already seen is nothing but 45. But as you can sum all the elements of this confusion matrix will also get 45 of the total number of test samples.

So, the total number of test samples over here is NP dot some of this entire confusion matrix. So, let's go ahead and run this particular cell. So, the total number of correctly identified test samples and the total number of test samples is equals to this. So, to correctly identify test samples is 43 and the total is 45. So, what is the accuracy accuracy is nothing but correctly identified this samples divided by the total test sample So, this is basically the validation accuracy we are focusing on So, the validation accuracy of these particular model is 95.55% Okay. So which is quite high given The model is very simple.

This concludes our lecture on logistic regression. In the next video, we shall begin our new topic called the K nearest neighbor classifier. So see you in the next lecture. Thank you

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.