Implementation of kNN classifier using python

Machine Learning Using Python Other classification Algorithms
9 minutes
Share the link to this page
Copied
  Completed
You need to have access to the item to view this lesson.
One-time Fee
$69.99
List Price:  $99.99
You save:  $30
€65.14
List Price:  €93.07
You save:  €27.92
£55.73
List Price:  £79.62
You save:  £23.88
CA$95.61
List Price:  CA$136.60
You save:  CA$40.98
A$106.30
List Price:  A$151.87
You save:  A$45.56
S$94.64
List Price:  S$135.20
You save:  S$40.56
HK$546.91
List Price:  HK$781.33
You save:  HK$234.42
CHF 63.50
List Price:  CHF 90.72
You save:  CHF 27.21
NOK kr764.69
List Price:  NOK kr1,092.46
You save:  NOK kr327.77
DKK kr485.92
List Price:  DKK kr694.20
You save:  DKK kr208.28
NZ$117
List Price:  NZ$167.15
You save:  NZ$50.15
د.إ257.06
List Price:  د.إ367.25
You save:  د.إ110.18
৳7,661.98
List Price:  ৳10,946.16
You save:  ৳3,284.17
₹5,839.65
List Price:  ₹8,342.71
You save:  ₹2,503.06
RM331.75
List Price:  RM473.95
You save:  RM142.20
₦86,437.65
List Price:  ₦123,487.65
You save:  ₦37,050
₨19,492.21
List Price:  ₨27,847.21
You save:  ₨8,355
฿2,575.56
List Price:  ฿3,679.53
You save:  ฿1,103.97
₺2,262.43
List Price:  ₺3,232.18
You save:  ₺969.75
B$357.76
List Price:  B$511.10
You save:  B$153.34
R1,296.01
List Price:  R1,851.52
You save:  R555.51
Лв127.38
List Price:  Лв181.98
You save:  Лв54.60
₩95,113.23
List Price:  ₩135,881.87
You save:  ₩40,768.63
₪260.11
List Price:  ₪371.60
You save:  ₪111.49
₱3,999.61
List Price:  ₱5,713.97
You save:  ₱1,714.36
¥10,715.43
List Price:  ¥15,308.41
You save:  ¥4,592.98
MX$1,185.45
List Price:  MX$1,693.57
You save:  MX$508.12
QR254.79
List Price:  QR364.01
You save:  QR109.21
P955.69
List Price:  P1,365.33
You save:  P409.64
KSh9,427.65
List Price:  KSh13,468.65
You save:  KSh4,041
E£3,355.67
List Price:  E£4,794.02
You save:  E£1,438.35
ብር3,989.43
List Price:  ብር5,699.43
You save:  ብር1,710
Kz58,616.62
List Price:  Kz83,741.62
You save:  Kz25,125
CLP$66,326.02
List Price:  CLP$94,755.52
You save:  CLP$28,429.50
CN¥506.51
List Price:  CN¥723.62
You save:  CN¥217.11
RD$4,049.59
List Price:  RD$5,785.38
You save:  RD$1,735.78
DA9,420.19
List Price:  DA13,457.99
You save:  DA4,037.80
FJ$157.70
List Price:  FJ$225.30
You save:  FJ$67.59
Q542.62
List Price:  Q775.21
You save:  Q232.58
GY$14,613.08
List Price:  GY$20,876.73
You save:  GY$6,263.64
ISK kr9,792.30
List Price:  ISK kr13,989.60
You save:  ISK kr4,197.30
DH706.05
List Price:  DH1,008.69
You save:  DH302.63
L1,239.86
List Price:  L1,771.31
You save:  L531.44
ден4,010.92
List Price:  ден5,730.13
You save:  ден1,719.21
MOP$562.15
List Price:  MOP$803.11
You save:  MOP$240.95
N$1,302.54
List Price:  N$1,860.85
You save:  N$558.31
C$2,571.43
List Price:  C$3,673.63
You save:  C$1,102.20
रु9,317.58
List Price:  रु13,311.40
You save:  रु3,993.82
S/262.81
List Price:  S/375.46
You save:  S/112.65
K268.53
List Price:  K383.63
You save:  K115.10
SAR262.51
List Price:  SAR375.03
You save:  SAR112.52
ZK1,879.71
List Price:  ZK2,685.42
You save:  ZK805.70
L324.19
List Price:  L463.14
You save:  L138.95
Kč1,629.65
List Price:  Kč2,328.17
You save:  Kč698.52
Ft25,373.17
List Price:  Ft36,248.95
You save:  Ft10,875.77
SEK kr758.75
List Price:  SEK kr1,083.98
You save:  SEK kr325.22
ARS$61,468.94
List Price:  ARS$87,816.53
You save:  ARS$26,347.59
Bs482.36
List Price:  Bs689.12
You save:  Bs206.75
COP$272,946.91
List Price:  COP$389,940.87
You save:  COP$116,993.96
₡35,623.88
List Price:  ₡50,893.45
You save:  ₡15,269.56
L1,732.95
List Price:  L2,475.75
You save:  L742.80
₲523,151.84
List Price:  ₲747,391.81
You save:  ₲224,239.96
$U2,683.09
List Price:  $U3,833.15
You save:  $U1,150.06
zł281.85
List Price:  zł402.67
You save:  zł120.81
Already have an account? Log In

Transcript

Hello everyone, welcome to the course of machine learning with Python. In this video, we shall learn about how to implement k nearest neighbor classifier in Python. So we start from importing the necessary libraries. So we are importing NumPy. And from SK learn dot data set, we are importing load IDs because we will be going over classification on Iris data set. So let's go ahead and press Shift Enter on the cell to run this particular set.

Okay, now we'll be loading the data set to the variable x and y. Next step would be splitting the entire data set into train and test. So from SK learn dot model underscore selection will be import crane underscore test underscore split. So this function will enable us to divide the entire data set into create and test. So we'll run this particular cell and see that our training data set contains 120 data points. And test data set contains 30 data points now we'll be standardizing the data set.

That means we'll be normalizing the dataset. So from SK learn dot p processing will import standard scaler. So standard scalar is a class. So if c is the instance of the object of this class, and we'll be fitting extreme and estates separately to that particular class. So now our columns in explain will have unit variance and zero mean. Similarly, the column index test will also have unit variance and zero mean Now we shall look upon an important concept called pairwise distance and how it works.

So from SK learn dot matrix, we'll be importing a function called pairwise underscore distances. Let's say we have a set of points which is denoted by zero comma one and one comma minus one and we have set up points which is another set of points which is noted by zero comma zero and two comma zero if we want to compute the pairwise distances between the set of points one and set of points two, we can simply call this function pairwise underscore distances and pass these set of points and see what are the results. Now we shall interpret the result look at the first point in the set of points one. So the first point is zero comma one. And the first point in the set of points to is zero comma zero. So the first element in this adding is the distance between the first point of set of points one with the first point of setup points to similarly the second element of this ad is the distance between the second point of the set of points one that means one home minus one with the first point of setup points to okay.

So similarly, we will have two into two total for distances. So pairwise distance actually computes all the distances between one set of points to another set Points okay. I hope this is clear. Now we should implement our K nearest neighbor algorithm. So, we have defined our kts algorithm in this particular cell look at what are the attributes or what are the variables it takes. So, it takes x underscore train y underscore train number of classes in underscore class x underscore test that is the test data points and number of nearest neighbor which is identified by key.

Now, by default the value of k is three okay. So, number of test points here is nothing but how many points are there in x test right we compute the pairwise distance matrix which is denoted by capital D between the test points and the training points. So, what will be our predicted variable or y predict we initialize this with an empty matrix. Okay. Now, for each of the base points We'll find the neighbors by the function NP dot art sought. So art sought his argument sort of which metrics that is the pairwise distance matrix.

And look at the index. So I am selecting row I and all the columns, okay, starting from one to k plus one, so y one to k plus one, because if the distance between the point let's say x to the point x would be zero, and this would be the minimum distance, so we have to exclude that. So that is one I am not taking from zero I'm starting from one and I am taking k plus one Now we shall label the neighbors. So what will be the label of the neighbors which is nothing but from the Y train, we'll be taking all the necessary levels of the neighbors. Now the implement majority 14 Let's introduce another variable count count is initialized with a zero vector now for j in levels underscore neighbor count within bracket t plus equals to one and why underscore credit I should be NP dot arg max of count okay.

So, I am taking the label of that particular test sample to be nothing but which one is occurring most that is the majority 40 and then from the function will return the predicted values of y. So, this is a vector will be returned Okay, so, let's go ahead and run this particular cell. Now, we should create it using our K nearest neighbor classifier. So, here I am calling our K nearest neighbor classifier here external we have already specified it contains 120 data points. Similarly y train number of classes identified as three now instance is equals to x underscore test which we have already Specify these countries, two data points and key we have specified three, okay, we can also specify these as five or one or seven. Okay, now let's go ahead and run this particular cell file.

Now we shall define from fusion metrics function. I hope that you have already seen these in the logistic regression video. So I'm not going to explain this again. So let's go ahead and run this particular cell. Now we should evaluate the model okay. So, we shall call this confusion matrix function, we should have passed our original TEST TEST levels predicted class levels and the number of classes it was two three.

Okay, so let's go ahead and on this particular cell, and this is our confusion matrix. So now to calculate the accuracy it will be nothing but sum of all the diagonal elements of the confusion matrix divided by the total number of elements in the confusion matrix and total correctly identified is nothing but some of the dialogue metrics and there are total 30 data points as you can see, so let's go ahead and run this particular cell. So currently identified 29 and there are total 30 data points. So accuracy of Kanan classifier with K equals to three is 96.67%, which is quite good now Kalan classification using a scanner. So from SK learn dot neighbors will input key neighbors classifier. So this is a class so now we shall create an instance or object of this class and specify number of neighbors is equals to three, we can always play around with his number of neighbors and see how changing the number of neighbors will affect the performance of the classifier.

Okay, so let's go ahead and run this particular cell. Note that After defining this object, I have fitted these with the training data set. Okay, now as you can see, here, there are so many Any parameters mentioned, so I will explain two of them. So here matrix Minkowski and P equals to two minutes, we are actually using Euclidean distance to measure the distance between the neighbors. Now we should test the model. So the scaling object has a method called credit.

So I will pass x test as the argument of this predict function of the predict method and the output will be all the predicted values of y that is all the levels. Okay, let's go ahead and run this particular cell and the similar format. We'll be constructing a confusion matrix and we comment on the accuracy. Okay, so here, the correctly identified is 30 and total is 30. So the Kenyan classifier in this case has achieved 100% accuracy on the test dataset which is quite remarkable. So thank you for watching this video.

In the Next video, we should introduce another classifier called support vector machine classifier. So see you in the next lecture.

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.