Implementation of hierarchical clustering in python

Machine Learning Using Python Unsupervised Learning: Clustering
5 minutes
Share the link to this page
Copied
  Completed
You need to have access to the item to view this lesson.
One-time Fee
$69.99
List Price:  $99.99
You save:  $30
€65.39
List Price:  €93.41
You save:  €28.02
£55.92
List Price:  £79.90
You save:  £23.97
CA$96.01
List Price:  CA$137.16
You save:  CA$41.15
A$107.15
List Price:  A$153.08
You save:  A$45.93
S$95.13
List Price:  S$135.90
You save:  S$40.77
HK$547.14
List Price:  HK$781.66
You save:  HK$234.52
CHF 63.86
List Price:  CHF 91.23
You save:  CHF 27.37
NOK kr775.40
List Price:  NOK kr1,107.76
You save:  NOK kr332.36
DKK kr487.78
List Price:  DKK kr696.86
You save:  DKK kr209.07
NZ$118.01
List Price:  NZ$168.60
You save:  NZ$50.58
د.إ257.06
List Price:  د.إ367.24
You save:  د.إ110.18
৳7,680.49
List Price:  ৳10,972.60
You save:  ৳3,292.11
₹5,842.03
List Price:  ₹8,346.11
You save:  ₹2,504.08
RM332.86
List Price:  RM475.54
You save:  RM142.67
₦86,437.65
List Price:  ₦123,487.65
You save:  ₦37,050
₨19,491.96
List Price:  ₨27,846.85
You save:  ₨8,354.89
฿2,586.09
List Price:  ฿3,694.58
You save:  ฿1,108.48
₺2,265.39
List Price:  ₺3,236.41
You save:  ₺971.02
B$363.53
List Price:  B$519.35
You save:  B$155.82
R1,302.64
List Price:  R1,861
You save:  R558.35
Лв127.90
List Price:  Лв182.73
You save:  Лв54.82
₩96,270.48
List Price:  ₩137,535.16
You save:  ₩41,264.67
₪262.29
List Price:  ₪374.71
You save:  ₪112.42
₱4,033.94
List Price:  ₱5,763.02
You save:  ₱1,729.07
¥10,867.12
List Price:  ¥15,525.12
You save:  ¥4,658
MX$1,187.12
List Price:  MX$1,695.96
You save:  MX$508.84
QR254.93
List Price:  QR364.20
You save:  QR109.27
P994.08
List Price:  P1,420.18
You save:  P426.09
KSh9,360.69
List Price:  KSh13,372.99
You save:  KSh4,012.30
E£3,358.63
List Price:  E£4,798.26
You save:  E£1,439.62
ብር4,003.77
List Price:  ብር5,719.92
You save:  ብር1,716.15
Kz58,546.63
List Price:  Kz83,641.63
You save:  Kz25,095
CLP$67,216.99
List Price:  CLP$96,028.39
You save:  CLP$28,811.40
CN¥506.70
List Price:  CN¥723.89
You save:  CN¥217.19
RD$4,073.53
List Price:  RD$5,819.58
You save:  RD$1,746.04
DA9,418.34
List Price:  DA13,455.35
You save:  DA4,037.01
FJ$158.31
List Price:  FJ$226.17
You save:  FJ$67.86
Q543.96
List Price:  Q777.12
You save:  Q233.16
GY$14,650.29
List Price:  GY$20,929.88
You save:  GY$6,279.59
ISK kr9,815.39
List Price:  ISK kr14,022.59
You save:  ISK kr4,207.20
DH707.71
List Price:  DH1,011.06
You save:  DH303.35
L1,237.78
List Price:  L1,768.33
You save:  L530.55
ден4,025.24
List Price:  ден5,750.59
You save:  ден1,725.35
MOP$563.96
List Price:  MOP$805.69
You save:  MOP$241.73
N$1,304.33
List Price:  N$1,863.42
You save:  N$559.08
C$2,570.38
List Price:  C$3,672.13
You save:  C$1,101.75
रु9,397.27
List Price:  रु13,425.24
You save:  रु4,027.97
S/263.43
List Price:  S/376.35
You save:  S/112.91
K270.11
List Price:  K385.89
You save:  K115.77
SAR262.49
List Price:  SAR375.01
You save:  SAR112.51
ZK1,873.89
List Price:  ZK2,677.10
You save:  ZK803.21
L325.37
List Price:  L464.84
You save:  L139.46
Kč1,643.47
List Price:  Kč2,347.91
You save:  Kč704.44
Ft25,458.03
List Price:  Ft36,370.18
You save:  Ft10,912.14
SEK kr764.90
List Price:  SEK kr1,092.76
You save:  SEK kr327.86
ARS$61,327.27
List Price:  ARS$87,614.14
You save:  ARS$26,286.87
Bs483.57
List Price:  Bs690.85
You save:  Bs207.27
COP$273,218.78
List Price:  COP$390,329.27
You save:  COP$117,110.49
₡35,710.66
List Price:  ₡51,017.42
You save:  ₡15,306.75
L1,733.65
List Price:  L2,476.75
You save:  L743.09
₲524,442.73
List Price:  ₲749,236.02
You save:  ₲224,793.28
$U2,683.09
List Price:  $U3,833.15
You save:  $U1,150.06
zł283.24
List Price:  zł404.64
You save:  zł121.40
Already have an account? Log In

Transcript

Hello, everyone, welcome to the course of machine learning with Python. In this video, we shall discuss about how to implement hierarchical clustering in Python. So we first start with importing the necessary libraries. So we have imported here matplotlib.pi plot NumPy and from SK learn dot data sets, we have imported MC underscore blobs. So we'll see you know why this particular function will help us to create a custom data set. Okay.

So let's go ahead and on this particular cell, now, we should create the data points for clustering. So, we have created 15 data points approximate only having three clusters and we have taken this more data points in order to visualize the hierarchical clustering that is a dendogram of the hierarchical clustering. Okay, we have taken number of features equals to two that means the dimensionality of the data set is two. That is for visual Okay, so, let's go ahead and on this particular cell and this is basically all the data points we have generated randomly now we should plot the data points okay. So, these are basically the origin data points fine. So, this is x and this is y.

So, as you can see there are approximately all the points are normally distributed, you cannot see the normal distribution over here because the number of data points are two lists but you can clearly see the three different clusters over here. Okay, now, we will see how hire surplus chili will help us clustering these data points. Okay, so let's first begin with plotting the data graphs. So to plot that into graphs, we have imported this particular function from Sai pi dot cluster dot hierarchy. We have important linkage and Dr. Okay.

So first we will be creating the linkages so we shall use linkage found There are various methods to create the linkages, few of them are pointed below. So, one is called single that is for single linkage that means me then complete or complete linkage that is Max then average or group average linkage and work. So, this is what variance minimization method. So, this is widely used. So, by default it takes single So, even if by default it takes single I have specifically mentioned that my method of linkage is single. So, I have passed the data set and I have specified without equals to single inside the function linkage and it will return an object lead okay.

So, that I have stood here in the variable. Now, we shall create and plot the dendogram. So, the dendogram will be formed with this link object file. So, as you can see, this is basically our dendogram Now, how do we optimize cluster So, at any point we can make the dendogram. So, if we bake at this level, we clearly have three different clusters. If we break it at this level, we have two clusters.

Now, using a scaler we can also do agglomerative clustering. So, from a scaling cluster we have to import agglomerative clustering. So, this is a class. So, AC is nothing but an object or instance of this abnormality clustering class, I have specified number of clusters is close to three, you can play around with these and change the number of clusters and see how it behaves as specified affinity equals to Euclidean that is how to measure the proximity between the points. So, I have specified that to be Euclidean distance, and here also I have mentioned linkage is equals to single Okay, now, let's go ahead and add this particular. So, now we can plot the cluster.

So as you can see, so what is why is he Y z is nothing but we have done the feed credit of the internet data set x that means in y see all the cluster level of all the points of this data set x is talk Okay. Now how many cluster levels will be there there shall be three cluster level 012 So, what I have done over here so, first I have plotted all the data points whose cluster level is equals to zero okay and I have colored it with the cyan color and I have mentioned that the marker should be square okay. Similarly, I have labeled it cluster one in the next one where y underscore is equals to goes to one I have specified cluster label to be cluster two and the color is green and the marker is nothing but round or doc.

In the third case where y equals two goes to two. I have plotted that blue color OK, so let's go ahead and run this particular cell and see how the clustering has been done. Okay. So, as you can see, these are the three clusters for Okay, so let's go ahead and see our original data point again okay. So, as you can see this is one cluster, this is another cluster, this is another cluster. So, there are three clusters and these t cluster is quite easily are quite remarkably identified by the clustering algorithm.

So, this is the end of this model. In the next video, we shall introduce another model, which is an artificial neural network. So, see you in the next lecture. Thank you

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.