Concept of Box Plot

Machine Learning Using Python Statistics and Exploratory Data Analysis
4 minutes
Share the link to this page
Copied
  Completed
You need to have access to the item to view this lesson.
One-time Fee
$69.99
List Price:  $99.99
You save:  $30
€65.14
List Price:  €93.07
You save:  €27.92
£55.73
List Price:  £79.62
You save:  £23.88
CA$95.61
List Price:  CA$136.60
You save:  CA$40.98
A$106.30
List Price:  A$151.87
You save:  A$45.56
S$94.64
List Price:  S$135.20
You save:  S$40.56
HK$546.91
List Price:  HK$781.33
You save:  HK$234.42
CHF 63.50
List Price:  CHF 90.72
You save:  CHF 27.21
NOK kr764.69
List Price:  NOK kr1,092.46
You save:  NOK kr327.77
DKK kr485.92
List Price:  DKK kr694.20
You save:  DKK kr208.28
NZ$117
List Price:  NZ$167.15
You save:  NZ$50.15
د.إ257.06
List Price:  د.إ367.25
You save:  د.إ110.18
৳7,661.98
List Price:  ৳10,946.16
You save:  ৳3,284.17
₹5,839.65
List Price:  ₹8,342.71
You save:  ₹2,503.06
RM331.75
List Price:  RM473.95
You save:  RM142.20
₦86,437.65
List Price:  ₦123,487.65
You save:  ₦37,050
₨19,492.21
List Price:  ₨27,847.21
You save:  ₨8,355
฿2,575.56
List Price:  ฿3,679.53
You save:  ฿1,103.97
₺2,262.43
List Price:  ₺3,232.18
You save:  ₺969.75
B$357.76
List Price:  B$511.10
You save:  B$153.34
R1,296.01
List Price:  R1,851.52
You save:  R555.51
Лв127.38
List Price:  Лв181.98
You save:  Лв54.60
₩95,113.23
List Price:  ₩135,881.87
You save:  ₩40,768.63
₪260.11
List Price:  ₪371.60
You save:  ₪111.49
₱3,999.61
List Price:  ₱5,713.97
You save:  ₱1,714.36
¥10,715.43
List Price:  ¥15,308.41
You save:  ¥4,592.98
MX$1,185.45
List Price:  MX$1,693.57
You save:  MX$508.12
QR254.79
List Price:  QR364.01
You save:  QR109.21
P955.69
List Price:  P1,365.33
You save:  P409.64
KSh9,427.65
List Price:  KSh13,468.65
You save:  KSh4,041
E£3,355.67
List Price:  E£4,794.02
You save:  E£1,438.35
ብር3,989.43
List Price:  ብር5,699.43
You save:  ብር1,710
Kz58,616.62
List Price:  Kz83,741.62
You save:  Kz25,125
CLP$66,326.02
List Price:  CLP$94,755.52
You save:  CLP$28,429.50
CN¥506.51
List Price:  CN¥723.62
You save:  CN¥217.11
RD$4,049.59
List Price:  RD$5,785.38
You save:  RD$1,735.78
DA9,420.19
List Price:  DA13,457.99
You save:  DA4,037.80
FJ$157.70
List Price:  FJ$225.30
You save:  FJ$67.59
Q542.62
List Price:  Q775.21
You save:  Q232.58
GY$14,613.08
List Price:  GY$20,876.73
You save:  GY$6,263.64
ISK kr9,792.30
List Price:  ISK kr13,989.60
You save:  ISK kr4,197.30
DH706.05
List Price:  DH1,008.69
You save:  DH302.63
L1,239.86
List Price:  L1,771.31
You save:  L531.44
ден4,010.92
List Price:  ден5,730.13
You save:  ден1,719.21
MOP$562.15
List Price:  MOP$803.11
You save:  MOP$240.95
N$1,302.54
List Price:  N$1,860.85
You save:  N$558.31
C$2,571.43
List Price:  C$3,673.63
You save:  C$1,102.20
रु9,317.58
List Price:  रु13,311.40
You save:  रु3,993.82
S/262.81
List Price:  S/375.46
You save:  S/112.65
K268.53
List Price:  K383.63
You save:  K115.10
SAR262.51
List Price:  SAR375.03
You save:  SAR112.52
ZK1,879.71
List Price:  ZK2,685.42
You save:  ZK805.70
L324.19
List Price:  L463.14
You save:  L138.95
Kč1,629.65
List Price:  Kč2,328.17
You save:  Kč698.52
Ft25,373.17
List Price:  Ft36,248.95
You save:  Ft10,875.77
SEK kr758.75
List Price:  SEK kr1,083.98
You save:  SEK kr325.22
ARS$61,468.94
List Price:  ARS$87,816.53
You save:  ARS$26,347.59
Bs482.36
List Price:  Bs689.12
You save:  Bs206.75
COP$272,946.91
List Price:  COP$389,940.87
You save:  COP$116,993.96
₡35,623.88
List Price:  ₡50,893.45
You save:  ₡15,269.56
L1,732.95
List Price:  L2,475.75
You save:  L742.80
₲523,151.84
List Price:  ₲747,391.81
You save:  ₲224,239.96
$U2,683.09
List Price:  $U3,833.15
You save:  $U1,150.06
zł281.85
List Price:  zł402.67
You save:  zł120.81
Already have an account? Log In

Transcript

Hello everyone, welcome to the course of machine learning with Python. This video is the supplemental to the last video where we have discussed the distribution of categorical and quantitative variable in this particular short video, we will learn how to draw a boxplot okay. So, what is a box and whisker plot? So, this is a way to visualize the distribution of a quantitative variable using five numbers somebody So, we already know what is five numbers somebody that means mean quartile one median quartile three and the maximum okay. So, this is basically a box plot of some quantitative variable okay note that here is q1 median and the q3 directly obtained from the five number summary However, this w one and W two known as the lower whisker and the upper whisker respectively are not so trivial okay. So, we have to compute the value of W one and W two that we will discuss in this video.

So, there are several methods of buying whiskers So, we will use 1.5 times IQ or criterion, this is also known as to kiss method for plotting the whiskers. So what does two keys method tells us about So, if Max and mean are the maximum and the minimum values of the data set respectively then if mean is greater than equals to Q one minus 1.5 times IQ, then w one will be equals to the minimum value is W one is at minimum value that is just greater than quartile one minus 1.5 times IQ if maximum value is less than or equal to quartile three plus 1.5 times IQ then whisker two is at maximum otherwise whisker two is the maximum Hulu which is just less than quartile three plus 1.5 times IQ. Okay, so, let's take an example. So following up the ages of the actresses who won academic award from the year 1970 to 2013.

Okay, so on examining this data set, we can find the following five number summary so the minimum is two One quartile one is 30.5 median is 34.5 what is three is 42 and maximum is 80. Now, what is IQ interquartile range that is quartile three minus quartile one which is 11.5. Now, according to 1.5 times IQ or criteria, what is one minus 1.5 times IQ is equals to 13.25 and what is the minimum value of our observations it is 21 we know that 21 is greater than 13.25. So, hence we score one should be placed at the minimum value aspect up algorithm or the flowchart that has been described in the last okay. So, the W one that is the lower whisker will be placed at age 21. So, what is quartile three plus 1.5 times I keyword is 42 plus 1.5 times 11.5 which is 550 9.25.

And what is the maximum value of the observations at the maximum value is greater than 59.25. So, hence we got to will be at value which is Just less than 59.25 So, what is the value just less than 59.25 if you hover around this entire data set you can find that the 49 is the value which is just less than 59.25 in this data set okay. So, my risk two will be placed at 49 So, this will be our box plot. So, note that there are several other values like 6162 8074 etc which falls beyond the range of 21 to 14 So, which is basically greater than 49 Okay. So, these observations are known as outliers okay. So, in many many cases, outliers are very very significant for example, fraud detection in case of fraud detection, we are actually trying to find out the outlier transaction cases which are kind of fraudulent or which actually carries the signature to be a fraudulent transaction.

So, there are several use cases where we can find that outlier detection becomes very important in machine learning So, one of the several outlier detection techniques, we have learned one of the outlier detection technique that is that two keys we thought about color detection which is nothing but to check the value whether it lies between these 1.5 times IQ or criteria okay. So, in this case if the value lies between 21 to 49 then we say that this is basically lying in the normal range of the data value, otherwise we say that that is basically outlier. Okay, so, there are several other methods of detecting outliers. So, this is just one of it. So, thank you in the next video, we will exclude the relationship among the categorical and quantitative variable. See you there.

Thank you

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.