Python implementation of linear regression with bi-variate data

4 minutes
Share the link to this page
Copied
  Completed
You need to have access to the item to view this lesson.
One-time Fee
$69.99
List Price:  $99.99
You save:  $30
€59.73
List Price:  €85.34
You save:  €25.60
£51.81
List Price:  £74.02
You save:  £22.20
CA$96.84
List Price:  CA$138.36
You save:  CA$41.51
A$106.75
List Price:  A$152.51
You save:  A$45.75
S$89.95
List Price:  S$128.50
You save:  S$38.55
HK$545.64
List Price:  HK$779.53
You save:  HK$233.88
CHF 55.86
List Price:  CHF 79.80
You save:  CHF 23.94
NOK kr703.17
List Price:  NOK kr1,004.57
You save:  NOK kr301.40
DKK kr445.91
List Price:  DKK kr637.04
You save:  DKK kr191.13
NZ$118.76
List Price:  NZ$169.67
You save:  NZ$50.90
د.إ257.03
List Price:  د.إ367.21
You save:  د.إ110.17
৳8,495.52
List Price:  ৳12,136.98
You save:  ৳3,641.45
₹6,172.17
List Price:  ₹8,817.76
You save:  ₹2,645.59
RM295.67
List Price:  RM422.40
You save:  RM126.73
₦106,514.70
List Price:  ₦152,170.38
You save:  ₦45,655.68
₨19,808.40
List Price:  ₨28,298.93
You save:  ₨8,490.52
฿2,243.26
List Price:  ฿3,204.79
You save:  ฿961.53
₺2,883.76
List Price:  ₺4,119.83
You save:  ₺1,236.07
B$378.92
List Price:  B$541.35
You save:  B$162.42
R1,231.03
List Price:  R1,758.69
You save:  R527.66
Лв116.90
List Price:  Лв167.02
You save:  Лв50.11
₩97,041.13
List Price:  ₩138,636.13
You save:  ₩41,595
₪233.01
List Price:  ₪332.89
You save:  ₪99.87
₱3,968.43
List Price:  ₱5,669.43
You save:  ₱1,701
¥10,316.66
List Price:  ¥14,738.72
You save:  ¥4,422.05
MX$1,309.87
List Price:  MX$1,871.33
You save:  MX$561.45
QR255.14
List Price:  QR364.50
You save:  QR109.36
P938.51
List Price:  P1,340.79
You save:  P402.27
KSh9,032.87
List Price:  KSh12,904.65
You save:  KSh3,871.78
E£3,397.07
List Price:  E£4,853.17
You save:  E£1,456.09
ብር9,985.48
List Price:  ብር14,265.58
You save:  ብር4,280.10
Kz64,180.83
List Price:  Kz91,690.83
You save:  Kz27,510
CLP$67,863.28
List Price:  CLP$96,951.70
You save:  CLP$29,088.41
CN¥499.22
List Price:  CN¥713.21
You save:  CN¥213.98
RD$4,414.63
List Price:  RD$6,306.89
You save:  RD$1,892.25
DA9,089.11
List Price:  DA12,985
You save:  DA3,895.89
FJ$157.67
List Price:  FJ$225.25
You save:  FJ$67.58
Q535.38
List Price:  Q764.86
You save:  Q229.48
GY$14,604.61
List Price:  GY$20,864.62
You save:  GY$6,260.01
ISK kr8,549.85
List Price:  ISK kr12,214.60
You save:  ISK kr3,664.74
DH634.40
List Price:  DH906.33
You save:  DH271.92
L1,171.66
List Price:  L1,673.87
You save:  L502.21
ден3,674.28
List Price:  ден5,249.20
You save:  ден1,574.92
MOP$561.02
List Price:  MOP$801.49
You save:  MOP$240.47
N$1,234.02
List Price:  N$1,762.97
You save:  N$528.94
C$2,569.13
List Price:  C$3,670.35
You save:  C$1,101.21
रु9,858.12
List Price:  रु14,083.64
You save:  रु4,225.51
S/245.88
List Price:  S/351.28
You save:  S/105.39
K291.36
List Price:  K416.25
You save:  K124.88
SAR262.47
List Price:  SAR374.98
You save:  SAR112.50
ZK1,666.62
List Price:  ZK2,380.99
You save:  ZK714.36
L303.21
List Price:  L433.17
You save:  L129.96
Kč1,456.45
List Price:  Kč2,080.74
You save:  Kč624.28
Ft23,442.21
List Price:  Ft33,490.30
You save:  Ft10,048.09
SEK kr657.57
List Price:  SEK kr939.43
You save:  SEK kr281.85
ARS$95,451.09
List Price:  ARS$136,364.55
You save:  ARS$40,913.45
Bs482.36
List Price:  Bs689.11
You save:  Bs206.75
COP$278,383.76
List Price:  COP$397,708.14
You save:  COP$119,324.37
₡35,369.65
List Price:  ₡50,530.25
You save:  ₡15,160.59
L1,828.90
List Price:  L2,612.83
You save:  L783.92
₲503,139.27
List Price:  ₲718,801.20
You save:  ₲215,661.92
$U2,810.15
List Price:  $U4,014.67
You save:  $U1,204.52
zł253.83
List Price:  zł362.63
You save:  zł108.80
Already have an account? Log In

Transcript

Hello everyone, welcome to the course on machine learning with Python. In this particular video, we will see how to explore Python in order to solve linear regression problem with single predictor variable. That means this is a bi variate regression analysis in Python. So first we will be importing the data set. Now these data set you are already familiar with the ADA Python analyses in the last module. so here also we'll be using the hybrid gender constant data.

So first, we'll be reading these data to our data frame called constant data. And we'll be printing the first few rows. As you can see, I have printed first five rows of the data frame. Now we want to predict the weight based on the person's height. So here the predictor variable x is equals to height and the target variable y is equals to weight. So I'll be taking y and x into separate NumPy arrays.

Okay, as you can see why he's nothing but a NumPy array and similarly if I type x x is also a NumPy array Now we shall find me and the standard deviation of x and y so for that we'll be importing NumPy okay. So mean value of y is y underscore me which is nothing but MP dot F or H of Y or MP dot mean of while the STD underscore dv underscore y stores the standard deviation of y and I have obtained the value using MP dot STD within bracket one similarly, we can obtain the value of mean or the average value of x and the standard deviation of x. Okay, now I am going to run this particular cell and then I'll be printing all the values I have obtained. x mean here is 130 8.26. Y mean is 35.61. standard deviation of x is 27.58 and standard deviation of y is 14.7.

Now, we shall find the correlation coefficient between x and y. We have already seen how to opt in these in the exploratory data analysis class in the same way will obtain coalition coefficient between X and Y using mp.co double RC function Okay, and the correlation coefficient rounded up to three decimal point is point 941. Now, we shall find the estimated model parameters using the equations we have seen in the last video. So, our estimate is theta one is equals to correlation coefficient between x and y multiplied with the standard deviation of y divided by the standard deviation of eggs. So, we'll go ahead and run this particular sin and print the estimated value of theta one and that is 0.5017. Similarly, we can go ahead and obtain the estimated value of the parameter theta zero which is nothing but y mean minus estimate Produce Tito one multiplied with x men.

So if we run this particular sin, we can see that theta zero is nothing but minus 33.756. Now we can go ahead and plot the regression line along with the data. So we'll import matplotlib.pi plot for this particular plotting. Now, to plot the regression line, we need some x data, which is nothing but linearly spaced value between the minimum value of x and the maximum value of x and we are using hundred data points and y data is nothing but theta zero multiplied. So theta zero plus theta one multiplied with x theta. Let's go ahead and compute x data and y data.

Now in this particular cell, we'll be plotting the scatter plot of X and Y that means weight versus height. along we will be plotting the x data and y data and we'll call it eight by eight to show the regression line have done that bleed on. So, you can see a bleed inside the plot. So, as you can see, these blue dots are nothing but the scatterplot of weight versus height and the rate line will show us the regression line okay. So, this is the best fit a straight line of the entire data set. So, so far this one So, we have seen how to explore the power of Python in order to solve the BI variate regression analysis problem.

In the next video, we shall go into the deeper detail of regression analysis, which is multiple linear regression. Thank you see you in the next lecture.

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.