Linear Regression Practical Session - Part - 3

SAS Analytics Linear Regression-Case Study & Practical session
8 minutes
Share the link to this page
Copied
  Completed
You need to have access to the item to view this lesson.
One-time Fee
$69.99
List Price:  $99.99
You save:  $30
€65.66
List Price:  €93.80
You save:  €28.14
£56.20
List Price:  £80.29
You save:  £24.09
CA$96.29
List Price:  CA$137.56
You save:  CA$41.27
A$109
List Price:  A$155.72
You save:  A$46.72
S$95.29
List Price:  S$136.13
You save:  S$40.84
HK$548.18
List Price:  HK$783.15
You save:  HK$234.96
CHF 63.57
List Price:  CHF 90.82
You save:  CHF 27.25
NOK kr772.45
List Price:  NOK kr1,103.56
You save:  NOK kr331.10
DKK kr489.98
List Price:  DKK kr700
You save:  DKK kr210.02
NZ$118.75
List Price:  NZ$169.65
You save:  NZ$50.90
د.إ257.05
List Price:  د.إ367.23
You save:  د.إ110.18
৳7,680.95
List Price:  ৳10,973.26
You save:  ৳3,292.30
₹5,842.52
List Price:  ₹8,346.81
You save:  ₹2,504.29
RM334.79
List Price:  RM478.30
You save:  RM143.50
₦90,777.03
List Price:  ₦129,687.03
You save:  ₦38,910
₨19,491.43
List Price:  ₨27,846.09
You save:  ₨8,354.66
฿2,579.37
List Price:  ฿3,684.97
You save:  ฿1,105.60
₺2,281.41
List Price:  ₺3,259.29
You save:  ₺977.88
B$366.88
List Price:  B$524.14
You save:  B$157.26
R1,343.94
List Price:  R1,920
You save:  R576.05
Лв128.54
List Price:  Лв183.64
You save:  Лв55.09
₩96,612.79
List Price:  ₩138,024.19
You save:  ₩41,411.40
₪264.97
List Price:  ₪378.55
You save:  ₪113.57
₱4,029.04
List Price:  ₱5,756.02
You save:  ₱1,726.98
¥10,812.58
List Price:  ¥15,447.20
You save:  ¥4,634.62
MX$1,208.81
List Price:  MX$1,726.94
You save:  MX$518.13
QR256.24
List Price:  QR366.07
You save:  QR109.83
P970.12
List Price:  P1,385.95
You save:  P415.82
KSh9,343.66
List Price:  KSh13,348.66
You save:  KSh4,005
E£3,382.65
List Price:  E£4,832.56
You save:  E£1,449.91
ብር3,997.39
List Price:  ብር5,710.80
You save:  ብር1,713.41
Kz58,507.55
List Price:  Kz83,585.80
You save:  Kz25,078.25
CLP$67,632.73
List Price:  CLP$96,622.33
You save:  CLP$28,989.60
CN¥506.81
List Price:  CN¥724.04
You save:  CN¥217.23
RD$4,150.64
List Price:  RD$5,929.75
You save:  RD$1,779.10
DA9,419.60
List Price:  DA13,457.15
You save:  DA4,037.55
FJ$159.29
List Price:  FJ$227.57
You save:  FJ$68.27
Q546.30
List Price:  Q780.47
You save:  Q234.16
GY$14,650.65
List Price:  GY$20,930.40
You save:  GY$6,279.74
ISK kr9,869.98
List Price:  ISK kr14,100.58
You save:  ISK kr4,230.60
DH711.61
List Price:  DH1,016.63
You save:  DH305.01
L1,253.52
List Price:  L1,790.82
You save:  L537.30
ден4,049.40
List Price:  ден5,785.11
You save:  ден1,735.70
MOP$564.53
List Price:  MOP$806.50
You save:  MOP$241.97
N$1,337.58
List Price:  N$1,910.91
You save:  N$573.33
C$2,584.86
List Price:  C$3,692.82
You save:  C$1,107.95
रु9,352.12
List Price:  रु13,360.74
You save:  रु4,008.62
S/263.28
List Price:  S/376.13
You save:  S/112.85
K266.90
List Price:  K381.31
You save:  K114.40
SAR262.55
List Price:  SAR375.08
You save:  SAR112.53
ZK1,789.88
List Price:  ZK2,557.08
You save:  ZK767.20
L326.78
List Price:  L466.85
You save:  L140.07
Kč1,658.90
List Price:  Kč2,369.96
You save:  Kč711.06
Ft25,921.07
List Price:  Ft37,031.68
You save:  Ft11,110.61
SEK kr766.36
List Price:  SEK kr1,094.85
You save:  SEK kr328.48
ARS$60,874.37
List Price:  ARS$86,967.11
You save:  ARS$26,092.74
Bs485.36
List Price:  Bs693.40
You save:  Bs208.04
COP$272,889.01
List Price:  COP$389,858.15
You save:  COP$116,969.14
₡35,190.72
List Price:  ₡50,274.61
You save:  ₡15,083.89
L1,734
List Price:  L2,477.26
You save:  L743.25
₲519,666.57
List Price:  ₲742,412.64
You save:  ₲222,746.06
$U2,710.62
List Price:  $U3,872.48
You save:  $U1,161.86
zł284.04
List Price:  zł405.79
You save:  zł121.74
Already have an account? Log In

Transcript

Now in this video we will be dividing the data into two parts that is training and validation a part of the data will be allocated to training and a part of the data with veracity to validation and we will be building our model based on our training data. We will also do step by selection for our data. So let's do it. First we are forming two data sets that is data training, data training and validation. As we haven't specified any library name in our mind in my data step, so it is implied that both data sets will be created inside work. said my One that is my library name dot data setting is linear underscore rank underscore retail set is a keyword where we specify the name of our input data set.

So our original data set is the input data set that is linear and is correct and discredited, which is located inside the library mile run. So this data set is getting kaput in the to duplicate data sets that we are creating that is training and validation. So this training and validation is a duplicate data set of linear underscore reg underscore retail. So the original data set is getting copied in the two data sets that is training and validation which will be created in network. Now we have to specify that what will be the proportion of division of the data set. For that I'm using the keyword romney within bracket zero, run only within bracket zero is a keyword to generate random numbers this is used to divide the data into two parts that is run only zero keyword is used to divide the data into two parts.

So, just because we are generating random numbers, so the division will not be exact it would be approx, and we have specified the run only zero less than point seven. That is 70% of our data is going to the data set trading and 30% of the data that is the remaining data is going to the data set validation 70% of the observations are going to training and 30% of the observations are going to validation. So in my original dataset, the total number of observations was 200, where 70% of 200 means around 140. So 140 observations should go to training and 60 observations should go to validation. But over here, we have generated random numbers using the keyword randomly which is used to do the division of the data into two paths. Therefore, the exact 70% and 30% will not go there will be an approximation in the division.

And just because we have created random numbers, therefore, every time we run this code we will be getting different sets of results. So, let's run the code. Before I run the code Let me explain you on the code once more. Here we have created the data sets training and validation which is the copy of the original data set linear underscore reg underscore data stored in mileage one run only zero is a keyword to generate random numbers here 70% of the data is going to training and 30% that is remaining is going to validation. Just because we have generated random numbers the division will not be exactly 7230 it will Be an approximation. So let's run this code first.

The two datasets as we did not specify any library name it will be created at work. So let's open the library to work. So, see this is training data sets. So, training data, there are 143 observations as I told you exactly 140 observations will not be allocated to training data and validation will have the remaining number of observations that is 57. Okay, so now the division of the data is done 70 suited to ratio the ratio may vary according to your own choice. Now we'll be doing step by selection and we will build a model based on our training data.

So we will be using the procedure PROC PRINT data equal to training model My model is a key word. We're here to create the classical linear regression model and customer satisfaction is my independent variable. product quality. From product quality till faceplate, price flexibility, they are all my independent variables. We are going to receive step by selection using the keyword adjusted R square. Then via using the statement grant and then quit.

So let's run the code we have built the classical linear regression model based on training data with customer satisfaction as our dependent variables and the independent variables are from product quality to price flexibility, we have we are doing stepwise selection, which is done to select the set of significant variables, the set of significant variables that we'll be getting over here that we have to use for our future purpose to run the next regression procedures that is true predicted non dependent variables. So here I am doing step by selection, and we are using the technique of adjusted R squared. So let's run this code. So, see, here, the step by selection is done in every step one of the other variables are either added or removed and the variables are removed based on the value of adjusted R square and r square. The steps are ordered or sorted in such a way that is it is sorted in descending order of our adjusted R square.

So, the step where my adjusted R squared value will be maximum in that step, whatever variables are there, those variables or those independent variables will be my significant variables that I will be using in future to predict my dependent variable. So, see here the adjusted R square value is maximum 0.8014 that is, it's at point one 4%. So, the independent variables that are significant are product quality ecommerce advertising product line sales force image competitive pricing, packaging order building Price flexibility. So out of all the variables in product quality, the price flexibility only these sets of variables. So these many variables that is around nine independent variables are taking significant variables for our model. So this result will change every time we run the code because we have generated random numbers.

So now let me copy this set of variables and keep it because this I needed for me future purpose. So, I'm keeping a certain set of variables over here I'm converting them into comments. Because normally we write the name of the variable since as editor window cannot be accepted. So, these are the set of independent variables. So, in this video we'll be doing the hair only. So let's end the video over here.

Thank you. Goodbye. See your for the next week.

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.