Practical Applications of Linear Regression in SAS - 1

7 minutes
Share the link to this page
Copied
  Completed
You need to have access to the item to view this lesson.
One-time Fee
$69.99
List Price:  $99.99
You save:  $30
€64.96
List Price:  €92.81
You save:  €27.84
£55.77
List Price:  £79.68
You save:  £23.90
CA$95.68
List Price:  CA$136.70
You save:  CA$41.01
A$106.02
List Price:  A$151.47
You save:  A$45.44
S$94.41
List Price:  S$134.88
You save:  S$40.47
HK$546.80
List Price:  HK$781.18
You save:  HK$234.37
CHF 63.34
List Price:  CHF 90.49
You save:  CHF 27.15
NOK kr761.11
List Price:  NOK kr1,087.35
You save:  NOK kr326.23
DKK kr485.02
List Price:  DKK kr692.92
You save:  DKK kr207.89
NZ$116.42
List Price:  NZ$166.33
You save:  NZ$49.90
د.إ257.06
List Price:  د.إ367.25
You save:  د.إ110.18
৳7,660.01
List Price:  ৳10,943.35
You save:  ৳3,283.33
₹5,835.78
List Price:  ₹8,337.18
You save:  ₹2,501.40
RM331.75
List Price:  RM473.95
You save:  RM142.20
₦86,437.65
List Price:  ₦123,487.65
You save:  ₦37,050
₨19,416.31
List Price:  ₨27,738.77
You save:  ₨8,322.46
฿2,572.74
List Price:  ฿3,675.50
You save:  ฿1,102.76
₺2,264.43
List Price:  ₺3,235.04
You save:  ₺970.61
B$356.70
List Price:  B$509.60
You save:  B$152.89
R1,295.44
List Price:  R1,850.72
You save:  R555.27
Лв127.05
List Price:  Лв181.51
You save:  Лв54.46
₩94,909.58
List Price:  ₩135,590.93
You save:  ₩40,681.35
₪259.50
List Price:  ₪370.74
You save:  ₪111.23
₱3,993.87
List Price:  ₱5,705.78
You save:  ₱1,711.90
¥10,712.31
List Price:  ¥15,303.96
You save:  ¥4,591.65
MX$1,187.89
List Price:  MX$1,697.07
You save:  MX$509.17
QR254.57
List Price:  QR363.69
You save:  QR109.12
P950.82
List Price:  P1,358.38
You save:  P407.55
KSh9,247.76
List Price:  KSh13,211.65
You save:  KSh3,963.89
E£3,352.12
List Price:  E£4,788.95
You save:  E£1,436.83
ብር4,006.43
List Price:  ብር5,723.72
You save:  ብር1,717.28
Kz58,511.64
List Price:  Kz83,591.64
You save:  Kz25,080
CLP$65,950.47
List Price:  CLP$94,219
You save:  CLP$28,268.52
CN¥506.53
List Price:  CN¥723.64
You save:  CN¥217.11
RD$4,055.76
List Price:  RD$5,794.19
You save:  RD$1,738.43
DA9,420.16
List Price:  DA13,457.95
You save:  DA4,037.79
FJ$157.70
List Price:  FJ$225.30
You save:  FJ$67.59
Q542.52
List Price:  Q775.06
You save:  Q232.54
GY$14,601.52
List Price:  GY$20,860.22
You save:  GY$6,258.69
ISK kr9,764.23
List Price:  ISK kr13,949.49
You save:  ISK kr4,185.26
DH703.98
List Price:  DH1,005.73
You save:  DH301.75
L1,236.34
List Price:  L1,766.28
You save:  L529.93
ден3,998.59
List Price:  ден5,712.52
You save:  ден1,713.92
MOP$561.77
List Price:  MOP$802.57
You save:  MOP$240.79
N$1,291.99
List Price:  N$1,845.78
You save:  N$553.78
C$2,569.36
List Price:  C$3,670.67
You save:  C$1,101.31
रु9,319.09
List Price:  रु13,313.56
You save:  रु3,994.46
S/260.54
List Price:  S/372.22
You save:  S/111.67
K269.79
List Price:  K385.44
You save:  K115.64
SAR262.50
List Price:  SAR375.02
You save:  SAR112.51
ZK1,882.68
List Price:  ZK2,689.66
You save:  ZK806.98
L323.40
List Price:  L462.03
You save:  L138.62
Kč1,628.77
List Price:  Kč2,326.92
You save:  Kč698.14
Ft25,305.79
List Price:  Ft36,152.68
You save:  Ft10,846.88
SEK kr755.02
List Price:  SEK kr1,078.64
You save:  SEK kr323.62
ARS$61,468.17
List Price:  ARS$87,815.44
You save:  ARS$26,347.26
Bs483.33
List Price:  Bs690.51
You save:  Bs207.17
COP$271,845.87
List Price:  COP$388,367.89
You save:  COP$116,522.02
₡35,672.25
List Price:  ₡50,962.55
You save:  ₡15,290.29
L1,724.16
List Price:  L2,463.20
You save:  L739.03
₲522,510.75
List Price:  ₲746,475.93
You save:  ₲223,965.17
$U2,674.97
List Price:  $U3,821.56
You save:  $U1,146.58
zł281.37
List Price:  zł401.98
You save:  zł120.60
Already have an account? Log In

Transcript

Welcome to clinical data management program using SAS. In this video we'll be talking about or we'll be discussing about the practical applications of linear regression using SAS for that first of all, we'll be importing our data set we'll be using the procedure called proc import data file equals within double quotes we'll be giving the path data set. Here our data sets are present so we'll give the proper name of my data set is death for CSV. This is my data seven year old data path from properties out was that great. This is my data set me on so important the SAS environment the name of the data set will be different. It will be created inside work, then replace Let's run this code.

So this is mainly because at the death rate is my dependent variable. And the rest of my dependent variable is lung cancer heart patients there will be multiple sclerosis to Colossus cancer brain tumor, all these patients HIV to Stanford in pneumonia, malaria or Alzheimer's disease, kidney failures, these are my different causes of death. The lack of independent variables which are dependent on my dependent variable or which are explaining the dependent variable that is measuring it over here. I have got these variable names that is death rate non cancer patients liver failure, multiple houses to opelousas cancer brain tumor, only patients HIV disconcerting you pneumonia, malaria, zimas disease, kidney disease. These are all our variables where death is the dependent variable restaurant independent reviews and there are approximately nine observations. There are observations in the data set.

Now let's start with a mutation. We'll be using the procedure called proc reg. Data. Because if that is the name of my data set just because it is present that works from hearing the library name is work capacity reliably without using a library anymore. It is implied that it as it is present in that model, my independent variable name, his death rate, and my dependent variable starting from lung cancer. From here we can copy the rate evenly from lung cancer, lung cancer.

These are independent labels. Here I'll be using the keyword called VA that is in this procedure to checking the multicollinearity of our data for whether our independent variables are that much getting influenced the juggernaut VA stance really insufficient in practice seeing the value of that envisioning factor we deciding amount of multiple names for each of the independent variables. So let's run this code. This is my overview. This is the readings efficient. And we see the VA value should be maximum, not cross check.

So you'll see the key bps variable is, is having a symbol VA, which surgeon qualified for this with more than a benchmark, which is 10. So we'll be moving this way even and again, check the medical needs of our data. Now lastly, we will be out kidney disease. Now let's run the code and check the multiple names. Now these come with intent, which is good. Now let's do it as part of a coalition that you'll be using the same procedure called proc Rick.

Data equals death rate. The extra version will be different. We'll be doing the Buddhist test more was lung cancer, Alzheimer's disease norson test and then run so let's run this code. This is the result of maybe the greatest insanity when we order readable tests. more simplistic we get the value of DWT, there is a new statistic value. anybody read the statistic verbalize between 1.5 and 2.5 when there is no correlation since here the district's two values 1.9.

So this will vary. That means there is no operation in our data or another test we'll be doing for autocorrelation it was doing sickness. We'll be using the same procedure, proc reg data equals death rate model that read was long answer as a more disease district s which stands for specification let's run this code test the p value of 0.6 is 0.6845. So the null hypothesis specification test is there is no address capacity and autocorrelation and alternative hypotheses Verizon in autumn or coalition partner model since a year of evil is greater than zero 0.0 point 05 is a level of significance as we accept might not hypothesis that is there is no autocorrelation it was because in our mode autocorrelation means atoms inoculated with respect to time. It means when the waves occur when we see autocorrelation that means that atoms are correlated with respect thing and it recursively means when the variance of the error terms are not constant.

So, this is how you check correlation. Now, let's divide the data into parts, dividing our data sets two part sets training and validation by 70% of our data in between. validation here we have to randomize our model a good fit is from the stack onwards, our answers will becoming different because we'll be having different number of observations every time we have the stack. That is when we divide our model or when we divide a data sets in our So, we are having two datasets that is data training validation set the problem is zero that means we are creating random numbers is less than point seven then our training data 70% of the observations will be going to training data and remaining 40% observations will be going to testing data that is of validation is done rapidly upon two data sets such as data training validation said that, if not only 70% wrongly generate random numbers 70% of the data will be will training data, and the remaining 80% will go into validation data.

So let's run this training data set. We have over 69 observations were given observations as training, data sets, and each observation, validation. So in this video, we'll be doing the next part of our practical we'll be doing our next video but in the next part of our linear regression article we'll be doing in our next video. Thank you Goodbye. see you for the next video.

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.