Welcome to clinical data management program using SAS. In this video, we will be discussing about performing correlations using the court procedure or what is correlations. So, basically the correlation measures the degree of association between the two numeric variables, that is whether the two numeric variables are related to each other or not. And what is the magnitude of the relation and this measurement or correlation is measured using the correlation coefficient the correlation coefficient is denoted by R the value of R ranges from minus one to plus one where if my absolute value of r is equal to point five that means, the variables are more correlated if it is less than point five there is less amount of correlation or low correlation where it is greater than point five that means, it is high correlation and the sign of the correlation denotes whether the correlation is positive or negative that is, if I say for example, that the two variables are correlated and the correlation coefficient is minus point seven that means What they are negatively correlated point seven means, obviously the correlation is high but it is negatively correlated or they're inversely correlated to each other that is, if one variable value increases or decreases and suppose if say the correlation coefficient value is only point seven across concept, then it is positively correlated correlation is high, because it is the absolute value is greater than point five but it is positively correlated because the sign is positive that means, if one variable is increasing, then the other over other variables is also increasing that is they're directly proportional to each other when he says negatively correlated that means, it is inversely correlated to each other are inversely related to each other.
When we say it is positively correlated, that means, negatively correlated means it is inversely related to each other when we say it is positively correlated, then we say it is directly proportional to each other. Now, let's understand the concept of correlation using SAS. So, let's move to SAS. To understand the correlation. We'll be using the proc court procedure we'll be using the disease data set which we had used before. So we have to run out Name statement that is written in CDM.
This is our disease data set which consists of around say 1000 observations and it has got variables like disease military service available vehicles daily introduce average commute, ancestry children marital status education employment status, zip code, date of birth, gender and ID. Now we are going to find the correlation between two numeric variable say we will find the correlation between average commute and daily internet use to further how do we proceed. So, we'll be using the procedure called proc Core Data equals CDM dot disease. Then we'll do var my first variable is average commute. So let me copy the variable name from here. If you Copy the variable name from here at least you can avoid the spelling mistakes.
And also you can check the labels well the variables have any labels or not because you cannot call a variable name by its label. Next is daily internet use. So these are the two variables between which I want to check the correlation, which I'm specifying in the var statement. And then before I run this code, let me first explain you all the code. We're using the correlation procedure on C gamer disease data set and I want to find the correlation between average commute and daily internet use then. So let's run this code see the correlation is around minus 0.0104 or you can also say 0.4 to 01.
So basically, the correlation is very much less than point five. That means the correlation is very less artists low correlation and minus means negatively correlated correlations resonance they are very much less correlated to each other. You should also understand that Since correlations close to zero we can also say that we can also say they are close to zero means they are almost uncorrelated to each other. If the correlation value is zero we say it is uncorrelated since it is close to zero minus 0.01 means it's close to zero it is not that much related to each other or not the variables average commute and data introduce are not that much correlated to each other. So in this video, we'll be learning till here in May coming we do I'll be discussing about the different concepts of logistic regression. Thank you Goodbye, see your for the next video.