Summarizing Clinical Data

6 minutes
Share the link to this page
Copied
  Completed

Transcript

Welcome to our clinical data management program using SAS. In this video, we will be discussing about summarizing clinical data. So what do you mean by summarizing clinical data we have to group with our data with respect to variable and then we are going to summarize the data. Say for example, in this case, our objective will be that we will be group we will be grouping our data with respect to the variable gender. So gender is of two types males and females and then we are going to display the total average commute for me and the total average commute for females. So to do this, we are going to use the data set disease we are going to use the disease data set.

And for grouping, first we will be sorting the data set with respect to the variable gender, because for grouping your data, which is the CO variable and summarizing data necessary condition is first you have to sort the data with respect to that variable. So first, we'll be sorting the disease data set with respect to the variable gender, we are not going to be Use the original data, we will not solve original data set. Rather, we will be creating a duplicate data set. And we'll be sorting the duplicate data set keeping the original data set intact. And we will be using the duplicate data set for summarizing our data. So to get the disease data set, we have to run the lignin statement.

I showed you all before how to run the librium statement and how to access your data sets in the SAS environment or how to calculate assets in SAS environment. So this is the disease data set that we're going to work with. It consists of these four columns that is ID gender, date of birth, record, employment status, education, marital status, children ancestry, average commute, daily interviews, available vehicles, military service and disease. So first, let me sort the data with respect to gender. For that we're going to use the prop start procedure. So prop soft data equals my library name is CDM dot disease.

I'm doing our sequel to you Disease seven because I am creating one duplicate data set named disease seven. That is my original data set C game for disease is getting copied in the duplicate data set that disease seven and we are going to start the duplicate data set disease seven keeping the original data set intact, so we won't be modifying your original data set CDF for disease, then I'm using the by statement that is vijender by statement is is used to specify the variable with respect to which we are going to solve the data set. And then raw. proc sort is the only procedure which does not generate any result viewer or any report, we can open the data set and see how the data is sorted. Or like we can check the results right by opening the data set to check that whether the data is sorted or not.

So let's execute this code. Article to disease seven we haven't specified any library name so by default it will be created and said work to disease service created and saved Work see the disease seven data set is sorted with respect to gender you know sorting is done by default in ascending order. So, here it is the gender is sorted in ascending order that means in a vertical order that is from A to Zed now we are going to do the summarizing part that is grouping and summarizing part for which we are going to use another type of data step it is also part of data manipulation. We are going to create our output data set named disease eight you can create the output data set in any library you want in work library or any other library for forming it in work library. So, data disease eight when we are doing set disease seven solid data sets will as specified by output data set in set statement I've specified my input data set my input data set is getting copied in our data set then by gender Then we are using this statement called first dot gender that is a we are going to specify the grouping variable in the buy statement and we have to also use the syntax a first job the grouping variable, then a community variable that is coming from zero because the initial value of my accumulating variable will be zero.

And basically we are doing a cumulative total of the average commute. So, for that we have to give the sum statement that is commute. One plus commute one plus average commute. Maybe have to also give if last or gender, gender is the grouping variable. And then, so let's run this code. So, see here the female total average commute is this and the overall average commute is this.

So we have basically softer data first, then we group the data with respect to the variable gender because we have sorted with respect to gender in ascending order, we have sorted the duplicate data set, we have kept the original dataset intact. That's where we have used our statement. We have used the concept of vijender for the routine statement and then if it was your gender and if lost or gender, it fostered the grouping variable then the initial value of your community variable will be zero and to create or to generate the commutative tutor. For the cumulative total we have given the substituent commute one plus average commute if lost or gender and Linda. So this is our value of average average commute and see the value of commute one as the value of commute commute one for a female coworker and for me, this is average commute for female and this average commute for me coming from this cumulative total for female commuters cumulative total for average commutes for females, and this value.

The first is 29,300 52 point 700 This is a cumulative total for average commute for females and 31,401.08 is the cumulative total for average commute for me implies the cost is for the female, and that is $29,031. For the President also it is for the female employees and the 29,000 is for the female president 31 rows in his 4 billion plus. So this is the concept of grouping and this is a way you summarize medical data. For now, I'll end this video over here. In my next video, I'll be teaching you how to convert numeric data types to factory types and how to convert calculators to numeric data types. For now, let me end this video over here.

Thank you. Goodbye. See you all for the next video.

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.