Merging and Concatenation of Datasets

7 minutes
Share the link to this page
Copied
  Completed
You need to have access to the item to view this lesson.
One-time Fee
$69.99
List Price:  $99.99
You save:  $30
€60.09
List Price:  €85.85
You save:  €25.76
£52.70
List Price:  £75.29
You save:  £22.59
CA$97.67
List Price:  CA$139.53
You save:  CA$41.86
A$106.33
List Price:  A$151.91
You save:  A$45.57
S$90.64
List Price:  S$129.50
You save:  S$38.85
HK$544.85
List Price:  HK$778.39
You save:  HK$233.54
CHF 56.11
List Price:  CHF 80.17
You save:  CHF 24.05
NOK kr706.48
List Price:  NOK kr1,009.31
You save:  NOK kr302.82
DKK kr448.86
List Price:  DKK kr641.25
You save:  DKK kr192.39
NZ$121.89
List Price:  NZ$174.13
You save:  NZ$52.24
د.إ257.03
List Price:  د.إ367.21
You save:  د.إ110.17
৳8,533.77
List Price:  ৳12,191.62
You save:  ৳3,657.85
₹6,313.38
List Price:  ₹9,019.51
You save:  ₹2,706.12
RM288.74
List Price:  RM412.50
You save:  RM123.76
₦101,115.95
List Price:  ₦144,457.55
You save:  ₦43,341.60
₨19,728.89
List Price:  ₨28,185.33
You save:  ₨8,456.44
฿2,234.53
List Price:  ฿3,192.33
You save:  ฿957.79
₺2,970.74
List Price:  ₺4,244.10
You save:  ₺1,273.35
B$373.53
List Price:  B$533.64
You save:  B$160.11
R1,196.95
List Price:  R1,710
You save:  R513.05
Лв117.58
List Price:  Лв167.98
You save:  Лв50.40
₩102,572.45
List Price:  ₩146,538.35
You save:  ₩43,965.90
₪226.88
List Price:  ₪324.13
You save:  ₪97.25
₱4,123.70
List Price:  ₱5,891.26
You save:  ₱1,767.55
¥10,892.02
List Price:  ¥15,560.70
You save:  ¥4,668.67
MX$1,278.46
List Price:  MX$1,826.45
You save:  MX$547.99
QR255.24
List Price:  QR364.65
You save:  QR109.40
P933.31
List Price:  P1,333.36
You save:  P400.04
KSh9,060.20
List Price:  KSh12,943.70
You save:  KSh3,883.50
E£3,325.80
List Price:  E£4,751.35
You save:  E£1,425.54
ብር10,786.23
List Price:  ብር15,409.57
You save:  ብር4,623.33
Kz63,849.11
List Price:  Kz91,216.93
You save:  Kz27,367.81
CLP$64,667.96
List Price:  CLP$92,386.76
You save:  CLP$27,718.80
CN¥494.44
List Price:  CN¥706.37
You save:  CN¥211.93
RD$4,408.50
List Price:  RD$6,298.13
You save:  RD$1,889.62
DA9,108.94
List Price:  DA13,013.32
You save:  DA3,904.38
FJ$158.60
List Price:  FJ$226.58
You save:  FJ$67.98
Q534.97
List Price:  Q764.28
You save:  Q229.30
GY$14,610.67
List Price:  GY$20,873.29
You save:  GY$6,262.61
ISK kr8,941.92
List Price:  ISK kr12,774.72
You save:  ISK kr3,832.80
DH646.20
List Price:  DH923.18
You save:  DH276.98
L1,189.30
List Price:  L1,699.07
You save:  L509.77
ден3,706.95
List Price:  ден5,295.87
You save:  ден1,588.92
MOP$560.07
List Price:  MOP$800.14
You save:  MOP$240.06
N$1,195.98
List Price:  N$1,708.62
You save:  N$512.63
C$2,569.92
List Price:  C$3,671.47
You save:  C$1,101.55
रु10,053.04
List Price:  रु14,362.10
You save:  रु4,309.06
S/235.54
List Price:  S/336.50
You save:  S/100.96
K296.08
List Price:  K422.99
You save:  K126.91
SAR262.66
List Price:  SAR375.24
You save:  SAR112.58
ZK1,604.51
List Price:  ZK2,292.25
You save:  ZK687.74
L305.84
List Price:  L436.93
You save:  L131.09
Kč1,450.31
List Price:  Kč2,071.96
You save:  Kč621.65
Ft22,877.24
List Price:  Ft32,683.18
You save:  Ft9,805.93
SEK kr657.79
List Price:  SEK kr939.74
You save:  SEK kr281.95
ARS$101,939.12
List Price:  ARS$145,633.56
You save:  ARS$43,694.43
Bs482.59
List Price:  Bs689.44
You save:  Bs206.85
COP$265,786.76
List Price:  COP$379,711.65
You save:  COP$113,924.88
₡34,389.94
List Price:  ₡49,130.59
You save:  ₡14,740.65
L1,839.91
List Price:  L2,628.55
You save:  L788.64
₲484,335.90
List Price:  ₲691,938.08
You save:  ₲207,602.18
$U2,753.78
List Price:  $U3,934.14
You save:  $U1,180.36
zł254.24
List Price:  zł363.22
You save:  zł108.97
Already have an account? Log In

Transcript

Welcome to clinical data management programs using SAS. In this video we will be discussing about merging and concatenation of datasets. So for merging datasets First, we need to know that we have to sort our data set with respect to the common variable, that is the data sets which we are merging, they should have at least one common variable, we have to sort the the original data sets that we have to match the opposite of the original data set with respect to those common variables. And then we have to merge those data sets with respect to the common D. So we are going to use the procedure called proc sort first to merge the data the data that we want to use. Let me show you that first. And then I'm going to start with the procedure.

I'm using a data set called D files which is located inside the C gave agree I'm running the live name statement to get the data set defined. So this is my data set consisting of ID gender, date of birth, zip code, employment status, education, marital status, children average commute of level vehicles and disease. I am clubbing this data are merging this data set d phi with the data set D six D six also. So Id D data words zip code education marital status children ancestry data interviews available vehicles, military service and disease. So I'm going to use one of the common variables out of these datasets I'm using the variable iD iD is one of the common variable between the two datasets. So I'm using the variable ID.

Let's let me close the view tables first. So first, I want to sort the data with respect to 80% to the SOT CDM 35 data set, I'm not going to change my original data set. So I'm going to create a duplicate data set which is a copy of my data CDM not the fake and I'm going to solve the duplicate data set which I'm calling using the out statement that is all too close to DC and I'm calling this data set inside work as a specific It like really my variable with the common variable which is ID, then we run this code. So this is a deep data set which is sorted with respect to ID. The next data set which ID is a decent data set, because we are merging different discs again I'm going to start that with respect to ID. I am sorting both datasets in ascending order which is by default.

Let's open the D 16 data set. This is all sorted with respect to ID. My ID is a common variable for both the data sets satisfy d 15. As well as d 16. I have created a duplicate data set for c d d fake that is d 15. Under soccer d 15.

With respect to ID in ascending order, severity I have here duplicate data set for CDM 36 agency 16 apps with respect to ID in ascending order. So this is g 15. sorted with respect to ID and D 16. This is something with respect to ID. Now I'm going to merge these two data sets with respect to the common variable ID. So for merging, we are going to again use the data set only data, my resulting data set name will be d 17. That is my output data set, I am creating the resulting data set inside work, then I'm merging the 50.

These are my input datasets and D 16. By ID and then I'm doing that. So let's run this code. So C, D 17. Is merged. This is Monday.

Data set which is most which is a result of data set as data set, it is a result of data set as a result of Mojo p 15 and D 60. So, this is the concept of merging you have to remember that for merging cost of disorder original data sets in with respect to a common variable in either ascending or descending order that is your choice. So, we have to first order original datasets which are merging with respect to the common variable in any order you order either ascending or descending then you have to merge those data sets with respect to the common variable. Now, let's learn the concept of concatenation concatenation. We do not need any sorting, we can do concatenation directly using said statement but for that also we'll be using the data set we're using via forming the resulting data set as data detail.

I'm creating a data set and said work I am concatenating set CDM dot d seven and CDM dot d eight. So, let me show you all the D seven D datasets. It is repeated inside CDs library, so this is d seven. So when we concatenate these two written data sets, basically the rows gets added to the resulting data sets will have the sum of the rows of this data set some of the rows of this year, like it will be the addition of the rows of this data set and this data set, it's for the seven main number of rows was 548, my number of rows was 1500, one approximately, so my resulting data set will have 2001 will be the number of rows of may result in a test that is 500 to 1600. One so you took our Cartesian way using the statement called set statement, then we're doing one resulting data set will be created in said work.

So we'll check the work library Seed consists of 1009 observations, it is the result of the computation of the 78. So, that is high Reynolds observations plus one observations. So, I hope you understood the concept of marginal contribution emerging here for sorting the data with respect to the variable with respect to the common variable by which we emerging then we are merging the DATA step using the merge statement with respect to the target variable. I mean merging the merging happens with respect to columns put in concatenation. concatenation happens with respect to rows the rows are getting added. So, in my coming video, I'll be discussing with you all enhancing reports with titles footnotes and labels creating frequency reports creating summary statistics reports, creating tabulated reports so these will be discussed in our coming videos.

Thank you. Goodbye. See you all for the next video.

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.