Indexing and Slicing Tables with Pandas

Python 3: Automating Your Job Tasks Superhero Level: Automate Data Analysis Tasks with Python 3
21 minutes
Share the link to this page
Copied
  Completed
You need to have access to the item to view this lesson.
One-time Fee
$99.99
List Price:  $139.99
You save:  $40
€92.01
List Price:  €128.82
You save:  €36.80
£78.38
List Price:  £109.74
You save:  £31.35
CA$136.52
List Price:  CA$191.13
You save:  CA$54.61
A$150.48
List Price:  A$210.68
You save:  A$60.19
S$134.87
List Price:  S$188.83
You save:  S$53.95
HK$780.66
List Price:  HK$1,092.95
You save:  HK$312.29
CHF 91.34
List Price:  CHF 127.88
You save:  CHF 36.54
NOK kr1,052.31
List Price:  NOK kr1,473.28
You save:  NOK kr420.96
DKK kr686.58
List Price:  DKK kr961.24
You save:  DKK kr274.66
NZ$162.83
List Price:  NZ$227.97
You save:  NZ$65.13
د.إ367.25
List Price:  د.إ514.17
You save:  د.إ146.91
৳11,719.82
List Price:  ৳16,408.23
You save:  ৳4,688.40
₹8,312.02
List Price:  ₹11,637.16
You save:  ₹3,325.14
RM469.75
List Price:  RM657.67
You save:  RM187.92
₦147,331.26
List Price:  ₦206,269.66
You save:  ₦58,938.40
₨27,796.36
List Price:  ₨38,916.02
You save:  ₨11,119.65
฿3,662.34
List Price:  ฿5,127.43
You save:  ฿1,465.08
₺3,217.98
List Price:  ₺4,505.30
You save:  ₺1,287.32
B$517.06
List Price:  B$723.91
You save:  B$206.84
R1,837.02
List Price:  R2,571.91
You save:  R734.88
Лв180.27
List Price:  Лв252.38
You save:  Лв72.11
₩136,066.92
List Price:  ₩190,499.14
You save:  ₩54,432.21
₪368.12
List Price:  ₪515.38
You save:  ₪147.26
₱5,814.46
List Price:  ₱8,140.48
You save:  ₱2,326.02
¥15,687.99
List Price:  ¥21,963.82
You save:  ¥6,275.82
MX$1,668.59
List Price:  MX$2,336.09
You save:  MX$667.50
QR364.34
List Price:  QR510.10
You save:  QR145.75
P1,357.56
List Price:  P1,900.64
You save:  P543.07
KSh13,298.67
List Price:  KSh18,618.67
You save:  KSh5,320
E£4,713.99
List Price:  E£6,599.78
You save:  E£1,885.78
ብር5,741.12
List Price:  ብር8,037.80
You save:  ብር2,296.67
Kz84,932.10
List Price:  Kz118,908.34
You save:  Kz33,976.24
CLP$90,081.08
List Price:  CLP$126,117.11
You save:  CLP$36,036.03
CN¥724.33
List Price:  CN¥1,014.10
You save:  CN¥289.76
RD$5,884.09
List Price:  RD$8,237.97
You save:  RD$2,353.87
DA13,459.27
List Price:  DA18,843.53
You save:  DA5,384.25
FJ$222.79
List Price:  FJ$311.92
You save:  FJ$89.12
Q776.15
List Price:  Q1,086.64
You save:  Q310.49
GY$20,904.26
List Price:  GY$29,266.81
You save:  GY$8,362.54
ISK kr13,793.62
List Price:  ISK kr19,311.62
You save:  ISK kr5,518
DH996.63
List Price:  DH1,395.33
You save:  DH398.69
L1,771.78
List Price:  L2,480.56
You save:  L708.78
ден5,664.24
List Price:  ден7,930.17
You save:  ден2,265.92
MOP$804.05
List Price:  MOP$1,125.70
You save:  MOP$321.65
N$1,836.30
List Price:  N$2,570.90
You save:  N$734.59
C$3,677.56
List Price:  C$5,148.73
You save:  C$1,471.17
रु13,287.88
List Price:  रु18,603.56
You save:  रु5,315.68
S/373.53
List Price:  S/522.95
You save:  S/149.42
K388.30
List Price:  K543.64
You save:  K155.33
SAR375.02
List Price:  SAR525.05
You save:  SAR150.02
ZK2,666.83
List Price:  ZK3,733.66
You save:  ZK1,066.83
L457.94
List Price:  L641.14
You save:  L183.19
Kč2,276.37
List Price:  Kč3,187.01
You save:  Kč910.64
Ft35,391.42
List Price:  Ft49,549.41
You save:  Ft14,157.98
SEK kr1,062.40
List Price:  SEK kr1,487.41
You save:  SEK kr425
ARS$89,066.09
List Price:  ARS$124,696.09
You save:  ARS$35,630
Bs690.41
List Price:  Bs966.60
You save:  Bs276.19
COP$385,967.67
List Price:  COP$540,370.17
You save:  COP$154,402.50
₡51,227.77
List Price:  ₡71,720.93
You save:  ₡20,493.15
L2,469.21
List Price:  L3,457
You save:  L987.78
₲751,546.38
List Price:  ₲1,052,195.01
You save:  ₲300,648.62
$U3,849.38
List Price:  $U5,389.29
You save:  $U1,539.90
zł391.29
List Price:  zł547.82
You save:  zł156.53
Already have an account? Log In

Transcript

Hi, and welcome back. In this lecture, we are going to learn about indexing and slicing data frames slash tables with pandas. First of all, I have loaded this JSON file the same JSON file you have seen in the previous lectures. And let's set the terminology straight regarding these data frames. So the columns have column labels, like for instance, address, department, and so on. And the rows have index labels.

So we have, in this case, the defaults, 012, and so on. First of all, let's start with label based slicing. And let's keep the default index labels that we currently have for this table. So zero all the way through nine. For instance, let's see how to return the table entry the table row at index label five, so that would be this row right here. To do that, we are going to use a method called lock.

So the Jason dot lock, and in between square brackets, we enter the label the index label for which we want to have the row returned. In this case, as I said, five. So let's hit Shift Enter. And indeed We have the entire row of this table returned with all the columns address department, first name, ID, last name, phone, salary and skills. What if we want to return the table entries at multiple indexes? For example, let's say at index labels through four and seven.

Well, for this, we would use the Jason dot lock, and in between square brackets, we enter a list so another pair of square brackets, and here we should input the index labels for which we want the rows to be returned. So as I said, two, four and seven, Shift Enter And indeed, we have these three rows returned. Now let's also see a range. So we want to return the table entries at index labels in the range three to five, meaning labels, three, four, and five. That would be these three rows right here. So for this, we are going to use the JSON dot lock, and in between square brackets, we enter the range three, colon, five.

Okay, so this is the range of index labels for which we want the rows to be displayed, Shift Enter. And indeed, the result is the one we were looking for rows labeled with the index labels, three, four, and five. Next, let's assume that we want only a specified range of columns to be returned for a specified range of index labels. So returning to our initial table, let's say that for rows 01, and two having this Three labels, we want only the phone, salary and skills columns to be returned. So we would need two ranges, one for the column labels and one for the index labels. Let's do this the JSON dot lock, and in between square brackets, we enter a comma.

Before the comma, we will enter the range of index labels. So the rows we want to have returned 02, because I said we want 01 and two, these labels, and after the comma, we enter the range of column labels. So we have phone, colon, up to the skills column. This will include phone salary and skills. Okay, let's see this shift, Enter. And indeed for the rows labeled with 01, and two, we have these three columns returned phone, salary and skills.

Now looking back At our initial table, let's assume that we want the address column to be regarded as the table index, instead of the default index labels zero through nine. As we already discussed this in the previous lecture, we should use the JSON dot set underscore index. And let's set the address column as being the new index of this table, comma, and also let's add in place equals true. And also let's print out the Jason so Shift Enter. And now as you can see, the address column is the index of this table instead of the default index labels. Okay.

Now let's assume that we want to have the full rows returned, which are associated with the specified range of values of the address column. Let me show you what I mean by that. For example, we want the full rows being printed out that are awesome. sheeted with this address up to fifth address Miami, so that would be these rows right here. Let's see how to do that, of course, we will need to use another range as we did previously. So the JSON dot lock, and in between square brackets, we enter the first index label, which is first address Miami.

So let me type this in first address, comma, Miami, colon. And now we enter the second index label, which would be as I said, fifth address, comma Miami. Okay, let's run this. And indeed We have these rows returned as we expected. What if we wants to consider the same five index labels as we did in this example, but we want only the phone column to be returned? Well, in that case, let me copy this and paste it and this new cell After this range, we enter a comma.

And then we specify the name of the column that we want to see for these five index labels. So Shift Enter. And indeed, this time, we only get the phone numbers for each of these addresses. What if we want to return multiple columns associated with a specified range of values of the address. So in this case, we only wanted the phone column to be displayed. But now let's say we want the last name and skills columns to be displayed for these five addresses.

Well, in that case, I'm going to copy this because the range is the same comma. And now we are going to enter a list as the second argument, where each element of this list is a name of the column that we want to have returned. So as I said, we want last name, comma and skills. Okay, shift, Enter. And indeed, this time, we have only these two columns. returned for the five addresses the five index labels in our table.

Going further, let's assume that we want to have a range of columns associated with a specified range of values for the address returned to the screen. So in this case, again, we would have two ranges. So let me paste this. This is the first range, the range of index labels. And now after the comma, we enter the range of columns. So let's say this time, last name, colon, all the way up to the skills column, Shift Enter.

And again, this time we have this range of columns last name, phone, salary and skills return for the five addresses for the five index labels in our table. But what if we want to extract a single column for a single address? So mainly, we want a single cell of the table to be returned. For this. Let's look at our table Let's assume that we want to extract the phone number for the first address in the table. So that would be this one right here ending in 890.

And let's see how to do that. Well simply the JSON dot lock, and in between square brackets, first we enter the index label the address for which we want the phone number to be returned. So that would be first address, comma, Miami, comma, and then the column that we're interested in, in our case, phone, shift, Enter. And indeed, we have the phone number returned, the one ending in eight, nine and zero. What if we want to return the phone value for all the entries in the table? Well, in that case, we will use a trick the JSON dot lock again, open and close square brackets.

The first argument will be just the colon which means the entire table and the second argument will be ugly. In the name of the column, phone, shift, enter, and again, we have all the phone numbers for all the addresses in the table. Of course, we can also return the full original table or data frame by using the JSON dot lock. And in between square brackets, we enter just the colon, Shift Enter, and indeed, the entire table is returned. Also, the same can be accomplished if we modify this, and we use colon, comma, and another colon, which is basically the same thing. We want all the columns for all the rows in the table to be displayed.

And that means the entire table, Shift Enter, and the result is the same. Finally, regarding label based indexing, and slicing, let's see how to return all the unique departments in the table. So the unique values on a column so we would have the JSON dot lock in between square brackets first because we are looking to connect See there, all the departments all the entries in the table, we are using a simple colon, comma, and then we are using the name of the column to be considered department. I hit Enter, in order to return the unique elements, the unique departments in the table, I'm going to enclose this as an argument for the set function. So set up the JSON dot lock, shift, Enter. And indeed, we have a set containing all the departments in the company it logistics, marketing and sales.

Okay, so we talked about label based slicing and indexing. Now let's talk about position based slicing, which is done with the eye lock method instead of the lock method that we've seen thus far in this video. The key thing to keep in mind here is that with the eye lock method, so I lock the slicing is done the old Python way, meaning the second index, the one after the colon will not be included in the slice. So this is one of the differences between label based and position based slicing. So let's see this in a couple of examples. I'm going to load the original file once again, and let's see it the JSON.

Okay, so this is the original table, the original data frame. Now, let's assume that we want to return the entry or the row at index number four, where as I said, indexing starts at zero. So that would be the fifth element, the fifth row in the table, this one right here. Let's test this. So the JSON dot I lock, and in between square brackets, we enter the index for Shift Enter. And indeed, we get the fifth element in the table, the fifth row, what if we want to have the entries or rows at indexes four, six and eight returned, where again, remember that the indexing starts at zero.

We will Have the JSON dot I lock. And in between square brackets, we enter a list with the indexes for six, sorry, six, and eight, Shift Enter. And indeed, we have the associated rows printed out to the screen based on these indexes. Now let's return the columns at indexes two, three, and four for the rows at index two. So since we are interested in the columns at indexes, two, three and four, this means that index five is the upper limit, and it isn't included in the slice. However, the slides should be to colon five, right?

So let's try this the JSON dot I lock of two. This is the index representing the row in the table, comma, and now the range of column indexes to colon five. So let's run this. So if we look at the original table, we have the row at index two, which is this one. And we are interested in the columns with the indexes of two up to but not including index five. So that would mean two, three, and four, which would be first name, ID, and last name.

So in the case of the columns, we have the first column with index zero, index one, index two is the first name, index three is the ID, index four is the last name. So we have first name, ID and last name returned for the row at index two. What if we want to print out the columns at indexes two, three, and four, but this time for all the entries, all the rows in the table, again, index five would be the upper limit, and it will not be included in the slice. So let me type this in the JSON dot I lock and the first argument is going to be a simple colon because as I said, we want all the entries all the rows in the table. And for these rows, we only want columns two, three and four to be displayed, that would mean range to up to, but not including five, shift, Enter.

And indeed, we have these three columns printed out for all the rows in the table. Again, as we did with the lock method, we can return the full original table slash data frame using the JSON dot I lock. And in between the square brackets, we just enter the colon, Shift Enter. And the same result would be achieved. If we enter two colons in between square brackets separated by a comma, Shift Enter, and the result is identical. Okay, what else can we do with indexing and slicing.

For example, we can select random samples from the data frame. So let's say that we want to return some random samples one at a time. So the JSON dot sample, we use the sample map Third, Shift Enter. Again, we use the same DJ son dot sample, we get another entry from the table, a random entry is always selected from the table using the sample method. Also, we can return random samples, but this time four at a time by specifying an argument inside the parentheses of the sample method. So that would be the JSON dot sample, and in between its parentheses and equals four, Shift Enter.

And this time, we have four random rows from the table returned. If we run this once again, we will have another set of four random rows from the table. And as you can see, this is a fresh new set of four entries from the table. Okay, what if we want to return a fraction of the items? For example, let's say given the fact that we have 10 Records in our table, we want to return only half of these records randomly. For this, we would use the sample Method yet again.

So the JSON dot sample, but this time, we will specify another argument called frac equals 0.5, Shift Enter, and we have five records, five rows returned randomly for the table. If we use this, once again, we will have a fresh new set of five rows from the table, and so on each time you use this method, of course, if you change the fraction right here, let's say 0.2. That would be two random records, given the fact that we have 10 Records in total, so Shift Enter, and we have two random entries from the table returned. Okay, I know this video is pretty long. But the last thing I want to discuss in this lecture is Boolean indexing and filtering data. So we have three operators that can be used for filtering data and Boolean indexing, and they are the vertical bar meaning or the ampersands symbol for Ent.

And finally this symbol for not. So first of all, let me load the original data from the file, Shift Enter. And also let me set the index of the table to the ID column. So that would be the JSON dot set index ID, the name of the column in place equals true, shift, Enter. And now let's see the Jason yet again. And indeed, this time the index of the table is the ID column.

Okay, now let's say that we want to return all the rows all the employees with a salary less than 50,000. For this, first of all, we will need to consider the salary column and all the entries in the table but only the salaries for these entries using the LOC method as we did earlier in this video, and set up a condition for the salary to be less than 50,000 for this DJ son, open and close square brackets, and now we enter the country. dition, which is the JSON dot lock, open and close square brackets, we are considering all the entries in the table, comma, but only the salary column, and this should be less than 50,000. Let's run this line of code. And it seems that we have only one employee who earns less than 50,000. And that would be this one right here on row number eight, which is actually true.

Looking at our table, okay, now what if we want to return all the entries, all the employees with a salary less than 50,000 or greater than 56,000, then we would need to add another condition right? So let me copy this and paste it in this new cell. So this condition would be the first one, and I'm going to include it inside parentheses. Now as I said, for the or logical operator, we need to use The vertical bar. And again, in between parentheses, I'm going to specify the second condition, which as I said, should be greater than 56,000. So I'm going to paste this and copy it inside the second pair of parentheses.

I'm going to change greater than 56,000. Let's run this. And this time we have five employees, the same employee from the previous example, who earns less than 50,004 other employees who earn more than $56,000. Okay, what if we now want to return all the rows, all the employees with a salary greater than 50,000 and from the sales department? Well, in that case, we will need again two conditions. Let me copy and paste this and I'm going to modify it right inside the cell.

So as I said, we want the AMP With a salary greater than 50,000, so I'm going to change this math operator right here, and also who work in the sales department. So that would mean changing this column right here because for the second condition right here, we are interested in the department column. So department equals and now I'm going to enter the name of the department. As I said, sales, Shift Enter. Okay, it seems that I forgot to also modify the logical operator between the two conditions. So let me replace the vertical bar with the ampersand for representing the end logical operator and now after hitting shift, enter again, this time we have two employees from the sales department who earn more than $50,000.

Okay, now the last thing I'm going to show you is how to return all the employees who are not in the IT department. And for this we are going to use the not logical operator. So that would be the Jason open and close square brackets not. And now inside parentheses, we should enter the condition, the JSON dot lock of colon department equals equals it, Shift Enter. And indeed, we have all the employees who do not work in the IT department returned as a result of this line of code. Okay, I'm sorry, If this video has been too long.

Feel free to watch it once again, or as many times as you need, take notes. And of course, write these lines of code yourself in order to get used to pandas way of analyzing data and its specific methods. I'll see you in the next lecture.

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.