Introduction to Pandas - Basic Operations

Python 3: Automating Your Job Tasks Superhero Level: Automate Data Analysis Tasks with Python 3
9 minutes
Share the link to this page
Copied
  Completed
You need to have access to the item to view this lesson.
One-time Fee
$99.99
List Price:  $139.99
You save:  $40
€92.19
List Price:  €129.06
You save:  €36.87
£78.58
List Price:  £110.01
You save:  £31.43
CA$136.67
List Price:  CA$191.35
You save:  CA$54.67
A$150.59
List Price:  A$210.84
You save:  A$60.24
S$134.92
List Price:  S$188.90
You save:  S$53.97
HK$780.74
List Price:  HK$1,093.06
You save:  HK$312.32
CHF 91.36
List Price:  CHF 127.90
You save:  CHF 36.54
NOK kr1,064.83
List Price:  NOK kr1,490.80
You save:  NOK kr425.97
DKK kr687.94
List Price:  DKK kr963.14
You save:  DKK kr275.20
NZ$163.14
List Price:  NZ$228.40
You save:  NZ$65.26
د.إ367.26
List Price:  د.إ514.18
You save:  د.إ146.92
৳11,726.90
List Price:  ৳16,418.13
You save:  ৳4,691.23
₹8,323.81
List Price:  ₹11,653.68
You save:  ₹3,329.86
RM470.40
List Price:  RM658.58
You save:  RM188.18
₦144,735.52
List Price:  ₦202,635.52
You save:  ₦57,900
₨27,863.65
List Price:  ₨39,010.23
You save:  ₨11,146.57
฿3,652.73
List Price:  ฿5,113.97
You save:  ฿1,461.24
₺3,221.22
List Price:  ₺4,509.83
You save:  ₺1,288.61
B$515.31
List Price:  B$721.46
You save:  B$206.14
R1,838.13
List Price:  R2,573.45
You save:  R735.32
Лв180.33
List Price:  Лв252.47
You save:  Лв72.14
₩136,328.34
List Price:  ₩190,865.12
You save:  ₩54,536.78
₪367.16
List Price:  ₪514.04
You save:  ₪146.88
₱5,817.18
List Price:  ₱8,144.28
You save:  ₱2,327.10
¥15,665.20
List Price:  ¥21,931.91
You save:  ¥6,266.71
MX$1,664.50
List Price:  MX$2,330.37
You save:  MX$665.86
QR364.97
List Price:  QR510.97
You save:  QR146
P1,352.70
List Price:  P1,893.83
You save:  P541.13
KSh13,248.67
List Price:  KSh18,548.67
You save:  KSh5,300
E£4,709.52
List Price:  E£6,593.52
You save:  E£1,884
ብር5,750.98
List Price:  ብር8,051.60
You save:  ብር2,300.62
Kz84,938.73
List Price:  Kz118,917.63
You save:  Kz33,978.89
CLP$89,860.01
List Price:  CLP$125,807.61
You save:  CLP$35,947.60
CN¥710.98
List Price:  CN¥995.41
You save:  CN¥284.42
RD$5,874.34
List Price:  RD$8,224.32
You save:  RD$2,349.97
DA13,453.05
List Price:  DA18,834.81
You save:  DA5,381.76
FJ$226.58
List Price:  FJ$317.23
You save:  FJ$90.64
Q777.82
List Price:  Q1,088.99
You save:  Q311.16
GY$20,943.47
List Price:  GY$29,321.70
You save:  GY$8,378.22
ISK kr13,838.61
List Price:  ISK kr19,374.61
You save:  ISK kr5,536
DH991.17
List Price:  DH1,387.67
You save:  DH396.50
L1,767.86
List Price:  L2,475.08
You save:  L707.21
ден5,684.36
List Price:  ден7,958.33
You save:  ден2,273.97
MOP$804.86
List Price:  MOP$1,126.84
You save:  MOP$321.97
N$1,819.29
List Price:  N$2,547.09
You save:  N$727.79
C$3,684.41
List Price:  C$5,158.32
You save:  C$1,473.91
रु13,334.80
List Price:  रु18,669.25
You save:  रु5,334.45
S/373.79
List Price:  S/523.33
You save:  S/149.53
K389.03
List Price:  K544.66
You save:  K155.63
SAR375.02
List Price:  SAR525.05
You save:  SAR150.02
ZK2,610.17
List Price:  ZK3,654.34
You save:  ZK1,044.17
L458.69
List Price:  L642.19
You save:  L183.49
Kč2,277.99
List Price:  Kč3,189.28
You save:  Kč911.28
Ft35,684.59
List Price:  Ft49,959.85
You save:  Ft14,275.26
SEK kr1,068.91
List Price:  SEK kr1,496.52
You save:  SEK kr427.60
ARS$88,989.05
List Price:  ARS$124,588.23
You save:  ARS$35,599.18
Bs691.73
List Price:  Bs968.45
You save:  Bs276.72
COP$381,035.35
List Price:  COP$533,464.74
You save:  COP$152,429.38
₡51,327.13
List Price:  ₡71,860.04
You save:  ₡20,532.90
L2,473.92
List Price:  L3,463.59
You save:  L989.66
₲753,154.66
List Price:  ₲1,054,446.66
You save:  ₲301,291.99
$U3,830.21
List Price:  $U5,362.45
You save:  $U1,532.23
zł393.43
List Price:  zł550.82
You save:  zł157.39
Already have an account? Log In

Transcript

The first thing to do is of course, installing the pandas module. To do that simply open up the windows command line and type in pip install pandas. I have already installed this on my computer, however, you should go ahead and do it yourself. So we can be on the same page throughout the rest of this section. Of course, make sure you are connected to the internet before doing this. And now, assuming you paused the video and installed pandas, I'm going to open up a Jupiter notebook and we are going to see some basic operations with pandas.

Actually, I'm going to use the notebook from the previous lecture. And by the way, if you want to delete some cells, just click on a cell, hit escape, and then hit Double D. Again, click escape and DD and now we are left with this empty cell right here. As always, the first thing we should do is import the necessary module. So I'm going to do that right now. Import pandas. And I'm going to use Shift Enter To execute this line of code and move on to the next line.

Now, in order to work with data using pandas, we will always need to create a so called Data Frame, which is actually a Python object that holds the data that we are going to analyze. To be even more specific. a data frame is a two dimensional data structure with labelled axes or in simpler terms, imagine a data frame as being a table with rows and columns. And you're going to see this in practice shortly. Now let's create our first data frame. Let's call it for example d equals pandas dot data, frame, opening close parentheses.

And in between parentheses, we can specify a data source which can be for instance, a list of lists where each enter list is a row in the table be, again, the data source, but this time, we can specify the path to a text file, a CSV file, an Excel file, an SQL database table, and so on. And we will see that in the next video, and see various additional optional arguments to modify or enhance the functionality of the data frame. And again, you will see some examples very soon. For now let's consider a very basic example of a data frame. And let's pass a list of lists in between the parentheses and no additional arguments. Let's imagine we want to build and analyze a table of people showing their first name, age and occupation.

This means we should input a list where each element each in our list represents a person. So I'm going to paste in this line. So notice that we have a list of lists right here. And these are each of the inner list of course. separated by a comma, I'm going to enter a new line right here to also check out type of these. So I'm going to print type of D. And let's hit Shift Enter To execute these two lines of code.

And as you can see right here, we just created our first data frame. Okay, great. You can also use the DIR function on this new object to see all the available methods that pandas provides to manipulate and analyze the data inside a data frame. We will see some of them in action during this section of the course. For now, let's return to our data frame. And let's print it out to the screen to see the true power of Jupiter.

And also the way pandas just built a nice and clean table using the data we provided. So the Shift Enter. And now looking at this table right here, let's get our terminology right. You can see that pandas added these labels 012 and so on. To the rows and the columns of the table, remember that 01 and two are the column labels, these ones right here. And then 012, and three are the index labels.

This is how they are called index labels. Now you may say okay, but that looks kind of strange and ugly. And I would like to customize these labels in my data frame. Okay, I totally see your point. To do that, we would need to add an argument called columns, for which we need to pass a list where each element of that list is a column name in our table, the name we want to replace the default label with. So let me show you.

I'm going to move to this new cell right here. And I'm going to paste in the same data frame. Only this time we have an additional argument columns equals name, age and occupation, which are going to be the new names of our columns before execute Using this line of code, please keep in mind that the number of elements of the columns list must match the number of elements in each of the lists containing the data. Or in other words, when renaming the column labels, make sure that the number of elements in this list right here, the columns list matches the number of columns in your table. Now let's press Shift Enter. And let's see D once again.

Okay, great job this time instead of 01. And two, we have name, age, and occupation. What about the rows in the table? What if we want to change the index labels as well, in that case, we simply add a new argument in between the parentheses of the data frame, and this argument is called index, which also expects a list. Again, the number of elements in the index list must match the number of records in the table. So let's see this in action.

So I'm going to paste this in and notice that In addition to the data frame definition above, we have this index argument equals ID one ID to ID three and ID for a list of elements, a list of future index labels. Okay? Now let me hit Enter. And also, I'm going to type in D to see the table once again, Shift Enter. And indeed, this time, we have ID one, ID to ID three and Id four instead of 012, and three. Okay, it looks much better now, doesn't it?

At this point, we can use these labels these new labels as being the attributes of an object and called them whenever we want to extract a subset of information from the table. Let me show you what I mean by that. For instance, let's say d dot name, Shift Enter, and returns the name column from our data frame. So we have Andy, Jane, Robert and Maria. Let's also see d dot age Shift Enter And of course, this returns the age column from our data frame. And finally d dot occupation, Shift Enter, you guessed it returns the associated column.

Now, each of these data structures is called a series object when using pandas. So if we do a type of d dot age, Shift Enter, indeed, we have a series of objects, so please keep that in mind. Also, as I said earlier, we have lots of methods available for working with data frames and series objects. For instance, a basic example would be to find out the minimum and maximum values in a column. To do that, we need to use the correct series object and two specific methods called min and max. Let's see this.

So we have the dot h, dot min, open and close parentheses, Shift Enter, and we have 21, which is the lowest value in the age column. Similarly, we can use, for example, the dot name, dot Max, Shift Enter. And we get Robert as a result, since it is the largest value in the Name column, alphabetically speaking, what if we apply the min method to the entire data frame, for example? Well, in that case, we will get the minimum value for each of the columns in the table. Let's see this d dot mean, open and close parentheses, Shift Enter. Okay, so Indeed, we have Andy for the name 21 for the age and engineer for the occupation.

Okay, so this was a basic introduction to pandas. I hope you enjoyed it. Next, it's time to move away from this basic example where we use a list of lists to loading reading and analyzing data from various types of files, which is a real life use case for a lot of office jobs that require data analysis. You can find a link attached to this lecture pointing to the official documentation of pandas, where you can find all the available methods and operations that this amazing module provides. I'll see you in the next lecture.

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.