Creating a Bar Plot Based on Excel Data

Python 3: Automating Your Job Tasks Superhero Level: Data Visualization with Bokeh and Python 3
17 minutes
Share the link to this page
Copied
  Completed
You need to have access to the item to view this lesson.
One-time Fee
$99.99
List Price:  $139.99
You save:  $40
€91.97
List Price:  €128.76
You save:  €36.79
£77.70
List Price:  £108.78
You save:  £31.08
CA$138.25
List Price:  CA$193.55
You save:  CA$55.30
A$152.72
List Price:  A$213.82
You save:  A$61.09
S$134.20
List Price:  S$187.89
You save:  S$53.68
HK$780.59
List Price:  HK$1,092.86
You save:  HK$312.27
CHF 88.44
List Price:  CHF 123.82
You save:  CHF 35.38
NOK kr1,100.24
List Price:  NOK kr1,540.39
You save:  NOK kr440.14
DKK kr687.33
List Price:  DKK kr962.29
You save:  DKK kr274.96
NZ$169.82
List Price:  NZ$237.75
You save:  NZ$67.93
د.إ367.26
List Price:  د.إ514.18
You save:  د.إ146.92
৳11,743.81
List Price:  ৳16,441.81
You save:  ৳4,697.99
₹8,371.41
List Price:  ₹11,720.31
You save:  ₹3,348.90
RM465.70
List Price:  RM652
You save:  RM186.30
₦165,740.53
List Price:  ₦232,043.38
You save:  ₦66,302.84
₨27,788.10
List Price:  ₨38,904.46
You save:  ₨11,116.35
฿3,584.57
List Price:  ฿5,018.55
You save:  ฿1,433.97
₺3,297.92
List Price:  ₺4,617.23
You save:  ₺1,319.30
B$564.64
List Price:  B$790.52
You save:  B$225.88
R1,827.40
List Price:  R2,558.44
You save:  R731.03
Лв180.11
List Price:  Лв252.16
You save:  Лв72.05
₩138,473.15
List Price:  ₩193,867.95
You save:  ₩55,394.80
₪365.84
List Price:  ₪512.19
You save:  ₪146.35
₱5,845.41
List Price:  ₱8,183.81
You save:  ₱2,338.40
¥15,371.96
List Price:  ¥21,521.36
You save:  ¥6,149.40
MX$1,849.41
List Price:  MX$2,589.25
You save:  MX$739.84
QR364.19
List Price:  QR509.88
You save:  QR145.69
P1,352.81
List Price:  P1,894
You save:  P541.18
KSh12,979.15
List Price:  KSh18,171.33
You save:  KSh5,192.18
E£4,833.51
List Price:  E£6,767.11
You save:  E£1,933.60
ብር5,772.08
List Price:  ብር8,081.15
You save:  ብር2,309.06
Kz87,254.87
List Price:  Kz122,160.31
You save:  Kz34,905.44
CLP$94,724.52
List Price:  CLP$132,618.12
You save:  CLP$37,893.60
CN¥724.96
List Price:  CN¥1,014.98
You save:  CN¥290.01
RD$5,909.73
List Price:  RD$8,273.86
You save:  RD$2,364.12
DA13,421.76
List Price:  DA18,791
You save:  DA5,369.24
FJ$225.87
List Price:  FJ$316.23
You save:  FJ$90.35
Q774.58
List Price:  Q1,084.45
You save:  Q309.86
GY$20,907.04
List Price:  GY$29,270.70
You save:  GY$8,363.65
ISK kr13,804.61
List Price:  ISK kr19,327.01
You save:  ISK kr5,522.40
DH983.39
List Price:  DH1,376.78
You save:  DH393.39
L1,772.18
List Price:  L2,481.12
You save:  L708.94
ден5,666.48
List Price:  ден7,933.30
You save:  ден2,266.82
MOP$803.68
List Price:  MOP$1,125.19
You save:  MOP$321.50
N$1,822.62
List Price:  N$2,551.75
You save:  N$729.12
C$3,674.96
List Price:  C$5,145.09
You save:  C$1,470.13
रु13,374.15
List Price:  रु18,724.35
You save:  रु5,350.19
S/375.27
List Price:  S/525.40
You save:  S/150.12
K391.68
List Price:  K548.37
You save:  K156.68
SAR375.13
List Price:  SAR525.20
You save:  SAR150.06
ZK2,608.33
List Price:  ZK3,651.77
You save:  ZK1,043.43
L458.02
List Price:  L641.25
You save:  L183.22
Kč2,333.92
List Price:  Kč3,267.59
You save:  Kč933.66
Ft36,019.89
List Price:  Ft50,429.29
You save:  Ft14,409.40
SEK kr1,081.31
List Price:  SEK kr1,513.87
You save:  SEK kr432.56
ARS$92,801.80
List Price:  ARS$129,926.23
You save:  ARS$37,124.43
Bs690.67
List Price:  Bs966.96
You save:  Bs276.29
COP$403,758.19
List Price:  COP$565,277.62
You save:  COP$161,519.43
₡52,788.48
List Price:  ₡73,905.99
You save:  ₡21,117.50
L2,471.95
List Price:  L3,460.83
You save:  L988.88
₲756,054.93
List Price:  ₲1,058,507.15
You save:  ₲302,452.21
$U4,029.23
List Price:  $U5,641.08
You save:  $U1,611.85
zł393.63
List Price:  zł551.10
You save:  zł157.46
Already have an account? Log In

Transcript

Hi, and welcome back. In this lecture, you're going to learn how to create a bar plot like this one you're seeing on the screen right now showing the top 10 programming languages by popularity based on the DOB index for March 2019, available on trb.com. Of course, by the time you're watching this video, the index may have a whole different structure than the one you're seeing right now, or the ratings for these programming languages may change as well. However, that is not important for this lecture. So let's focus on our goal using this data to generate an eyecatching bar plot. And to make things even more interesting, let's read the data from an Excel spreadsheet that I have right here.

I have inserted some of the data on the website into these two columns, one for the name of the programming language and the other one for the rating value. have also added the header for the two columns to emulate a real life Excel table with the first column being called language and the second columns name being ratings. By the way, I have attached this sample Excel spreadsheet to this lecture. So you can go ahead and download it right now. And make sure to save it inside the same plot folder so we can reference it easier. Okay.

Next, it's time to create a new notebook in the same folder. So new Python three, and I'm going to call this notebook simply bar, Enter. And now it's time to see the code for this data visualization application. And I'm going to paste in some of the code I already wrote. And as with other apps inside this course, I will go ahead and explain it line by line. Unlike in the previous video where I've analyzed each line of code as I wrote it, starting with this video, I'm going to take the explaining only approach to save time, and a Avoid making you watch me typing each character.

So here's the code for this app. And by the way, you can find the code right after this video. And so you can download the code for each application. In this section right after the video ends for that particular application. You can choose to pause the video and type in each line of code as I explain it, or just go to the notebook following this video, copy the code and then paste it in your own Jupiter notebook. Okay, first of all, as always, we need to import the necessary tools.

And I mean classes, methods or attributes from within the bulky library, as well as the pandas module, as you can see right here, that will help us load the spreadsheet data as a data frame. You're going to use all of these tools right here in the code below, and you'll be able to see why they are important for our application and how to properly put them to work. The first thing we should take care of is importing all the data In the Excel spreadsheet to our current application as a panda's data frame, using, of course, the read Excel method, as you can see, right here, in between the parentheses of this method, all you need to add is the path to the Excel file enclosed by double quotes. Actually, let's try this in the Python interpreter at the same time to see the data frame. So I'm going to copy this line of code.

And I'm going to open up the Python interpreter. And now let's hit enter. Oh, sorry, I forgot to import the pandas module in the interpreter. So let me do this import pandas. And now let's run this line of code once again. Now let's see our data frame so top 10.

So notice that all the programming languages in the Excel spreadsheet, so let me minimize this. As I said, all the programming languages in the Excel spreadsheet have been loaded along with their corresponding rates. We also have the column names we defined in the Excel table, as well as the default indexing system for the rows inside the data frame. Now going back to the Jupiter notebook, the next thing I did is I defined the end result that will contain our bar plot, using Of course, the output file function, also assigning a name to the resulting HTML file, and I called it bar dot html. You can name this output file however you like according to your needs. Next, we are referencing the two columns in the data frame using the language and writing variables by passing the name of the column in between square brackets as we did with other data frames in the previous section of the course.

Let's do the same in the Python interpreter just for the sake of a short recap. So I'm going to copy and paste each of these lines real quick and also writing Enter language and writing. Okay, great. Now we have both columns loaded and we can reference them further into our code. Now back to the Jupiter notebook once again. Next we are creating a column data source object, which is bouquets own way of mapping column names with lists of data.

As you can read in the comments that I've added, quoting the official documentation of bulky. The column data source takes a parameter called data, which is actually a dictionary with the column names as keys and lists of data as values. Let's see this dictionary in the Python interpreter using the two variables referencing the language and the ratings column respectively. So I'm going to copy this portion of the code and I'm going to paste it in the interpreter. As you can notice, we have a dictionary with two elements separated by comma. So here's the comma.

This is the first element. And this is the second element of the dictionary right here. Each element is of course, a key value pair, where the key is the column name. So for instance, we have language. And the value is the data on each column separated from the key using a colon, of course. So this would be the data right here.

Now to see this even better, let's assign this dictionary to a variable and then use the keys and values methods on our dictionary. So I'm going to really find this a equals, and now let's see a dot keys. Actually, let me minimize this a bit. Okay, so you can see this better, okay, and a dot values. So indeed, we have the confirmation that this is a dictionary with keys and values. Now back to our code and looking at the comments once again.

Keep in mind that once the column data source has been created, it can be passed into the source parameter of blotting methods. And that's exactly what we are going to do down below. So right here where it says source equals source in between the parentheses of the V bar method. Till then, you can read more about column data source by using this link right here that I have included in the comments. Next, as soon as we create the column data source object on this line, we are ready to create our new plot using the figure function we've seen in the previous lecture. I'm going to create a new object called P and the figure function and I'll pass some arguments some keyword arguments to customize our plot.

Now you may recognize the title arguments. For example, from the previous video, this one right here, which is used to provide a suggestive title for our new plot. I named my Plot top 10 programming languages. But what about the other arguments here? Let's discuss each of them one by one. Again quoting bouquets documentation on ranges link in the comments right here.

By default bouquet will attempt to automatically set the data bounds of plots to fit snugly around the data. Sometimes you may need to set a plot arrange explicitly. This can be accomplished by setting the x range or y range properties. In our case, we have x range, right here equals language means that we want all the languages all the entries in the table to be considered for plotting the x axis. Otherwise, we may use a slice here to only consider some of the languages in the table. For instance, if I run this code as it is, with x range equals language, then we get all the languages plotted on the x axis.

So let's try this Ctrl Enter. And indeed, we have 10 languages in the table plotted to this chart. On the other hand, if you want only, let's say Python, c++ and dotnet, to be considered for the plot, then you need to enter the correct slice. What would that slice be? Well, since Python is the third language in the table, so the third one right here, then we need to start our slides at index two, right? That's because Java, which is the first language is positioned at index zero.

Next, since I said that we want dotnet as the last language to be considered for the plot, we need to also add a stop index inside our slice. Since dotnet, is located at index four being the fifth language in the table, we need to add index five as the stop index for the slice, right? That's because, as you may remember, from list slicing, for example, the slice goes up to but does not include the second index. So going back to our code, we need to specify five x range equals language of two colon five. And now let's run our code once again, Ctrl plus Enter. And indeed, we now have only a certain range of data included in the plot.

Pretty cool right? Now let's undo this change and take a brief look at the other arguments as well. So I'm going to go back to the code, and I'm going to leave only x range equals language. Okay, you can also save the notebook by hitting the Save button right here. Next, the plot height argument simply defines the height of the plot according to the user's needs. Whilst the toolbar location argument is set to none, meaning we choose to hide the toolbar for this particular plot.

However, if you would like to have the toolbar shown, as we've seen in the previous lecture, then you need to pass one of four options for this argument in between double quotes, above or below. Low left and right. Let's test this by setting the toolbar location to right and then let's hit Control Enter once again. So right, Control Enter. Okay, there it is right here, you can position this interactive toolbar wherever you like, or even hide it using the non value. So let me undo this setting.

Okay. Next once we have our plot settings in place, we need to also take care of drawing the vertical bars themselves. To illustrate the rating of each programming language in the table. For this, we will need to use the V bar method and apply it on our figure object, also passing some additional arguments to customize the way the bars will be drawn. First of all, we need to specify the values for the X and top arguments. So these first two arguments right here, where x refers to the x coordinates of the centers of the vertical bars as you can know This in the final plot where we see this tiny ticker and the language name right at the center of each bar.

And then we have the top argument, which refers to the y coordinates of the top edge of each bar. So that would be the top edge for each of these bars. Pretty straightforward, I think. Next, we're setting the width of each bar using the width argument. So with equals 0.7. And then using the source argument, we're passing the column data source object that we've discussed earlier.

Furthermore, we are using the legend argument right here, legend equals languages to provide a name for the legend of this plot, if we need to add such an option, and then we are also passing the line color right here equals white. This argument specifies a color for the border surrounding each vertical bar. As it is right now the border color is set to As you can also notice in the plot, but we can change it to black for instance, to highlight each vertical bar even more. So let me replace white with black. And now I'm going to hit Ctrl. Enter once again.

Okay, great, just as we expected. Now let me undo this change. The last argument I used in between the parentheses of the V bar method is fill color right here, which as its name implies, takes care of adding colors to each of the vertical bars in our plot. There are several mandatory parameters that we need to pass to this function. The first one is called field name, field underscore name, and is of type string and we have language for this argument. Next, we have the ballot argument which adds various colors to our vertical bars, and thus painting each bar in a different color.

There are many palettes available when using bulky and it chose one of them called spectral 10. According to bouquets documentation in the context of bulky a palette is a simple plain Python list of hex color strings. If you would like to see all the available palettes in bulky, just use the Python interpreter to import them and then use the DIR and help functions to find out more about all the available options when deciding to use a palette. So let me show you from bulky import palettes. Enter. Sorry, I have a misspelling.

So from instead of form, okay from okay import bullets, and now we can use their bullets. And we have squeezed text 64 lines. Let's see this. And these are all the bullets in okay. And also we can use help of palettes and we have a lot more lines. I'm not going to expand this, but these are the two functions that you can use to find out more about the bullets in bulky.

Now let's get back to our code. And finally, we have vectors equals language. This argument right here, which is of type sequence, and in our case is referencing the languages in our table, you can read more about the vector C map function in the link I've added to the comments. The last section of this code highlights a few more styling settings regarding our bar plot. First, we are setting the color of the grid lines corresponding to the x axis to non on this line, which basically means invisible. Of course, we can choose to modify this setting to black for instance, and also set red as being the color of the y axis grid lines.

So let me show you how to do this real quick. Let's say that instead of non we choose to have black gridlines for the x axis. And also let's say, P dot y grid for the y axis, dot grid, line color equals, let's say red. And now let's run our code once again. And indeed, notice that we have vertical black grid lines and horizontal red grid lines for the y axis. Okay, now let's get back to the code.

And let me undo these changes. Next, you can configure the start and end points of our Y axis range. For instance, looking at the plot, once again, you can see that the top most value is 16 right here, however, we can change this if we need to do so let's say 225. And I'm going to do that right now. So 25 instead of 16 Ctrl, plus Enter, and indeed this time, we have a longer y axis, of course. Finally, using the orientation and location attributes for the legend of our plot, we can change the way and position that the legend is shown on the plot.

Of course, we can play around with these settings as well, and maybe set the location to bottom left instead of top right. For instance, it really depends on your specific needs when designing the plot. Last but not least, we are using the show function to display the plot to the screen. Let's hit Ctrl. Enter once again to check out the final result. So first of all, let me close all the tabs to the right so Ctrl plus enter.

This is the final result and I will see you in the next lecture.

Sign Up

Share

Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.