Searching the Tree of HTML Tags: find() and find_all()

Python 3: Automating Your Job Tasks Superhero Level: Automate Web Scraping with Python 3
6 minutes
Share the link to this page
You need to have access to the item to view this lesson.
One-time Fee
List Price:  $139.99
You save:  $40
List Price:  €129.06
You save:  €36.87
List Price:  £110.01
You save:  £31.43
List Price:  CA$191.35
You save:  CA$54.67
List Price:  A$210.84
You save:  A$60.24
List Price:  S$188.90
You save:  S$53.97
List Price:  HK$1,093.06
You save:  HK$312.32
CHF 91.36
List Price:  CHF 127.90
You save:  CHF 36.54
NOK kr1,064.83
List Price:  NOK kr1,490.80
You save:  NOK kr425.97
DKK kr687.94
List Price:  DKK kr963.14
You save:  DKK kr275.20
List Price:  NZ$228.40
You save:  NZ$65.26
List Price:  د.إ514.18
You save:  د.إ146.92
List Price:  ৳16,418.13
You save:  ৳4,691.23
List Price:  ₹11,653.68
You save:  ₹3,329.86
List Price:  RM658.58
You save:  RM188.18
List Price:  ₦202,635.52
You save:  ₦57,900
List Price:  ₨39,010.23
You save:  ₨11,146.57
List Price:  ฿5,113.97
You save:  ฿1,461.24
List Price:  ₺4,509.83
You save:  ₺1,288.61
List Price:  B$721.46
You save:  B$206.14
List Price:  R2,573.45
You save:  R735.32
List Price:  Лв252.47
You save:  Лв72.14
List Price:  ₩190,865.12
You save:  ₩54,536.78
List Price:  ₪514.04
You save:  ₪146.88
List Price:  ₱8,144.28
You save:  ₱2,327.10
List Price:  ¥21,931.91
You save:  ¥6,266.71
List Price:  MX$2,330.37
You save:  MX$665.86
List Price:  QR510.97
You save:  QR146
List Price:  P1,893.83
You save:  P541.13
List Price:  KSh18,548.67
You save:  KSh5,300
List Price:  E£6,593.52
You save:  E£1,884
List Price:  ብር8,051.60
You save:  ብር2,300.62
List Price:  Kz118,917.63
You save:  Kz33,978.89
List Price:  CLP$125,807.61
You save:  CLP$35,947.60
List Price:  CN¥995.41
You save:  CN¥284.42
List Price:  RD$8,224.32
You save:  RD$2,349.97
List Price:  DA18,834.81
You save:  DA5,381.76
List Price:  FJ$317.23
You save:  FJ$90.64
List Price:  Q1,088.99
You save:  Q311.16
List Price:  GY$29,321.70
You save:  GY$8,378.22
ISK kr13,838.61
List Price:  ISK kr19,374.61
You save:  ISK kr5,536
List Price:  DH1,387.67
You save:  DH396.50
List Price:  L2,475.08
You save:  L707.21
List Price:  ден7,958.33
You save:  ден2,273.97
List Price:  MOP$1,126.84
You save:  MOP$321.97
List Price:  N$2,547.09
You save:  N$727.79
List Price:  C$5,158.32
You save:  C$1,473.91
List Price:  रु18,669.25
You save:  रु5,334.45
List Price:  S/523.33
You save:  S/149.53
List Price:  K544.66
You save:  K155.63
List Price:  SAR525.05
You save:  SAR150.02
List Price:  ZK3,654.34
You save:  ZK1,044.17
List Price:  L642.19
You save:  L183.49
List Price:  Kč3,189.28
You save:  Kč911.28
List Price:  Ft49,959.85
You save:  Ft14,275.26
SEK kr1,068.91
List Price:  SEK kr1,496.52
You save:  SEK kr427.60
List Price:  ARS$124,588.23
You save:  ARS$35,599.18
List Price:  Bs968.45
You save:  Bs276.72
List Price:  COP$533,464.74
You save:  COP$152,429.38
List Price:  ₡71,860.04
You save:  ₡20,532.90
List Price:  L3,463.59
You save:  L989.66
List Price:  ₲1,054,446.66
You save:  ₲301,291.99
List Price:  $U5,362.45
You save:  $U1,532.23
List Price:  zł550.82
You save:  zł157.39
Already have an account? Log In


Welcome back. Although there are several methods that can help you search for HTML tags on a specific website, we are going to discuss the most frequently used ones in this lecture. And they are called find and find all. First, the find method finds only the first occurrence of a certain tag in your HTML code. Let's use the same Beautiful Soup object from the previous lecture called result. To test the outcome of this method.

Let's say that you want to get the first div tag on the same web page. So let's see the page source once again to try to get the result. So there's the first div tag on this page. This means that we should get this entire tag along with its content when applying the find method on our object. So this means that we should get the entire div right here, this entire content. Let's test this in the Python interpreter.

So we will Have result dot find of div. Okay, so this is the result right here. To see this even better. Let's use the print function and the predefined method once again. So we have resolved dot find of div dot prettify and let's use the print function as well. Okay, so we have this div section, starting at div class container, and then ending with closing nav and two closing div tags.

So let's check this on our web page as well, as I said, div class container. And we have nav div and div at the end, okay, so the result is indeed correct. Next, let's see how to use the Find all method to find all the occurrences of a certain tag on a webpage. And let's find all the h1 tags on our web page. So let me get back to the Python interpreter. And I'm going to use result dot find underscore all of age one Because we are looking for h1 tags, h1 headings, so enter, so we get a list where each element is an h1 tag that Beautiful Soup has found on the target website.

Moreover, if we apply the land function here, so land of this, we see that we have a total of six h1 tags on the page. Let's confirm this. I'm going to search for h1 and closing angle bracket on the page to see if the result was indeed correct. So h1 and a closing bracket. And as you can see, we have 12 results, which means six h1 headings because each heading has an opening and closing h1 tag. Okay, great job.

Next, what if we want to find all the occurrences of a certain tag on the page but only tags that have a specific value for a certain attribute? For instance, going back to our page right here, let me close this Let's find out the div tag for each of the three products currently residing on this page. To do this, let's go to the webpage itself. And just right click on any product and hit inspect, you will be redirected to a certain element corresponding to the product right here, depending on where you clicked. However, to identify the div tag corresponding to the entire product element, you have to go to the uppermost div tag and see if hovering your mouse cursor over that div tag will translate to the entire product box being highlighted on the left side. So if we minimize this, we can see that these three div tags correspond to the products listed on the page.

This means that we have to look for div tags, whose class is this, call SM four, call LG four, call MD four. We can do that very easily using the Find all method. Let me write this first. So Again, we are applying the final method on our Beautiful Soup object. And in between parentheses, we have two arguments. The first argument is the tag that we are looking for in between double quotes.

In our case, this is the div tag. The second argument is a dictionary, as you can see right here, with a single element, where the key is the HTML attribute itself class, and the value is the attribute value that we want to match. Also in between double quotes, let's assign this to a variable called products. Enter. If we check the type of this variable, so type of products, notice that this is called a result set element. Now since we have only three products listed on this page, right here, therefore we have three div tags that meet the condition.

I'm talking about these three div tags, the length of this result set should be three right? Let's check this as well. So learn of products. Great. So this way we have extracted the three products listed on this page. Pretty cool right?

Now before moving on to the next lecture, I want to add a quick note. The Beautiful Soup module is quite comprehensive and enables you to perform complex operations. When designing web scraping applications in Python. There are lots of other concepts, methods and attributes to discuss, test and use inside your Python scripts. And in this course, we are only scratching the surface of web scrapping This is definitely a topic that needs its own course and maybe even then the capabilities of Beautiful Soup won't be exhausted. That's why I preferred to provide you with the most important concepts and skills during this first few videos in this section.

And now since I want this course to be as practical and hands on as possible, and I want you to get used to building useful applications instead of watching countless hours of theory or abstract concepts. We are going to To move on to building two versions of a basic web scrapping application throughout the next two lectures. So having that said, I'll see you soon

Sign Up


Share with friends, get 20% off
Invite your friends to LearnDesk learning marketplace. For each purchase they make, you get 20% off (upto $10) on your next purchase.