How to use BeatifulSoup for webscraping

Question

I'm trying to collect all the titles of a forum from a certain site. I can't really figure out which HTML elements to target as I'm not very familiar with the site structure.

This is what I could develop reading the documentation

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = 'http://example.net/bbs/board.php?bo_table=ent'

uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

page_soup = soup(page_html, "html.parser")

#I don't think this is correct, but not sure on how else to to do this...
containers = page_soup.findAll("td",{"class":"td_subject"})


for container in containers:
subject = container.a.font.font.contents
#similarly not sure this is correct     
print("subject: ", subject)

I'm not really sure where I should be trying to improvise

charlie_brown · Answer 1 · Apr 4, 2018

Best answer

your programme is fine until you start executing the for-loop. You have to access container.a.contents[0]to get the subjects, and the print function should be inside your for loop:

for container in containers:
    subject = container.a.contents[0]
    print("subject: ", subject)

answered Apr 4, 2018 by charlie_brown
• 7,720 points
selected Oct 12, 2018 by Omkar

findingbugs · Answer 2 · Oct 12, 2018

You can go through the below link:

Here the webscrapping is explained in brief

https://www.edureka.co/blog/web-scraping-with-python/

answered Oct 12, 2018 by findingbugs
• 4,780 points

How to use BeatifulSoup for webscraping

Your comment on this question:

2 answers to this question.

Your answer

Your comment on this answer:

Your comment on this answer:

Related Questions In Python

How to use BeautifulSoup for Webscraping

How to use for loop in Python?

Python program how to use the for loop for printing the existing list

How to use in python for loop not equal marks? example: a!=0

Raw_input method is not working in python3. How to use it?

how to use print statement in python3?

How to use threading in Python?

How to use “raise” keyword in Python

How can I use python to execute a curl command?

How to use not equal operator in python

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES