I'm trying to collect all the titles of a forum from a certain site. I can't really figure out which HTML elements to target as I'm not very familiar with the site structure.
This is what I could develop reading the documentation
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url = 'http://thailove.net/bbs/board.php?bo_table=ent'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
#I don't think this is correct, but not sure on how else to to do this...
containers = page_soup.findAll("td",{"class":"td_subject"})
for container in containers:
subject = container.a.font.font.contents
#similarly not sure this is correct
print("subject: ", subject)
I'm not really sure where I should be trying to improvise