Hi, @Shubham,
Web scraping is the technique to extract data from a website.
The module BeautifulSoup is designed for web scraping. The BeautifulSoup module can handle HTML and XML. You can refer to the code below:
from BeautifulSoup import BeautifulSoup
import urllib2
import re
html_page = urllib2.urlopen("https://abc.com")
soup = BeautifulSoup(html_page)
for link in soup.findAll('a', attrs={'href': re.compile("^http://")}):
print link.get('href')
It downloads the raw HTML code with the line
html_page = urllib2.urlopen("https://www.edureka.co/")
A BeautifulSoup object is created and we use this object to find all links:
soup = BeautifulSoup(html_page)
for link in soup.findAll('a', attrs={'href': re.compile("^http://")}):
print link.get('href')
I hope this will be helpful for you.