安装BeautifulSoup
1.Linux 系统上的安装:
sudo apt-get install python-bs4
2.Mac系统
pip install beatifulsoup4
3.Windows系统
pip install beatifulsoup4
html = urlopen("http://www.baidu.com")
这行代码可能出现两种异常
1.网页在服务器上不存在
2.服务器不存在
第一种会抛出HTTPError异常
第二种会抛出HTMLError异常
如果调用的标签不存在,就会返回AttributeError
返回网页标题的封装函数
from urllib.request import urlopen from urllib.error import HTTPError,URLError from bs4 import BeautifulSoup def getTitle(url): try: html = urlopen(url) except (HTTPError,URLError) as e: return None try: bs0bj = BeautifulSoup(html.read()) title = bs0bj.body.h1 except AttributeError as e: return None return title title = getTitle("https://www.douban.com") if title == None: print("Title could not be found") else: print(title)