Beautiful编写简单爬虫实验

最新推荐文章于 2022-08-30 08:48:10 发布

OliverkingLi

最新推荐文章于 2022-08-30 08:48:10 发布

阅读量503

点赞数

分类专栏： python

本文链接：https://blog.csdn.net/OliverkingLi/article/details/72888034

版权

python 专栏收录该内容

40 篇文章 0 订阅

订阅专栏

from urllib.request import urlopen
from urllib.error import HTTPError
from bs4 import BeautifulSoup

def getTitle(url):
    try:
        html = urlopen(url)
    except HTTPError as e:
        return None
    try:
        bsObj = BeautifulSoup(html.read(), 'lxml')
        title = bsObj.body.h1
    except AttributeError as e:
        return None
    return title

title = getTitle("http://www.pythonscraping.com/pages/page1.html")
if title == None:
    print("The title could not be found.")
else:
    print(title)

输出：

<h1>An Interesting Title</h1>