代码:
# -*- encoding: utf-8 -*-
"""
@project = Pa_chong
@file = test2
@auther = ztt
@create_time = '2019/4/13 9:17'
"""
from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen("http://www.pythonscraping.com/pages/warandpeace.html")
bsObj = BeautifulSoup(html)
nameList = bsObj.find_all("span", {"class": "green"})
for name in nameList:
print(name.get_text())
运行后报警告:
原因:
- 需要
html5lib
库的支持
解决:
- 安装:
pip install html5lib
bsObj = BeautifulSoup(html)
更改为bsObj = BeautifulSoup(html, "html5lib")