背景:学习Python网络数据采集(爬虫)
环境:Python3.6+BeautifulSoup4.7.1
报错代码:
from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen("http://www.pythonscraping.com/pages/page1.html")
bsObj = BeautifulSoup(html.read())
print(bsObj.h1)
报错信息:
UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.
The code that caused this warning is on line 5 of the file D:/xxx/Python学习/python-scraping-exe/hello.py. To get rid of this warning, pass the additional argument 'features="html.parser"' to the BeautifulSoup constructor.
bsObj = BeautifulSoup(html.read())
修改后的代码(修改了第四行代码):
from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen("http://www.pythonscraping.com/pages/page1.html")
bsObj = BeautifulSoup(html.read(),"html.parser")
print(bsObj.h1)
正确运行Yep!