前几天在用requests和BeautifulSoup组合写爬虫,用BeautifulSoup解析网页的时候出现了一个错误,烦恼了可长时间,把错误输进网页也没找到解决方法,终于在无意间给解决了,写这个以防自己下次出现相同的错误,忘了怎么解决。
soup = BeautifulSoup(html.text,'lxml')
print(soup.text)
params = json.loads(soup.text)
代码只给出部分,不全公布
我是用json模块将字符串转换成字典来提取里面想要的数据
以下是显示的错误:
Traceback (most recent call last):
File "D:/PythonProject/try.py", line 20, in <module>
params = json.loads(soup.text)
File "D:\Anaconda3-5.2install\lib\json\__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "D:\Anaconda3-5.2install\lib\json\decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "D:\Anaconda3-5.2install\lib\json\decoder.py", line 355, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 3520 (char 3519)
错误的代码语句是:soup = BeautifulSoup(html.text,'lxml')
解决方法是:将这句改成soup = BeautifulSoup(html.text,'html.parser'),就是把‘lxml’换成“html.parser”就可以了