我的代码
page = requests.get("http://www.sogou.com/kmap?query=%E9%99%88%E5%A5%95%E8%BF%85&from=relation&id=")
pageJson = simplejson.loads(page.text)
报了如下错误:
Traceback (most recent call last):
File "D:/pythonCode/crawl/DownloadSogouTupu.py", line 7, in <module>
pageJson = simplejson.loads(page.text)
File "C:\Users\denglinjie\AppData\Local\Programs\Python\Python35-32\lib\site-packages\simplejson\__init__.py", line 516, in loads
return _default_decoder.decode(s)
File "C:\Users\denglinjie\AppData\Local\Programs\Python\Python35-32\lib\site-packages\simplejson\decoder.py", line 377, in decode
raise JSONDecodeError("Extra data", s, end, len(s))
simplejson.scanner.JSONDecodeError: Extra data: line 1 column 22089 - line 1 column 22090 (char 22088 - 22089)
问题原因:
url返回的数据行的末尾多了一个^M,导致simpleJson解析失败,去掉最后的字符就可以了:
pageJson = simplejson.loads(page.text[0:-1])