本人刚刚学习python不久,想尝试使用python爬取当当网的书籍信息,中间踩了一些坑,给大家分享一下
写好代码后运行发现报错了
C:\Users\freeid\venv\untitled31\Scripts\python.exe C:/Users/freeid/PycharmProjects/untitled3/当当/dangdang.py
Traceback (most recent call last):
File "C:/Users/freeid/PycharmProjects/untitled3/当当/dangdang.py", line 52, in <module>
sss.index_request()
File "C:/Users/freeid/PycharmProjects/untitled3/当当/dangdang.py", line 26, in index_request
self.detail_request(title, bigsrc,picsrc)
File "C:/Users/freeid/PycharmProjects/untitled3/当当/dangdang.py", line 40, in detail_request
writer.writerow([title,author,content,price])
UnicodeEncodeError: 'gbk' codec can't encode character '\xa0' in position 142: illegal multibyte sequence
原来是content中的\xa0无法解析,替换一下就好了
content = content.replace(u'\xa0','')#将无法解析的内容替换