def parse_url(self, response): print(remove_tags(response.selector.xpath('//body').extract()[0]))在我用该函数解析爬取到的信息response时候,发现有异常,报错为
UnicodeEncodeError: 'gbk' codec can't encode character '\xa0' in position 1816: illegal multibyte sequence
通过百度,发现是控制台的输出编码格式为gbk,所以添加一段代码就行
import io import sys #改变标准输出的默认编码 sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding='gb18030')