1. 通过检查页面发现字符集是gb2312
2.pip install chardet
3.导入 chardet 包
4.
def parse(self, response):
filename = "shuidaoemiaobing.html"
# Python 编程的文件 IO 操作
with open(filename, "wb") as f:
f.write(response.body.decode(chardet.detect(response.body)['encoding']).encode('utf-8'))
5. scrapy crawl agc03_2