手写https://tech.china.com/article/20180529/20180529144614.html爬虫时,遇到下载页码乱码问题,经查资料解决,通过requests获取的编码不是GBK,而是ISO-8859-1。将页面编码改为 urf-8,此编码与主题无关,只是网上查到的方法
import requests
url = 'http://search.51job.com/jobsearch/search_result.php?fromJs=1&jobarea=090200%2C00&funtype=0000&industrytype=00&keyword=python&keywordtype=2&lang=c&stype=2&postchannel=0000&fromType=1&confirmdate=9'
r = requests.get(url)
r.encoding = 'GBK'
print r.text