运行代码错误如题:
url = 'http://blog.csdn.net/dc_726/article/details/45399457'
# pretend as a browser
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1;\
WOW64; rv:23.0) Gecko/20100101 Firefox/23.0 '}
req = Request(url, headers=headers)
html = urlopen(req)
bsHtml = BeautifulSoup(html)
text = bsHtml.find('div', id="article_content")
print(text)
1.读取过程中己将文本编码为utf-8, 故错误的原因在打印过程(print())中,实际上,窗口错误提示已明确指出:
File "D:/Project/python/Text/text.py", line 34, in <module>
print(text)
2.print() 系统默认为gbk编码格式,即程序运行时将对text进行gbk编码,后输出时对其进行解码,故考虑用下列改之:(应该可以采用更改系统设置的方法的。。)