python爬取过程出现异常_爬虫中requests使用时出现转码错误。

最新推荐文章于 2023-05-07 08:40:22 发布

weixin_39723899

最新推荐文章于 2023-05-07 08:40:22 发布

阅读量338

点赞数

文章标签： python爬取过程出现异常

#-*-coding:utf-8-*-

import requests

def load_url(url,file_name):

try:

my_headers = {

'Agent-User': 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_2; en-US) AppleWebKit/533.3 (KHTML, like Gecko) Chrome/5.0.354.0 Safari/533.3'}

re=requests.get(url,headers=my_headers)

re.raise_for_status()

re.encoding=re.apparent_encoding

print('爬取%s内容完成'%file_name)

return re.text

except:

print('爬取失败!')

def save_data(data,file_name):

print('开始保存文件%s'%file_name)

with open(file_name,'w') as f:

f.write(data)

print('文件 %s保存完成！'%file_name)

def spider(kw,begin,end):

for page in range(begin,end+1):

pn=(begin-1)*50

kw={'kw':kw}

full_url='http://tieba.baidu.com/f?'+'kw='+kw['kw']+'&ie=utf-8&pn='+str(pn)

print(full_url)

file_name='网页'+str(page)+'.html'

html=load_url(full_url,file_name)

save_data(html,file_name)

if __name__=='__main__':

#url = 'http://tieba.baidu.com/f?'

kw=input('请输入爬取的贴吧名称：')

begin=int(input('请输入爬取开始的页号：'))

end=int(input('爬取结束的页号:'))

spider(kw,begin,end)

错误提示：

F:\Python\python.exe F:/Python/练习夹/spider/tiebaCase.py

请输入爬取的贴吧名称：战狼2

请输入爬取开始的页号：1

爬取结束的页号:2

http://tieba.baidu.com/f?kw=战狼2&ie=utf-8&pn=0

爬取网页1.html内容完成

Traceback (most recent call last):

开始保存文件网页1.html

File "F:/Python/练习夹/spider/tiebaCase.py", line 37, in

spider(kw,begin,end)

File "F:/Python/练习夹/spider/tiebaCase.py", line 30, in spider

save_data(html,file_name)

File "F:/Python/练习夹/spider/tiebaCase.py", line 19, in save_data

f.write(data)

UnicodeEncodeError: 'gbk' codec can't encode character '\xe7' in position 265: illegal multibyte sequence

这个错误求解决。。。

weixin_39723899

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python爬取过程出现异常_爬虫中requests使用时出现转码错误。

#-*-coding:utf-8-*-import requestsdef load_url(url,file_name):try:my_headers = {'Agent-User': 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_2; en-US) AppleWebKit/533.3 (KHTML, like Gecko) Chrome/5.0...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。