python解码gbk_解决Python3 requests 响应头中文GBK编码报错,无法请求

问题表现:

响应头中有gbk编码的中文,导致requests无法解码读取header。

http包如图:

e7e4461cf40c400f2c16ae7b54d19f95.png

Python 3.4.3 (default, Aug 25 2017, 16:49:50)

[GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux

Type "help", "copyright", "credits" or "license" for more information.

>>> import requests

>>> res = requests.get('http://down.chinaz.com/download.asp?id=35&dp=1&fid=22&f=yes',headers={'Referer':'http://down.chinaz.com/soft/12162.htm'},allow_redirects=False)

Traceback (most recent call last):

File "", line 1, in

File "/usr/local/lib/python3.4/site-packages/requests/api.py", line 72, in get

return request('get', url, params=params, **kwargs)

File "/usr/local/lib/python3.4/site-packages/requests/api.py", line 58, in request

return session.request(method=method, url=url, **kwargs)

File "/usr/local/lib/python3.4/site-packages/requests/sessions.py", line 510, in request

resp = self.send(prep, **send_kwargs)

File "/usr/local/lib/python3.4/site-packages/requests/sessions.py", line 655, in send

r._next = next(self.resolve_redirects(r, request, yield_requests=True, **kwargs))

File "/usr/local/lib/python3.4/site-packages/requests/sessions.py", line 125, in resolve_redirects

url = self.get_redirect_target(resp)

File "/usr/local/lib/python3.4/site-packages/requests/sessions.py", line 116, in get_redirect_target

return to_native_string(location, 'utf8')

File "/usr/local/lib/python3.4/site-packages/requests/_internal_utils.py", line 25, in to_native_string

out = string.decode(encoding)

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc6 in position 28: invalid continuation byte

>>>

直接导致无法请求,该问题google也找不到相关问题,因为大部人遇到的都是请求成功的响应编码问题,而这个问题是请求时即报错。

经过测试python2.7是没有该问题的

从ipython 中可以看出是这一段错误:

usr/local/lib/python3.4/site-packages/requests/sessions.py in get_redirect_target(self, resp)

114 if is_py3:

115 location = location.encode('latin1')

--> 116 return to_native_string(location, 'utf8')

117 #return location

118

那么对比下python 2.7 与python3.4 的requests底层代码可以看出差别:

python3.4 requests中获取响应location代码;

默认全部使用ut8解码

a2aa8bf240e82d263c0f0bbcd3cc977d.png

python 2.7代码:

97e527cb41f3497bbed2a6daf915087e.png

再看下 get_redirect_target函数:

207f215c3ab23e4be62ee629478f6aff.png

基本可以确认为python3.4 中获取location时默认使用了utf-8解码,然而如果location是中文gbk编码,那么就会出现文中一开始出现的报错。

临时的解决方法可以将utf-8改为 GBK,另外以下两处也需要修改,用于请求location的地址:

0bebf9eb8008bc886d1b7d3bfce871f4.png

您的支持将鼓励我们继续创作!

wx.png

用 [微信] 扫描二维码打赏

zfb.jpg

用 [支付宝] 扫描二维码打赏

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值