python requests get请求 utf8,python request.get（）返回的解码文本不正确，而不是UTF-8？...

最新推荐文章于 2024-05-06 13:42:45 发布

吞鲸者1号

最新推荐文章于 2024-05-06 13:42:45 发布

阅读量514

点赞数

文章标签： python requests get请求 utf8

When the content-type of the server is 'Content-Type:text/html', requests.get() returns improperly encoded data.

However, if we have the content type explicitly as 'Content-Type:text/html; charset=utf-8', it returns properly encoded data.

Also, when we use urllib.urlopen(), it returns properly encoded data.

Has anyone noticed this before? Why does requests.get() behave like this?

解决方案When you make a request, Requests makes educated guesses about the encoding of the response based on the HTTP headers. The text encoding guessed by Requests is used when you access r.text. You can find out what encoding Requests is using, and change it, using the r.encoding property.

>>> r.encoding

'utf-8'

>>> r.encoding = 'ISO-8859-1'

Check the encoding requests used for your page, and if it's not the right one - try to force it to be the one you need.

Regarding the differences between requests and urllib.urlopen - they probably use different ways to guess the encoding. Thats all.

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

吞鲸者1号

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python requests get请求 utf8,python request.get（）返回的解码文本不正确，而不是UTF-8？...

When the content-type of the server is 'Content-Type:text/html', requests.get() returns improperly encoded data.However, if we have the content type explicitly as 'Content-Type:text/html; charset=utf-...
复制链接

扫一扫

python设置utf-8为默认编码

骨哥博客

11-18

1万+

当使用python编程时，编码问题一直很让人头疼

【Python】requests发送http json请求&utf-8编码

pompom86的博客

03-10

9825

0 安装库+引用库安装代码 pip install PyEmail 引用库 import smtplib from email.mime.text import MIMEText from email.header import Head

参与评论您还未登录，请先登录后发表或查看评论

python系列：requests爬虫【中文乱码】的3种解决方法

最新发布

weixin_54626591的博客

05-06

592

requests爬虫【中文乱码】的3种解决方法

python requests get请求 utf8_python request.get（）返回的解码文本不正确，而不是UTF-8？...

weixin_39528697的博客

12-22

890

When the content-type of the server is 'Content-Type:text/html', requests.get() returns improperly encoded data.However, if we have the content type explicitly as 'Content-Type:text/html; charset=utf-...

Requests库的get()方法

cxrpty的博客

09-07

3116

获取一个网页最简单的方法就是：构造一个向服务器请求资源的Request对象，返回一个Response对象即r，其包含从服务器返回的所有的相关资源 r = requests.get(url) requests.get()完整的使用方法有三个参数 requests.get(url,params=None，**kwargs) url：获取页面的url链接 params：url中的额外参数，...

requests.get使用utf-8得到还是中文乱码的解决办法

ggshabidaima的博客

11-17

6537

Python使用requests bs4爬取数据时乱码问题解决方法。如果设置了encoding为utf-8后还是乱码怎么办。

python中urllib.request和requests的使用及区别详解

09-16

在`requests`中，你可以直接调用`get()`或`post()`方法发起请求，无需像`urllib.request`那样先创建`Request`对象： ```python import requests url = 'https://www.douban.com' headers = { 'User-Agent': '...

Python-第三方库requests详解.pdf

03-30

### Python-第三方库requests详解 #### 一、概述 `requests` 是一款广泛使用的 Python 库，用于处理 HTTP 请求。该库基于 `urllib` 开发，并采用了 Apache 2 Licensed 开源协议。与 `urllib` 相比，`requests` ...

eoLinker-API_Shop_《机动车合格证》二维码解码_API接口_Python调用示例代码.pdf

02-27

3. **设置编码**：在Python2中，由于默认编码问题，代码使用`reload(sys)`和`sys.setdefaultencoding('utf-8')`来确保字符串处理时使用UTF-8编码。不过，在Python3中，这一步通常是不必要的，因为Python3默认支持UTF...

python_requests快速入门归纳.pdf

02-12

`requests.get()`方法用于发送GET请求，返回一个Response对象，保存了服务器的响应信息。除了GET，requests库还支持其他HTTP请求方法，如POST、PUT、DELETE、HEAD和OPTIONS，用法与GET类似： ```python r = requests...

Python urllib.request对象案例解析

12-17

首先通过`urlopen()`方法获取响应对象，然后使用`read().decode('utf8')`解码字节为字符串，并将其写入到文件中。 - **案例2**展示了如何保存图片。这里也是通过`urlopen()`获取响应，但因为图片是二进制数据，所以...

【Requests库】{1} ——Requests库的get()方法

Giyn

03-09

751

Requests库的get()方法获得一个网页最简单的两行代码就是： import requests r = requests.get(url) Requests对象是内部生成的 requests.get()返回的内容用r表示，r是一个Response对象 python是大小写敏感的，所以R是大写的 requests.get()的完整使用方法： requests.get(url, p...

在学习python爬虫过程中，遇到python2.x中使用requests时中文乱码的问题解决

斧冰

07-11

731

最近在学习python爬虫，所用的书是《Python爬虫开发与项目实践》，作者是，范传辉，在学习第五章的时候，遇到中文乱码的情况，安装书中的代码总是乱码，自己摸索和网络搜索资料解决后，将这记录下来：代码如下： # -*- coding: utf-8 -* from lxml import etree import requests import csv import re import...

python-爬虫-requests.get()-响应内容中文乱码

草青工作室的专栏

02-24

7956

python-爬虫-requests.get()-响应内容中文乱码由于目标url的headers没有提供charset，那么这串字节流就会用latin-1 转换为 unicode 编码的方式转换成了我们见到的unicode对象。但是网页的编码方式实际上是utf-8，所以我们实际上需要的是从utf-8转换成unicode编码。此时这一串字节流就会被错误地解释成unicode编码。我们如何发现这种情...

python request乱码_怎么解决python中的request中文乱码

weixin_27719109的博客

02-09

2156

怎么解决python中的request中文乱码发布时间：2020-08-24 16:00:12来源：亿速云阅读：79作者：Leah怎么解决python中的request中文乱码？相信很多没有经验的人对此束手无策，为此本文总结了问题出现的原因和解决方法，通过这篇文章希望你能解决这个问题。先在爬虫都在推荐用Requests库，而不是Urllib，但是读取网页的时候中文会出现乱码。分析：r = requ...

解析Python requests响应内容编码规则

点工一枚

03-25

6790

1.发现问题我们在使用requests发送请求时，响应的内容有时候会出现乱码的情况，下面我举一个例子： import requests r = requests.get('http://www.baidu.com') print(r.text) # 打印发现内容为乱码我们可以使用r.encoding来查看编码解析text时我们的字符集编码是什么: print(r.encoding) 打印结果: 然后我们在通过r.text查看到HTML本身的字符集编码是utf-8,所以这里才会出现乱码的情况

Python requests库 get方法——设置返回内容的编码呈现方式

qq_40144132的博客

08-13

4713

以下是设置requests库中get方法如何改变response内容的编码格式，使其能正常显示中文内容，而不出现乱码。 import requests #导入Python requests库 url = 'https://www.baidu.com' #设置访问资源为百度https://www.baidu.com r = requests.get(url) #向百度发送给请求，r为response实例，即百度网页的HTML页面内容 print("页面默认编码格...

爬虫基基础础知识

BoHeqixi的博客

03-06

488

requests Requests是用python语言基于urllib编写的，采用的是Apache2 Licensed开源协议的HTTP库，Requests它会比urllib更加方便，可以节约我们大量的工作 response.text返回的是Unicode格式，通常需要转换为utf-8格式，否则就是乱码。response.content是二进制模式，可以下载视频之类的，如果想看的话需要decode成utf-8格式。通过response.content.decode("utf-8)的...

requests.get中文显示乱码解决方法

jkchen's Haven

06-06

3820

def getHtml(url): res = requests.get(url,timeout=2,headers={'User-Agent':'Baiduspider'}) return res.text html = getHtml('http://tianqi.eastday.com/jinhua_history/58549_202001.html') bs = BeautifulSoup(html,"html.parser") print(bs.prettify()) 显示乱

如何处理request.content

07-20

response = requests.get(url) # 检查状态码 if response.status_code == 200: # 保存为文件 with open('image.jpg', 'wb') as f: f.write(response.content) # 解码为字符串 content_str = response....