UnicodeDecodeError: 'gbk' codec can't decode byte 0xd0 in position 15219: illegal multibyte sequence

最新推荐文章于 2022-11-02 18:25:45 发布

YANNIand

最新推荐文章于 2022-11-02 18:25:45 发布

阅读量267

点赞数

分类专栏： 21天爬虫实战文章标签： python

本文链接：https://blog.csdn.net/YANNIand/article/details/104884310

版权

21天爬虫实战专栏收录该内容

1 篇文章 0 订阅

订阅专栏

在使用爬虫的时候，就会有各种code报错。说是不合法。报错如下：

UnicodeDecodeError: 'gbk' codec can't decode byte 0xd0 in position 15219: illegal multibyte sequence

原来爬虫部分代码：

from lxml import etree
import requests

headers = {
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:73.0) Gecko/20100101 Firefox/73.0',
    'Referer': 'https://www.ygdy8.net/html/gndy/china/index.html'
}

url = "https://www.ygdy8.net/html/gndy/china/index.html"
response = requests.get(url,headers=headers)
text = response.content.decode('gbk')

然后检查了一下大小写都不是这些问题。
把gbk改成常用的utf-8也不行。
在网上找了方法。要用errors=’ignore‘。

text = response.content.decode(encoding='gbk', errors='ignore')

最后就能顺利得到输出爬虫结果。

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

YANNIand

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
UnicodeDecodeError: 'gbk' codec can't decode byte 0xd0 in position 15219: illegal multibyte sequence

在使用爬虫的时候，就会有各种code报错。说是不合法。报错如下：from lxml import etreeimport requestsheaders = { 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:73.0) Gecko/20100101 Firefox/73.0', 'Refere...
复制链接

扫一扫