解决UnicodeDecodeError: ‘utf-8‘ codec can‘t decode byte 0xc8 in position 0: invalid continuation byte

最新推荐文章于 2024-08-07 22:04:37 发布

annyangya

最新推荐文章于 2024-08-07 22:04:37 发布

阅读量2.5w

点赞数 14

分类专栏： python 文章标签： python 乱码 csv

本文链接：https://blog.csdn.net/ayangann915/article/details/117523755

版权

21 篇文章 1 订阅

订阅专栏

困扰了很久的问题今天终于解决了！！！

异步导入csv文件提示UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc8 in position 0: invalid continuation byte

原代码为：

    resp = requests.get(private_url).content
    with open(file_dir_name, "w") as fd:
        fd.write(resp.decode(encoding="utf-8"))

解码的字符格式我换成了gbk，Latin。。。各种都试过，只要有中文，就会报错。

年少的我不懂事，不知道翻译一看，看到这个报错想当然认为是解码的字符集不对，直到今天，我终于正视了bytes，我仿佛找到了解决门道。

于是我打印了resp的type类型，发现果然是<class 'bytes'>, 不就是二进制字节类型吗

然后我再次看了代码，resp取的是requests的content，点进去看content的源码解释：

    def content(self):
        """Content of the response, in bytes."""

响应内容就是bytes类型。

更改后的代码为：

    resp = requests.get(private_url).content
    with open(file_dir_name, "w") as fd:
        fd.write(resp.decode(encoding="unicode_escape"))

unicode_escape就是对编码后存储的文本，读取时进行反向转换，就能直接得到原始文本数据。

最后完美解决。

ps：想起之前的产品问我为什么不能加入中文，我和他好生辩论了一番，现在想起来还是我太菜了，遇到问题没找到解决办法就翻篇。其实这次我差点又想不让文件中有中文，但是这个产品要求一定有中文，于是我便深入了解了一下。。。没有解决不了的难题，只有一颗懒惰的怕麻烦的心～

关注

专栏目录