python3与python2的字符串编码问题

最新推荐文章于 2024-05-29 19:55:05 发布

酷跑阿迪

最新推荐文章于 2024-05-29 19:55:05 发布

阅读量620

点赞数 1

分类专栏： Python 文章标签： python 字符串编码编码比较

本文链接：https://blog.csdn.net/W_Hfly/article/details/66972172

版权

Python 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

Python3和Python2字符串编码采用不同的方式，下面分为几部分进行比较。

1、查看Python版本

import sys
__author__ = "author"
print(sys.version_info) #字典方式显示
print(sys.version)

python3.6.0:
python3.6.0版本
python2.7.11
python2.7.11版本

2、查看Python默认编码方式

print(sys.getdefaultencoding()) #python3
print sys.getdefaultencoding() #python2

输出结果为python3的为utf-8，Python2的为ascii。

3、Python3与Python2中的字符串编码区别
python3中包含两种方式，一种为bytes，一种为str。Python2中一种为unicode，一种为bytes
其实bytes为二进制方式，例如字符串b”hello”就为bytes模式，Python3采用8位模式，python2中采用7位模式。
Python3对于编码和解码字符尤其严格，bytes与str模式是不同的类型，他们比较的结果是False，而在Python2中他们比较的结果就是True。如下所示：
Python3：这里写图片描述 Python2：
从结果中也可以看出来。要想可以判断字符格式并且可以得到想要的格式可以编写函数进行判断，利用isinstance函数：

#python3
def Get_Str(str_or_bytes):
    if isinstance(str_or_bytes, bytes):
        VALUE = str_or_bytes.decode("utf-8")
    else:
        VALUE = str_or_bytes
    return VALUE

def Get_Bytes(str_or_bytes):
    if isinstance(str_or_bytes, str):
        VALUE = str_or_bytes.encode("utf-8")
    else:
        VALUE = str_or_bytes
    return VALUE
#测试
str1 = "abc"
str2 = b"abc"
print(Get_Bytes(str1))
print(Get_Str(str2))

#! _*_encoding=utf-8_*_
#python2
def Get_Unicode(str_or_bytes):
    if isinstance(str_or_bytes, bytes):
        VALUE = str_or_bytes.decode("utf-8")
    else:
        VALUE = str_or_bytes
    return VALUE

def Get_Str(str_or_bytes):
    if isinstance(str_or_bytes, str):
        VALUE = str_or_bytes.encode("utf-8")
    else:
        VALUE = str_or_bytes
    return VALUE
#测试
str1 = "abc"
str2 = b"abc"
print Get_Unicode(str2)
print Get_Str(str1)

4、在Python3中bytes模式的字符串不支持%s格式化输出，python2支持。
python3 bytes格式化输出错误

Python2 bytes格式化输出正确

5、输出到文件，两种方式也不同，Python3不支持bytes模式直接输出，需要以二进制模式输出，Python2支持。

#python3
"""
with open("test.text", "w+") as f:
    f.write("Welcome to China")  #TypeError 错误
"""
with open("text.text", "wb+") as f:
    f.write(b"Welcome to China") #正确

#python2
with open("test.text", 'a+') as f:
    f.write(b"Welcome to China\n") #正确

with open("test.text", "ab+") as f:
    f.write(b"Welcome to China") #正确

6、encode 与 decode应该一一对应，如下代码所示：

Str = "Welcome to China"
print(s.encode("gbk"))
print(s.encode("utf-8"))
print(s.encode("utf-8").decode("utf-8")

对于热爱Python的爱好者，在Python3中编码和解码尤为重要。

酷跑阿迪

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录