Python系列(五)：bytes和str的区别与联系

最新推荐文章于 2024-04-24 14:29:30 发布

斯曦巍峨

最新推荐文章于 2024-04-24 14:29:30 发布

阅读量1.2k

点赞数 2

分类专栏： python 文章标签： python 开发语言

本文链接：https://blog.csdn.net/qq_42103091/article/details/124573589

版权

python 专栏收录该内容

16 篇文章 1 订阅

订阅专栏

Bytes和Str的区别

在Python3中，字符序列有两种类型：bytes和str。bytes类型是无符号的8位值（通常以ASCII码显式），而str类型是Unicode代码点（code point）。代码点指编码字符集中，字符所对应的数字。

a = b'hello world'
print(isinstance(a, bytes))
print(list(a))
print(a)
"""
True
[104, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100]
b'hello world'
"""

a = 'hello world'
print(isinstance(a, str))
print(list(a))
print(a)
"""
True
['h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd']
hello world
"""

isinstance()方法可以判断对象的类型，例如这里用来判断是str还是bytes。

Python3对文本（str）和二进制数据（bytes）有着严格的区分，不能混用。

x = b'python'
y = b'java'
z = 'c++'
w = 'c'

print(x + y)
# b'pythonjava'
print(z + w)
# c++c
print(x + z)
# TypeError: can't concat str to bytes

print('python' == b'python')
# False

上述示例中str类型和bytes类型间使用=来比较是否相等不会报错，但是会返回False。

Bytes与Str间的转换

str类型和bytes类型间可以相互转换。
str到bytes的转换需要调用encode()方法。
bytes到str间的转换需要调用decode()方法。

x = b'python'
y = x.decode(encoding='utf-8')
z = y.encode(encoding='utf-8')
print(y)
print(z)
"""
python
b'python'
"""

可以观察到encode()和decode()方法都有一个encoding参数用来指定具体的编码规则。

读写文件的注意事项

当要将bytes类型写入到文件中时，必须指定mode=wb。读取二进制文件时可以指定mode=rb或者指定编码方式，使用后者时读出来的就不是bytes类型的字符序列了。

x = b'python'

# 错误示例
with open('data.bin', mode='w') as fp:
    fp.write(x)
# TypeError: write() argument must be str, not bytes

# 正确示例
with open('data.bin', mode='wb') as fp:
    fp.write(x)

# 读取二进制文件方式1
with open('data.bin', mode='rb') as fp:
    content = fp.read()
    print(content)
# python

# 读取二进制文件方式2
with open('data.bin', mode='r', encoding='utf-8') as fp:
    content = fp.read()
    print(content, type(content))
# python <class 'str'>

当读写Unicode数据时，只需要注意下编码方式即可，最好是显式的传递encoding参数。

x = '世界你好'

with open('data.txt', mode='w', encoding='utf-8') as fp:
    fp.write(x)

with open('data.txt', mode='r', encoding='utf-8') as fp:
    content = fp.read()
    print(content)
# 世界你好

# 错误示例，编码方式不对
with open('data.txt', mode='r', encoding='gbk') as fp:
    content = fp.read()
    print(content)
# 涓栫晫浣犲ソ

斯曦巍峨

关注

2
点赞
踩
6

收藏

觉得还不错? 一键收藏
打赏
0
评论
Python系列(五)：bytes和str的区别与联系

Bytes和Str的区别在Python3中，字符序列有两种类型：bytes和str。bytes类型是无符号的8位值（通常以ASCII码显式），而str类型是Unicode代码点（code point）。代码点指编码字符集中，字符所对应的数字。a = b'hello world'print(isinstance(a, bytes))print(list(a))print(a)"""True[104, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100
复制链接

扫一扫