注意:我将详细说明我对Python3的回答,因为Python2的生命周期已经非常接近了。
在Python 3中
bytes由8位无符号值序列组成,而str由表示人类语言文本字符的Unicode代码点序列组成。>>> # bytes
>>> b = b'h\x65llo'
>>> type(b)
>>> list(b)
[104, 101, 108, 108, 111]
>>> print(b)
b'hello'
>>>
>>> # str
>>> s = 'nai\u0308ve'
>>> type(s)
>>> list(s)
['n', 'a', 'i', '̈', 'v', 'e']
>>> print(s)
naïve
尽管bytes和str似乎工作方式相同,但它们的实例彼此不兼容,即bytes和str实例不能与>和+等运算符一起使用。此外,请记住,比较bytes和str实例以获得相等,即使用==,将始终计算为False,即使它们包含完全相同的字符。>>> # concatenation
>>> b'hi' + b'bye' # this is possible
b'hibye'
>>> 'hi' + 'bye' # this is also possible
'hibye'
>>> b'hi' + 'bye' # this will fail
Traceback (most recent call last):
File "", line 1, in
TypeError: can't concat str to bytes
>>> 'hi' + b'bye' # this will also fail
Traceback (most recent call last):
File "", line 1, in
TypeError: can only concatenate str (not "bytes") to str
>>>
>>> # comparison
>>> b'red' > b'blue' # this is possible
True
>>> 'red'> 'blue' # this is also possible
True
>>> b'red' > 'blue' # you can't compare bytes with str
Traceback (most recent call last):
File "", line 1, in
TypeError: '>' not supported between instances of 'bytes' and 'str'
>>> 'red' > b'blue' # you can't compare str with bytes
Traceback (most recent call last):
File "", line 1, in
TypeError: '>' not supported between instances of 'str' and 'bytes'
>>> b'blue' == 'red' # equality between str and bytes always evaluates to False
False
>>> b'blue' == 'blue' # equality between str and bytes always evaluates to False
False
处理bytes和str时的另一个问题是在处理使用^{}内置函数返回的文件时出现的。一方面,如果您不想在文件中读写二进制数据,请始终使用“rb”或“wb”等二进制模式打开文件。另一方面,如果要在文件中读取或写入Unicode数据,请注意计算机的默认编码,因此如果需要,请传递encoding参数以避免意外。
在Python 2中
str由8位值序列组成,而unicode由Unicode字符序列组成。需要记住的一点是,如果str只包含7位ASCI字符,则str和unicode可以与运算符一起使用。
在Python 2中使用helper函数在str和unicode之间转换,在Python 3中使用bytes和str之间转换,这可能很有用。