python 字节串与list,int, str三种数据类型的转换

fhqlongteng

已于 2022-07-26 17:40:17 修改

阅读量4.4k

点赞数 2

分类专栏： pyqt5编程文章标签： python list 开发语言

于 2022-01-08 20:53:51 首次发布

本文链接：https://blog.csdn.net/fhqlongteng/article/details/122385315

版权

pyqt5编程专栏收录该内容

19 篇文章 2 订阅

订阅专栏

1、字节串bytes

bytes字节串或叫字节流一般用于底层硬件通信的数据类型，如串口，以太网口，文件读写的返回的都是bytes类型的数据。

#打印字节串
b=[i for i in range(0x80)]
c_bytes = bytes(b)
print(“bytes str:”，c_bytes.hex())

执行结果：
bytes str: 000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f404142434445464748494a4b4c4d4e4f505152535455565758595a5b5c5d5e5f606162636465666768696a6b6c6d6e6f707172737475767778797a7b7c7d7e7f

2、字节串bytes与int的转换

可以使用内建函数to_bytes, from_bytes进行转换。

int.to_bytes(length, byteorder, *, signed=False)

整数会使用 length 个字节来表示。如果整数不能用给定的字节数来表示则会引发 OverflowError。

byteorder 参数确定用于表示整数的字节顺序。如果 byteorder 为 "big"，则最高位字节放在字节数组的开头。如果 byteorder 为 "little"，则最高位字节放在字节数组的末尾。要请求主机系统上的原生字节顺序，请使用 sys.byteorder 作为字节顺序值。

signed 参数确定是否使用二的补码来表示整数。如果 signed 为 False 并且给出的是负整数，则会引发 OverflowError。 signed 的默认值为 False。

a = 0xf234
print("int conver to bytes(大端)",a.to_bytes(2, byteorder="big", signed=False))
print("int conver to bytes(小端)",a.to_bytes(3, byteorder="little", signed=True)

执行结果：
int conver to bytes(大端) b'\xf24'      说明：转换成b"\xf2\x34", 其中\x34以字符4的形式显示出来
int conver to bytes(小端) b'4\xf2\x00'

int.from_bytes(bytes, byteorder, *, signed=False)

bytes是要转换的十六进制；
byteorder：选'big'和'little'，以上例为例，其中big代表正常顺序，即f1ff。little反之，代表反序fff1；
signed：选True、Flase表示是否要区分二进制的正负数含义。即是否要对原二进制数进行原码反码补码操作。

b=b"\x12\x34\x45"
print("bytes conver to int(无符号)",int.from_bytes(b, byteorder="big", signed=True))

c=b"\xff"
print("bytes conver to int(有符号):",int.from_bytes(c, byteorder="little", signed=True))

执行结果：
bytes conver to int(无符号) 1193029
bytes conver to int(有符号): -1

#使用 struct来进行解包处理
ss = b"\xff"
print("int:",struct.unpack('<b', ss)[0])
执行结果：
int: -1

3、bytes与list之间的转换

#bytes转换成List
a=b"\x12\x34\x45\x56"
print("bytes convert to list:",list(a))

#list转换成bytes
b=[1,2,3,4,5,6,7,8]
print("list convert to bytes:",bytes(b))

执行结果：
bytes convert to list: [18, 52, 69, 86]
list convert to bytes: b'\x01\x02\x03\x04\x05\x06\x07\x08'

4、 bytes与str之间的转换

#bytes转换成str
c=b"\x01\x02\x03\x041234"
print("bytes convert str:",c.decode("utf-8"))

#str转换成bytes
c="1234"
print("str convert bytes:",c.encode("utf-8"))

执行结果：
bytes convert str: 1234
str convert bytes: b'1234'

注意bytes转换成str时，对于bytes中大于0x7F的数据是无法转换成字符串的，因为ASCII码表中定义的范围为0-0x7F这128个数。

#bytes转换成str
b=[i for i in range(0x81)]
c=bytes(b)
print("bytes convert str:",c.decode("utf-8"))
执行结果：
Traceback (most recent call last):
  File "E:\python_test\test.py", line 40, in <module>
    print("bytes convert str:",c.decode("utf-8"))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 128: invalid start byte

#bytes转换成str
b=[i for i in range(0xff)]
c=bytes(b)
print("bytes convert str:",c.decode("utf-8", errors = "replace")) replace表示对无法转换的字节用?来代替输出

输出结果如下：

5. Python encode()方法

encode() 方法为字符串类型（str）提供的方法，用于将 str 类型转换成 bytes 类型，这个过程也称为“编码”。

encode() 方法的语法格式如下：

str.encode([encoding="utf-8"][,errors="strict"])

注意，格式中用 [] 括起来的参数为可选参数，也就是说，在使用此方法时，可以使用 [] 中的参数，也可以不使用。

该方法各个参数的含义如表 1 所示。

表 1 encode()参数及含义
参数	含义
str	表示要进行转换的字符串。
encoding = "utf-8"	指定进行编码时采用的字符编码，该选项默认采用 utf-8 编码。例如，如果想使用简体中文，可以设置 gb2312。当方法中只使用这一个参数时，可以省略前边的“encoding=”，直接写编码格式，例如 str.encode("UTF-8")。
errors = "strict"	指定错误处理方式，其可选择值可以是： strict：遇到非法字符就抛出异常。 ignore：忽略非法字符。 replace：用“？”替换非法字符。 xmlcharrefreplace：使用 xml 的字符引用。该参数的默认值为 strict。

注意，使用 encode() 方法对原字符串进行编码，不会直接修改原字符串，如果想修改原字符串，需要重新赋值。

【例 1】将 str 类型字符串“C语言中文网”转换成 bytes 类型。

>>> str = "C语言中文网"
>>> str.encode()
b'C\xe8\xaf\xad\xe8\xa8\x80\xe4\xb8\xad\xe6\x96\x87\xe7\xbd\x91'

此方式默认采用 UTF-8 编码，也可以手动指定其它编码格式，例如：

>>> str = "C语言中文网"
>>> str.encode('GBK')
b'C\xd3\xef\xd1\xd4\xd6\xd0\xce\xc4\xcd\xf8'

6.Python decode()方法

和 encode() 方法正好相反，decode() 方法用于将 bytes 类型的二进制数据转换为 str 类型，这个过程也称为“解码”。

decode() 方法的语法格式如下：

bytes.decode([encoding="utf-8"][,errors="strict"])

该方法中各参数的含义如表 2 所示。

表 2 decode()参数及含义
参数	含义
bytes	表示要进行转换的二进制数据。
encoding="utf-8"	指定解码时采用的字符编码，默认采用 utf-8 格式。当方法中只使用这一个参数时，可以省略“encoding=”，直接写编码方式即可。注意，对 bytes 类型数据解码，要选择和当初编码时一样的格式。
errors = "strict"	指定错误处理方式，其可选择值可以是： strict：遇到非法字符就抛出异常。 ignore：忽略非法字符。 replace：用“？”替换非法字符。 xmlcharrefreplace：使用 xml 的字符引用。该参数的默认值为 strict。

【例 2】

>>> str = "C语言中文网"
>>> bytes=str.encode()
>>> bytes.decode()
'C语言中文网'

注意，如果编码时采用的不是默认的 UTF-8 编码，则解码时要选择和编码时一样的格式，否则会抛出异常，例如：

>>> str = "C语言中文网"
>>> bytes = str.encode("GBK")
>>> bytes.decode() #默认使用 UTF-8 编码，会抛出以下异常
Traceback (most recent call last):
File "<pyshell#10>", line 1, in <module>
bytes.decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd3 in position 1: invalid continuation byte
>>> bytes.decode("GBK")
'C语言中文网'

7、bytes与字符串之间的转换

这里实现的是把bytes字节串中的十六进制数据与字符串之前的转换，即把十六进制数据直接转换成字符串，比如 b"\x01\x02\x03" 转换成十六进制字符串就是“010203”这种形式。即这里会使用hex和fromhex两个内置函数来完成转换。

>>> a=bytes([1,2,3,4,5,6])
>>> print("1:",a)
1: b'\x01\x02\x03\x04\x05\x06'
>>> b=a.hex()
>>> print("2:",b)
2: 010203040506
>>> c=bytes.fromhex(b)
>>> print("3:",c)
3: b'\x01\x02\x03\x04\x05\x06'
>>>