编码：decode()/encode()、quote()/unquote()

最新推荐文章于 2021-11-30 08:10:23 发布

草尖上的舞动

最新推荐文章于 2021-11-30 08:10:23 发布

阅读量626

点赞数 1

原文链接：https://www.jianshu.com/p/3ebc3718c5a4

版权

编码类型：

ascii：用一个字节表示一个字符，仅包含127个英文大小写、数字、符号等
unicode：用2-4个字节表示一个字符，能够覆盖各国语言
utf-8：用1-6个字节表示一个字符，比如英文用一个，汉字通常用3个，为了节约空间

字符类型：

str：字符串
bytes：bytes 字面值中只允许 ASCII 字符（无论源代码声明的编码为何）。任何超出 127 的二进制值必须使用相应的转义序列形式加入 bytes 字面值。
len()函数计算的是str的字符数，如果换成bytes，len()函数就计算字节数

decode()和encode():字符串编码转换

decode():解码,作用是将其他编码的字符串转换成unicode编码
encode():编码,作用是将unicode编码转换成其他编码的字符串

字符串在Python内部的表示是unicode编码，因此，在做编码转换时，通常需要以unicode作为转换媒介的，即先将其他编码的字符串解码（decode）成unicode，再从unicode编码（encode）成另一种编码。

eg1:字符串编码成bytes的对象

_str = ''' 
中文'''
_bytes = _str.encode()
print(_bytes)
结果：b' \n\xe4\xb8\xad\xe6\x96\x87' ---空行编码成\n，中文两字编码

eg2:bytes的对象解码成字符串

_bytes =b'\t\nabc'
_str = _bytes.decode() 
print(_str)

结果图\t\n解码成空格和空行

quote()和unquote():用于对url进行编码、解码。

导入：from urllib import parse
功能：将单个字符串编码转化为 %xx%xx 的形式

bytes对象=parse.quote(str类型)
str类型=parse.unquote(bytes对象)

from urllib.parse import quote
from urllib.parse import unquote
str1 = 'https://www.amazon.com/s?ref=nb&k=%s' % quote("你好")
print(str1)#https://www.amazon.com/s?ref=nb&k=%E4%BD%A0%E5%A5%BD

str2 = 'https://www.amazon.com/s?ref=nb&k=%s' % unquote("%E4%BD%A0%E5%A5%BD")
print(str2)#https://www.amazon.com/s?ref=nb&k=你好