Python day2 part1:字符串编码

最新推荐文章于 2022-02-28 19:44:18 发布

原创最新推荐文章于 2022-02-28 19:44:18 发布 · 215 阅读

0 ·

CC 4.0 BY-SA版权

Python 专栏收录该内容

26 篇文章

订阅专栏

本文介绍了计算机中常用的字符串编码方式，包括ASCII、Unicode及UTF-8，并详细讲解了Python中字符串的编码转换方法，同时提供了字符串格式化的两种实用方式。

字符串编码方式主要有两种:ASCII码、Unicode码和UTF-8码

Unicode和UTF-8码支持汉字

UTF-8码是可变长的,UTF-8中英语字符通常是1个字节,而汉字通常为3个字节,比Unicode更加节约

在计算机中

内存、记事本里面使用Unicode编码,在文件或者传输途中使用UTF-8编码

在网页中

服务器里面是Unicode,而网页上提供给浏览
器的是UTF-8

1.字符串相关函数

a. 字符转化

ord变汉字或英语字符为整数表示

chr将整数表示换回字符

print(ord('A'))
print(ord('字'))
print(chr(65))
print(chr(25991))

结果:

b. 字符串转化

encode()可以将字符串变为指定形式的编码,decode()反之

字符串与encode()和decode()之间使用.连接

()中间可以使用大写也可以使用小写,ascii与ASCII、utf-8与UTF-8都行

print('hello world'.encode('ascii'))
print('北京航空航天大学'.encode('utf-8'))
print('北京航空航天大学'.encode('UTF-8'))
print(b'hello world'.decode())
print(b'\xe5\x8c\x97\xe4\xba\xac\xe8\x88\xaa\xe7\xa9\xba\xe8\x88\xaa\xe5\xa4\xa9\xe5\xa4\xa7\xe5\xad\xa6'.decode())

b'hello world'
b'\xe5\x8c\x97\xe4\xba\xac\xe8\x88\xaa\xe7\xa9\xba\xe8\x88\xaa\xe5\xa4\xa9\xe5\xa4\xa7\xe5\xad\xa6'
b'\xe5\x8c\x97\xe4\xba\xac\xe8\x88\xaa\xe7\xa9\xba\xe8\x88\xaa\xe5\xa4\xa9\xe5\xa4\xa7\xe5\xad\xa6'
hello world
北京航空航天大学

2.格式化

a.旧式方法使用%连接

print('hello %s your final score is %lf'%('li ming',3.5))

C:\pythontest>python 1.py
hello li ming your final score is 3.500000

b.使用format()

format()将会依次替换{0} {1}所代表的值

print('hello {0} your final score is {1:.1f}'.format('li ming',3.5))

格式控制较为特殊,比较麻烦,但也比较强大

'{} {} {}'.format('a', 'b', 'c')