字符串格式化
重点:
- format
- bytes, bytearray
- 要会对bytearray 进行修改
- 切片操作 一定要会!
- enumerate 熟练掌握
一、print style 字符串格式化
In [1]: s = 'i love %s'
In [2]: s % 'python'
Out[2]: 'i love python'
In [3]: s
Out[3]: 'i love %s'
In [4]: s % str(['meizil', 'fanbingbing'])
Out[4]: "i love ['meizil', 'fanbingbing']"
In [5]: 'i love %s, %s' % 'chenyihan', 'fanbingbing'
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-5-f814fabc6bf5> in <module>()
----> 1 'i love %s, %s' % 'chenyihan', 'fanbingbing'
TypeError: not enough arguments for format string
In [6]: 'i love %s, %s' % ('chenyihan', 'fanbingbing')
Out[6]: 'i love chenyihan, fanbingbing'
In [7]: 'i love %s, %s' % ('chenyihan', '99')
Out[7]: 'i love chenyihan, 99'
In [8]: 'i love %s, %d' % ('chenyihan', '99')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-8-5c3fb30ff6f9> in <module>()
----> 1 'i love %s, %d' % ('chenyihan', '99')
TypeError: %d format: a number is required, not str
In [9]: 'i love %s, %d' % ('chenyihan', 99)
Out[9]: 'i love chenyihan, 99'
In [10]:
In [10]: 'i love %s' % {'k' : 'v'}
Out[10]: "i love {'k': 'v'}"
In [11]:
- %s%r, 格式化字符串,隐式的使用str(),将value转化为字符串
- %d, 格式化数字类型,如果value不是数字,则直接抛出异常
二、format
In [11]: s = 'i love {}'
In [12]: s.format('chenyihan')
Out[12]: 'i love chenyihan'
In [13]:
format 方法使用大括号作为占位符
In [13]: s = 'i love {}, and {}'
In [14]: s.format('chenyihan', 'linzhiling')
Out[14]: 'i love chenyihan, and linzhiling'
In [15]: s = 'i love {1}, and {0}' # 位置参数
In [16]: s.format('chenyihan', 'linzhiling')
Out[16]: 'i love linzhiling, and chenyihan'
In [17]: 'i love {0} {0}',.format('陈意涵')
File "<ipython-input-17-d462f6dbd363>", line 1
'i love {0} {0}',.format('陈意涵')
^
SyntaxError: invalid syntax
In [18]: 'i love {0} {0}'.format('陈意涵')
Out[18]: 'i love 陈意涵 陈意涵'
In [19]: 'i love {0} {1}'.format('陈意涵')
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-19-95f7c46e3cd4> in <module>()
----> 1 'i love {0} {1}'.format('陈意涵')
IndexError: tuple index out of range
In [20]: 'i love {girl} {wife}'.format(girl='陈意涵', wife='静静') #关键字参数
Out[20]: 'i love 陈意涵 静静'
In [21]: 'i love {girl} {wife}'.format(girl='静静', wife='陈意涵')
Out[21]: 'i love 静静 陈意涵'
In [22]: 'i love {girl} {girl}'.format(girl='陈意涵')
Out[22]: 'i love 陈意涵 陈意涵'
In [23]: 'i love {girl} {women'}.format(girl='陈意涵')
File "<ipython-input-23-bf939a5a80a6>", line 1
'i love {girl} {women'}.format(girl='陈意涵')
^
SyntaxError: invalid syntax
In [24]:
位置参数必须要在前面,关键字参数在后面
In [24]: 'i love {girl} {0}'.format('linzhiling', girl='陈意涵')
Out[24]: 'i love 陈意涵 linzhiling'
In [25]:
- {} 按照顺序,使用位置参数
- {数字 i} 会把位置参数当成一个列表args, args[i],当 i 不是args的索引的时候,抛出 IndexError
- {关键字 k} 会把关键字参数当成一个字典 kwargs,使用kwargs[k], 当k不是kwargs的key时,就会抛出 KeyError
In [25]: '{1}, {2}'.format(0, 1, 2)
Out[25]: '1, 2'
In [26]: args = [0, 1, 2]
In [27]: args[1]
Out[27]: 1
In [28]: args[2]
Out[28]: 2
In [29]: '{1}, {3}'.format(0, 1, 2)
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-29-b9658b5525cd> in <module>()
----> 1 '{1}, {3}'.format(0, 1, 2)
IndexError: tuple index out of range
In [30]: '{1}, {3}'.format(0, 1, 2, 3)
Out[30]: '1, 3'
In [36]: '{name}, {age}'.format(name='刘柳江', age=18)
Out[36]: '刘柳江, 18'
In [38]: kwargs = {'name' : '刘柳江', 'age' : 18}
In [39]: kwargs['name']
Out[39]: '刘柳江'
In [40]:
字符串与bytes
定义:
- string 是文本序列
- bytes 是字节序列
区别:
- 文本是有编码的(utf8,gbk,GB18030)
- 字节没有编码
- 文本的编码指的是字符如何使用字节来表示
- 单字节编码,ASCII
- 多字节编码,utf8
编码是什么?从一种形式或格式转换为另一种形式的过程,也称为计算机编程语言的代码。简称编码
Python3中,字符串默认使用utf8编码,utf8是由3个16进制的字符组成的
In [1]: s = '马哥教育'
In [2]: type(s)
Out[2]: str
In [3]: s.encode()
Out[3]: b'\xe9\xa9\xac\xe5\x93\xa5\xe6\x95\x99\xe8\x82\xb2'
In [4]: 0xe9
Out[4]: 233
In [5]: bin(233)
Out[5]: '0b11101001'
In [6]: bin(0xa9)
Out[6]: '0b10101001'
In [7]: bin(0xac)
Out[7]: '0b10101100'
In [8]: b'\xe9\xa9\xac'.decode()
Out[8]: '马'
In [9]: b'\xe9\xa9\xac\xe5\x93\xa5\xe6\x95\x99\xe8\x82\xb2'.decode()
Out[9]: '马哥教育'
In [10]: help(str.encode)
Help on method_descriptor:
encode(...)
S.encode(encoding='utf-8', errors='strict') -> bytes
Encode S using the codec registered for encoding. Default encoding
is 'utf-8'. errors may be given to set a different error
handling scheme. Default is 'strict' meaning that encoding errors raise
a UnicodeEncodeError. Other possible values are 'ignore', 'replace' and
'xmlcharrefreplace' as well as any other name registered with
codecs.register_error that can handle UnicodeEncodeErrors.
~
(END)
In [11]: '马'.encode()
Out[11]: b'\xe9\xa9\xac'
一、bytes的操作
In [12]: b = b'abc'
In [13]: type(b)
Out[13]: bytes
In [14]: b'abc'.find(b'b')
Out[14]: 1
In [15]: '马哥教育'.encode()
Out[15]: b'\xe9\xa9\xac\xe5\x93\xa5\xe6\x95\x99\xe8\x82\xb2'
In [16]: '马哥教育'.encode().find(b'\xa5')
Out[16]: 5
In [17]: b.hex()
Out[17]: '616263'
In [18]:
二、bytearray
bytearray 是 bytes 的可变版本
str 和 bytes 都是不可变的
In [18]: s = 'abc'
In [19]: s[0]
Out[19]: 'a'
In [20]: s[0] = 'A'
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-20-cc37915722b8> in <module>()
----> 1 s[0] = 'A'
TypeError: 'str' object does not support item assignment
In [21]: b = b'abc'
In [22]: b[0]
Out[22]: 97
In [23]: b[0] = b'A'
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-23-9132e616117d> in <module>()
----> 1 b[0] = b'A'
TypeError: 'bytes' object does not support item assignment
In [24]:
In [24]: b
Out[24]: b'abc'
In [25]: b = bytearray(b)
In [26]: b
Out[26]: bytearray(b'abc')
In [27]: b[0]
Out[27]: 97
In [28]: b[0] = b'A'
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-28-9132e616117d> in <module>()
----> 1 b[0] = b'A'
TypeError: an integer is required
In [29]: b[0] = int(b'A')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-29-d2cb8449404d> in <module>()
----> 1 b[0] = int(b'A')
ValueError: invalid literal for int() with base 10: b'A'
In [30]: b[0] = int(b'A'.hex(), 16)
In [31]: b
Out[31]: bytearray(b'Abc')
In [32]:
bytearray 的方法
In [32]: b
Out[32]: bytearray(b'Abc')
In [33]: b.append(b'b')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-33-34bc8d98d616> in <module>()
----> 1 b.append(b'b')
TypeError: an integer is required
In [34]: b.append(int(b'd'.hex(), 16)) # 将b'd' 转换为16进制数
In [35]: b
Out[35]: bytearray(b'Abcd')
In [36]: b.append(97)
In [37]: b
Out[37]: bytearray(b'Abcda')
In [38]: b.append(10000000000000000000000)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-38-885ab3f1d9d4> in <module>()
----> 1 b.append(10000000000000000000000)
ValueError: byte must be in range(0, 256)
In [39]:
线性结构与切片
哪几种结构?
- str
- list
- tuple
- bytes
- bytearray
特点:
- 可迭代
- len获取长度
- 可以使用下标访问
- 可以切片
enumerate
In [45]: item = ['a', 'b', 'c']
In [46]: for i, item in enumerate(item):
...: if item == 'a' and i == 0:
...: print(item)
...:
a
In [47]:
def _enumerator(iterator):
ret = []
i = 0
for v in iterator:
ret.append((i, v))
i += 1
return ret
切片操作
In [47]: lst = list(range(10))
In [48]: lst
Out[48]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [49]: lst[0:2] # 左闭右开
Out[49]: [0, 1]
In [50]: lst[0:-1] # start 参数默认为0, stop默认参数是-1
Out[50]: [0, 1, 2, 3, 4, 5, 6, 7, 8]
In [51]: lst[:100] # 若stop超出范围,不会报错, stop = len(list)
Out[51]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [52]: lst[-100:] # 若start 超出范围,不会报错, start = 0
Out[52]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [53]:
规则:
- 负数索引,实际上可以转化为 len(lst) + index
- 当start为0时,可以省略,当stop为-1时,可以省略
- 负数索引,当stop >= start时,返回空列表
- 当stop超出索引范围时,stop = 0,stop = -1
lst(start:stop:step)
In [53]: lst
Out[53]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [54]: lst[1:9:2] # 取第step 个
Out[54]: [1, 3, 5, 7]
In [55]: lst[-1:-9:-2]
Out[55]: [9, 7, 5, 3]
In [56]:
当step为负数时,start必须大于stop参数,从右往左的顺序取出
作业¶
- 使用更快的方法求100W 以内的素数
- 求杨辉三角第n行第k列的值
- 矩阵转置 , 用list 或者 tuple
所有的作业 都是用已学过的知识