Python内置数据结构3

最新推荐文章于 2024-10-05 08:57:27 发布

weixin_34080903

最新推荐文章于 2024-10-05 08:57:27 发布

阅读量105

点赞数

文章标签： python 数据结构与算法

原文链接：https://my.oschina.net/u/3552459/blog/1558800

版权

2019独角兽企业重金招聘Python工程师标准>>>

字符串格式化

重点：

format
bytes, bytearray
要会对bytearray 进行修改
切片操作一定要会！
enumerate 熟练掌握

一、print style 字符串格式化

In [1]: s = 'i love %s'

In [2]: s % 'python'
Out[2]: 'i love python'

In [3]: s
Out[3]: 'i love %s'

In [4]: s % str(['meizil', 'fanbingbing'])
Out[4]: "i love ['meizil', 'fanbingbing']"

In [5]: 'i love %s, %s' % 'chenyihan', 'fanbingbing'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-f814fabc6bf5> in <module>()
----> 1 'i love %s, %s' % 'chenyihan', 'fanbingbing'

TypeError: not enough arguments for format string

In [6]: 'i love %s, %s' % ('chenyihan', 'fanbingbing')
Out[6]: 'i love chenyihan, fanbingbing'

In [7]: 'i love %s, %s' % ('chenyihan', '99')
Out[7]: 'i love chenyihan, 99'

In [8]: 'i love %s, %d' % ('chenyihan', '99')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-5c3fb30ff6f9> in <module>()
----> 1 'i love %s, %d' % ('chenyihan', '99')

TypeError: %d format: a number is required, not str

In [9]: 'i love %s, %d' % ('chenyihan', 99)
Out[9]: 'i love chenyihan, 99'

In [10]:

In [10]: 'i love %s' % {'k' : 'v'}
Out[10]: "i love {'k': 'v'}"

In [11]:

%s%r, 格式化字符串，隐式的使用str()，将value转化为字符串
%d，格式化数字类型，如果value不是数字，则直接抛出异常

二、format

In [11]: s = 'i love {}'

In [12]: s.format('chenyihan')
Out[12]: 'i love chenyihan'

In [13]:

format 方法使用大括号作为占位符

In [13]: s = 'i love {}, and {}'

In [14]: s.format('chenyihan', 'linzhiling')
Out[14]: 'i love chenyihan, and linzhiling'

In [15]: s = 'i love {1}, and {0}'  # 位置参数

In [16]: s.format('chenyihan', 'linzhiling')
Out[16]: 'i love linzhiling, and chenyihan'

In [17]: 'i love {0} {0}',.format('陈意涵')
  File "<ipython-input-17-d462f6dbd363>", line 1
    'i love {0} {0}',.format('陈意涵')
                     ^
SyntaxError: invalid syntax


In [18]: 'i love {0} {0}'.format('陈意涵')
Out[18]: 'i love 陈意涵 陈意涵'

In [19]: 'i love {0} {1}'.format('陈意涵')
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-19-95f7c46e3cd4> in <module>()
----> 1 'i love {0} {1}'.format('陈意涵')

IndexError: tuple index out of range

In [20]: 'i love {girl} {wife}'.format(girl='陈意涵', wife='静静')   #关键字参数
Out[20]: 'i love 陈意涵 静静'

In [21]: 'i love {girl} {wife}'.format(girl='静静', wife='陈意涵')
Out[21]: 'i love 静静 陈意涵'

In [22]: 'i love {girl} {girl}'.format(girl='陈意涵')
Out[22]: 'i love 陈意涵 陈意涵'

In [23]: 'i love {girl} {women'}.format(girl='陈意涵')
  File "<ipython-input-23-bf939a5a80a6>", line 1
    'i love {girl} {women'}.format(girl='陈意涵')
                          ^
SyntaxError: invalid syntax


In [24]:

位置参数必须要在前面，关键字参数在后面

In [24]: 'i love {girl} {0}'.format('linzhiling', girl='陈意涵')
Out[24]: 'i love 陈意涵 linzhiling'

In [25]:

{} 按照顺序，使用位置参数
{数字 i} 会把位置参数当成一个列表args， args[i]，当 i 不是args的索引的时候，抛出 IndexError
{关键字 k} 会把关键字参数当成一个字典 kwargs，使用kwargs[k]，当k不是kwargs的key时，就会抛出 KeyError

In [25]: '{1}, {2}'.format(0, 1, 2)
Out[25]: '1, 2'

In [26]: args = [0, 1, 2]

In [27]: args[1]
Out[27]: 1

In [28]: args[2]
Out[28]: 2

In [29]: '{1}, {3}'.format(0, 1, 2)
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-29-b9658b5525cd> in <module>()
----> 1 '{1}, {3}'.format(0, 1, 2)

IndexError: tuple index out of range

In [30]: '{1}, {3}'.format(0, 1, 2, 3)
Out[30]: '1, 3'

In [36]: '{name}, {age}'.format(name='刘柳江', age=18)
Out[36]: '刘柳江, 18'

In [38]: kwargs = {'name' : '刘柳江', 'age' : 18}

In [39]: kwargs['name']
Out[39]: '刘柳江'

In [40]:

字符串与bytes

定义：

string 是文本序列
bytes 是字节序列

区别：

文本是有编码的（utf8，gbk，GB18030）
字节没有编码
文本的编码指的是字符如何使用字节来表示
单字节编码，ASCII
多字节编码，utf8

编码是什么？从一种形式或格式转换为另一种形式的过程，也称为计算机编程语言的代码。简称编码

Python3中，字符串默认使用utf8编码，utf8是由3个16进制的字符组成的

In [1]: s = '马哥教育'

In [2]: type(s)
Out[2]: str

In [3]: s.encode()
Out[3]: b'\xe9\xa9\xac\xe5\x93\xa5\xe6\x95\x99\xe8\x82\xb2'

In [4]: 0xe9
Out[4]: 233

In [5]: bin(233)
Out[5]: '0b11101001'

In [6]: bin(0xa9)
Out[6]: '0b10101001'

In [7]: bin(0xac)
Out[7]: '0b10101100'

In [8]: b'\xe9\xa9\xac'.decode()
Out[8]: '马'

In [9]: b'\xe9\xa9\xac\xe5\x93\xa5\xe6\x95\x99\xe8\x82\xb2'.decode()
Out[9]: '马哥教育'

In [10]: help(str.encode)
Help on method_descriptor:

encode(...)
    S.encode(encoding='utf-8', errors='strict') -> bytes

    Encode S using the codec registered for encoding. Default encoding
    is 'utf-8'. errors may be given to set a different error
    handling scheme. Default is 'strict' meaning that encoding errors raise
    a UnicodeEncodeError. Other possible values are 'ignore', 'replace' and
    'xmlcharrefreplace' as well as any other name registered with
    codecs.register_error that can handle UnicodeEncodeErrors.
~
(END)

In [11]: '马'.encode()
Out[11]: b'\xe9\xa9\xac'

一、bytes的操作

In [12]: b = b'abc'

In [13]: type(b)
Out[13]: bytes

In [14]: b'abc'.find(b'b')
Out[14]: 1

In [15]: '马哥教育'.encode()
Out[15]: b'\xe9\xa9\xac\xe5\x93\xa5\xe6\x95\x99\xe8\x82\xb2'

In [16]: '马哥教育'.encode().find(b'\xa5')
Out[16]: 5

In [17]: b.hex()
Out[17]: '616263'

In [18]:

二、bytearray

bytearray 是 bytes 的可变版本

str 和 bytes 都是不可变的

In [18]: s = 'abc'

In [19]: s[0]
Out[19]: 'a'

In [20]: s[0] = 'A'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-20-cc37915722b8> in <module>()
----> 1 s[0] = 'A'

TypeError: 'str' object does not support item assignment

In [21]: b = b'abc'

In [22]: b[0]
Out[22]: 97

In [23]: b[0] = b'A'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-23-9132e616117d> in <module>()
----> 1 b[0] = b'A'

TypeError: 'bytes' object does not support item assignment

In [24]:

In [24]: b
Out[24]: b'abc'

In [25]: b = bytearray(b)

In [26]: b
Out[26]: bytearray(b'abc')

In [27]: b[0]
Out[27]: 97

In [28]: b[0] = b'A'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-28-9132e616117d> in <module>()
----> 1 b[0] = b'A'

TypeError: an integer is required

In [29]: b[0] = int(b'A')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-29-d2cb8449404d> in <module>()
----> 1 b[0] = int(b'A')

ValueError: invalid literal for int() with base 10: b'A'

In [30]: b[0] = int(b'A'.hex(), 16)

In [31]: b
Out[31]: bytearray(b'Abc')

In [32]:

bytearray 的方法

In [32]: b
Out[32]: bytearray(b'Abc')

In [33]: b.append(b'b')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-33-34bc8d98d616> in <module>()
----> 1 b.append(b'b')

TypeError: an integer is required

In [34]: b.append(int(b'd'.hex(), 16))       # 将b'd' 转换为16进制数

In [35]: b
Out[35]: bytearray(b'Abcd')

In [36]: b.append(97)

In [37]: b
Out[37]: bytearray(b'Abcda')

In [38]: b.append(10000000000000000000000)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-38-885ab3f1d9d4> in <module>()
----> 1 b.append(10000000000000000000000)

ValueError: byte must be in range(0, 256)

In [39]:

线性结构与切片

哪几种结构？

str
list
tuple
bytes
bytearray

特点：

可迭代
len获取长度
可以使用下标访问
可以切片

enumerate

In [45]: item = ['a', 'b', 'c']

In [46]: for i, item in enumerate(item):
    ...:     if item == 'a' and i == 0:
    ...:         print(item)
    ...:
a

In [47]:

def _enumerator(iterator):
    ret = []
    i = 0
    for v in iterator:
        ret.append((i, v))
        i += 1
    return ret

切片操作

In [47]: lst = list(range(10))

In [48]: lst
Out[48]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [49]: lst[0:2]    # 左闭右开
Out[49]: [0, 1]

In [50]: lst[0:-1]   # start 参数默认为0, stop默认参数是-1
Out[50]: [0, 1, 2, 3, 4, 5, 6, 7, 8]

In [51]: lst[:100]   # 若stop超出范围，不会报错， stop = len(list)
Out[51]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [52]: lst[-100:]  # 若start 超出范围，不会报错， start = 0
Out[52]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [53]:

规则：

负数索引，实际上可以转化为 len(lst) + index
当start为0时，可以省略，当stop为-1时，可以省略
负数索引，当stop >= start时，返回空列表
当stop超出索引范围时，stop = 0，stop = -1

lst(start:stop:step)

In [53]: lst
Out[53]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [54]: lst[1:9:2]   # 取第step 个
Out[54]: [1, 3, 5, 7]

In [55]: lst[-1:-9:-2]
Out[55]: [9, 7, 5, 3]

In [56]:

当step为负数时，start必须大于stop参数，从右往左的顺序取出