【Python CookBook】第四章迭代器与生成器

最新推荐文章于 2024-09-16 14:49:33 发布

取个名字就这么难

最新推荐文章于 2024-09-16 14:49:33 发布

阅读量234

点赞数

分类专栏：读书笔记文章标签： python 生成器列表

本文链接：https://blog.csdn.net/yyt_1995/article/details/106399673

版权

读书笔记专栏收录该内容

4 篇文章 0 订阅

订阅专栏

迭代器，可迭代对象，迭代器协议，生成器；next()函数，_iter_()函数，yield语句；以及如何自己创建一个迭代器或者可迭代对象，见：迭代器，生成器与协程
反向迭代：reversed()函数。反向迭代需要对象的大小可预先确定或者本身实现了_reversed_()函数才行。
```
>>> a = [1, 2, 3, 4]
>>> for x in reversed(a):
... print(x)
...
4321
```
需要将生成器暴露外部状态，可将实现一个类，将生成器函数放到_iter_()函数中。
迭代器切片：itertools.islice()。迭代器和生成器不能使用标准的切片操作，因为它们的长度事先我们并不知道。如果需要切片10到后面所有，则（c, 10, None）。
```
>>> import itertools
>>> for x in itertools.islice(c, 10, 20):
... print(x)
...
10
11
12
13
14
15
16
17
18
19
```

跳过可迭代对象的开始部分：itertools.dropwhile()，参数是一个函数对象和一个可迭代对象，它会返回一个迭代器对象，丢弃原有序列中直到函数返回Flase 之前的所有元素，然后返回后面所有元素。

>>> from itertools import dropwhile
>>> with open('/etc/passwd') as f:
... for line in dropwhile(lambda line: line.startswith('#'), f):
... print(line, end='')
...
nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false
root:*:0:0:System Administrator:/var/root:/bin/sh
...

集合元素的全排列：itertools.permutations()

>>> items = ['a', 'b', 'c']
>>> from itertools import permutations
>>> for p in permutations(items):
... print(p)
...
('a', 'b', 'c')
('a', 'c', 'b')
('b', 'a', 'c')
('b', 'c', 'a')
('c', 'a', 'b')
('c', 'b', 'a')

可指定长度的全排列：

>>> for p in permutations(items, 2):
... print(p)
...
('a', 'b')
('a', 'c')
('b', 'a')
('b', 'c')
('c', 'a')
('c', 'b')

集合元素的全部组合：itertools.combinations()

>>> from itertools import combinations
>>> for c in combinations(items, 3):
... print(c)
...
('a', 'b', 'c')
>>> for c in combinations(items, 2):
... print(c)
...
('a', 'b')
('a', 'c')
('b', 'c')
>>> for c in combinations(items, 1):
... print(c)
...
('a',)
('b',)
('c',)

itertools.combinations_with_replacement() 允许同一个元素被选择多次。

迭代时返回索引和值：enumerate()。可指定索引的初始值。

同时迭代多个序列：zip()函数。zip() 可以接受多于两个的序列的参数，zip()返回的是迭代器。迭代长度跟参数中最短序列长度一致。

>>> xpts = [1, 5, 4, 2, 10, 7]
>>> ypts = [101, 78, 37, 15, 62, 99]
>>> for x, y in zip(xpts, ypts):
... print(x,y)
...
1 101
5 78
4 37
2 15
10 62
7 99

或者使用itertools.zip_longest()来和最长序列长度一致，fillvalue指定缺省值。

>>> from itertools import zip_longest
>>> for i in zip_longest(a,b):
... print(i)
...
(1, 'w')
(2, 'x')
(3, 'y')
(None, 'z')
>>> for i in zip_longest(a, b, fillvalue=0):
... print(i)
...
(1, 'w')
(2, 'x')
(3, 'y')
(0, 'z')
>>>

更帅的生成字典的方式：

headers = ['name', 'shares', 'price']
values = ['ACME', 100, 490.1]
s = dict(zip(headers,values))

迭代多个集合里的所有元素：itertools.chain()

>>> from itertools import chain
>>> a = [1, 2, 3, 4]
>>> b = ['x', 'y', 'z']
>>> for x in chain(a, b):
... print(x)
...
1234xyz

生成器函数是一个实现管道机制的好办法。yield 语句作为数据的生产者而for 循环语句作为数据的消费者。

展开嵌套的序列成一个单层列表：yield from。下面为模板：

from collections import Iterable

# 额外的参数ignore_types 和检测语句isinstance(x, ignore_types) 用来将字符串和字节排除在可迭代对象外，防止将它们再展开成单个的字
def flatten(items, ignore_types=(str, bytes)):
    for x in items:
        if isinstance(x, Iterable) and not isinstance(x, ignore_types):
            yield from flatten(x)
    	else:
   			yield x

items = [1, 2, [3, 4, [5, 6], 7], 8]
# Produces 1 2 3 4 5 6 7 8
for x in flatten(items):
	print(x)

合并多个有序序列：heapq.merge()。而且其返回迭代器，开销小。

>>> import heapq
>>> a = [1, 4, 7, 10]
>>> b = [2, 5, 6, 11]
>>> for c in heapq.merge(a, b):
... print(c)
...
1 2 4 5 6 7 10 11

iter 函数一个鲜为人知的特性是它接受一个可选的callable 对象和一个标记(结尾) 值作为输入参数。当以这种方式使用的时候，它会创建一个迭代器，这个迭代器会不断调用callable 对象直到返回值和标记值相等为止。故处理数据时尽量用迭代器代替while True。

>>> import sys
>>> f = open('/etc/passwd')
>>> for chunk in iter(lambda: f.read(10), ''):
... 	n = sys.stdout.write(chunk)
...
nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false
root:*:0:0:System Administrator:/var/root:/bin/sh
daemon:*:1:1:System Services:/var/root:/usr/bin/false
_uucp:*:4:4:Unix to Unix Copy Protocol:/var/spool/uucp:/usr/sbin/uucico
...