【Python Cookbook】第四章迭代器与生成器

最新推荐文章于 2023-07-22 16:46:20 发布

Prymce-Q

最新推荐文章于 2023-07-22 16:46:20 发布

阅读量600

点赞数

分类专栏： Python Cookbook笔记文章标签： python

本文链接：https://blog.csdn.net/weixin_47691066/article/details/127131629

版权

Python Cookbook笔记专栏收录该内容

6 篇文章 1 订阅

订阅专栏

文章目录

一、迭代器
二、生成器
总结

一、迭代器

1.1 手动访问迭代器中的元素

下面的交互例子对迭代时发生的基本过程进行了解释，如下：

items = [1, 2, 3]
# 得到迭代器
it = iter(items)
# 运行迭代器
next(it)
1
next(it)
2
next(it)
3
next(it)
StopIteration:       # 报错表示迭代结束

1.2 委托迭代

如果我们要构建一个自定义的容器对象，并且其能够完成正常的迭代任务，所要做的就是定义一个__iter__()方法，将迭代请求委托到对象内部持有的容器上，如下：

class Node:
    def __init__(self, value):
        self._value = value
        self._children = []
    
    def __repr__(self):
        return 'Node({!r})'.format(self._value)
    
    def add_child(self, node):
        self._children.append(node)
    
    def __iter__(self):
        return iter(self._children)
    
root = Node(0)
child1 = Node(1)
child2 = Node(2)
root.add_child(child1)
root.add_child(child2)
for ch in root:
    print(ch)
Node(1)
Node(2)

__iter__()方法简单地将迭代请求转发给对象内部持有的_children属性上。

1.3 实现迭代协议

我们希望构建一个自定的对象，其能够支持迭代操作，同时希望它能够用一种简单的方式实现迭代协议，如下：

class Node:
    def __init__(self, value):
        self._value = value
        self._children = []
    
    def __repr__(self):
        return 'Node({!r})'.format(self._value)
    
    def add_child(self, node):
        self._children.append(node)
    
    def __iter__(self):
        return iter(self._children)
    
    def depth_first(self):
        yield self
        for c in self:
            yield from c.depth_first()
# Build
root = Node(0)
child1 = Node(1)
child2 = Node(2)
root.add_child(child1)
root.add_child(child2)
child1.add_child(Node(3))
child1.add_child(Node(4))
child2.add_child(Node(5))

for ch in root.depth_first():
    print(ch)
Node(0)
Node(1)
Node(3)
Node(4)
Node(2)
Node(5)

在上面的代码中，depth_first()的实现非常容易阅读，他通过首先产出自身，然后迭代每一个子节点，然后再利用子节点的depth_first()方法来产出其它的元素。

1.4 反向迭代

若我们想要反向迭代序列中的元素，可以使用内建的reversed()函数实现，如下：

a = [1, 2, 3, 4]
for x in reversed(a):
    print(x)
4
3
2
1

高级地，可以在自定义的类中实现一个__reversed__()方法来实现该类的反向迭代，如下：

class Countdown:
    def __init__(self, start):
        self.start = start
    # Forward iterator
    def __iter__(self):
        n = self.start
        while n > 0:
            yield n
            n -= 1
    # Reverse iterator
    def __reversed__(self):
        n = 1
        while n < self.start:
            yield n
            n += 1

a = Countdown(5)
print(list(a))
[5, 4, 3, 2, 1]
print(list(reversed(a)))
[1, 2, 3, 4]

1.5 对迭代器进行切片操作

要对迭代器进行切片操作，使用itertools库中的itertools.islice()函数，如下：

def count(n):
    while True:
        yield n
        n += 1
        
c = count(0)
c[10: 20]
TypeError: 'generator' object is not subscriptable

import itertools
for i in itertools.islice(c, 10, 12):
    print(i)
10
11

1.5 跳过可迭代对象的前一部分元素

可以使用islice()方法，将最后一个元素改为None来表示想要前三个元素之外的所有元素，如下：

import itertools
items = ['a', 'b', 'c', 1, 2, 3]

for x in itertools.islice(items, 3, None):
    print(x)
1
2
3

也可以通过itertools.dropwhile()方法来实现，如下：

import itertools
items = ['a', 'b', 'c', 1, 2, 3]

for x in itertools.dropwhile(lambda x: isinstance(x, str), items):
    print(x)
1
2
3

1.6 迭代所有可能的组合或排序

我们可以使用itertools.permutations()来完成对一系列元素的所有可能的组合或排序进行迭代任务，（同时进行组合与排序）如下：

import itertools
items = ['a', 'b', 'c']

for p in itertools.permutations(items):
    print(p)
('a', 'b', 'c')
('a', 'c', 'b')
('b', 'a', 'c')
('b', 'c', 'a')
('c', 'a', 'b')
('c', 'b', 'a')

若想得到一个短长度的所有排列组合，可以添加一个长度参数，如下：

for p in itertools.permutations(items, 2):
    print(p)
('a', 'b')
('a', 'c')
('b', 'a')
('b', 'c')
('c', 'a')
('c', 'b')

若只想进行元素组合的探究，不进行元素排序的研究，可以使用itertools.combinations()方法，如下：

import itertools
items = ['a', 'b', 'c']

for p in itertools.combinations(items, 2):
    print(p)
('a', 'b')
('a', 'c')
('b', 'c')

以上的方法都是在采样元素时保证了元素的不重复情况，若在采样时元素可以重复，可以使用itertools.combinations_with_replacement()方法，如下：

import itertools
items = ['a', 'b', 'c']

for p in itertools.combinations_with_replacement(items, 2):
    print(p)
('a', 'a')
('a', 'b')
('a', 'c')
('b', 'b')
('b', 'c')
('c', 'c')

可以看到上面的方法所提供的组合中，元素存在重复的情况。

1.7 以 [索引-值] 对的形式来迭代序列

这是一个最为常见且经典的方法，使用enumerate()函数来解决 [索引-值] 的对形式问题，如下：

list1 = ['a', 'b', 'c']
for idx, val in enumerate(list1):
    print(idx, val)
0 a
1 b
2 c

也可以自定义开始索引，如下：

list1 = ['a', 'b', 'c']
for idx, val in enumerate(list1, 1):
    print(idx, val)
1 a
2 b
3 c

1.8 同时迭代多个序列

我们想要迭代多个序列中的元素，可以使用zip()函数来进行，如下：

list1 = ['a', 'b', 'c']
list2 = [1, 2, 3]
for x1, x2 in zip(list1, list2):
    print(x1, x2)
a 1
b 2
c 3

zip()的工作原理是创建了一个迭代器，该迭代器可以产出元组(x, y)，且整个迭代器的长度与其中的最短输入序列长度相同。

list1 = ['a', 'b', 'c', 'd']
list2 = [1, 2, 3]
for x in zip(list1, list2):
    print(x)
('a', 1)
('b', 2)
('c', 3)

如果不需要迭代器的长度与最短的输入序列一样，那么可以使用另一种方法来进行，itertools.zip_longest()来替代，如下：

import itertools
list1 = ['a', 'b', 'c', 'd']
list2 = [1, 2, 3]
for x in itertools.zip_longest(list1, list2):
    print(x)
('a', 1)
('b', 2)
('c', 3)
('d', None)

for x in itertools.zip_longest(list1, list2, fillvalue=0):
    print(x)
('a', 1)
('b', 2)
('c', 3)
('d', 0)

zip()方法的其他使用方法也很多，比如其可以用来构建dict、list等，如下：

list1 = ['a', 'b', 'c']
list2 = [1, 2, 3]
# 构建字典
x = dict(zip(list1, list2))
x
{'a': 1, 'b': 2, 'c': 3}

# 构建列表
y = list(zip(list1, list2))
y
[('a', 1), ('b', 2), ('c', 3)]

# 3个输入序列
list3 = ['A', 'B', 'C']
z = zip(list1, list2, list3)
for i in z:
    print(i)
('a', 1, 'A')
('b', 2, 'B')
('c', 3, 'C')

1.9 在不同的容器中进行迭代

我们需要对许多的对象执行相同的操作，但是这些对象被包含在不同的容器之中，为了避免重复地使用循环算法，我们可以用itertools.chain()方法来完成这一个任务，如下：（a与b为两个不同的list容器）

a = [1, 2]
b = ['x', 'y', 'z']
from itertools import chain
for x in chain(a, b):
    print(x)
1
2
x
y
z

如果当输入序列很大的情况下，chain()方法在内存使用上会更加的高效。而且可迭代对象之间不是同一种类型时也可以轻松适用。

1.10 合并多个有序序列，再进行迭代

我们得到一组有序序列，想将它们合并在一起之后的有序序列进行迭代，可以使用heapq.merge()函数，如下：

import heapq
a = [1, 4, 6]
b = [2, 5, 11]
for c in heapq.merge(a, b):
    print(c)
1
2
4
5
6
11

二、生成器

2.1 用生成器创建新的迭代模式

我们想实现一个自定义的迭代模式，使其区别于常见的内建函数（如range()、reversed()等），使用生成器，如下：

def frange(start, stop, increment):
    x = start
    while x < stop:
        yield x
        x += increment

list(frange(0, 4, 0.5))
[0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5]

函数中只要出现了yield语句就会将其变成一个生成器。生成器只会在响应迭代操作时才会运行，如下：

def countdown(n):
    print('Start to count from ', n)
    while n > 0:
        yield n
        n -= 1
    print('Down')

c = countdown(3)
c
<generator object countdown at 0x000001B5F3E2A7B0>

next(c)
Start to count from  3
3
next(c)
2
next(c)
1
next(c)
Down
StopIteration:

可以从上面的例子中注意到，生成器只会在响应迭代过程中的next()操作时才会运行。一旦生成器函数返回，迭代也就停止了。但是通常情况下，for语句替我们处理了这些细节。

2.2 定义带有额外状态的生成器

想定义一个生成器函数，但它需要涉及一些额外的状态，我们希望将这些状态暴露给用户

from collections import deque

class linehistory:
    def __init__(self, lines, histlen=3):
        self.lines = lines
        self.history = deque(maxlen=histlen)
    
    def __iter__(self):
        for lineno, line in enumerate(self.lines, 1):
            self.history.append((lineno, line))
            yield line
            
    def clear(self):
        self.history.clear()

a = linehistory('LOVE')
list(a)
['L', 'O', 'V', 'E']

a.history
deque([(2, 'O'), (3, 'V'), (4, 'E')])

a.clear()
a.history
deque([])

# 构成迭代器
a = linehistory('LOVE')
b = iter(a)

next(b)
'L'

2.3 扁平化处理嵌套型的序列

嵌套型序列，举个例子就是列表之中包含了列表，现在我们想将这一列表转化成一列单独的值，可以通过写一个带有yeild from语句的递归生成器函数来解决，如下：

from collections import Iterable

def flatten(items, ignore_types=(str, bytes)):
    for x in items:
        if isinstance(x, Iterable) and not isinstance(x, ignore_types):
            yield from flatten(x)
        else:
            yield x
            
items = [1, 2, [3, 4, [5, 6], 7], 8]
for x in flatten(items):
    print(x)

上述代码中，isinstance(x, Iterrable)是为了检查x是否是可迭代的，而isinstance(x, ignore_types)是为了避免x为字符串str或者字节串byte这类迭代对象而进行不必要的深入的迭代。yeild from是将这个可迭代的对象作为一种子例程进行递归。

当然上面的代码也可以写成下面的形式，只不过运用yeild from方法会更简洁，如下：

def flatten(items, ignore_types=(str, bytes)):
    for x in items:
        if isinstance(x, Iterable) and not isinstance(x, ignore_types):
            for i in flatten(x):
                yield i
        else:
            yield x