深入浅出 Python Generators

最新推荐文章于 2024-04-04 00:00:00 发布

Charles_lgc

最新推荐文章于 2024-04-04 00:00:00 发布

阅读量389

点赞数

分类专栏： python 文章标签： python generator 生成器

python 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

原文

在上篇文章中介绍了如何用class构造iterator：必须实现__iter__()和`__next__()，必须keep internal states，在结束迭代时必须明确raise StopIteration。这一系列操作比较复杂，本篇介绍更简单的方法。

Python generators are a simple way of creating iterators.

Python generator 就是一个可以返回 iterator 的函数

如何构造generator？

在一个函数定义中，至少一处使用了yield，则该函数就成了一个generator。
generator可以包含任意个yield和return语句，区别在于：

yield语句暂时退出该函数，保存函数的各种状态，下次继续从这里执行。
而return语句相当于一个raise StopIteration。

Generator函数和Normal函数的差别

Generator和Normal函数的差别：

至少包含一个yield
被调用时，它不是立即执行，而是返回一个iterator；一个next()会让它执行至第一个yield
__iter__()和__next__()方法被自动创建
generator function 发生yield时，该function相当于被“暂停”了，然后控制转交给caller。
Local variables 和其状态被保存
最后，当该函数结束时，StopIteration被自动raised。
下面是一个例子：

# A simple generator function
def my_gen():
    n = 1
    print('This is printed first')
    # Generator function contains yield statements
    yield n

    n += 1
    print('This is printed second')
    yield n

    n += 1
    print('This is printed at last')
    yield n

然后进行一些测试：

>>> # It returns an object but does not start execution immediately.
>>> a = my_gen()

>>> # We can iterate through the items using next().
>>> next(a)
This is printed first
1
>>> # Once the function yields, the function is paused and the control is transferred to the caller.

>>> # Local variables and theirs states are remembered between successive calls.
>>> next(a)
This is printed second
2

>>> next(a)
This is printed at last
3

>>> # Finally, when the function terminates, StopIteration is raised automatically on further calls.
>>> next(a)
Traceback (most recent call last):
...
StopIteration
>>> next(a)
Traceback (most recent call last):
...
StopIteration

需要注意的是，generator 对象只能被 iterate 一次，如果想再次 iterate，需要重新定义一个该对象a = my_gen()

一般generators用在for loop中：

# A simple generator function
def my_gen():
    n = 1
    print('This is printed first')
    # Generator function contains yield statements
    yield n

    n += 1
    print('This is printed second')
    yield n

    n += 1
    print('This is printed at last')
    yield n

# Using for loop
for item in my_gen():
    print(item)

运行上面的代码，可以看到的结果是：

This is printed first
1
This is printed second
2
This is printed at last
3

这种写法也称为 Co-routine，利用 Generator 可以把一些异步的操作写成看似同步一样。具体的例子参见。

在Loop中yield

一般会在一个loop中进行yield来构造一个generator，下面是一个例子：

def rev_str(my_str):
    length = len(my_str)
    for i in range(length - 1,-1,-1):
        yield my_str[i]

# For loop to reverse the string
# Output:
# o
# l
# l
# e
# h
for char in rev_str("hello"):
     print(char)

Generator Expression， Generator表达式

类似lambda function创建了一个anonymous function，generator expression用来创建一个anonymous generator function。
generator expression的语法类似于 list comprehension, 但是要把 [ ] 方括号改成 ( ) 圆括号。
区别在于，list comprehension 创造了整个list，而 generator expression 一次创建一个item。
因此，generator expression 更节省内存。

# Initialize the list
my_list = [1, 3, 6, 10]

# square each term using list comprehension
# Output: [1, 9, 36, 100]
[x**2 for x in my_list]

# same thing can be done using generator expression
# Output: <generator object <genexpr> at 0x0000000002EBDAF8>
(x**2 for x in my_list)

可以看到，上面的 generator expression 并不是立马返回一个列表，而是返回了一个 generator 对象，需要使用 next 来产生一个值。

# Intialize the list
my_list = [1, 3, 6, 10]

a = (x**2 for x in my_list)
# Output: 1
print(next(a))

# Output: 9
print(next(a))

# Output: 36
print(next(a))

# Output: 100
print(next(a))

# Output: StopIteration
next(a)

注意： Generator expression 可以用在函数括号中，此时其圆括号可以省略掉：

>> sum(x**2 for x in my_list)
146

>>> max(x**2 for x in my_list)
100

Generator 的优点

1. 简洁

对比一下 PowTwo 用 iterator 类的实现和 Generator 的实现：

lass PowTwo:
    def __init__(self, max = 0):
        self.max = max

    def __iter__(self):
        self.n = 0
        return self

    def __next__(self):
        if self.n > self.max:
            raise StopIteration

        result = 2 ** self.n
        self.n += 1
        return result

同样的功能，Generator 会更加简洁：

def PowTwo(max = 0):
    n = 0
    while n < max:
        yield 2 ** n
        n += 1

因为 generator 默认实现了很多细节

2. 节省内存

这一点在比较 list comprehension 和 generator 时已经解释过了。

3. 可以利用 Infinite Generator 产生 Infinite Stream

这一点在上一篇中也解释过了。

4. Pipelining Generators

利用 Generator 可以把一系列操作放入管道中，效率更高。
在下面的例子中，假设一个文件每一行都是一个数字，或者是一个 ‘N/A’ ，我们想把所有的数字加起来：

with open('afile.log') as file:
    col = (line for line in file)
    num = (float(x) for x in col if x != 'N/A')
    print("The sum is: ", sum(num))

问题：是否效率更高？还没验证过。如何验证？

Charles_lgc

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
深入浅出 Python Generators

本文由浅入深解释了 Python Generators。Generator 即调用之后可以返回 Iterator 的函数Generator 即包含至少一个 yield 的函数，它有两种应用：1. 构造 Iterator2. 使用 Generator 表达式，类似 list comprehension3. 产生 Infinite Stream4. 实现 Continuation
复制链接

扫一扫