基本定义
可迭代对象
可迭代对象(Iterable
):可以直接作用于for
循环的对象统称为可迭代对象。可以使用isinstance()
判断一个对象是否是Iterable
对象。
>>> from collections import Iterable
>>> isinstance([], Iterable)
True
>>> isinstance({}, Iterable)
True
>>> isinstance('abc', Iterable)
True
>>> isinstance((x for x in range(10)), Iterable)
True
>>> isinstance(100, Iterable)
False
迭代器
迭代器(Iterator
): Python
的Iterator
对象表示的是一个数据流,Iterator
对象可以被next()
函数调用并不断返回下一个数据,直到没有数据时抛出StopIteration
错误。可以把这个数据流看做是一个有序序列,但我们却不能提前知道序列的长度,只能不断通过next()
函数实现按需计算下一个数据,所以Iterator
的计算是惰性的,只有在需要返回下一个数据时它才会计算。
Iterator
甚至可以表示一个无限大的数据流,例如全体自然数。而使用list是永远不可能存储全体自然数的。
生成器
生成器(generator
):生成器不但可以作用于for
循环,还可以被next()
函数不断调用并返回下一个值,直到最后抛出StopIteration
错误表示无法继续返回下一个值了。我们可以使用isinstance()
判断一个对象是否是Iterator
对象:
>>> from collections import Iterator
>>> isinstance((x for x in range(10)), Iterator)
True
>>> isinstance([], Iterator)
False
>>> isinstance({}, Iterator)
False
>>> isinstance('abc', Iterator)
False
yield
牛津词典中对yield
定义一般有2层含义:产出 和让步 ,对于Python来讲如果我们在一段代码中使用yield value
确实这2层含义都成立。首先,yield value
确实会产出value
给到调用next(...)
方法的调用方;并且还会作出让步,暂停执行yield value
之后代码,让步给调用方继续执行。直到调用方需要下一个值时再次调用next(...)
方法。调用方会从yield value
中取到value
。
如果我们在代码中使用var = yield
也可以从调用方获取到数据,不过需要调用方调用.send(data)
将数据传输到var
而不是next(...)
方法。
yield
关键字甚至可以不接受或者产出数据,不管数据如何流动,yield
都是一种流程控制工具,使用它可以实现协作式多任务;协程可以把控制器让步给中心调度程序,从而激活其他协程。
协程基本案例
下面是一段simple_coroutine
协程和main
交互切换执行的过程演示,simple_coroutine
协程和main
只要有一方随机产出6
则终止执行,否则就会一直切换执行
import random
def simple_coroutine():
print('coroutine started')
while True:
send = random.choice(range(8))
print('coroutine send',send)
if send == 6:
yield send
break
receive = yield send
print('coroutine receive', receive)
if __name__ == '__main__':
print('main started')
coroutine = simple_coroutine()
print('main declare',coroutine)
receive = next(coroutine)
print('main next',receive)
while receive != 6:
send = random.choice(range(8))
print('main send', send)
if send == 6:
coroutine.send(send)
coroutine.close()
break
receive = coroutine.send(send)
print('main receive',receive)
结果如下:
#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
main started
main declare <generator object simple_coroutine at 0x10c862480>
coroutine started
coroutine send 6
main next 6
或者
#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
main started
main declare <generator object simple_coroutine at 0x10c1d8480>
coroutine started
coroutine send 0
main next 0
main send 1
coroutine receive 1
coroutine send 3
main receive 3
main send 2
coroutine receive 2
coroutine send 6
main receive 6
通过上面的执行结果我们可以得到的结论是:
- 协程使用生成器的函数定义:定义体中有
yield
关键字。 - 带有
yield
的函数不再是一个普通函数,Python
解释器会将其视为一个generator
。 coroutine = simple_coroutine()
与创建生成器的方式一样,调用函数只会得到生成器对象<generator object simple_coroutine at 0x10c1d8480>
,并不会开始执行协程代码。- 我们需要先调用
next(coroutine)
函数,因为得到的生成器对象还没启动没在yield
处暂停,我们无法调用coroutine.send(send)
发送数据。 - 当我们调用
next(coroutine)
函数后,得到的生成器对象会启动执行到yield send
处产出receive
,然后终止执行,让步给调用方main
继续执行,直到调用方main
需要下一个值时调用receive = coroutine.send(send)
让步给协程继续执行。 value = yield
代表协程只需要从调用方接受数据,那么产出的值为None
,这个值是隐式指定的,因为yield
右边没有关键字。
协程状态
协程可以处于下面的四种状态,协程当前的状态可以使用inspect.getgeneratorstate(...)
函数得到:
GEN_CREATE
:创建,等待开始执行GEN_RUNNING
:生成器执行中GEN_SUSPENDED
:在yield表达式处暂停GEN_CLOSE
:执行结束,生成器关闭
此时,我们将main
修改如下:
if __name__ == '__main__':
print('main started')
coroutine = simple_coroutine()
print('main declare',coroutine)
print(getgeneratorstate(coroutine))
receive = next(coroutine)
print('main next',receive)
while receive != 6:
send = random.choice(range(8))
print('main send', send)
if send == 6:
coroutine.send(send)
print(getgeneratorstate(coroutine))
coroutine.close()
print(getgeneratorstate(coroutine))
break
receive = coroutine.send(send)
print(getgeneratorstate(coroutine))
print('main receive',receive)
执行结果如下:
#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
main started
main declare <generator object simple_coroutine at 0x10ca96480>
GEN_CREATED
coroutine started
coroutine send 7
main next 7
main send 6
coroutine receive 6
coroutine send 5
GEN_SUSPENDED
GEN_CLOSED
或者
/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
main started
main declare <generator object simple_coroutine at 0x10effc480>
GEN_CREATED
coroutine started
coroutine send 3
main next 3
main send 2
coroutine receive 2
coroutine send 2
GEN_SUSPENDED
main receive 2
main send 7
coroutine receive 7
coroutine send 6
GEN_SUSPENDED
main receive 6
从上面的执行结果来看,我们可以得到一下结论
coroutine = simple_coroutine()
创建协程,协程处于GEN_CREATED
状态,等待开始执行- 调用
next(coroutine)
后,协程处于GEN_RUNNING
执行中,直到遇到yield
产出值后处于GEN_SUSPENDED
暂停状态 - 协程
break
执行完成或者调用coroutine.close()
后,协程处于GEN_CLOSE
:执行结束关闭状态。
使用协程连续计算平均值
下面我们看一个使用协程连续计算平均值的例子,我们设置一个无限循环,只要调用方不断的将值发给协程,他就会一直接受值,然后计算total和count。仅当调用方调用.close()
方法,或者程序没有对协程的引用时被垃圾程序回收。
def average():
total = 0.0
count = 0
average = None
while True:
term = yield average
total += term
count += 1
average = total/count
if __name__ == '__main__':
coro_avg = average()
next(coro_avg)
no_list = []
for i in range(6):
no =random.choice(range(100))
no_list.append(no)
print(no_list,' avg: ',coro_avg.send(no))
执行结果
#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
[51] avg: 51.0
[51, 45] avg: 48.0
[51, 45, 44] avg: 46.666666666666664
[51, 45, 44, 81] avg: 55.25
[51, 45, 44, 81, 49] avg: 54.0
[51, 45, 44, 81, 49, 50] avg: 53.333333333333336
协程返回值
为了使协程返回值,我们必须要使协程可以正常终止。我们改造上面计算连续平均值的程序。
from collections import namedtuple
Result = namedtuple('Result','count average')
def average():
total = 0.0
count = 0
average = None
while True:
term = yield
if term is None:
break;
total += term
count += 1
average = total/count
return Result(count,average)
if __name__ == '__main__':
coro_avg = average()
next(coro_avg)
no_list = []
for i in range(6):
no =random.choice(range(100))
no_list.append(no)
coro_avg.send(no)
try:
coro_avg.send(None)
except StopIteration as e:
result = e.value
print(no_list,' avg: ',result)
执行结果如下:
#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
[71, 2, 73, 55, 74, 67] avg: Result(count=6, average=57.0)
yield from
yield from
语法可以让我们方便地调用另一个generator
。yield from Iterable
相当于for i in Iterable:yield i
yield from
结果会在内部自动捕获StopIteration
异常。这种处理方式与for
循环处理StopIteration
异常的方式一样。对于yield from
结构来说,解释器不仅会捕获StopIteration
异常,还会把value
属性的值变成yield from
表达式的值。- 在函数外部不能使用
yield from
(yield
也不行)。
def gen():
for c in "AB":
yield c
for i in range(1,3):
yield i
def gen2():
yield from "AB"
yield from range(1,3)
def gen3():
yield from gen()
if __name__ == '__main__':
l_result = []
coro = gen()
while True:
try:
a = next(coro)
l_result.append(a)
except StopIteration as e:
break
print(l_result)
print(list(gen3()))
print(list(gen()))
print(list(gen2()))
执行结果
#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
['A', 'B', 1, 2]
['A', 'B', 1, 2]
['A', 'B', 1, 2]
['A', 'B', 1, 2]
yield from与同步非阻塞
yield from
本身只是让我们方便地调用另一个generator
,但是在阻塞网络的情况下,我们可以利用yield from
和asyncio
实现异步非阻塞。
#复杂计算要一会
@asyncio.coroutine
def count_no():
return 2**1000
@asyncio.coroutine
def count(no):
print(time.time(),'count %d! %s' % (no,threading.currentThread()))
yield from count_no()
print(time.time(),'count %d end! %s' % (no,threading.currentThread()))
if __name__ == '__main__':
loop = asyncio.get_event_loop()
tasks=[count(i) for i in range(100)]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
执行结果如下:
#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
1545651898.067989 count 9! <_MainThread(MainThread, started 140736385631168)>
1545651898.068131 count 9 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.068198 count 75! <_MainThread(MainThread, started 140736385631168)>
1545651898.06828 count 75 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.0683289 count 10! <_MainThread(MainThread, started 140736385631168)>
1545651898.068373 count 10 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.068413 count 76! <_MainThread(MainThread, started 140736385631168)>
1545651898.068458 count 76 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.068498 count 11! <_MainThread(MainThread, started 140736385631168)>
1545651898.068539 count 11 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.068577 count 77! <_MainThread(MainThread, started 140736385631168)>
1545651898.0689468 count 77 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.0690439 count 12! <_MainThread(MainThread, started 140736385631168)>
1545651898.069103 count 12 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.069149 count 78! <_MainThread(MainThread, started 140736385631168)>
1545651898.069193 count 78 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.0692348 count 13! <_MainThread(MainThread, started 140736385631168)>
1545651898.069278 count 13 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.0693178 count 79! <_MainThread(MainThread, started 140736385631168)>
1545651898.069359 count 79 end! <_MainThread(MainThread, started 140736385631168)>
复杂计算操作可以达到我们的目标,但是网络请求呢?
import asyncio
import requests
import threading
import time
#asyncio.coroutine包装成generator
@asyncio.coroutine
def request_net(url):
return requests.get(url)
@asyncio.coroutine
def hello(url):
print(time.time(),'Hello %s! %s' % (url,threading.currentThread()))
resp = yield from request_net(url)
print(time.time(),resp.request.url)
print(time.time(),'Hello %s again! %s' % (url,threading.currentThread()))
if __name__ == '__main__':
loop = asyncio.get_event_loop()
#访问12306模拟网络长时间操作
tasks = [hello('https://kyfw.12306.cn'), hello('http://www.baidu.com')]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
执行结果如下
#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
1545649725.3060732 Hello https://kyfw.12306.cn! <_MainThread(MainThread, started 140736385631168)>
1545649725.96431 https://kyfw.12306.cn/otn/passport?redirect=/otn/
1545649725.96435 Hello https://kyfw.12306.cn again! <_MainThread(MainThread, started 140736385631168)>
1545649725.9646132 Hello http://www.baidu.com! <_MainThread(MainThread, started 140736385631168)>
1545649726.023788 http://www.baidu.com/
1545649726.023826 Hello http://www.baidu.com again! <_MainThread(MainThread, started 140736385631168)>
可以发现和正常的请求并没有什么两样,依然还是顺次执行的,12306那么卡,百度也是等待返回后顺序调度的,其实,要实现异步处理,我们必须要使用支持异步操作的请求方式才可以实现真正的异步
async def get(url):
async with aiohttp.ClientSession() as session:
rsp = await session.get(url)
result = await rsp.text()
return result
async def request(url):
print(time.time(), 'Hello %s! %s' % (url, threading.currentThread()))
result = await get(url)
print(time.time(),'Get response from', url)
print(time.time(), 'Hello %s again! %s' % (url, threading.currentThread()))
if __name__ == '__main__':
loop = asyncio.get_event_loop()
tasks = [ asyncio.ensure_future(request('http://www.163.com/')),asyncio.ensure_future(request('http://www.baidu.com'))]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
执行结果如下:
#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
1545651249.1526 Hello http://www.163.com/! <_MainThread(MainThread, started 140736385631168)>
1545651249.1627488 Hello http://www.baidu.com! <_MainThread(MainThread, started 140736385631168)>
1545651250.150497 Get response from http://www.baidu.com
1545651250.1505191 Hello http://www.baidu.com again! <_MainThread(MainThread, started 140736385631168)>
1545651250.165774 Get response from http://www.163.com/
1545651250.165801 Hello http://www.163.com/ again! <_MainThread(MainThread, started 140736385631168)>
用asyncio
提供的@asyncio.coroutine
可以把一个generator
标记为coroutine
类型,然后在coroutine
内部用yield from
调用另一个coroutine
实现异步操作。
为了简化并更好地标识异步IO
,从Python 3.5开始引入了新的语法async
和await
,可以让coroutine
的代码更简洁易读。