深度学习Python协程

基本定义

可迭代对象

可迭代对象(Iterable):可以直接作用于for循环的对象统称为可迭代对象。可以使用isinstance()判断一个对象是否是Iterable对象。

>>> from collections import Iterable
>>> isinstance([], Iterable)
True
>>> isinstance({}, Iterable)
True
>>> isinstance('abc', Iterable)
True
>>> isinstance((x for x in range(10)), Iterable)
True
>>> isinstance(100, Iterable)
False

迭代器

迭代器(Iterator): PythonIterator对象表示的是一个数据流,Iterator对象可以被next()函数调用并不断返回下一个数据,直到没有数据时抛出StopIteration错误。可以把这个数据流看做是一个有序序列,但我们却不能提前知道序列的长度,只能不断通过next()函数实现按需计算下一个数据,所以Iterator的计算是惰性的,只有在需要返回下一个数据时它才会计算。

Iterator甚至可以表示一个无限大的数据流,例如全体自然数。而使用list是永远不可能存储全体自然数的。

生成器

生成器(generator):生成器不但可以作用于for循环,还可以被next()函数不断调用并返回下一个值,直到最后抛出StopIteration错误表示无法继续返回下一个值了。我们可以使用isinstance()判断一个对象是否是Iterator对象:

>>> from collections import Iterator
>>> isinstance((x for x in range(10)), Iterator)
True
>>> isinstance([], Iterator)
False
>>> isinstance({}, Iterator)
False
>>> isinstance('abc', Iterator)
False

yield

牛津词典中对yield定义一般有2层含义:产出让步 ,对于Python来讲如果我们在一段代码中使用yield value确实这2层含义都成立。首先,yield value确实会产出value给到调用next(...)方法的调用方;并且还会作出让步,暂停执行yield value之后代码,让步给调用方继续执行。直到调用方需要下一个值时再次调用next(...)方法。调用方会从yield value中取到value

如果我们在代码中使用var = yield也可以从调用方获取到数据,不过需要调用方调用.send(data)将数据传输到var而不是next(...)方法。

yield关键字甚至可以不接受或者产出数据,不管数据如何流动,yield都是一种流程控制工具,使用它可以实现协作式多任务;协程可以把控制器让步给中心调度程序,从而激活其他协程。

协程基本案例

下面是一段simple_coroutine协程和main交互切换执行的过程演示,simple_coroutine协程和main只要有一方随机产出6则终止执行,否则就会一直切换执行

import random
def simple_coroutine():
    print('coroutine started')
    while True:
        send = random.choice(range(8))
        print('coroutine send',send)
        if send == 6:
            yield send
            break

        receive = yield send
        print('coroutine receive', receive)

if __name__ == '__main__':
    print('main started')
    coroutine = simple_coroutine()
    print('main declare',coroutine)
    receive = next(coroutine)
    print('main next',receive)
    while receive != 6:
        send = random.choice(range(8))
        print('main send', send)
        if send == 6:
            coroutine.send(send)
            coroutine.close()
            break
        receive = coroutine.send(send)
        print('main receive',receive)

结果如下:

#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
main started
main declare <generator object simple_coroutine at 0x10c862480>
coroutine started
coroutine send 6
main next 6

或者

#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
main started
main declare <generator object simple_coroutine at 0x10c1d8480>
coroutine started
coroutine send 0
main next 0
main send 1
coroutine receive 1
coroutine send 3
main receive 3
main send 2
coroutine receive 2
coroutine send 6
main receive 6

通过上面的执行结果我们可以得到的结论是:

  • 协程使用生成器的函数定义:定义体中有yield关键字。
  • 带有 yield 的函数不再是一个普通函数,Python 解释器会将其视为一个 generator
  • coroutine = simple_coroutine()与创建生成器的方式一样,调用函数只会得到生成器对象<generator object simple_coroutine at 0x10c1d8480>,并不会开始执行协程代码。
  • 我们需要先调用next(coroutine)函数,因为得到的生成器对象还没启动没在yield处暂停,我们无法调用coroutine.send(send)发送数据。
  • 当我们调用next(coroutine)函数后,得到的生成器对象会启动执行到yield send处产出receive,然后终止执行,让步给调用方main继续执行,直到调用方main需要下一个值时调用receive = coroutine.send(send)让步给协程继续执行。
  • value = yield代表协程只需要从调用方接受数据,那么产出的值为None,这个值是隐式指定的,因为yield右边没有关键字。

协程状态

协程可以处于下面的四种状态,协程当前的状态可以使用inspect.getgeneratorstate(...)函数得到:

  • GEN_CREATE:创建,等待开始执行
  • GEN_RUNNING:生成器执行中
  • GEN_SUSPENDED:在yield表达式处暂停
  • GEN_CLOSE:执行结束,生成器关闭

此时,我们将main修改如下:

if __name__ == '__main__':
    print('main started')
    coroutine = simple_coroutine()
    print('main declare',coroutine)
    print(getgeneratorstate(coroutine))
    receive = next(coroutine)
    print('main next',receive)
    while receive != 6:
        send = random.choice(range(8))
        print('main send', send)
        if send == 6:
            coroutine.send(send)
            print(getgeneratorstate(coroutine))
            coroutine.close()
            print(getgeneratorstate(coroutine))
            break
        receive = coroutine.send(send)
        print(getgeneratorstate(coroutine))
        print('main receive',receive)

执行结果如下:

#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
main started
main declare <generator object simple_coroutine at 0x10ca96480>
GEN_CREATED
coroutine started
coroutine send 7
main next 7
main send 6
coroutine receive 6
coroutine send 5
GEN_SUSPENDED
GEN_CLOSED

或者

/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
main started
main declare <generator object simple_coroutine at 0x10effc480>
GEN_CREATED
coroutine started
coroutine send 3
main next 3
main send 2
coroutine receive 2
coroutine send 2
GEN_SUSPENDED
main receive 2
main send 7
coroutine receive 7
coroutine send 6
GEN_SUSPENDED
main receive 6

从上面的执行结果来看,我们可以得到一下结论

  • coroutine = simple_coroutine()创建协程,协程处于GEN_CREATED状态,等待开始执行
  • 调用next(coroutine)后,协程处于GEN_RUNNING执行中,直到遇到yield产出值后处于GEN_SUSPENDED暂停状态
  • 协程break执行完成或者调用coroutine.close()后,协程处于GEN_CLOSE:执行结束关闭状态。

使用协程连续计算平均值

下面我们看一个使用协程连续计算平均值的例子,我们设置一个无限循环,只要调用方不断的将值发给协程,他就会一直接受值,然后计算total和count。仅当调用方调用.close()方法,或者程序没有对协程的引用时被垃圾程序回收。

def average():
    total = 0.0
    count = 0
    average = None
    while True:
        term = yield average
        total += term
        count += 1
        average = total/count


if __name__ == '__main__':
    coro_avg = average()
    next(coro_avg)
    no_list = []
    for i in range(6):
        no =random.choice(range(100))
        no_list.append(no)
        print(no_list,' avg:  ',coro_avg.send(no))

执行结果

#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
[51]  avg:   51.0
[51, 45]  avg:   48.0
[51, 45, 44]  avg:   46.666666666666664
[51, 45, 44, 81]  avg:   55.25
[51, 45, 44, 81, 49]  avg:   54.0
[51, 45, 44, 81, 49, 50]  avg:   53.333333333333336

协程返回值

为了使协程返回值,我们必须要使协程可以正常终止。我们改造上面计算连续平均值的程序。

from collections import namedtuple

Result = namedtuple('Result','count average')

def average():
    total = 0.0
    count = 0
    average = None
    while True:
        term = yield
        if term is None:
            break;
        total += term
        count += 1
        average = total/count
    return Result(count,average)

if __name__ == '__main__':
    coro_avg = average()
    next(coro_avg)
    no_list = []
    for i in range(6):
        no =random.choice(range(100))
        no_list.append(no)
        coro_avg.send(no)

    try:
        coro_avg.send(None)
    except StopIteration as e:
        result = e.value

    print(no_list,' avg:  ',result)

执行结果如下:

#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
[71, 2, 73, 55, 74, 67]  avg:   Result(count=6, average=57.0)

yield from

  • yield from语法可以让我们方便地调用另一个generator
  • yield from Iterable 相当于for i in Iterable:yield i
  • yield from 结果会在内部自动捕获StopIteration 异常。这种处理方式与 for 循环处理StopIteration异常的方式一样。对于yield from 结构来说,解释器不仅会捕获StopIteration异常,还会把value属性的值变成yield from 表达式的值。
  • 在函数外部不能使用yield fromyield也不行)。
def gen():
    for c in "AB":
        yield c
    for i in range(1,3):
        yield i

def gen2():
    yield from "AB"
    yield from range(1,3)

def gen3():
    yield from gen()
    
if __name__ == '__main__':
    l_result = []
    coro = gen()
    while True:
        try:
            a = next(coro)
            l_result.append(a)
        except StopIteration as e:
            break

    print(l_result)

    print(list(gen3()))

    print(list(gen()))
    print(list(gen2()))

执行结果

#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
['A', 'B', 1, 2]
['A', 'B', 1, 2]
['A', 'B', 1, 2]
['A', 'B', 1, 2]

yield from与同步非阻塞

yield from本身只是让我们方便地调用另一个generator,但是在阻塞网络的情况下,我们可以利用yield fromasyncio实现异步非阻塞。

#复杂计算要一会
@asyncio.coroutine
def count_no():
    return 2**1000

@asyncio.coroutine
def count(no):
    print(time.time(),'count %d! %s' % (no,threading.currentThread()))
    yield from count_no()
    print(time.time(),'count %d end! %s' % (no,threading.currentThread()))

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    tasks=[count(i) for i in range(100)]
    loop.run_until_complete(asyncio.wait(tasks))
    loop.close()

执行结果如下:

#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
1545651898.067989 count 9! <_MainThread(MainThread, started 140736385631168)>
1545651898.068131 count 9 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.068198 count 75! <_MainThread(MainThread, started 140736385631168)>
1545651898.06828 count 75 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.0683289 count 10! <_MainThread(MainThread, started 140736385631168)>
1545651898.068373 count 10 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.068413 count 76! <_MainThread(MainThread, started 140736385631168)>
1545651898.068458 count 76 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.068498 count 11! <_MainThread(MainThread, started 140736385631168)>
1545651898.068539 count 11 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.068577 count 77! <_MainThread(MainThread, started 140736385631168)>
1545651898.0689468 count 77 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.0690439 count 12! <_MainThread(MainThread, started 140736385631168)>
1545651898.069103 count 12 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.069149 count 78! <_MainThread(MainThread, started 140736385631168)>
1545651898.069193 count 78 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.0692348 count 13! <_MainThread(MainThread, started 140736385631168)>
1545651898.069278 count 13 end! <_MainThread(MainThread, started 140736385631168)>
1545651898.0693178 count 79! <_MainThread(MainThread, started 140736385631168)>
1545651898.069359 count 79 end! <_MainThread(MainThread, started 140736385631168)>

复杂计算操作可以达到我们的目标,但是网络请求呢?

import asyncio
import requests
import threading
import time

#asyncio.coroutine包装成generator
@asyncio.coroutine
def request_net(url):
    return requests.get(url)

@asyncio.coroutine
def hello(url):
    print(time.time(),'Hello %s! %s' % (url,threading.currentThread()))
    resp = yield from request_net(url)
    print(time.time(),resp.request.url)
    print(time.time(),'Hello %s again! %s' % (url,threading.currentThread()))


if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    #访问12306模拟网络长时间操作
    tasks = [hello('https://kyfw.12306.cn'), hello('http://www.baidu.com')]
    loop.run_until_complete(asyncio.wait(tasks))
    loop.close()

执行结果如下

#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
1545649725.3060732 Hello https://kyfw.12306.cn! <_MainThread(MainThread, started 140736385631168)>
1545649725.96431 https://kyfw.12306.cn/otn/passport?redirect=/otn/
1545649725.96435 Hello https://kyfw.12306.cn again! <_MainThread(MainThread, started 140736385631168)>
1545649725.9646132 Hello http://www.baidu.com! <_MainThread(MainThread, started 140736385631168)>
1545649726.023788 http://www.baidu.com/
1545649726.023826 Hello http://www.baidu.com again! <_MainThread(MainThread, started 140736385631168)>

可以发现和正常的请求并没有什么两样,依然还是顺次执行的,12306那么卡,百度也是等待返回后顺序调度的,其实,要实现异步处理,我们必须要使用支持异步操作的请求方式才可以实现真正的异步

async def get(url):
    async with aiohttp.ClientSession() as session:
        rsp = await session.get(url)
        result = await rsp.text()
        return result


async def request(url):
    print(time.time(), 'Hello %s! %s' % (url, threading.currentThread()))
    result = await get(url)
    print(time.time(),'Get response from', url)
    print(time.time(), 'Hello %s again! %s' % (url, threading.currentThread()))

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    tasks = [ asyncio.ensure_future(request('http://www.163.com/')),asyncio.ensure_future(request('http://www.baidu.com'))]
    loop.run_until_complete(asyncio.wait(tasks))
    loop.close()

执行结果如下:

#/usr/local/bin/python3.7 /data/code/python/test/coroutine/coroutinue.py
1545651249.1526 Hello http://www.163.com/! <_MainThread(MainThread, started 140736385631168)>
1545651249.1627488 Hello http://www.baidu.com! <_MainThread(MainThread, started 140736385631168)>
1545651250.150497 Get response from http://www.baidu.com
1545651250.1505191 Hello http://www.baidu.com again! <_MainThread(MainThread, started 140736385631168)>
1545651250.165774 Get response from http://www.163.com/
1545651250.165801 Hello http://www.163.com/ again! <_MainThread(MainThread, started 140736385631168)>

asyncio提供的@asyncio.coroutine可以把一个generator标记为coroutine类型,然后在coroutine内部用yield from调用另一个coroutine实现异步操作。
为了简化并更好地标识异步IO,从Python 3.5开始引入了新的语法asyncawait,可以让coroutine的代码更简洁易读。

没有更多推荐了,返回首页