通过for循环来延时_异步编程 101:写一个事件循环

67892a3e900195bc42235e984e62a8d2.png
本文的代码来源于: https:// snarky.ca/how-the-heck- does-async-await-work-in-python-3-5/

回顾一下

上一篇文章介绍了 Python async、await 关键字的发展历史,说过,async和 await 是 API 而不是 implementation。基于 async、await实现的事件循环有很多,包括 asyncio、curio等。其中 asyncio 底层基于 future对象,curio 底层基于tuple。

这篇文章我们来用 最小堆实现一个简单的事件循环。

heapq 模块

Heaps are arrays for which a[k] <= a[2 k+1] and a[k] <= a[2k+2] for all k, counting elements from 0. For the sake of comparison, non-existing elements are considered to be infinite. The interesting property of a heap is that a[0] is always its smallest element. (来源于 Python 内置模块 heapq 源代码)

简单来说,heaps就是一种有特殊性质的 Python 列表: a[k]<=a[2*k+1]a[k]<=a[2*k+2],第一个元素永远是最小的。

没错你肯定已经看出来了,这就是一颗二叉树:

35a3887d8f5480e55e1e73d90454e416.png

heapq模块主要有下面这几个 API:

Usage:

heap = [] # creates an empty heap
heappush(heap, item) # pushes a new item on the heap
item = heappop(heap) # pops the smallest item from the heap
item = heap[0] # smallest item on the heap without popping it
heapify(x) # transforms list into a heap, in-place, in linear time
item = heapreplace(heap, item) # pops and returns smallest item,and adds new item; the heap size is unchanged
  • 初始化堆:heap = []
  • 往堆中添加元素:heappush(heap,item)
  • 从堆中 pop 出最小的元素:item = heappop(heap)
  • 从堆中获取最小元素但是不移除:item = heap[0]
  • 将队列转换成堆:heapify(x)
  • pop 最小元素并添加一个新的元素进去:item = heapreplace(heap, item)

生成器 send() 方法

再回顾一下,这个可能有点难理解。

next_value=generator.send(value)

会发生三件事:

  • 恢复生成器继续执行
  • value 成为了生成器当前 yield 表达式的值
  • 生成器下一次 yield表达式的值,作为 next_value返回。

看下这个例子:

>>> def double_inputs():
... while True:
...         x = yield
... yield x * 2
...
>>> gen = double_inputs()
>>> next(gen) # run up to the first yield
>>> gen.send(10) # goes into 'x' variable
20
>>> next(gen) # run up to the next yield
>>> gen.send(6) # goes into 'x' again
12
>>> next(gen) # run up to the next yield
>>> gen.send(94.3) # goes into 'x' again
188.5999999999999

执行 gen.send(10)发生的事情如下:

  • 让生成器恢复运行
  • 10赋予了 x=yieldx
  • x*2的值是 20,此时再次遇到 yield,函数再次暂停,并且把 x*2的值作为返回值,所以发现这个语句输出了20.

next(g)等价于 g.send(None),这个经常用来让生成器运行到 yield 的地方然后停下来。

事件循环功能设计

我们要实现的事件循环很简单,核心功能如下:

  • 处理很多延时任务
  • 运行时间点最早的任务最先运行
  • 假如前面的任务需要很长时间才能完成,不会阻塞后面的任务(也就是他们可以并行执行)

代码

Task 类

你可以把这个想做是 asyncio.Task/curio.Task
class Task:
    def __init__(self, wait_until, coro):
        self.coro = coro
        self.waiting_until = wait_until

    def __eq__(self, other):
        return self.waiting_until == other.waiting_until

    def __lt__(self, other):
        return self.waiting_until < other.waiting_until

这里定义了两个特殊方法: __eq____lt__,用来对 Task进行 <==比较。因为我们这里用的是 heapq最小堆,『最小』的排在最前面。Task 实例比较大小的依据是他们的 waiting_until下一次恢复运行的时间点)。

所以,在某一个时刻,最小堆的状态可能是这样的:

e48904987bfa1080faee534d74a2a618.png

TaskA将在0秒后恢复运行,他的恢复运行时间(wait_until)『最小』,所以就会首先被弹出执行,然后 TaskB会取代他的位置成为『最小』的元素。

实际执行的任务

@types.coroutine
def sleep(seconds):
    now = datetime.datetime.now()
    wait_until = now + datetime.timedelta(seconds=seconds)
    actual = yield wait_until
    return actual - now

async def countdown(label, length, *, delay=0):
    print(label, 'waiting', delay, 'seconds before starting countdown')
    delta = await sleep(delay)
    print(label, 'starting after waiting', delta)
    while length:
        print(label, 'T-minus', length)
        waited = await sleep(1)
        length -= 1
    print(label, 'lift-off!')

delay秒之后运行一个耗时 length秒的任务。简要分析一下代码:

有一点需要明确, countdown()返回的是一个 coroutine对象,你如果不 await它(或者调用 next(), send()),什么也不会真正执行。

delta=awaitsleep(delay)这一句,会加入coroutine sleep()里面,在第一个 yield 的地方暂停。要想让它恢复运行,需要通过某种方式"send stuff back"(参考上一篇文章),也就是对这个生成器调用 send()方法。 后面会看到,实际上这属于事件循环的工作。

另外,对于每个任务,第一次恢复执行的时间应该是 delay秒,所以事件循环应该在程序开始 delay秒的时候调用 send()

后面的 while循环会再次进行运行、暂停的循环,直到时间超过了 length秒,也就是任务结束。

事件循环代码

class SleepingLoop:
    def __init__(self, *coros):
        self._new = coros
        self._waiting = []

    def run_until_complete(self):
        for coro in self._new:
            wait_for = coro.send(None)
            heapq.heappush(self._waiting, Task(wait_for, coro))
        while self._waiting:
            now = datetime.datetime.now()
            task = heapq.heappop(self._waiting)
            if now < task.waiting_until:
                delta = task.waiting_until - now
                time.sleep(delta.total_seconds())
                now = datetime.datetime.now()
            try:
                # It's time to resume the coroutine.
                wait_until = task.coro.send(now)
                heapq.heappush(self._waiting, Task(wait_until, task.coro))
            except StopIteration:
                # The coroutine is done.
                pass

def main():
    """Start the event loop, counting down 3 separate launches.

    This is what a user would typically write.
    """
    loop = SleepingLoop(
        countdown('A', 5, delay=0),
        countdown('B', 3, delay=2),
        countdown('C', 4, delay=1)
    )
    start = datetime.datetime.now()
    loop.run_until_complete()
    print('Total elapsed time is', datetime.datetime.now() - start)

if __name__ == '__main__':
    main()

代码一共就只有这么点,是不是很简单?来分析一下:

for coro in self._new:
    wait_for = coro.send(None)
    heapq.heappush(self._waiting, Task(wait_for, coro))

wait_for=coro.send(None) 是第一次对这些coroutine对象调用 send(),如前面所说,这一步会在 sleepactual=yieldwait_until这个地方停下来。 wait_until的值会传给 wait_for,这是第一次开始任务开始运行的时间。然后把这些Task 对象添加到最小堆里面。

接下来是一个 while循环,每个循环从最小堆中取出『最小』的元素,也就是下一次恢复运行时间最近的哪一个任务。如果发现现在还没到它的恢复执行时间,就调用阻塞time.sleep()。(这里可以阻塞,因为这个事件循环非常简单,我们可以确定这段时间没有新的任务需要恢复执行。)

接着对 coro调用 send()方法,如果还没遇到 StopIteration,就把新的 Task 推到最小堆(前面从最小堆里面取出任务,如果这个任务没迭代完,就更新它的下次恢复执行时间,再次推到最小堆里面)。

那么什么时候会发生 StopIteration异常呢?当 countdown()这个 coroutine 得 while 循环结束的时候,也就是没有更多的 yield 的时候。

最终的代码

import datetime
import heapq
import types
import time


class Task:
    """Represent how long a coroutine should wait before starting again.

    Comparison operators are implemented for use by heapq. Two-item
    tuples unfortunately don't work because when the datetime.datetime
    instances are equal, comparison falls to the coroutine and they don't
    implement comparison methods, triggering an exception.

    Think of this as being like asyncio.Task/curio.Task.
    """

    def __init__(self, wait_until, coro):
        self.coro = coro
        self.waiting_until = wait_until

    def __eq__(self, other):
        return self.waiting_until == other.waiting_until

    def __lt__(self, other):
        return self.waiting_until < other.waiting_until


class SleepingLoop:
    """An event loop focused on delaying execution of coroutines.

    Think of this as being like asyncio.BaseEventLoop/curio.Kernel.
    """

    def __init__(self, *coros):
        self._new = coros
        self._waiting = []

    def run_until_complete(self):
        # Start all the coroutines.
        for coro in self._new:
            wait_for = coro.send(None)
            heapq.heappush(self._waiting, Task(wait_for, coro))
        # Keep running until there is no more work to do.
        while self._waiting:
            now = datetime.datetime.now()
            # Get the coroutine with the soonest resumption time.
            task = heapq.heappop(self._waiting)
            if now < task.waiting_until:
                # We're ahead of schedule; wait until it's time to resume.
                delta = task.waiting_until - now
                time.sleep(delta.total_seconds())
                now = datetime.datetime.now()
            try:
                # It's time to resume the coroutine.
                wait_until = task.coro.send(now)
                heapq.heappush(self._waiting, Task(wait_until, task.coro))
            except StopIteration:
                # The coroutine is done.
                pass


@types.coroutine
def sleep(seconds):
    """Pause a coroutine for the specified number of seconds.

    Think of this as being like asyncio.sleep()/curio.sleep().
    """
    now = datetime.datetime.now()
    wait_until = now + datetime.timedelta(seconds=seconds)
    # Make all coroutines on the call stack pause; the need to use `yield`
    # necessitates this be generator-based and not an async-based coroutine.
    actual = yield wait_until
    # Resume the execution stack, sending back how long we actually waited.
    return actual - now


async def countdown(label, length, *, delay=0):
    """Countdown a launch for `length` seconds, waiting `delay` seconds.

    This is what a user would typically write.
    """
    print(label, 'waiting', delay, 'seconds before starting countdown')
    delta = await sleep(delay)
    print(label, 'starting after waiting', delta)
    while length:
        print(label, 'T-minus', length)
        waited = await sleep(1)
        length -= 1
    print(label, 'lift-off!')


def main():
    """Start the event loop, counting down 3 separate launches.

    This is what a user would typically write.
    """
    loop = SleepingLoop(
        countdown('A', 5, delay=0),
        # countdown('B', 3, delay=2),
        # countdown('C', 4, delay=1)
    )
    start = datetime.datetime.now()
    loop.run_until_complete()
    print('Total elapsed time is', datetime.datetime.now() - start)



if __name__ == '__main__':
    main()

总结一下

把这个例子里面的元素和 asyncio做一下对应:

  • Task类相当于 asyncio.Task。本文的 Task依据 waiting_until来判断恢复执行时间; asyncio.Task是一个 future对象,当 asyncio的事件循环检测到这个 future对象的状态发生变化的时候,执行相应的逻辑。
  • sleep()函数相等于 asyncio.sleep()。不会阻塞。
  • SleepingLoop相当于 asyncio.BaseEventLoopSleepingLoop用的是最小堆, asyncio.BaseEventLoop更加复杂,基于 future对象,以及 selectors模块等。

fc5c53935c811b7463058ec7ca8f5177.png

如果你像我一样真正热爱计算机科学,喜欢研究底层逻辑,欢迎关注我的微信公众号:

4706ad88f6fc2096c55302b7b1ff4429.png
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值