本文主要是对上面文章的解析和自我总结
asyncio
是Python 3.4版本引入的标准库,直接内置了对异步IO的支持。
asyncio
的编程模型就是一个消息循环。我们从asyncio
模块中直接获取一个EventLoop
的引用,然后把需要执行的协程扔到EventLoop
中执行,就实现了异步IO。
案例一:
import threading
import asyncio
@asyncio.coroutine
def hello():
print('Hello world! (%s)' % threading.currentThread())
yield from asyncio.sleep(1)
print('Hello again! (%s)' % threading.currentThread())
loop = asyncio.get_event_loop()
tasks = [hello(), hello(),hello()]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
结果:
Hello world! (<_MainThread(MainThread, started 8340)>)
Hello world! (<_MainThread(MainThread, started 8340)>)
Hello world! (<_MainThread(MainThread, started 8340)>)
Hello again! (<_MainThread(MainThread, started 8340)>)
Hello again! (<_MainThread(MainThread, started 8340)>)
Hello again! (<_MainThread(MainThread, started 8340)>)
我们会发现打印过程是三个hello word一起打印,三个again一起打印,并且他们的线程都是同一个MainThread
1.@asyncio.coroutine
把一个generator标记为coroutine(协程)类型,然后,我们就把这个coroutine
扔到EventLoop
中执行。
2.loop = asyncio.get_event_loop() 创建一个事件循环
3.tasks是要处理的协程集合
4.loop.run_until_complete(asyncio.wait(tasks)) 把tasks加入到循环中去,这里要注意一下asyncio.wait的作用,这里可是有很大一个知识点:
.asyncio.wait asyncio.gather这两个都是接受多个future或coro组成的列表,但是不同的是,asyncio.gather会将列表中不是task的coro预先封装为future,而wait则不会。
不过,loop.run_until_complete(asyncio.wait(tasks))运行时,会首先将tasks列表里的coro(协程)先转换为future
ensure_future的作用,比如ensure_future(b())是将b()携程(coro)加入到task中,当我们启动eventloop的时候,就会按照task产生的先后顺序依次去执行。
#!/usr/bin/env py3
import asyncio
async def a():
print ("a")
async def b():
print ("b")
asyncio.ensure_future(a())
bb=asyncio.ensure_future(b())
loop = asyncio.get_event_loop()
loop.run_until_complete(bb)#虽然传入的参数是task-bb,但是task-a却会执行,
#并且是第一个执行,首先打印a,其次打印b
所以ensure_future(a())就是直接把协程a加到事件循环里面去了
案例二:
import asyncio
async def wget(host):
print('wget %s...'%host)
connect=asyncio.open_connection(host,80)
reader,writer=await connect
header='GET / HTTP/1.0\r\nHOST:%s\r\n\r\n'%host
writer.write(header.encode('utf-8'))#发送请求
await writer.drain()
while True:
line=await reader.readline()
if line==b'\r\n':
break
print('%s header > %s'%(host,line.decode('utf-8').rstrip()))
writer.close()
loop=asyncio.get_event_loop()
tasks=[wget(host) for host in['www.sina.com.cn', 'www.sohu.com', 'www.163.com']]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
结果:
wget www.163.com...
wget www.sohu.com...
wget www.sina.com.cn...
www.sohu.com header > HTTP/1.1 200 OK
www.sohu.com header > Content-Type: text/html;charset=UTF-8
www.sohu.com header > Connection: close
www.sohu.com header > Server: nginx
www.sohu.com header > Date: Thu, 24 Jan 2019 14:10:17 GMT
www.sohu.com header > Cache-Control: max-age=60
www.sohu.com header > X-From-Sohu: X-SRC-Cached
www.sohu.com header > Content-Encoding: gzip
www.sohu.com header > FSS-Cache: HIT from 3287345.4532539.4668740
www.sohu.com header > FSS-Proxy: Powered by 3025197.4008247.4406588
www.163.com header > HTTP/1.0 302 Moved Temporarily
www.163.com header > Server: Cdn Cache Server V2.0
www.163.com header > Date: Thu, 24 Jan 2019 14:10:42 GMT
www.163.com header > Content-Length: 0
www.163.com header > Location: http://www.163.com/special/0077jt/error_isp.html
www.163.com header > X-Via: 1.0 PSzjnbyd2ew126:8 (Cdn Cache Server V2.0)
www.163.com header > Connection: close
www.sina.com.cn header > HTTP/1.1 302 Moved Temporarily
www.sina.com.cn header > Server: nginx
www.sina.com.cn header > Date: Thu, 24 Jan 2019 14:10:42 GMT
www.sina.com.cn header > Content-Type: text/html
www.sina.com.cn header > Content-Length: 154
www.sina.com.cn header > Connection: close
www.sina.com.cn header > Location: https://www.sina.com.cn/
www.sina.com.cn header > X-Via-CDN: f=edge,s=cmcc.hangzhou.ha2ts4.95.nb.sinaedge.com,c=117.148.127.211;
www.sina.com.cn header > X-Via-Edge: 1548339042266d37f94757eae0d705745b656
注意点:
1.tasks=[wget(host) for host in['www.sina.com.cn', 'www.sohu.com', 'www.163.com']] 这是我们的tasks
2.asyncio.open_connection:用reader用来接收服务器返回的数据;用writer向服务器发送请求
3.writer.drain() 在这里我的理解是放置IO阻塞的,官方文档翻译是:
This is a flow control method that interacts with the underlying IO write buffer. When the size of the buffer reaches the high watermark, drain() blocks until the size of the buffer is drained down to the low watermark and writing can be resumed. When there is nothing to wait for, the drain()
returns immediately.
4.line=yield from reader.readline() 这个是一行一行的读
5.rstrip() 的作用:去掉字符串末尾的换行符