关于Python协程的讨论,一般出现最多的几个关键字就是:
- 阻塞
- 非阻塞
- 同步
- 异步
- 并发
- 并行
- 协程
- asyncio
- aiohttp
概念知识的话,感觉以下两篇博文都讲得不错,这里就不转了,直接贴地址:
http://python.jobbole.com/87310/
http://python.jobbole.com/88291/
https://aiohttp.readthedocs.io/en/stable/client_quickstart.html #这个是aiohttp的文档
我在下面的内容就当作一些练习题好了。
- 定义一个协程函数,请求一次腾讯网
import asyncio
import aiohttp
async def get_page(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as resp:
page = await resp.text(encoding='GB18030')
print(page)
url = 'https://www.qq.com'
loop = asyncio.get_event_loop()
loop.run_until_complete(get_page(url))
- 给协程添加一个回调函数,抓取腾讯网的标题
import asyncio
import aiohttp
import re
async def get_page(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as resp:
page = await resp.text(encoding='GB18030')
return page
def callback(future):
pattern = '<title>(.*?)</title>'
item = re.findall(pattern, future.result())
print(item)
url = 'https://www.qq.com'
loop = asyncio.get_event_loop()
task = asyncio.ensure_future(get_page(url))
task.add_done_callback(callback)
loop.run_until_complete(task)
- 并发请求100次腾讯网,平均耗时约为:2.349秒
import asyncio
import aiohttp
import re
import time
async def get_page(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as resp:
page = await resp.text(encoding='GB18030')
return page
def callback(future):
pattern = '<title>(.*?)</title>'
item = re.findall(pattern, future.result())
print(item)
url = 'https://www.qq.com'
loop = asyncio.get_event_loop()
tasks = [asyncio.ensure_future(get_page(url)) for _ in range(100)]
for task in tasks:
task.add_done_callback(callback)
start = time