廖雪峰的官方网站–python自动化笔记
用asyncio提供的@asyncio.coroutine可以把一个generator标记为coroutine类型,然后在coroutine内部用yield from调用另一个coroutine实现异步操作。
为了简化并更好地标识异步IO,从Python 3.5开始引入了新的语法async和await,可以让coroutine的代码更简洁易读。
请注意,async和await是针对coroutine的新语法,要使用新的语法,只需要做两步简单的替换:
- 把@asyncio.coroutine替换为async;
- 把yield from替换为await。
# 使用3.4版本
@asyncio.coroutine
def hello():
print("Hello world!")
r = yield from asyncio.sleep(1)
print("Hello again!")
# Python 3.5以及后续版本
# 用新语法重新编写如下:
async def hello():
print("Hello world!")
r = await asyncio.sleep(1)
print("Hello again!")
Python如何实现异步函数
import asyncio
async def async_func():
print("Start async function")
await asyncio.sleep(1)
print("Finish async function")
asyncio.run(async_func())
import asyncio
async def task(name, delay):
print(f"Executing task: {name}")
await asyncio.sleep(delay)
print(f"Task {name} finished")
# 创建事件循环
loop = asyncio.get_event_loop()
# 创建任务列表
tasks = [
task("Task 1", 2),
task("Task 2", 1),
task("Task 3", 3)
]
# 并发执行任务
loop.run_until_complete(asyncio.gather(*tasks))
# 关闭事件循环
loop.close()
比较常见的是 asyncio 库和 aiofiles 模块
import asyncio
import aiofiles
async def async_function(x):
async with aiofiles.open('output.txt', 'a') as f:
await f.write(str(x) + '\n')
await f.write('done')
async def main():
tasks = [asyncio.create_task(async_function(i)) for i in range(1, 11)]
await asyncio.gather(*tasks)
asyncio.run(main())
httpx请求库——异步请求–Python3.10 异步编程
import httpx
import asyncio
headers={
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Safari/537.36'
}
async def function():
async with httpx.AsyncClient()as client:
response=await client.get('https://www.baidu.com',headers=headers)
print(response.text)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(function())
Python3.10 异步编程 asyncio request异步爬取
httpx的两个坑(httpx.ReadTimeout; SSL:CERTIFICATE_VERIFY_FAILED)
client = httpx.AsyncClient(verify=False, timeout=None)
asyncio.run()报错RuntimeError:Event loop is closed的原因以及解决办法
发现错误:raise RuntimeError(‘Event loop is closed’) RuntimeError: Event loop is closed
asyncio.run(主协程函数名())
loop = asyncio.get_event_loop() #可以防止报错
loop.run_until_complete(主协程函数名())
ahttp请求库——异步requests请求
需求背景:大量接口需要并行的情况
ahttp:ahttp 是一个所有的http连接请求均使用协程的方式,使请求过程中 IO 操作交给其他硬件,而CPU专注于处理计算型任务,可以大量的节约等待的时间
ahttp的使用方式基本上和requests一致,只不过requests请求是同步,而ahttp的请求是异步。不同的是requests可以直接请求,而由于ahttp是异步的,所以需要构造好请求之后进行一次“执行”
import ahttp
urls1 = [ f"https://movie.douban.com/top250?start={i*25}" for i in range(2)]
urls=[
'http://www.heroku.com',
'http://python-tablib.org',
'http://httpbin.org',
'http://python-requests.org',
'http://fakedomain/',
'http://kennethreitz.com'
]
reqs = [ahttp.post(i) for i in urls]
resqs = ahttp.run(reqs,order=True,pool=3)#按顺序排序,pool线程池可以设置最大并发数
和 前面的使用ahttp构造请求list请求相比,使用session请求速度更快,而且共享cookies,因为session创建的是一个持久的链接。由于是异步请求,得到的resps并不是按照reqs请求的顺序排列的,因此我们需要按照顺序处理,只需要在ahttp.run添加一个参数order
import ahttp
urls1 = [ f"https://movie.douban.com/top250?start={i*25}" for i in range(2)]
urls=[
'http://www.heroku.com',
'http://python-tablib.org',
'http://httpbin.org',
'http://python-requests.org',
'http://fakedomain/',
'http://kennethreitz.com'
]
sess=ahttp.Session()#和使用ahttp构造请求list请求相比,使用session请求速度更快,而且共享cookies,因为session创建的是一个持久的链接
reqs = [sess.post(i) for i in urls]
resqs = ahttp.run(reqs,order=True,pool=3)#按顺序排序,pool线程池可以设置最大并发数
print('全部',resqs)
print('第1个',resqs[0])
aiohttp请求库——异步requests请求
import sys
import time
import asyncio
import aiohttp
async def get_html(semaphore, session, url, delay=6):
await semaphore.acquire()
async with session.get(url) as res:
html = await res.text()
# asyncio.sleep(delay)
# RuntimeWarning: coroutine 'sleep' was never awaited
# RuntimeWarning: Enable tracemalloc to get the object allocation traceback
await asyncio.sleep(delay) # is a coroutine and should be awaited.
semaphore.release()
return html
async def main():
categories = {
"makeup": "https://www.sephora.com/shop/"
}
semaphore = asyncio.Semaphore(value=1)
tasks = []
async with aiohttp.ClientSession(loop=loop, connector=aiohttp.TCPConnector(ssl=False)) as session:
for category, url in categories.items():
# Get HTML of all pages
tasks.append(get_html(semaphore, session, url))
res = await asyncio.gather(*tasks)
if __name__ == "__main__":
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
import asyncio
import aiohttp
async def main():
async with aiohttp.ClientSession() as session:
async with session.get('http://httpbin.org/get') as resp:
print(resp.status)
print(await resp.text())
if __name__ == '__main__':
# python3.7才支持这种写法,作为一个入口函数,以debug模式运行事件循环
asyncio.run(main(), debug=True)
# python3.6及以下版本写法
event_loop = asyncio.get_event_loop()
results = event_loop.run_until_complete(asyncio.gather(main()))
event_loop.close()