以下为实现异步抓取网页的代码段:
import asyncio
import aiohttp
async def fetch_page(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as response:
return response.status
tasks = [fetch_page('http://books.toscrape.com') for i in range(50)]
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.gather(*tasks))
当前 Python 版本为 3.10.5,运行以上代码,会出现如下警告:
DeprecationWarning: There is no current event loop
loop = asyncio.get_event_loop()
DeprecationWarning: There is no current event loop
loop.run_until_complete(asyncio.gather(*tasks))
总之 asyncio.get_event_loop()
, asyncio.gather(*tasks)
两个函数都不能用。
代码改为使用 asyncio.run()
,测试ok:
import asyncio
import aiohttp
async def fetch_page(url):
async with aiohttp.ClientSession() as session:
async with session.get(url) as response:
print(response.status)
async def main():
tasks = [fetch_page('https://quotes.toscrape.com/') for i in range(50)]
await asyncio.gather(*tasks) # argument unpacking: (tasks[0], tasks[1], ... tasks[49])
asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
asyncio.run(main())
以下语句必须要加,
# important
asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
否则 windows 系统对于 https
网站,会出现错误
RuntimeError: Event loop is closed