asyncio并发数_aiohttp.TCPConnector（带限制参数）vs asyncio.Semaphore用于限制并发连接数...

最新推荐文章于 2023-02-11 11:29:33 发布

bug射击师

最新推荐文章于 2023-02-11 11:29:33 发布

阅读量1.1k

点赞数

文章标签： asyncio并发数

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_34463209/article/details/113375331

版权

本文探讨了使用aiohttp时限制并发连接数的两种方法：通过TCPConnector设置限制和使用Semaphore。作者提出了关于这两种方法是否可互换、性能差异以及如何处理错误重试和数据处理的问题，并提供了包含这两种选项的代码示例。

摘要由CSDN通过智能技术生成

我想我想通过制作一个简单的脚本来学习新的python异步等待语法，更具体地说是asyncio模块，它允许你在一个下载多个资源 .

但现在我被卡住了 .

While researching I came across two options to limit the number of concurrent requests:

将aiohttp.TCPConnector(带限制参数)传递给aiohttp.ClientSession或

使用asyncio.Semaphore .

是否有首选选项，或者如果您只想限制并发连接数，它们是否可以互换使用？性能方面(大致)是否相等？

两者似乎都有默认值100并发连接/操作 . 如果我只使用信号量限制为500，那么aiohttp内部会隐式地将我锁定为100个并发连接吗？

这对我来说都是非常新的和不清楚的 . 请随时指出我的任何误解或我的代码中的缺陷 .

这是我目前包含两个选项的代码(我应该删除哪些？)：

Bonus Questions:

如何处理(最好重试x次)出现错误的coros？

coro完成后立即保存返回数据(通知我的DataHandler)的最佳方法是什么？我不希望最后全部保存，因为我可以尽快开始处理结果 .

小号

import asyncio

from tqdm import tqdm

import uvloop as uvloop

from aiohttp import ClientSession, TCPConnector, BasicAuth

# You can ignore this class

class DummyDataHandler(DataHandler):

"""Takes data and stores it somewhere"""

def __init__(self, *args, **kwargs):

super().__init__(*args, **kwargs)

def take(self, origin_url, data):

return True

def done(self):

return None

class AsyncDownloader(object):

def __init__(self, concurrent_connections=100, silent=False, data_handler=None, loop_policy=None):

self.concurrent_connections = concurrent_connections

self.silent = silent

self.data_handler = data_handler or DummyDataHandler()

self.sending_bar = None

self.receiving_bar = None

asyncio.set_event_loop_policy(loop_policy or uvloop.EventLoopPolicy())

self.loop = asyncio.get_event_loop()

self.semaphore = asyncio.Semaphore(concurrent_connections)

async def fetch(self, session, url):

# This is option 1: The semaphore, limiting the number of concurrent coros,

# thereby limiting the number of concurrent requests.

with (await self.semaphore):

async with session.get(url) as response:

# Bonus Question 1: What is the best way to retry a request that failed?

resp_task = asyncio.ensure_future(response.read())

self.sending_bar.update(1)

resp = await resp_task

await response.release()

if not self.silent:

self.receiving_bar.update(1)

return resp

async def batch_download(self, urls, auth=None):

# This is option 2: Limiting the number of open connections directly via the TCPConnector

conn = TCPConnector(limit=self.concurrent_connections, keepalive_timeout=60)

async with ClientSession(connector=conn, auth=auth) as session:

await asyncio.gather(*[asyncio.ensure_future(self.download_and_save(session, url)) for url in urls])

async def download_and_save(self, session, url):

content_task = asyncio.ensure_future(self.fetch(session, url))

content = await content_task

# Bonus Question 2: This is blocking, I know. Should this be wrapped in another coro

# or should I use something like asyncio.as_completed in the download function?

self.data_handler.take(origin_url=url, data=content)

def download(self, urls, auth=None):

if isinstance(auth, tuple):

auth = BasicAuth(*auth)

print('Running on concurrency level {}'.format(self.concurrent_connections))

self.sending_bar = tqdm(urls, total=len(urls), desc='Sent ', unit='requests')

self.sending_bar.update(0)

self.receiving_bar = tqdm(urls, total=len(urls), desc='Reveived', unit='requests')

self.receiving_bar.update(0)

tasks = self.batch_download(urls, auth)

self.loop.run_until_complete(tasks)

return self.data_handler.done()

### call like so ###

URL_PATTERN = 'https://www.example.com/{}.html'

def gen_url(lower=0, upper=None):

for i in range(lower, upper):

yield URL_PATTERN.format(i)

ad = AsyncDownloader(concurrent_connections=30)

data = ad.download([g for g in gen_url(upper=1000)])

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。