web异步请求python自动化笔记

kunwen123

已于 2023-07-05 06:59:33 修改

阅读量130

点赞数 1

文章标签： python 自动化测试工具

于 2023-07-04 04:56:13 首次发布

本文链接：https://blog.csdn.net/kunwen123/article/details/131526800

版权

文章介绍了Python中的异步编程概念，从asyncio库的coroutine使用，到Python3.5引入的async和await新语法。展示了如何使用asyncio进行任务并发执行，以及httpx和aiohttp这两个库在异步HTTP请求中的应用。同时提到了异步爬取、错误处理和并发控制策略。

摘要由CSDN通过智能技术生成

廖雪峰的官方网站–python自动化笔记

用asyncio提供的@asyncio.coroutine可以把一个generator标记为coroutine类型，然后在coroutine内部用yield from调用另一个coroutine实现异步操作。

为了简化并更好地标识异步IO，从Python 3.5开始引入了新的语法async和await，可以让coroutine的代码更简洁易读。

请注意，async和await是针对coroutine的新语法，要使用新的语法，只需要做两步简单的替换：

把@asyncio.coroutine替换为async；
把yield from替换为await。

# 使用3.4版本
@asyncio.coroutine
def hello():
    print("Hello world!")
    r = yield from asyncio.sleep(1)
    print("Hello again!")
# Python 3.5以及后续版本
# 用新语法重新编写如下：

async def hello():
    print("Hello world!")
    r = await asyncio.sleep(1)
    print("Hello again!")

Python如何实现异步函数

import asyncio

async def async_func():
    print("Start async function")
    await asyncio.sleep(1)
    print("Finish async function")
 
asyncio.run(async_func())

异步编程详解

import asyncio

async def task(name, delay):
    print(f"Executing task: {name}")
    await asyncio.sleep(delay)
    print(f"Task {name} finished")

# 创建事件循环
loop = asyncio.get_event_loop()

# 创建任务列表
tasks = [
    task("Task 1", 2),
    task("Task 2", 1),
    task("Task 3", 3)
]

# 并发执行任务
loop.run_until_complete(asyncio.gather(*tasks))

	# 关闭事件循环
	loop.close()

比较常见的是 asyncio 库和 aiofiles 模块

import asyncio  
import aiofiles  
  
async def async_function(x):  
    async with aiofiles.open('output.txt', 'a') as f:  
        await f.write(str(x) + '\n')  
        await f.write('done')  
  
async def main():  
    tasks = [asyncio.create_task(async_function(i)) for i in range(1, 11)]  
    await asyncio.gather(*tasks)  
  
asyncio.run(main())

httpx请求库——异步请求–Python3.10 异步编程

import httpx
import asyncio
headers={
    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Safari/537.36'
}
async def function():
    async with httpx.AsyncClient()as client:
        response=await client.get('https://www.baidu.com',headers=headers)
        print(response.text)
if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    loop.run_until_complete(function())

Python3.10 异步编程 asyncio request异步爬取
httpx的两个坑(httpx.ReadTimeout； SSL:CERTIFICATE_VERIFY_FAILED)
client = httpx.AsyncClient(verify=False, timeout=None)

asyncio.run()报错RuntimeError:Event loop is closed的原因以及解决办法

在这里插入图片描述

发现错误：raise RuntimeError(‘Event loop is closed’) RuntimeError: Event loop is closed

asyncio.run(主协程函数名())

 loop = asyncio.get_event_loop() #可以防止报错
 loop.run_until_complete(主协程函数名())

需求背景：大量接口需要并行的情况
ahttp:ahttp 是一个所有的http连接请求均使用协程的方式，使请求过程中 IO 操作交给其他硬件，而CPU专注于处理计算型任务，可以大量的节约等待的时间
ahttp的使用方式基本上和requests一致，只不过requests请求是同步，而ahttp的请求是异步。不同的是requests可以直接请求，而由于ahttp是异步的，所以需要构造好请求之后进行一次“执行”

import ahttp
 
urls1 = [ f"https://movie.douban.com/top250?start={i*25}" for i in range(2)]
urls=[
      'http://www.heroku.com',
      'http://python-tablib.org',
      'http://httpbin.org',
      'http://python-requests.org',
      'http://fakedomain/',
      'http://kennethreitz.com'
]
 
reqs = [ahttp.post(i) for i in urls]
resqs = ahttp.run(reqs,order=True,pool=3)#按顺序排序，pool线程池可以设置最大并发数

和前面的使用ahttp构造请求list请求相比，使用session请求速度更快，而且共享cookies，因为session创建的是一个持久的链接。由于是异步请求，得到的resps并不是按照reqs请求的顺序排列的，因此我们需要按照顺序处理，只需要在ahttp.run添加一个参数order

import ahttp
 
urls1 = [ f"https://movie.douban.com/top250?start={i*25}" for i in range(2)]
urls=[
      'http://www.heroku.com',
      'http://python-tablib.org',
      'http://httpbin.org',
      'http://python-requests.org',
      'http://fakedomain/',
      'http://kennethreitz.com'
]
 
sess=ahttp.Session()#和使用ahttp构造请求list请求相比，使用session请求速度更快，而且共享cookies，因为session创建的是一个持久的链接
reqs = [sess.post(i) for i in urls]
resqs = ahttp.run(reqs,order=True,pool=3)#按顺序排序，pool线程池可以设置最大并发数
print('全部',resqs)
print('第1个',resqs[0])

aiohttp请求库——异步requests请求

import sys
import time
import asyncio
import aiohttp

async def get_html(semaphore, session, url, delay=6):
    await semaphore.acquire()
    async with session.get(url) as res:
        html = await res.text()
        # asyncio.sleep(delay) 
        # RuntimeWarning: coroutine 'sleep' was never awaited
        # RuntimeWarning: Enable tracemalloc to get the object allocation traceback
        await asyncio.sleep(delay)  #  is a coroutine and should be awaited.
        semaphore.release()
        return html

async def main():
    categories = {
        "makeup": "https://www.sephora.com/shop/"
    }
    semaphore = asyncio.Semaphore(value=1)
    tasks = []
    async with aiohttp.ClientSession(loop=loop, connector=aiohttp.TCPConnector(ssl=False)) as session:
        for category, url in categories.items():
                # Get HTML of all pages
            tasks.append(get_html(semaphore, session, url))
        res = await asyncio.gather(*tasks)

if __name__ == "__main__":
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())

import asyncio
import aiohttp
 
 
async def main():
    async with aiohttp.ClientSession() as session:
        async with session.get('http://httpbin.org/get') as resp:
            print(resp.status)
            print(await resp.text())
 
 
if __name__ == '__main__':
    # python3.7才支持这种写法，作为一个入口函数，以debug模式运行事件循环
    asyncio.run(main(), debug=True)
    # python3.6及以下版本写法
    event_loop = asyncio.get_event_loop()
    results = event_loop.run_until_complete(asyncio.gather(main()))
    event_loop.close()