协程发送request请求:
requests库是串行的,遇到io会阻塞等待
grequests本质也不是真正并行的,是程序员级别的线程,在遇到io时会切换到下一个“线程”.
因此,若进发送一次http请求,两者并无区别,但多次的时候,就有很大区别了,理论上说,和一次请求是一样的,但量大了其实是不一样的。(后面详细谈)
如何使用:
少废话上官方git:https://github.com/spyoungtech/grequests#usage
import grequests
import time
import json
import requests
def use_grequests():
task = []
urls = ["https://www.baidu.com/", "https://www.csdn.net/", "https://www.sina.com.cn/"]
print(len(urls))
while urls:
url = urls.pop(0)
rs = grequests.request("GET", url, data=adata, headers=header)
task.append(rs)
resp = grequests.map(task, exception_handler=exception_handler)
return resp
def use_requests():
# urls = ["https://www.baidu.com/" for i in range(num)]
urls = ["https://www.baidu.com/", "https://www.csdn.net/", "https://www.sina.com.cn/"]
index = 0
while urls:
url = urls.pop(0)
resp = requests.get(url=url, headers=header, data=adata)
print(resp)
# index += 1
# if index % 10 == 0:
# print('目前是第{}个请求'.format(index))
def main(num):
time1 = time.time()
finall_res = use_requests(num)
print(finall_res)
time2 = time.time()
T = time2 - time1
print('use_requests发起{}个请求花费了{}秒'.format(num, T))
print('正在使用grequests模块发起请求...')
time3 = time.time()
finall_res2 = use_grequests(num)
for aa in finall_res2:
print(aa, "\n")
time4 = time.time()
T2 = time4 - time3
print('use_grequests发起{}个请求花费了{}秒'.format(num, T2))
if __name__ == '__main__':
main()
注意点
需要说明的是,
grequests.map(task, exception_handler=exception_handler)
map的返回值就是投递顺序
imap是迭代器处理的,谁先返回处理谁,这两个各有优劣,根据业务自取。需要说明的是,imap的size要配一下,否则default=2有问题
exception_handler=exception_handler 是http请求异常了的切面函数,需要注意的是,这里的异常不包含状态码有返回的,4xx的都算正常,只有http请求失败的,比如url写错了这种才进入异常函数。这里需要注意。
还有一个size参数,并行请求的参数,
def map(requests, stream=False, size=None, exception_handler=None, gtimeout=None):
"""Concurrently converts a list of Requests to Responses.
:param requests: a collection of Request objects.
:param stream: If True, the content will not be downloaded immediately.
:param size: Specifies the number of requests to make at a time. If None, no throttling occurs.
:param exception_handler: Callback function, called when exception occured. Params: Request, Exception
:param gtimeout: Gevent joinall timeout in seconds. (Note: unrelated to requests timeout)
"""
def imap(requests, stream=False, size=2, exception_handler=None):
"""Concurrently converts a generator object of Requests to
a generator of Responses.
:param requests: a generator of Request objects.
:param stream: If True, the content will not be downloaded immediately.
:param size: Specifies the number of requests to make at a time. default is 2
:param exception_handler: Callback function, called when exception occurred. Params: Request, Exception
"""
经测试,几十个一起来的时候,优势消失。几个,十几个是没问题的。
use_requests发起3个请求花费了0.4109797477722168秒
use_grequests发起3个请求花费了0.21564006805419922秒 (imap)
use_requests发起10个请求花费了1.2701311111450195秒
use_grequests发起10个请求花费了0.4723951816558838秒(imap)
use_requests发起30个请求花费了4.247822523117065秒
use_grequests发起30个请求花费了1.4679083824157715秒 (imap)
imap和map 在30个以内,性能趋同,100个的时候imap胜出,越多则和串行requests差距减小