python 多协程异步IO爬取网页加速3倍。

最新推荐文章于 2022-01-13 08:22:39 发布

森林光头强大叔叔

最新推荐文章于 2022-01-13 08:22:39 发布

阅读量283

点赞数

文章标签： python

原文链接：http://www.cnblogs.com/hushuning/p/7922220.html

版权

 1 from urllib import request
 2 import gevent,time
 3 from gevent import monkey#该模块让当前程序所有io操作单独标记，进行异步操作。
 4 
 5 monkey.patch_all()#对当前程序的io操作打上补丁。没有该monkey方法，异步IO无效。
 6 def  f(url):
 7     print('GET:%s'%url)
 8     resp = request.urlopen(url)#获取网页
 9     data = resp.read()#读取网页
10     print('%d bytes received from  %s'%(len(data),url))#打印长度
11 url = ['https://www.yahoo.com/','https://www.python.org/',
12  'https://github.com/']
13 start = time.time()
14 for i in url:
15     f(i)#循环运行列表中的网页
16 print('串行执行时间：',time.time() - start)#串行执行时间
17 async_time = time.time()
18 gevent.joinall([
19     gevent.spawn(f,'https://www.yahoo.com/')#异步执行启动协程
20 ,   gevent.spawn(f,'https://www.python.org/'),
21     gevent.spawn(f,'https://github.com/'),
22 ])
23 print('异步执行时间async time:',time.time() - async_time)#多协程异步IO执行时间

以下为运行结果，明显多协程的牛逼之处。。。。。。。如果不执行monkey方法，则异步IO就会按串行执行。

C:\Users\hushuning\Anaconda3\python.exe C:/Users/hushuning/PycharmProjects/untitled/njx/把当前程序的所有的io操作单独标记，进行异步操作.py
GET:https://www.yahoo.com/
510125 bytes received from  https://www.yahoo.com/
GET:https://www.python.org/
48857 bytes received from  https://www.python.org/
GET:https://github.com/
51373 bytes received from  https://github.com/
串行执行时间： 4.710935354232788
GET:https://www.yahoo.com/
GET:https://www.python.org/
GET:https://github.com/
48857 bytes received from  https://www.python.org/
512422 bytes received from  https://www.yahoo.com/
51373 bytes received from  https://github.com/
异步执行时间async time: 1.6521050930023193

Process finished with exit code 0

转载于:https://www.cnblogs.com/hushuning/p/7922220.html

森林光头强大叔叔

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python 多协程异步IO爬取网页加速3倍。

1 from urllib import request 2 import gevent,time 3 from gevent import monkey#该模块让当前程序所有io操作单独标记，进行异步操作。 4 5 monkey.patch_all()#对当前程序的io操作打上补丁。没有该monkey方法，异步IO无效。 6 def f(url): 7...
复制链接

扫一扫