天天Python GIL,光嘴上说但是实际并没有真正测试对比过。
今天测试了一下Python的多线程、多进程、单线程的下载图片效率。
实测Python多线程在io密集型的情况下还是比单线程快很多的,引用一下另一位博主解释的原因:
io是分为网络io和磁盘io,一般情况下,io有发送数据(output)和返回数据(input)两个过程。比如以浏览器为主体,浏览器发送请求给服务器(output),服务器再将请求结果返回给浏览器(input)。python在io阻塞的情况下,会释放GIL(global interpreter lock)锁,其他线程会在当前线程等待返回值(阻塞)的情况下继续执行发送请求(output),第三个线程又会在第二个线程等待返回值(阻塞)的情况下发送请求(output),即在同一时间片段,会有一个线程在等待数据,也会有一个线程在发数据。这就减少了io传输的时间。
---------------------
作者:daijiguo
来源:CSDN
原文:https://blog.csdn.net/daijiguo/article/details/78042309
版权声明:本文为博主原创文章,转载请附上博文链接!
至于多线程和多进程,在下载多个图片时(比如图片数量多于cpu核数),并且每个图片比较小的情况下,多线程看似更快,
感觉是因为图片数大于cpu核数,所以进程和线程都要切换?虽然进程切换的比较少,但是进程的开销更大,而虽然因为GIL只有一个cpu核心工作,但是线程开销比较小,加上下载的资源也比较小,多线程切换的次数也较少,所以多线程更快。
而再在下载少量图片(比如图片数量小于cpu核数)时,并且图片比较大,多进程可以充分利用cpu,不用切换进程,减少了开销,而多线程要不停地切换任务,再加上图片比较大,线程不停的切换增加开销降低效率,导致速度不如多进程。
感觉我的理解就是这样了。。。
哪天再测一下cpu密集型多线程、多进程对比。。。
import requests
import time
from threading import Thread
import threading
import multiprocessing
python_list=[
'https://www.python.org/ftp/python/3.5.7/Python-3.5.7.tgz',
'https://www.python.org/ftp/python/3.7.3/python-3.7.3.exe',
'https://www.python.org/ftp/python/2.7.16/python-2.7.16.amd64.msi'
]
large_url_list=[
#python地址,虽然不大,但国外地址相对较慢
'https://www.python.org/ftp/python/3.7.3/python-3.7.3.exe',
# 'https://raw.githubusercontent.com/mymmsc/books/master/%E7%AE%97%E6%B3%95%E5%AF%BC%E8%AE%BA%E4%B8%AD%E6%96%87%E7%89%88.pdf',
'http://codown.youdao.com/cidian/YoudaoDict_webdict_default.exe',
'https://down.360safe.com/setup.exe',
'https://d1.music.126.net/dmusic/cloudmusicsetup_2.5.2.197409.exe',
'http://pcclient.download.youku.com/youkuclient/youkuclient_setup_7.7.7.4191.exe',
'http://dl2.xmind.cn/xmind-8-update8-windows.exe',
'https://cdn-dl.yinxiang.com/YXWin6/public/Evernote_6.17.20.667.exe',
]
url_list=[
'https://images7.alphacoders.com/333/333388.jpg',
'https://images2.alphacoders.com/597/597309.jpg',
'https://images8.alphacoders.com/562/562449.jpg',
'https://images.alphacoders.com/562/562450.jpg',
'https://images3.alphacoders.com/562/562451.jpg',
'https://images.alphacoders.com/562/562452.jpg',
'https://images2.alphacoders.com/101/1011957.jpg',
'https://images6.alphacoders.com/101/1011958.jpg',
'https://images5.alphacoders.com/101/1011959.jpg',
'https://images8.alphacoders.com/101/1011961.jpg',
'https://images3.alphacoders.com/692/692439.jpg',
'https://images4.alphacoders.com/940/940881.jpg',
'https://images5.alphacoders.com/689/689398.jpg',
'https://images5.alphacoders.com/757/757038.jpg',
]
time_path='time_compare.txt'
#请求url,保存图片
def save_pic(url,count):
# print(url)
# print('save_pic',threading.current_thread())
file_name = (str(count+1) + '.jpg' )
res = requests.get(url)
print(len(res.content)//1024//1024, url)
with open(file_name,'wb') as f:
f.write(res.content)
#单线程
def single_download(url_list):
# print(threading.current_thread())
s_time=time.time()
for i in range(len(url_list)):
res=requests.get(url_list[i])
print(len(res.content)//1024//1024)
file_name=str(i+1)+'.jpg'
with open(file_name,'wb') as f:
f.write(res.content)
e_time=time.time()
t_time=e_time-s_time
# with open('single_download.txt','a') as f:
with open(time_path,'a') as f:
f.write('单线程总耗时:%r'%t_time+'\n'+'\n')
print('单线程总耗时:%r'%t_time)
#多线程
def thread_download(save_pic,url_list):
threads = []
start=time.time()
for i in range(len(url_list)):
#创建线程
t = Thread(target = save_pic, args = [url_list[i],i])
# t.setDaemon(True)
t.start()
threads.append(t)
#每个线程按顺序逐个执行
# t.join()
#多线程并发
# print('thread_download',threading.current_thread())
for t in threads:
t.join()
end = time.time()
print('多线程总耗时:%r' % (end-start))
# with open('thread_download.txt','a') as f:
with open(time_path,'a') as f:
f.write('多线程总耗时:%r'%(end - start)+'\n')
#多进程
def process_download(save_pic,url_list):
processes = []
start=time.time()
for i in range(len(url_list)):
#创建线程
p=multiprocessing.Process(target = save_pic, args = [url_list[i],i])
p.start()
processes.append(p)
#每个进程按顺序逐个执行
# p.join()
# 多进程并发
# print('process_download',threading.currentThread())
for p in processes:
p.join()
end = time.time()
print('多进程总耗时:%r' % (end-start))
# with open('thread_download.txt','a') as f:
with open(time_path,'a') as f:
f.write('多进程总耗时:%r'%(end - start)+'\n')
if __name__ == '__main__':
thread_download(save_pic,python_list)
process_download(save_pic,python_list)
single_download(large_url_list)
耗时对比:
多线程总耗时:22.477999925613403 多进程总耗时:31.263000011444092 单线程总耗时:25.10800004005432 多线程总耗时:21.917999982833862 多进程总耗时:28.180999994277954 单线程总耗时:21.52900004386902 多线程总耗时:6.33299994468689 多进程总耗时:6.327999830245972 单线程总耗时:21.680999994277954 多线程总耗时:4.704999923706055 多进程总耗时:7.363000154495239 单线程总耗时:22.16599988937378 多线程总耗时:4.493000030517578 多进程总耗时:5.243000030517578 单线程总耗时:20.289999961853027 多线程总耗时:7.164999961853027 多进程总耗时:6.3429999351501465 单线程总耗时:40.97699999809265 多线程总耗时:10.406000137329102 多进程总耗时:11.692000150680542 单线程总耗时:39.74600005149841 多线程总耗时:11.069999933242798 多进程总耗时:13.827999830245972 单线程总耗时:55.35499978065491 多线程总耗时:12.45300006866455 多进程总耗时:15.381999969482422 多线程总耗时:14.733000040054321 多进程总耗时:17.787999868392944 多线程总耗时:67.04800009727478 多进程总耗时:65.76999998092651 多线程总耗时:11.710999965667725 多进程总耗时:13.263000011444092 多线程总耗时:150.0369999408722 多进程总耗时:87.61500000953674 多线程总耗时:207.85199999809265 多进程总耗时:85.44199991226196 多线程总耗时:14.031000137329102 多进程总耗时:16.914999961853027 单线程总耗时:16.836000204086304 多线程总耗时:16.92199993133545 多进程总耗时:24.299000024795532 单线程总耗时:20.825999975204468 多线程总耗时:24.26200008392334 多进程总耗时:25.591000080108643 单线程总耗时:39.54299998283386 多线程总耗时:42.15599989891052 多进程总耗时:43.079999923706055 多线程总耗时:45.169999837875366 多进程总耗时:39.575000047683716 多线程总耗时:50.48699998855591 多进程总耗时:54.603999853134155 多线程总耗时:55.680999994277954 多进程总耗时:57.11299991607666 多线程总耗时:51.34699988365173 多线程总耗时:68.9359998703003 多进程总耗时:60.924999952316284 多线程总耗时:53.098999977111816 多进程总耗时:55.61199998855591 多线程总耗时:52.46000003814697 多进程总耗时:51.26799988746643 多线程总耗时:226.48599982261658 多进程总耗时:211.4670000076294 多线程总耗时:11.33299994468689 多进程总耗时:15.307000160217285 多线程总耗时:11.495000123977661 多进程总耗时:11.54800009727478 多线程总耗时:9.815999984741211 多进程总耗时:10.997999906539917 多线程总耗时:162.45900011062622 多进程总耗时:180.01900005340576 多线程总耗时:214.36699986457825 多进程总耗时:157.90300011634827 多线程总耗时:152.77100014686584 多进程总耗时:136.43899989128113 多线程总耗时:108.96199989318848 多进程总耗时:104.80599999427795 多线程总耗时:81.69500017166138 多进程总耗时:82.85199999809265 单线程总耗时:176.9119999408722