模拟并发任务，观察多进程和多线程的cpu使用率(以及进程池的chunksize参数)

本文链接：https://blog.csdn.net/kai404/article/details/127615008

1. 多进程 vs 多线程

import threading
import concurrent.futures

def kai(n):
    while True:
        if n != 0:
            n+=1
    print(n)


#ex = concurrent.futures.ThreadPoolExecutor(max_workers=5)  # 用线程
ex = concurrent.futures.ProcessPoolExecutor(max_workers=5)  # 用进程

ex_feature = ex.map(kai,[1,2,3,4,5])

print(ex_feature)

上述脚本可以验证对于纯计算类场景，对进程和多线程的效率差别（其实是对于cpu的充分使用程度）。

但是并不是说多进程有多鸡肋，在IO密集的场景下：多线程在遇到IO等待时会释放GIL锁让其他线程有执行机会，还是可以充分提升任务执行效率的。而且多线程较多进程更为轻量，更适合IO密集场景下高并发量的并行处理场景。

2. 验证excutor.map()中的chunksize参数的效果

对0-100000之间能被3整除的数进行过滤，4个子进程

import concurrent.futures

def li(n):
    if n % 3 == 0:
        return n

data=range(100000)
ex = concurrent.futures.ProcessPoolExecutor(max_workers=4)
#ex_feature = ex.map(li,data,chunksize=25000)
ex_feature = ex.map(li,data)

#print(ex_feature)

#with open('/tmp/result_with_chunk', 'w') as f:  #用chunksize时的结果文件
with open('/tmp/result_no_chunk', 'w') as f:     #不用chunksize时的结果文件
    for i in ex_feature:
        if i:
            f.write('{} '.format(i))

然后通过系统time命令观察耗时

~]# time python3 /tmp/tt.py     #no chunksize

real    0m18.694s
user    0m26.665s
sys     0m4.836s

~]# time python3 /tmp/tt.py     #with chunksize

real    0m0.227s
user    0m0.288s
sys     0m0.055s

对比结果文件，完全一致
~]# diff /tmp/result_with_chunk /tmp/result_no_chunk 
~]#

可以看到，在用或不用chunksize参数时，两者效率简直无法相比

待深入：chunksize的合理性（依据工作进程数，待处理数据总数等因素）

3 两种进程池写法

concurrent.futures.ProcessPoolExecutor()
multiprocessing.pool.Pool()

# test1方式1
from multiprocessing import Pool

def li(n):
    if n % 3 == 0:
        return n

data=range(10000000)
with Pool(5) as p:
    p.map(li,data,chunksize=2000000)

# test2方式2
import concurrent.futures

def li(n):
    if n % 3 == 0:
        return n

data=range(10000000)
ex = concurrent.futures.ProcessPoolExecutor(max_workers=4)
ex.map(li,data,chunksize=2500000)

执行结果(各执行3次)

 ~]$ time python3 /tmp/test1

real    0m1.582s
user    0m2.786s
sys     0m0.709s
 ~]$ time python3 /tmp/test1

real    0m1.627s
user    0m2.779s
sys     0m0.717s
 ~]$ time python3 /tmp/test1

real    0m1.583s
user    0m2.776s
sys     0m0.711s
 ~]$ time python3 /tmp/test2

real    0m8.056s
user    0m10.067s
sys     0m1.783s
 ~]$ time python3 /tmp/test2

real    0m7.961s
user    0m10.094s
sys     0m1.736s
 ~]$ time python3 /tmp/test2

real    0m7.981s
user    0m10.090s
sys     0m1.671s

看到方式1比方式2效率要高几倍