之前不会用,用了发现,可以大大提高生产力,提高哥的幸福指数,遇到那种阻塞的进程,好用的亿批
建议先看下廖雪峰的多进程讲解:
https://www.liaoxuefeng.com/wiki/1016959663602400/1017628290184064
以下给一个模板,诸位替换函数就可使用
1. 模板1
from multiprocessing import Pool
from functools import partial
params_list = [
# 此处需要修改
item1,
item2,
item3,
...
]
if __name__ == "__main__":
with Pool(processes=8) as pool: # 开启8个进程进行运行, 这个自己设置吧,一般CPU核数就行,遇到存在阻塞的情况,比CPU核心稍多一些就好
parallel_warp = partial(parallel_fun, 一些已经定下来的参数)
pool.map(parallel_warp, params_list)
# 单步调试进入这里, 多进程看上边儿
# parallel_warp(params_list[0])
用上下文管理器 Pool(processes=8)
好用的一批
给个例子:
from multiprocessing import Pool
from functools import partial
params_list = [
# 此处需要修改
1, 2, 3, 4, 5
]
def print_ABC_num(num, A, B, C):
print(A, B, C, num)
parallel_fun = print_ABC_num
if __name__ == "__main__":
with Pool(processes=8) as pool: # 开启8个进程进行运行
parallel_warp = partial(parallel_fun, A="a", B="bb", C="CcC")
pool.map(parallel_warp, params_list)
# 单步调试进入这里, 多进程看上边儿
# parallel_warp(params_list[0])
值得一提的是,还可以通过这样的操作来获取每个并行函数的结果:
for result in pool.map(worker, [1,2,3]):
print(result)
2. 模板2
有时候,并行函数的输入的变化参数不止一个怎么搞,除了修改 params_list
曲线救国,还可以这样:
from multiprocessing import Pool
from functools import partial
params_list1 = [
# 此处需要修改
item1,
item2,
item3,
......
]
params_list2 = [
# 此处需要修改
item1,
item2,
item3,
......
]
if __name__=='__main__':
p = Pool(8)
for a, b in zip(params_list1, params_list2):
p.apply_async(parallel_fun, args=(a, b))
p.close()
p.join()
同样,给个例子:
from multiprocessing import Pool
from functools import partial
def print_A_and_B(A, B):
print("{}\t{}\t{}".format(A, B, A+B))
if __name__=='__main__':
params_list1 = [
# 此处需要修改
1, 2, 3, 4, 5
]
params_list2 = [
# 此处需要修改
6, 7, 8, 9, 10
]
p = Pool(8)
for a, b in zip(params_list1, params_list2):
p.apply_async(print_A_and_B, args=(a, b))
p.close()
p.join()
3. 模板2的几点说明
关于上面的p.close()
与p.join()
:
对Pool对象调用join()方法会等待所有子进程执行完毕,调用join()之前必须先调用close(),调用close()之后就不能继续添加新的Process了
(摘自廖雪峰的博客)
4. if name==‘main’
这句常用的if,起到了至关重要的作用,如果没有这句话,会报错:
发生异常: RuntimeError (note: full exception trace is shown but execution is paused at: <module>)
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
File "xxxxx/try.py", line 21, in <module>
p = Pool(8)
File "xxxxx", line 1, in <module> (Current frame)
参考了:
https://stackoverflow.com/questions/18204782/runtimeerror-on-windows-trying-python-multiprocessing
windows 系统中,子进程将首先先导入 main 模块(也就是当前文件)
我们需要使用if __name__=='__main__'
来避免被子进程递归创建子进程
5. 查看电脑核数
摘自:
https://www.cnblogs.com/emanlee/p/3587571.html
# 总核数 = 物理CPU个数 X 每颗物理CPU的核数
# 总逻辑CPU数 = 物理CPU个数 X 每颗物理CPU的核数 X 超线程数
# 查看物理CPU个数
cat /proc/cpuinfo| grep "physical id"| sort| uniq| wc -l
# 查看每个物理CPU中core的个数(即核数)
cat /proc/cpuinfo| grep "cpu cores"| uniq
# 查看逻辑CPU的个数
cat /proc/cpuinfo| grep "processor"| wc -l