最近有个很繁琐的需要提前计算数据指标的问题,很慢,于是就想到了并行化,之前没有用过python进行并行化,搜了一下,受Caspar的译文译文启发,原文在这里,实现如下:
from multiprocessing import Pool
def compute(params):
'''
params:[param1,param2,param3...]
'''
# ...
pass
def get_params(dataset):
'''
return the list of params, each param is also a list
'''
# ...
params = []
for batch_num in xrange(num_batch):
param = []
param.append(param1)
param.append(param2)
param.append(param3)
params.append(param)
return params
if __name__ == '__main__':
dataset = 'xxxxxx'
params = get_params(dataset)
pool = Pool()
pool.map(compute, params)
pool.close()
pool.join()
这里用列表来存储多个参数。