python多重循环程序,如何在Python中使用多重处理并行求和循环

I am having difficulty understanding how to use Python's multiprocessing module.

I have a sum from 1 to n where n=10^10, which is too large to fit into a list, which seems to be the thrust of many examples online using multiprocessing.

Is there a way to "split up" the range into segments of a certain size and then perform the sum for each segment?

For instance

def sum_nums(low,high):

result = 0

for i in range(low,high+1):

result += i

return result

And I want to compute sum_nums(1,10**10) by breaking it up into many sum_nums(1,1000) + sum_nums(1001,2000) + sum_nums(2001,3000)... and so on. I know there is a close-form n(n+1)/2 but pretend we don't know that.

Here is what I've tried

import multiprocessing

def sum_nums(low,high):

result = 0

for i in range(low,high+1):

result += i

return result

if __name__ == "__main__":

n = 1000

procs = 2

sizeSegment = n/procs

jobs = []

for i in range(0, procs):

process = multiprocessing.Process(target=sum_nums, args=(i*sizeSegment+1, (i+1)*sizeSegment))

jobs.append(process)

for j in jobs:

j.start()

for j in jobs:

j.join()

#where is the result?

解决方案

First, the best way to get around the memory issue is to use an iterator/generator instead of a list:

def sum_nums(low, high):

result = 0

for i in xrange(low, high+1):

result += 1

return result

in python3, range() produces an iterator, so this is only needed in python2

Now, where multiprocessing comes in is when you want to split up the processing to different processes or CPU cores. If you don't need to control the individual workers than the easiest method is to use a process pool. This will let you map a function to the pool and get the output. You can alternatively use apply_async to apply jobs to the pool one at a time and get a delayed result which you can get with .get():

import multiprocessing

from multiprocessing import Pool

from time import time

def sum_nums(low, high):

result = 0

for i in xrange(low, high+1):

result += i

return result

# map requires a function to handle a single argument

def sn((low,high)):

return sum_nums(low, high)

if __name__ == '__main__':

#t = time()

# takes forever

#print sum_nums(1,10**10)

#print '{} s'.format(time() -t)

p = Pool(4)

n = int(1e8)

r = range(0,10**10+1,n)

results = []

# using apply_async

t = time()

for arg in zip([x+1 for x in r],r[1:]):

results.append(p.apply_async(sum_nums, arg))

# wait for results

print sum(res.get() for res in results)

print '{} s'.format(time() -t)

# using process pool

t = time()

print sum(p.map(sn, zip([x+1 for x in r], r[1:])))

print '{} s'.format(time() -t)

On my machine, just calling sum_nums with 10**10 takes almost 9 minutes, but using a Pool(8) and n=int(1e8) reduces this to just over a minute.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值