python最大迭代次数,Python大迭代次数失败

最新推荐文章于 2022-09-13 16:08:33 发布

苏格拉晴

最新推荐文章于 2022-09-13 16:08:33 发布

阅读量243

点赞数

文章标签： python最大迭代次数

I wrote simple monte-carlo π calculation program in Python, using multiprocessing module.

It works just fine, but when I pass 1E+10 iterations for each worker, some problem occur, and the result is wrong. I cant understand what is the problem, because everything is fine on 1E+9 iterations!

import sys

from multiprocessing import Pool

from random import random

def calculate_pi(iters):

""" Worker function """

points = 0 # points inside circle

for i in iters:

x = random()

y = random()

if x ** 2 + y ** 2 <= 1:

points += 1

return points

if __name__ == "__main__":

if len(sys.argv) != 3:

print "Usage: python pi.py workers_number iterations_per_worker"

exit()

procs = int(sys.argv[1])

iters = float(sys.argv[2]) # 1E+8 is cool

p = Pool(processes=procs)

total = iters * procs

total_in = 0

for points in p.map(calculate_pi, [xrange(int(iters))] * procs):

total_in += points

print "Total: ", total, "In: ", total_in

print "Pi: ", 4.0 * total_in / total

解决方案

The problem seems to be that multiprocessing has a limit to the largest int it can pass to subprocesses inside an xrange. Here's a quick test:

import sys

from multiprocessing import Pool

def doit(n):

print n

if __name__ == "__main__":

procs = int(sys.argv[1])

iters = int(float(sys.argv[2]))

p = Pool(processes=procs)

for points in p.map(doit, [xrange(int(iters))] * procs):

pass

Now:

$ ./multitest.py 2 1E8

xrange(100000000)

$ ./multitest.py 2 1E9

xrange(1000000000)

$ ./multitest.py 2 1E10

xrange(1410065408)

This is part of a more general problem with multiprocessing: It relies on standard Python pickling, with some minor (and not well documented) extensions to pass values. Whenever things go wrong, the first thing to check is that the values are arriving the way you expected.

In fact, you can see this problem by playing with pickle, without even touching multiprocessing (which isn't always the case, because of those minor extensions, but often is):

>>> pickle.dumps(xrange(int(1E9)))

'c__builtin__\nxrange\np0\n(I0\nI1000000000\nI1\ntp1\nRp2\n.'

>>> pickle.dumps(xrange(int(1E10)))

'c__builtin__\nxrange\np0\n(I0\nI1410065408\nI1\ntp1\nRp2\n.'

Even without learning all the details of the pickle protocol, it should be obvious that the I1000000000 in the first case is 1E9 as an int, while the equivalent chunk of the next case is about 1.41E9, not 1E10, as an int. You can experiment

One obvious solution to try is to pass int(iters) instead of xrange(int(iters)), and let calculate_pi create the xrange from its argument. (Note: In some cases an obvious transformation like this can hurt performance, maybe badly. But in this case, it's probably slightly better if anything—a simpler object to pass, and you're parallelizing the xrange construction—and of course the difference is so tiny it probably won't matter. Just make sure to think before blindly transforming.)

And a quick test shows that this now works:

import sys

from multiprocessing import Pool

def doit(n):

print xrange(n)

if __name__ == "__main__":

procs = int(sys.argv[1])

iters = int(float(sys.argv[2]))

p = Pool(processes=procs)

for points in p.map(doit, [iters] * procs):

pass

Then:

$ ./multitest.py 2 1E10

xrange(10000000000)

However, you will still run into a larger limit:

$ ./multitest.py 2 1E100

OverflowError: Python int too large to convert to C long

Again, it's the same basic problem. One way to solve that is to pass the arg all the way down as a string, and do the int(float(a)) inside the subprocesses.

As a side note: The reason I'm doing iters = int(float(sys.argv[2])) instead of just iters = float(sys.argv[2]) and then using int(iters) later is to avoid accidentally using the float iters value later on (as the OP's version does, in computing total and therefore total_in / total).

And keep in mind that if you get to big enough numbers, you run into the limits of the C double type: 1E23 is typically 99999999999999991611392, not 100000000000000000000000.

苏格拉晴

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python最大迭代次数,Python大迭代次数失败

I wrote simple monte-carlo π calculation program in Python, using multiprocessing module.It works just fine, but when I pass 1E+10 iterations for each worker, some problem occur, and the result is wro...
复制链接

扫一扫