python apply_async没有执行函数,Pool.apply_async（）：未执行嵌套函数

最新推荐文章于 2023-11-20 15:39:53 发布

一不小心就来了

最新推荐文章于 2023-11-20 15:39:53 发布

阅读量1k

点赞数

文章标签： python apply_async没有执行函数

I am getting familiar with Python's multiprocessing module. The following code works as expected:

#outputs 0 1 2 3

from multiprocessing import Pool

def run_one(x):

print x

return

pool = Pool(processes=12)

for i in range(4):

pool.apply_async(run_one, (i,))

pool.close()

pool.join()

Now, however, if I wrap a function around the above code, the print statements are not executed (or the output is redirected at least):

#outputs nothing

def run():

def run_one(x):

print x

return

pool = Pool(processes=12)

for i in range(4):

pool.apply_async(run_one, (i,))

pool.close()

pool.join()

If I move the run_one definition outside of run, the output is the expected one again, when I'm calling run():

#outputs 0 1 2 3

def run_one(x):

print x

return

def run():

pool = Pool(processes=12)

for i in range(4):

pool.apply_async(run_one, (i,))

pool.close()

pool.join()

What am I missing here? Why isn't the second snippet printing anything? If I simply call the run_one(i) function instead of using apply_async, all the three codes output the same.

解决方案

Pool needs to pickle (serialize) everything it sends to its worker-processes. Pickling actually only saves the name of a function and unpickling requires re-importing the function by name.

For that to work, the function needs to be defined at the top-level, nested functions won't be importable by the child and already trying to pickle them raises an exception:

from multiprocessing.connection import _ForkingPickler

def run():

def foo(x):

pass

_ForkingPickler.dumps(foo) # multiprocessing custom pickler;

# same effect with pickle.dumps(foo)

run()

# Out:

Traceback (most recent call last):

...

AttributeError: Can't pickle local object 'run..foo'

The reason why you don't see an exception is, because Pool already starts catching exceptions during pickling tasks in the parent and only re-raises them when you call .get() on the AsyncResult object you immediately get when you call pool.apply_async().

That's why (with Python 2) you better always use it like this, even if your target-function doesn't return anything (still returns implicit None):

results = [pool.apply_async(foo, (i,)) for i in range(4)]

# `pool.apply_async()` immediately returns AsyncResult (ApplyResult) object

for res in results:

res.get()

Non-async Pool-methods like Pool.map() and Pool.starmap() use the same (asynchronous) low-level functions under the hood like their asynchronous siblings, but they additionally call .get() for you, so you will always see an exception with these methods.

Python 3 has an error_callback-parameter for asynchronous Pool-methods you can use instead to handle exceptions.