python类中数据成员_Python:有效的解决方法,用于多处理一个类中的数据成员的函数...

I'm aware of various discussions of limitations of the multiprocessing module when dealing with functions that are data members of a class (due to Pickling problems).

But is there another module, or any sort of work-around in multiprocessing, that allows something specifically like the following (specifically without forcing the definition of the function to be applied in parallel to exist outside of the class)?

class MyClass():

def __init__(self):

self.my_args = [1,2,3,4]

self.output = {}

def my_single_function(self, arg):

return arg**2

def my_parallelized_function(self):

# Use map or map_async to map my_single_function onto the

# list of self.my_args, and append the return values into

# self.output, using each arg in my_args as the key.

# The result should make self.output become

# {1:1, 2:4, 3:9, 4:16}

foo = MyClass()

foo.my_parallelized_function()

print foo.output

Note: I can easily do this by moving my_single_function outside of the class, and passing something like foo.my_args to the map or map_async commands. But this pushes the parallelized execution of the function outside of instances of MyClass.

For my application (parallelizing a large data query that retrieves, joins, and cleans monthly cross-sections of data, and then appends them into a long time-series of such cross-sections), it is very important to have this functionality inside the class since different users of my program will instantiate different instances of the class with different time intervals, different time increments, different sub-sets of data to gather, and so on, that should all be associated with that instance.

Thus, I want the work of parallelizing to also be done by the instance, since it owns all the data relevant to the parallelized query, and it would just be silly to try write some hacky wrapper function that binds to some arguments and lives outside of the class (Especially since such a function would be non-general. It would need all kinds of specifics from inside the class.)

解决方案

Steven Bethard has posted a way to allow methods to be pickled/unpickled. You could use it like this:

import multiprocessing as mp

import copy_reg

import types

def _pickle_method(method):

# Author: Steven Bethard

# http://bytes.com/topic/python/answers/552476-why-cant-you-pickle-instancemethods

func_name = method.im_func.__name__

obj = method.im_self

cls = method.im_class

cls_name = ''

if func_name.startswith('__') and not func_name.endswith('__'):

cls_name = cls.__name__.lstrip('_')

if cls_name:

func_name = '_' + cls_name + func_name

return _unpickle_method, (func_name, obj, cls)

def _unpickle_method(func_name, obj, cls):

# Author: Steven Bethard

# http://bytes.com/topic/python/answers/552476-why-cant-you-pickle-instancemethods

for cls in cls.mro():

try:

func = cls.__dict__[func_name]

except KeyError:

pass

else:

break

return func.__get__(obj, cls)

# This call to copy_reg.pickle allows you to pass methods as the first arg to

# mp.Pool methods. If you comment out this line, `pool.map(self.foo, ...)` results in

# PicklingError: Can't pickle : attribute lookup

# __builtin__.instancemethod failed

copy_reg.pickle(types.MethodType, _pickle_method, _unpickle_method)

class MyClass(object):

def __init__(self):

self.my_args = [1,2,3,4]

self.output = {}

def my_single_function(self, arg):

return arg**2

def my_parallelized_function(self):

# Use map or map_async to map my_single_function onto the

# list of self.my_args, and append the return values into

# self.output, using each arg in my_args as the key.

# The result should make self.output become

# {1:1, 2:4, 3:9, 4:16}

self.output = dict(zip(self.my_args,

pool.map(self.my_single_function, self.my_args)))

Then

pool = mp.Pool()

foo = MyClass()

foo.my_parallelized_function()

yields

print foo.output

# {1: 1, 2: 4, 3: 9, 4: 16}

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值