python instance method_Python Can't pickle type 'instancemethod' 问题

最近给Simiki从单进程改为多进程时, 遇到这个报错.

给一个正常的简化的版本:

#!/usr/bin/env python

# -*- coding: utf-8 -*-

import multiprocessing

def func(x):

print(x * x)

class Klass(object):

def __init__(self):

print "Constructor ... %s" % multiprocessing.current_process().name

def __del__(self):

print "... Destructor %s" % multiprocessing.current_process().name

def func(self, x):

print(x * x)

def run(self):

pool = multiprocessing.Pool(processes=3)

for num in range(8):

pool.apply_async(func, args=(num,))

pool.close()

pool.join()

if __name__ == '__main__':

_kls = Klass()

_kls.run()

这里加上构造函数和析构函数, 是因为后面在这里会有问题.

另外, 这里有个比较奇怪的现象, 原先实例类时, 变量名是kls, 但是析构出错, 改为als没有问题, 试了很多变量名, 有的有问题, 有的没问题, 还不清楚原因

Constructor ... MainProcess

Exception AttributeError: "'NoneType' object has no attribute 'current_process'" in > ignored

原因已经找到, 参考__del__文档:

Starting with version 1.5, Python guarantees that globals whose name begins with a single underscore are deleted from their module before other globals are deleted; if no other references to such globals exist, this may help in assuring that imported modules are still available at the time when the __del__() method is called.

对于普通变量, Python无法保证__del__和对象销毁的顺序, 如果变量名以下划线开头, 可以保证此变量在其它导入模块之前调用.

具体可以看看我在StackOverflow上的提问

正常执行结果:

Constructor ... MainProcess

0

1

4

9

16

25

36

49

... Destructor MainProcess

如果把func移到类中, 改为method, 则问题来了:

#!/usr/bin/env python

# -*- coding: utf-8 -*-

import multiprocessing

class Klass(object):

def __init__(self):

print "Constructor ... %s" % multiprocessing.current_process().name

def __del__(self):

print "... Destructor %s" % multiprocessing.current_process().name

def func(self, x):

print(x * x)

def run(self):

pool = multiprocessing.Pool(processes=3)

for num in range(8):

pool.apply_async(self.func, args=(num,))

pool.close()

pool.join()

if __name__ == '__main__':

_kls = Klass()

_kls.run()

执行报错:

Constructor ... MainProcess

Exception in thread Thread-2:

Traceback (most recent call last):

File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 808, in __bootstrap_inner

self.run()

File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 761, in run

self.__target(*self.__args, **self.__kwargs)

File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 342, in _handle_tasks

put(task)

PicklingError: Can't pickle : attribute lookup __builtin__.instancemethod failed

原因是multiprocessing会对调用的函数做序列化, 然后method不可被序列化. 具体参考python2 - pickle的文档, 里面列出了可被序列化的对象:

None, True, and False

integers, long integers, floating point numbers, complex numbers

normal and Unicode strings

tuples, lists, sets, and dictionaries containing only picklable objects

functions defined at the top level of a module

built-in functions defined at the top level of a module

classes that are defined at the top level of a module

instances of such classes whose __dict__ or the result of calling __getstate__() is picklable (see section The pickle protocol for details).

关于解决思路, 众说纷坛, 有利有弊.

最简单的思路? 感谢这个兄弟的思路, 加一个全局的代理函数, 他也注意到了后面的几个方法会出现析构多次, 但是这个其实也会这样:

#!/usr/bin/env python

# -*- coding: utf-8 -*-

import multiprocessing

def proxy(cls_instance, i):

return cls_instance.func(i)

class Klass(object):

def __init__(self):

print "Constructor ... %s" % multiprocessing.current_process().name

def __del__(self):

print "... Destructor %s" % multiprocessing.current_process().name

def func(self, x):

print(x * x)

def run(self):

pool = multiprocessing.Pool(processes=3)

for num in range(8):

#pool.apply_async(self.func, args=(num,))

pool.apply_async(proxy, args=(self, num,))

pool.close()

pool.join()

if __name__ == '__main__':

_kls = Klass()

_kls.run()

执行代码:

Constructor ... MainProcess

0

1

4

... Destructor PoolWorker-1

9

... Destructor PoolWorker-2

16

... Destructor PoolWorker-1

36

... Destructor PoolWorker-3

25

... Destructor PoolWorker-2

49

... Destructor PoolWorker-1

... Destructor PoolWorker-3

... Destructor PoolWorker-2

... Destructor MainProcess

注意到这里线程池中的worker析构了8次, 每个worker进程析构了2-3次, 主进程析构了1次.

还有一个介绍的比较多的就是用copy_reg将MethodType注册为可序列化来解决:

参考这个回答的模板:

#!/usr/bin/env python

# -*- coding: utf-8 -*-

import copy_reg

import types

def _pickle_method(m):

if m.im_self is None:

return getattr, (m.im_class, m.im_func.func_name)

else:

return getattr, (m.im_self, m.im_func.func_name)

copy_reg.pickle(types.MethodType, _pickle_method)

或者这个回答的模板:

#!/usr/bin/env python

# -*- coding: utf-8 -*-

from copy_reg import pickle

from types import MethodType

def _pickle_method(method):

func_name = method.im_func.__name__

obj = method.im_self

cls = method.im_class

return _unpickle_method, (func_name, obj, cls)

def _unpickle_method(func_name, obj, cls):

for cls in cls.mro():

try:

func = cls.__dict__[func_name]

except KeyError:

pass

else:

break

return func.__get__(obj, cls)

pickle(MethodType, _pickle_method, _unpickle_method)

这里用到了mro()

im_self is the class instance object;

im_func is the function object;

im_class is the class of im_self for bound methods or the class that asked for the method for unbound methods;

具体见文档

#!/usr/bin/env python

# -*- coding: utf-8 -*-

class Klass(object):

def func(self):

pass

def main(self):

print 'self.func.im_self: %s' % self.func.im_self

print 'self.func.im_class: %s' % self.func.im_class

print 'self.func.im_func: %s' % self.func.im_func

_kls = Klass()

_kls.main()

print '---'

print '_kls.func.im_self: %s' % _kls.func.im_self

print '_kls.func.im_class: %s' % _kls.func.im_class

print '_kls.func.im_func: %s' % _kls.func.im_func

print '---'

print 'Klass.func.im_self: %s' % Klass.func.im_self

print 'Klass.func.im_class: %s' % Klass.func.im_class

print 'Klass.func.im_func: %s' % Klass.func.im_func

执行结果:

self.func.im_self: <__main__.klass object at>

self.func.im_class:

self.func.im_func:

---

_kls.func.im_self: <__main__.klass object at>

_kls.func.im_class:

_kls.func.im_func:

---

Klass.func.im_self: None

Klass.func.im_class:

Klass.func.im_func:

接着回来, 上面的方法, 即:

#!/usr/bin/env python

# -*- coding: utf-8 -*-

import multiprocessing

import copy_reg

import types

def _pickle_method(m):

if m.im_self is None:

return getattr, (m.im_class, m.im_func.func_name)

else:

return getattr, (m.im_self, m.im_func.func_name)

copy_reg.pickle(types.MethodType, _pickle_method)

class Klass(object):

def __init__(self):

print "Constructor ... %s" % multiprocessing.current_process().name

def __del__(self):

print "... Destructor %s" % multiprocessing.current_process().name

def func(self, x):

print(x * x)

def run(self):

pool = multiprocessing.Pool(processes=3)

for num in range(8):

pool.apply_async(self.func, args=(num,))

pool.close()

pool.join()

if __name__ == '__main__':

_kls = Klass()

_kls.run()

执行代码:

Constructor ... MainProcess

0

1

4

... Destructor PoolWorker-1

9

... Destructor PoolWorker-2

16

... Destructor PoolWorker-1

36

... Destructor PoolWorker-3

25

... Destructor PoolWorker-1

49

... Destructor PoolWorker-2

... Destructor PoolWorker-1

... Destructor PoolWorker-3

... Destructor MainProcess

同上.

另一个方法是, 参考这个回答, 定义__call__. 这样就可以将类本身, 如这里的self传给apply_async, 因为顶层定义的类是可被序列化的:

#!/usr/bin/env python

# -*- coding: utf-8 -*-

import multiprocessing

class Klass(object):

def __init__(self):

print "Constructor ... %s" % multiprocessing.current_process().name

def __del__(self):

print "... Destructor %s" % multiprocessing.current_process().name

def __call__(self, x):

return self.func(x)

def func(self, x):

print(x * x)

def run(self):

pool = multiprocessing.Pool(processes=3)

for num in range(8):

#pool.apply_async(self.func, args=(num,))

pool.apply_async(self, args=(num,))

pool.close()

pool.join()

if __name__ == '__main__':

_kls = Klass()

_kls.run()

执行代码:

Constructor ... MainProcess

0

1

4

... Destructor PoolWorker-1

9

... Destructor PoolWorker-2

16

... Destructor PoolWorker-3

25

... Destructor PoolWorker-1

36

... Destructor PoolWorker-2

49

... Destructor PoolWorker-3

... Destructor PoolWorker-1

... Destructor PoolWorker-2

... Destructor MainProcess

这个回答的方法其实和之前设置的代理函数一样, 都是把method提升到global:

#!/usr/bin/env python

# -*- coding: utf-8 -*-

import multiprocessing

def call_it(instance, name, args=(), kwargs=None):

"""indirect caller for instance methods and multiprocessing"""

if kwargs is None:

kwargs = {}

return getattr(instance, name)(*args, **kwargs)

class Klass(object):

def __init__(self):

print "Constructor ... %s" % multiprocessing.current_process().name

def __del__(self):

print "... Destructor %s" % multiprocessing.current_process().name

def func(self, x):

print(x * x)

def run(self):

pool = multiprocessing.Pool(processes=3)

for num in range(8):

#pool.apply_async(self.func, args=(num,))

pool.apply_async(call_it, args=(self, 'func', (num,)))

pool.close()

pool.join()

if __name__ == '__main__':

_kls = Klass()

_kls.run()

执行代码:

Constructor ... MainProcess

0

1

4

... Destructor PoolWorker-1

9

... Destructor PoolWorker-1

36

... Destructor PoolWorker-3

25

... Destructor PoolWorker-2

16

... Destructor PoolWorker-1

49

... Destructor PoolWorker-3

... Destructor PoolWorker-1

... Destructor PoolWorker-2

... Destructor MainProcess

关于析构8次的问题,看了copy_reg, pickle, multiprocessing的代码, 还是没什么头绪. 坑爹!

这块考虑, 最还还是把多进程调用的函数抽离出来放在全局是最安全的; 毕竟其它的方案又丑陋, 并且最主要的是各种黑魔法, 完全不清楚内部细节.

另外, Python3.3+ 解决了这个问题, 所以在Python3下是可运行的, 这里有个讨论

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值