python3多进程包_在python3中的多进程之间共享python对象

Here I create a producer-customer program,the parent process(producer) create many child process(consumer),then parent process read file and pass data to child process.

but , here comes a performance problem,pass message between process cost too much time (I think).

for an example ,a 200MB original data ,parent process read and pretreat will cost less then 8 seconds , than just pass data to child process by multiprocess.pipe will cost another 8 seconds , and child processes do the remain work just cost another 3 ~ 4 seconds.

so ,a complete work flow cost less than 18 seconds ,and more than 40% time cost on communication between process , it is much bigger than I used think about ,and I tried multiprocess.Queue and Manager ,they are worse.

I works with windows7 / Python3.4.

I had google for several days , and POSH maybe a good solution , but it can't build with python3.4

there I have 3 ways:

1.is there any way can share python object direct between process in Python3.4 ? as POSH

or

2.is it possable pass the "pointer" of an object to child process and child process can recovery the "pointer" to python object?

or

3.multiprocess.Array may be a valid solution , but if I want share complex data structure, such as list, how it works? should I make a new class base on it and provide interfaces as list?

Edit1:

I tried the 3rd way,but it works worse.

I defined those value:

p_pos = multiprocessing.Value('i') #producer write position

c_pos = multiprocessing.Value('i') #customer read position

databuff = multiprocess.Array('c',buff_len) # shared buffer

and two function:

send_data(msg)

get_data()

in send_data function(parent process),it copies msg to databuff , and send the start and end position (two integer)to child process via pipe.

than in get_data function (child process) ,it received the two position and copy the msg from databuff.

in final,it cost twice than just use pipe @_@

Edit 2:

Yes , I tried Cython ,and the result looks good.

I just changed my python script's suffix to .pyx and compile it ,and the program speed up for 15%.

No doubt , I met the " Unable to find vcvarsall.bat" and " The system cannot find the file specified" error , and I cost whole day for solved the first one , and blocked by the second one.

Finally , I found

解决方案

I was at your place five month ago. I looked around few times but my conclusion is multiprocessing with Python has exactly the problem you describe :

Pipes and Queue are good but not for big objects from my experience

Manager() proxies objects are slow except arrays and those one are limited. if you want to share a complex data structure use a Namespace like it is done here : multiprocessing in python - sharing large object (e.g. pandas dataframe) between multiple processes

There are no pointers or real memory management in Python, so you can't share selected memory cells

I solved this kind of problem by learning C++, but it's probably not what you want to read...

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值