Process可以分布到多台机器上,而Thread最多只能分布到同一台机器的多个CPU上。
Python的multiprocessing模块不但支持多进程,其中managers子模块还支持把多进程分布到多台机器上。一个服务进程可以作为调度者,将任务分布到其他多个进程中,依靠网络通信。由于managers模块封装很好,不必了解网络通信的细节,就可以很容易地编写分布式多进程程序。
#task_master.py
import multiprocessing.managers import BaseManager
import queue,time,random
que_num = 10
task_queue = queue.Queue(que_num)
result_queue = queue.Queue(que_num)
def get_task():
return task_queue
def get_result():
return result_queue
BaseManager.register('get_task',callable=get_task)
BaseManager.register('get_result',callable=get_result)
manager = QueueManager(address=('127.0.0.1', 5000), authkey=b'abc')
manager.start()
try:
#通过网络获取任务队列和结果队列
task = manager.get_task();
result = manager.get_result();
#添加任务
for i in range(task_number):
print('Put task %d...' % i)
task.put(i);
#每秒检测一次是否所有任务都被执行完
while not result.full():
time.sleep(1);
for i in range(result.qsize()):
ans = result.get();
print('task %d is finish , runtime:%d s' % ans);
except:
print('Manager error');
finally:
#一定要关闭,否则会爆管道未关闭的错误
manager.shutdown();
#task_worker.py
import time,sys,queue,random
from multiprocessing.managers import BaseManager
BaseManager.register('get_task')
BaseManager.register('get_result')
m = BaseManager(address = ('127.0.0.1',9000),authkey=b'abc')
try:
m.connect()
except:
print('connect failed')
sys.exit()
task = m.get_task()
result = m.get_result()
while not task.empty():
n = task.get(timeout = 1)
print('run task %d' % n)
sleeptime = random.randint(0,3)
time.sleep(sleeptime)
rt = (n, sleeptime)
result.put(rt)
if __name__=='__main__':
pass