https://docs.python.org/2/library/multiprocessing.html
The multiprocessing module also introduces APIs which do not have analogs in the threading module. A prime example of this is the Pool object which offers a convenient means of parallelizing the execution of a function across multiple input values, distributing the input data across processes (data parallelism).
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
p = Pool(5)
print(p.map(f, [1, 2, 3]))
outputs:
[1, 4, 9]
Process
from multiprocessing import Process
def f(name):
print 'hello', name
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
from multiprocessing import Process
import os
def info(title):
print title
print 'module name:', __name__
if hasattr(os, 'getppid'): # only available on Unix
print 'parent process:', os.getppid()
print 'process id:', os.getpid()
def f(name):
info('function f')
print 'hello', name
if __name__ == '__main__':
info('main line')
p = Process(target=f, args=('bob',))
p.start()
p.join()
outputs:
main line
module name: __main__
process id: 1888
function f
module name: __main__
process id: 12348
hello bob
Class Process
Process(group=None, target=None, name=None, args=(), kwargs={})
run():
Method representing the process’s activity.
start():
Start the process’s activity.
启动进程
每个进程对象最多只能调用一次。 它使对象的run()方法在一个进程中被调用。
join([timeout]):
Block the calling thread until the process whose join() method is called terminates or until the optional timeout occurs.
timeout为空的时候阻碍主进程的执行,知道该进程执行完毕,若不为空,则时间达到timeout时,主进程继续执行
name:
The process’s name.
is_alive():
返回子进程是否存活
一个子进程从start()开始一直存活,直到该子进程被终止。
daemon:
当该值为true时,主进程结束时结束该子进程。
Note that a daemonic process is not allowed to create child processes.
daemonic的进程继续创建子进程的话,新的子进程可能变成孤儿(orphaned )
pid:
Return the process ID.
exitcode:
The child’s exit code
authkey:
The process’s authentication key (a byte string).
When multiprocessing is initialized the main process is assigned a random string using os.urandom().
terminate():
Terminate the process.
注意:如果在相关联的进程正在使用管道或队列时,则管道或队列易于被损坏并且可能变得不能由其他进程使用。 类似地,如果进程已经获得锁或信号量等,则终止它可能导致其他进程死锁。
Exchanging objects between processes(通信)
多进程支持两种类型的通信通道:
Queues:
Queues are thread and process safe.
from multiprocessing import Process, Queue
def f(q):
q.put([42, None, 'hello'])
if __name__ == '__main__':
q = Queue()
p = Process(target=f, args=(q,))
p.start()
print q.get() # prints "[42, None, 'hello']"
p.join()
Pipes
Pipe()函数返回通过管道连接的一对连接对象,默认情况下是双向(双向)。
from multiprocessing import Process, Pipe
def f(conn):
conn.send([42, None, 'hello'])
conn.close()
if __name__ == '__main__':
parent_conn, child_conn = Pipe()
p = Process(target=f, args=(child_conn,))
p.start()
print parent_conn.recv() # prints "[42, None, 'hello']"
p.join()
Synchronization between processes(同步)
Lock()
使用lock:
from multiprocessing import Process, Lock
def f(l, i):
l.acquire()
print 'hello world', i
l.release()
if __name__ == '__main__':
lock = Lock()
for num in range(10):
Process(target=f, args=(lock, num)).start()
outputs():
hello world 0
hello world 2
hello world 1
hello world 3
hello world 4
hello world 6
hello world 5
hello world 7
hello world 8
hello world 9
不使用lock:
from multiprocessing import Process, Lock
def f(l, i):
# l.acquire()
print 'hello world', i
# l.release()
if __name__ == '__main__':
lock = Lock()
for num in range(10):
Process(target=f, args=(lock, num)).start()
outputs:
hello world 1
hello world 4
hello world 2
hhello world 5
ello worldhello world 6
3h
ello world 0
hello world 7
hello world 8
hello world 9
Sharing state between processes
(when doing concurrent programming it is usually best to avoid using shared state as far as possible. ) @_@
Shared memory
from multiprocessing import Process, Value, Array
def f(n, a):
n.value = 3.1415927
for i in range(len(a)):
a[i] = -a[i]
if __name__ == '__main__':
num = Value('d', 0.0)
arr = Array('i', range(10))
p = Process(target=f, args=(num, arr))
p.start()
p.join()
print num.value
print arr[:]
outputs:
3.1415927
[0, -1, -2, -3, -4, -5, -6, -7, -8, -9]
Server process
from multiprocessing import Process, Manager
def f(d, l):
d[1] = '1'
d['2'] = 2
d[0.25] = None
l.reverse()
if __name__ == '__main__':
manager = Manager()
d = manager.dict()
l = manager.list(range(10))
p = Process(target=f, args=(d, l))
p.start()
p.join()
print d
print l
outputs:
{0.25: None, 1: '1', '2': 2}
[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
Pool
Functionality within this package requires that the “__main__” module be importable by the children.
from multiprocessing import Pool, TimeoutError
import time
import os
def f(x):
return x*x
if __name__ == '__main__':
pool = Pool(processes=4) # start 4 worker processes
# print "[0, 1, 4,..., 81]"
print pool.map(f, range(10))
# print same numbers in arbitrary order
for i in pool.imap_unordered(f, range(10)):
print i
# evaluate "f(20)" asynchronously
res = pool.apply_async(f, (20,)) # runs in *only* one process
print res.get(timeout=1) # prints "400"
# evaluate "os.getpid()" asynchronously
res = pool.apply_async(os.getpid, ()) # runs in *only* one process
print res.get(timeout=1) # prints the PID of that process
# launching multiple evaluations asynchronously *may* use more processes
multiple_results = [pool.apply_async(os.getpid, ()) for i in range(4)]
print [res.get(timeout=1) for res in multiple_results]
# make a single worker sleep for 10 secs
res = pool.apply_async(time.sleep, (10,))
try:
print res.get(timeout=1)
except TimeoutError:
print "We lacked patience and got a multiprocessing.TimeoutError"