multiprocessing and threading in Python

url: https://www.linuxjournal.com/content/multiprocessing-python

1 Basic concepts

  1. Python’s “multiprocessing” module feels like threads, but actually launches processes.
  2. Downside to threads is the global interpreter lock (GIL). Because a thread cedes the GIL whenever it uses I/O, this means threads are good idea when dealing with I/O.
  3. When dealing with lots of I/O, it is prefered to take full advantage of a multicore system. And in Python, that means using processes.
  4. To the dilemma of launch easy-to-use threads even though they don’t really run in parallel, or launch new processes over which we have little control, the answer is somewhere in the middle. The Python standard library “multiprocessing” gives the feeling of working with threads, but that actually works with processes.

2 first example: same result

2.1 threading

The “multiprocessing” module is designed to look and feel like the “threading” module, and it largely succeeds in doing so. For example, the following is a simple example of a multithreaded program:

def hello(n):
    time.sleep(random.randint(1,3))
    print("[{0}] Hello!".format(n))

for i in range(10):
    threading.Thread(target=hello, args=(i,)).start()

print("Done!")

But “Done!” is printed befor the threads. To correct this, we can use join to wait each thread be completed.

threads = [ ]
for i in range(10):
    t = threading.Thread(target=hello, args=(i,))
    threads.append(t)
    t.start()

for one_thread in threads:
    one_thread.join()

print("Done!")

1.2 multiprocessing

import multiprocessing

processes = [ ]

for i in range(10):
    t = multiprocessing.Process(target=hello, args=(i,))
    processes.append(t)
    t.start()

for one_process in processes:
    one_process.join()

print("Done!")

The result is the same
What’s the Difference between threading and multiprocessing?

3 The second example shows difference

Perhaps the biggest difference is that threads share global variables while separate processes don’t.

3.1 threading share global variable

Here’s a simple example of how a function running in a thread can modify a global variable.

just to prove a point; if we really want to modify global variables from within a thread, we should use a lock.

import threading

mylist = [ ]

def hello(n):
    time.sleep(random.randint(1,3))
    mylist.append(threading.get_ident())   # bad in real code!
    print("[{0}] Hello!".format(n))

threads = [ ]
for i in range(10):
    t = threading.Thread(target=hello, args=(i,))
    threads.append(t)
    t.start()

for one_thread in threads:
    one_thread.join()

print("Done!")
print(len(mylist))
print(mylist)

The function appends its ID to that list and then returns.

Dont do that in real code, because Python data structures ARENOT thread-safe!

output:
Done!
10
[123145344081920, 123145354592256, 123145375612928, …] #mylist

The global variable mylist is shared by the threads.

3.2 multiprocessing don’t share global variables

import multiprocessing

mylist = [ ]

def hello(n):
    time.sleep(random.randint(1,3))
    mylist.append(os.getpid())
    print("[{0}] Hello!".format(n))

processes = [ ]
for i in range(10):
    t = multiprocessing.Process(target=hello, args=(i,))
    processes.append(t)
    t.start()

for one_process in processes:
    one_process.join()

print("Done!")
print(len(mylist))
print(mylist)

The output from this program is as follows:
0
[] #mylist is empty

Each time a new process with “multiprocessing” is created, the process has its own value of the global mylist list, which goes away when the processes are joined.

3.3 Queues for multiprocessing to share global

In multiprocessing, queues can bridge the gap among processes.

import multiprocessing
from multiprocessing import Queue

q = Queue()

def hello(n):
    time.sleep(random.randint(1,3))
    q.put(os.getpid())
    print("[{0}] Hello!".format(n))

processes = [ ]
for i in range(10):
    t = multiprocessing.Process(target=hello, args=(i,))
    processes.append(t)
    t.start()

for one_process in processes:
    one_process.join()

mylist = [ ]
while not q.empty():
    mylist.append(q.get())

print("Done!")
print(len(mylist))
print(mylist)

The Queue instance is designed to be shared across the different processes. Moreover, it can handle any type of Python data.

4 Conclusion

Threading is easy to work with, but threads don’t truly execute in parallel. Multiprocessing is a module that provides an API that’s almost identical to that of threads.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值