python multiprocessing.Process与threading.Thread的区别以及多进程，多线程的一些使用方法

alphanoblaker

已于 2022-04-11 09:11:50 修改

阅读量2.8k

点赞数 4

分类专栏： python matplotlib thread 文章标签： python

于 2022-03-24 16:20:35 首次发布

本文链接：https://blog.csdn.net/weixin_43698781/article/details/123699989

版权

python 同时被 3 个专栏收录

12 篇文章

订阅专栏

matplotlib

2 篇文章

订阅专栏

thread

2 篇文章

订阅专栏

本文探讨了在TensorFlow中使用多线程和多进程遇到的内存爆炸问题及解决方案。通过实例代码展示了线程与进程的区别，包括内存共享、警告处理、创建速度及CPU占用。实验结果显示，线程适合IO密集型任务，进程适用于计算密集型任务。在处理大量数据时，线程可能会因全局解释器锁（GIL）导致的伪并行问题，而进程能实现真正的并行计算。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

问题出现的背景：之前的工作：tensorflow利用for循环进行训练遇到的内存爆炸问题(OOM)

在前置的学习过程中遇到了tensorflow的一些问题，就想着用进程或者线程来解决。但是使用过程中出现了不少bug，才有了本文

概述

Process跟Thread的一些区别如下（不一定全面，只是自己使用中总结的）：
（下文中会用代码说明）

首先是内存共享的问题。Thread创建的子线程可以直接使用主线程的变量数据；而Process创建的子进程跟主进程是完全隔离的，如果要用到主进程的变量，就必须作为参数传入，即使将所需变量设置为global也无效（因为设置为global也只是对于主进程而言的，对于子进程是invisible，二者数据不共享）
其次，如果在Thread中使用matplotlib进行绘图，则会出现：UserWarning: Starting a Matplotlib GUI outside of the main thread will likely fail.；使用Process则不会有此Warning。
（解决方法见：另一篇文章：pyplot.plot使用遇到：UserWarning: Starting a Matplotlib GUI outside of the main thread will likely fail）
创建进程较为耗时，而且创建Process的过程中CPU占用率较高，因为创建的进程都会有一份数据的拷贝，相互之间互不影响；而创建线程就会简单很多，创建很快而且CPU占用几乎没有
要注意的是，正是因为线程有数据共享，如果多线程要修改同一变量，就会产生问题；如果子线程只是使用变量，但要使用的变量会不断改变，也会存在问题
本次实验使用Process / Thread本意就是为了尽量减少内存开销，创建的都是单个子进程 / 子线程。如果真是要进行并行计算，要看情况来选择Process还是Thread。因为python中有GIL（全局解释器锁，是CPython的机制），它确保同一时刻只有一个线程在执行 Python bytecode，所以即便开了多线程，同一时刻也只能有一个线程执行运算，是“伪并行”。而Process可以在多个CPU核上同时执行运算。所以：对于多个计算密集型的任务（比如做矩阵运算，解码高清视频等），使用Process更好；而对于多个IO密集型（CPU 使用率较低，程序中会存在大量的 I/O 操作占用时间）的任务（比如文件读取、网络通信等），因为CPU占用率低，多进程与多线程差别不大，但是由于线程的创建与切换更加方便，所以使用Thread更好

代码详解

# 后续代码用到的：
from multiprocessing import Process
from threading import Thread
import numpy as np
import matplotlib.pyplot as plt

代码部分一：

下面代码可以说明上述的前两点

使用Thread：

def func():
    plt.figure(figsize=(8, 8))
    plt.plot(x, label='x')
    plt.plot(y, label='y')
    plt.legend(loc='upper left')
    print(loop)
    plt.savefig(str(loop) + '_thread_class_test.png')
    plt.clf()

if __name__ == '__main__':
    for loop in range(5):
        x = np.random.randint(1, 10, 10)
        y = np.random.randint(1, 10, 10)
        t = Thread(target=func, args=())
        t.start()
        t.join()

上述代码可以正常运行。使用Thread时，虽然并没有将任何变量传入子线程，但子线程是可以看得到主线程的变量的。
同时可以发现，会有 UserWarning: Starting a Matplotlib GUI outside of the main thread will likely fail. 的提示
如果不想此Warning提示，解决方法见：另一篇文章：pyplot.plot使用遇到：UserWarning: Starting a Matplotlib GUI outside of the main thread will likely fail

使用Process：

def func():
    plt.figure(figsize=(8, 8))
    plt.plot(x, label='x')
    plt.plot(y, label='y')
    plt.legend(loc='upper left')
    print(loop)
    plt.savefig(str(loop) + '_thread_class_test.png')
    plt.clf()

if __name__ == '__main__':
    for loop in range(5):
        x = np.random.randint(1, 10, 10)
        y = np.random.randint(1, 10, 10)
        p = Process(target=func, args=())
        p.start()
        p.join()

发现会直接报错：xxx is not defined
只有把需要的参数都放入args中，才不会报错：
更改：p = Process(target=func, args=(loop, x, y))
同时要记得，更改func函数的形参列表

如果把loop设置为全局变量，对于子进程也是没有用的。可以尝试：
def func(x, y):
    ...
    global loop
    print(loop)
    plt.savefig(str(loop) + '_thread_class_test.png')
    plt.clf()

if __name__ == '__main__':
	global loop
    for loop in range(5):
        ...
        # 此处将x，y作为了传入参数
        p = Process(target=func, args=(x, y))
        ...
        
会直接报错：'loop' is not defined

代码部分二：

下面说明第三点

使用Thread

def func(loop):
    total = 0
    for i in range(50):
        total += i
    time.sleep(100)
    print(loop)


if __name__ == '__main__':
    threads = []
    start = time.time()
    for loop in range(1000):
        t = Thread(target=func, args=(loop, ))
        t.start()
        threads.append(t)

    end = time.time()
    print('创建线程共用时: {} s'.format(end - start))
    for i in range(1000):
        threads[i].join()

输出：
创建线程共用时: 0.14698576927185059 s

对应的CPU情况如下：
在这里插入图片描述

使用Process

def func(loop):
    total = 0
    for i in range(50):
        total += i
    time.sleep(100)


if __name__ == '__main__':
    procs = []
    start = time.time()
    for loop in range(1000):
        p = Process(target=func, args=(loop, ))
        p.start()
        # p.join()
        procs.append(p)

    # print(threading.active_count())
    end = time.time()
    print('创建进程共用时: {} s'.format(end - start))
    for i in range(1000):
        procs[i].join()

输出：
创建进程共用时: 22.241607666015625 s

对应的CPU情况如下：
在这里插入图片描述
注意：进程的结束也是挺消耗资源的，上述代码在结束进程时也花了好一会儿，而且

代码部分三：

第三点的补充说明以及第四点的前一部分见：链接
下面代码为了说明第四点的后一部分。（简单修改上述代码即可）

使用Thread

def func():
    time.sleep(1)
    print(loop)


if __name__ == '__main__':
    threads = []
    for loop in range(100):
        t = Thread(target=func, args=())
        t.start()
        threads.append(t)

    for i in range(100):
        threads[i].join()

期望输出：0 1 2 ... 99
实际输出：99 99 99 ... 99

这就是使用多Thread可能遇到的一个问题，因为子线程直接使用主线程中的loop变量，而不是作为参数传入，而loop变量一直在改变，导致与期望的输出不一致（这种问题很容易遇到，比如每个子线程要用循环变量loop作为子线程中保存文件名的一部分，而没有直接传入loop变量，最后就可能导致100个子线程最终只保存了一个文件）

使用Process

Process不存在此问题，Process必须传入参数然后会拷贝一份

代码部分四：

下面使用Thread和Process对IO密集型和计算密集型任务进行简单地对比

IO密集型任务（以文件读写为例）

注意，本部分在func函数中添加了暂停部分：time.sleep(1)，以模拟在读和写之间存在的其他的操作

首先，使用一个空函数func来测试创建、终止多线程和多进程耗费的时间：

def func(loop):
	sleep(1)
    return
函数body部分与下面的相同，for循环使用的是os.cpu_count()（我的机器是12个逻辑核心）

先是用Thread：
输出：
使用多线程耗时: 1.0146193504333496 s

然后用Process：
输出：
使用多进程耗时: 6.255192995071411 s

然后是多线程多进程进行文件读写

使用Thread

首先使用单线程来进行文件读写，作为对比
将for循环改为range(1)即可
输出结果：
单线程耗时: 1.0627291202545166 s

接下来使用多线程：
def func(loop):
    with open('test.log', 'r') as f_in:
    # test.log文件大小为 5291 KB
        f_str = ''.join(f_in.readlines())
        time.sleep(1)
	    with open('test_' + str(loop) + '.txt', 'w+') as f_out:
	        f_out.write(f_str)
	        


if __name__ == '__main__':
    threads = []
    start = time.time()
    for loop in range(os.cpu_count()):
        t = Thread(target=func, args=(loop, ))
        t.start()
        threads.append(t)

    for i in range(len(procs)):
        threads[i].join()

    end = time.time()
    print('多线程耗时: {} s'.format(end - start))
   
输出：
使用多线程耗时: 1.51499342918396 s

cpu核心的资源监视情况：
在这里插入图片描述

使用Process

首先使用单进程来作为对比
将for循环改为range(1)
输出结果：
使用单进程耗时: 2.5278077125549316 s

接下来使用多进程：
def func(loop):
    with open('tf_gpu_test.log', 'r') as f_in:
        # test.log文件大小为 5291 KB
        f_str = ''.join(f_in.readlines())
        time.sleep(1)
        with open('test_' + str(loop) + '.txt', 'w+') as f_out:
            f_out.write(f_str)

if __name__ == '__main__':
    procs = []
    start = time.time()
    for loop in range(os.cpu_count()):
        p = Process(target=func, args=(loop, ))
        p.start()
        procs.append(p)

    for i in range(len(procs)):
        procs[i].join()

    end = time.time()
    print('使用多进程耗时: {} s'.format(end - start))

使用多进程耗时: 6.341761112213135 s

cpu核心的资源监视情况：
在这里插入图片描述

结论

经过对比可以发现，不考虑创建和终止（因为实际使用中，创建之后一般就一直运行，很久之后才会终止），多进程、多线程实际处理花费的时间接近，但是可以看到，正如代码部分二中的结果，进程的创建、终止会消耗很多资源。所以对于IO密集型任务，使用多线程Thread更好

计算密集型任务（以矩阵运算为例）

将上述func函数修改为进行矩阵运算的即可：
一次func函数要做10^13数量级的运算
def func(loop):
    for i in range(10):
        x = np.random.randint(0, 10, size=(10 ** 4, 10 ** 4))
        # 进行矩阵乘法
        x = x * x
        # print(i)
        
main部分的代码分别对应即可，与上文相同

使用Thread：

使用单线程耗时：14.035309076309204 s
使用多线程耗时：148.3268826007843 s

CPU情况如下：
在这里插入图片描述

使用Process：

使用单进程耗时：15.407793760299683 s
使用多进程耗时：29.029470443725586 s

CPU情况如下：
在这里插入图片描述

结论

无论是从实际耗时的输出结果，还是从CPU的使用率上都能明显看出，在计算密集型任务中，使用Process多进程可以真正实现并行计算，而使用Thread多线程实际上只是并发（而且是只使用了一个CPU逻辑核心，总共有12个），而无法进行并行计算。所以，对于计算密集型任务，使用Process进行并行计算的优势很明显

参考

https://docs.python.org/zh-cn/3/library/multiprocessing.html
https://docs.python.org/zh-cn/3/library/threading.html
https://www.liaoxuefeng.com/wiki/1016959663602400/1017629247922688