超简单Python多进程教学concurrent.futures.ProcessPoolExecutor

写点什么啦

已于 2024-09-24 14:22:10 修改

阅读量115

点赞数 2

文章标签： python 开发语言多进程

于 2024-09-24 13:58:18 首次发布

本文链接：https://blog.csdn.net/m0_38096164/article/details/142487908

版权

Python并发编程：单进程与多进程处理性能比较

在现代编程实践中，编写高效、可扩展的代码是至关重要的。对于计算密集型或I/O密集型任务，合理利用多核处理器的能力可以显著提高程序性能。Python提供了多种并发和并行编程的方法，包括多线程、多进程以及异步编程。本文将通过一个具体示例，展示如何使用Python的multiprocessing和concurrent.futures模块来比较单进程与多进程处理任务的性能。

背景

Python的全局解释器锁（GIL）限制了多线程在执行计算密集型任务时的性能，因为它不允许多个线程同时执行Python字节码。因此，在处理此类任务时，多进程是一个更好的选择，每个进程有自己的Python解释器和内存空间，从而绕过GIL的限制。

示例代码

本文示例中，我们将定义一个模拟耗时操作的函数process_seed，然后使用单进程和多进程两种方式处理一系列任务，并比较它们的执行时间。

import time
import concurrent.futures
import multiprocessing

# def task(n):
#     time.sleep(1)  
#     return n * n

# def single_processing():
#     start_time = time.time()
#     results = []
#     for i in range(5):
#         results.append(task(i))
#     end_time = time.time()
#     print(f"单进程处理耗时: {end_time - start_time:.2f} 秒")

# def multi_processing():
#     start_time = time.time()
#     with multiprocessing.Pool(processes=5) as pool:
#         results = pool.map(task, range(5))
#     end_time = time.time()
#     print(f"多进程处理耗时: {end_time - start_time:.2f} 秒")

# 任务函数
def process_seed(seed):
    time.sleep(1)  # 模拟耗时操作
    print(f"处理seed {seed}")
    return f"seed {seed}结果"

def process_seeds(seeds, task_function):
    # 获取逻辑核心数量
    num_cores = multiprocessing.cpu_count()
    print(f"有效核心数: {num_cores}")
    # 单进程处理
    start_time = time.time()
    results = []
    for seed in seeds:
        result = task_function(seed)
        results.append(result)
    end_time = time.time()
    single_threaded_time = end_time - start_time
    print(f"单进程处理耗时 {single_threaded_time:.2f} s")
    # 多进程处理
    start_time = time.time()
    with concurrent.futures.ProcessPoolExecutor(max_workers=num_cores) as executor:
        results = list(executor.map(task_function, seeds))
    end_time = time.time()
    multi_threaded_time = end_time - start_time
    print(f"多进程处理耗时 {multi_threaded_time:.2f} s")
    # 返回结果
    return results

#single_processing()
#multi_processing()
seeds = range(5)  # 假设有 5个种子任务
results = process_seeds(seeds, process_seed)
print(results)

执行结果：

解析

函数定义：
- process_seed(seed)：模拟一个耗时操作，通过time.sleep(1)暂停1秒钟模拟处理过程。
处理函数：
- process_seeds(seeds, task_function)：接受一系列种子seeds和任务函数task_function。
- 单进程处理：遍历种子列表，逐个调用任务函数。
- 多进程处理：使用ProcessPoolExecutor创建一个进程池，max_workers设置为处理器核心数。executor.map并行地应用task_function到每个种子上。
性能比较：
- 记录并打印单进程和多进程处理的时间，从而比较两者的性能。