python 64式: 第42式、进程池源码分析

最新推荐文章于 2024-07-04 10:22:52 发布

天地一扁舟

最新推荐文章于 2024-07-04 10:22:52 发布

阅读量1.5k

点赞数 1

分类专栏： python 64式

本文链接：https://blog.csdn.net/qingyuanluofeng/article/details/103249528

版权

本文深入探讨了Python的进程池实现原理，包括线程池和进程池的基础、`concurrent.futures`模块的作用、`Executor`类的功能以及`submit`、`map`、`shutdown`等方法的用法。重点分析了`ProcessPoolExecutor`的内部结构，如工作项、工作ID队列、调用队列、结果队列等，并详细阐述了进程池的工作流程，从提交任务到结果返回的整个过程。最后总结了进程池不受GIL限制、利用多核优势的特点，以及其在处理多客户端请求时的应用场景。

摘要由CSDN通过智能技术生成

目标:
弄清楚进程池的实现原理

0 线程池与进程池基础

关键:
1、线程池提出原因:同时创建很多线程是需要消耗资源的，可以创建几个线程，其他任务在等待线程池中线程
完成，就可以继续处理
本质:将任务提交到线程池的任务队列中
组成:等待队列和一系列线程
作用: 主线程可以获取线程状态，返回值；一个线程完成，主线程可以知道

2、 concurrent.futures.Executor
作用:抽象类，有异步执行调用方法。有两个子类:
ThreadPoolExecutor(max_workers)和ProcessPoolExecutor(max_workers)
max_workers:表示有多少worker并行执行该任务,异步调用,若为None,则设置为机器的处理器数目

3、 Executor.submit(fn, *args, **kwargs)
作用:调度函数的执行
参数: fn: 异步执行的函数,*args: fn的参数,**kwargs: fn的参数
返回值: 返回一个Future对象，表示可调用的执行
注意: submit是立即返回的

4、 Executor.map(function, *iterables, timeout=None):
作用:将argument作为参数执行函数，以异步方式执行；相当于map(func, *iterables)
但是func是异步执行，如果操作超时，返回错误；不指定timeout,则不设置超时
参数: func:异步执行函数，*iterables:可迭代对象，如列表，每一次func执行，都会从iterables中取参数

5、 Executor.shutdown(wait=True)
作用:释放系统资源，在submit()或map()等异步操作之后调用，使用with语句可以避免显示调用该方法

6、 concurrent.futures.as_completed(fs, timeout=None)
作用:接收一个future列表，返回一个迭代器，在运行结束后删除future，一次取出所有任务的结果
本质：是生成器，任务还没有完成，会阻塞；先完成任务会先通知主线程

7 关于concurrent.futures.Future
concurrent.future: 未来完成的操作，异步编程
cancel():取消调用，若执行，不能取消；返回值表示是否可以取消
cancelled():返回是否已经取消
done():返回任务是否已经成功完成
result(timeout=None):返回调用的结果，如果还没有完成，将会等待一定时间
exception(timeout=None):返回调用的异常
wait(fs, timeout=None, return_when=ALL_COMPLETED):让主线程阻塞，直到满足设定的要求
参数:等待的任务序列，超时时间，等待条件。ALL_COMPLETED表示要等待所有任务完成。

总结:
进程池:不受GIL全局解释器锁的限制，缩短执行时间,使用多核处理的模块，推荐使用
线程池:不管多少处理器，运行的时候只有一个线程运行。【协程：多个线程之间互相渡让cpu的控制权】
线程池/进程池适用:处理多个客户端请求的服务端部分

python进程池架构如下:
Implements ProcessPoolExecutor.

The follow diagram and text describe the data-flow through the system:

|======================= In-process =====================|== Out-of-process ==|

+----------+ +----------+ +--------+ +-----------+ +---------+
| | => | Work Ids | => | | => | Call Q | => | |
| | +----------+ | | +-----------+ | |
| | | ... | | | | ... | | |
| | | 6 | | | | 5, call() | | |
| | | 7 | | | | ... | | |
| Process | | ... | | Local | +-----------+ | Process |
| Pool | +----------+ | Worker | | #1..n |
| Executor | | Thread | | |
| | +----------- + | | +-----------+ | |
| | <=> | Work Items | <=> | | <= | Result Q | <= | |
| | +------------+ | | +-----------+ | |
| | | 6: call() | | | | ... | | |
| | | future | | | | 4, result | | |
| | | ... | | | | 3, except | | |
+----------+ +------------+ +--------+ +-----------+ +---------+

Executor.submit() called:
- creates a uniquely numbered _WorkItem and adds it to the "Work Items" dict
- adds the id of the _WorkItem to the "Work Ids" queue

Local worker thread:
- reads work ids from the "Work Ids" queue and looks up the corresponding
WorkItem from the "Work Items" dict: if the work item has been cancelled then
it is simply removed from the dict, otherwise it is repackaged as a
_CallItem and put in the "Call Q". New _CallItems are put in the "Call Q"
until "Call Q" is full. NOTE: the size of the "Call Q" is kept small because
calls placed in the "Call Q" can no longer be cancelled with Future.cancel().
- reads _ResultItems from "Result Q", updates the future stored in the
"Work Items" dict and deletes the dict entry

Process #1..n:
- reads _CallItems from "Call Q", executes the calls, and puts the resulting
_ResultItems in "Request Q"

上述python进程池架构来自:
python futures的源码

1 主入口

这里用自己编写的程序

import time

from concurrent import futures

def processPoolExecutor_submit(m, n):
with futures.ProcessPoolExecutor(max_workers=m) as executor:
executor_dict = dict((executor.submit(run, times), times) for
times in range(m * n))

for future in futures.as_completed(executor_dict):
times = executor_dict[future]
if future.exception() is not None:
print("%d generated exception: %s" % (times, future.exception()))
else:
print("RunTimes: %d. Res: %s" % (times, future.result()))

def processPoolExecutor_demo():
processPoolExecutor_submit(5, 1)

分析:
1.1) 进入
E:\developSoftware\python27\Lib\site-packages\concurrent\futures\process.py
代码如下:
class ProcessPoolExecutor(_base.Executor):
def __init__(self, max_workers=None):
"""Initializes a new ProcessPoolExecutor instance.

Args:
max_workers: The maximum number of processes that can be used to
execute the given calls. If None or not given then as many
worker processes will be created as the machine has processors.
"""
_check_system_limits()

if max_workers is None:
self._max_workers = multiprocessing.cpu_count()
else:
if max_workers <= 0:
raise ValueError("max_workers must be greater than 0")

self._max_workers = max_workers

# Make the call queue slightly larger than the number of processes to
# prevent the worker processes from idling. But don't make it

最低0.47元/天解锁文章

天地一扁舟

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
python 64式: 第42式、进程池源码分析

目标:弄清楚进程池的实现原理0 线程池与进程池基础关键:1、线程池提出原因:同时创建很多线程是需要消耗资源的，可以创建几个线程，其他任务在等待线程池中线程完成，就可以继续处理本质:将任务提交到线程池的任务队列中组成:等待队列和一系列线程作用: 主线程可以获取线程状态，返回值；一个线程完成，主线程可以知道2、 concurrent.futures.Executor作用:抽...
复制链接

扫一扫

专栏目录