Java 线程池 ThreadPoolExecutor 源码分析

最新推荐文章于 2024-03-02 11:29:25 发布

VIP文章 cleverGump

最新推荐文章于 2024-03-02 11:29:25 发布

阅读量1.1w

点赞数 14

分类专栏： Java 文章标签： java 线程池并发

本文链接：https://blog.csdn.net/clevergump/article/details/50688008

版权

转载请注明本文出自 clevergump 的博客：http://blog.csdn.net/clevergump/article/details/50688008, 谢谢!

线程池能够对线程进行有效的管理, 复用和数量上限的限制, 如果你需要创建多个线程来执行多个异步任务, 那么使用线程池显然要比频繁地 new Thread().start() 这种方式要好.

Java 中的线程池是用 ThreadPoolExecutor 类来表示的. 我们今天就结合该类的源码来分析一下这个类内部对于线程的创建, 管理以及后台任务的调度等方面的执行原理. 我这里分析的是 Oracle JDK 1.8 的源码.

1. ctl

ThreadPoolExecutor 类中有个非常重要的字段 ctl, ctl 其实可以理解为单词 control 的简写, 翻译过来就是 “控制”, 具体来说就是对线程池的运行状态和池子中有效线程的数量进行控制的一个字段. 我们看下该字段在源码中的定义:

/**
 * The main pool control state, ctl, is an atomic integer packing
 * two conceptual fields
 *   workerCount, indicating the effective number of threads
 *   runState,    indicating whether running, shutting down etc
 *
 * In order to pack them into one int, we limit workerCount to
 * (2^29)-1 (about 500 million) threads rather than (2^31)-1 (2
 * billion) otherwise representable. If this is ever an issue in
 * the future, the variable can be changed to be an AtomicLong,
 * and the shift/mask constants below adjusted. But until the need
 * arises, this code is a bit faster and simpler using an int.
 *
 * The workerCount is the number of workers that have been
 * permitted to start and not permitted to stop.  The value may be
 * transiently different from the actual number of live threads,
 * for example when a ThreadFactory fails to create a thread when
 * asked, and when exiting threads are still performing
 * bookkeeping before terminating. The user-visible pool size is
 * reported as the current size of the workers set.
 *
 * The runState provides the main lifecycle control, taking on values:
 *
 *   RUNNING:  Accept new tasks and process queued tasks
 *   SHUTDOWN: Don't accept new tasks, but process queued tasks
 *   STOP:     Don't accept new tasks, don't process queued tasks,
 *             and interrupt in-progress tasks
 *   TIDYING:  All tasks have terminated, workerCount is zero,
 *             the thread transitioning to state TIDYING
 *             will run the terminated() hook method
 *   TERMINATED: terminated() has completed
 *
 * The numerical order among these values matters, to allow
 * ordered comparisons. The runState monotonically increases over
 * time, but need not hit each state. The transitions are:
 *
 * RUNNING -> SHUTDOWN
 *    On invocation of shutdown(), perhaps implicitly in finalize()
 * (RUNNING or SHUTDOWN) -> STOP
 *    On invocation of shutdownNow()
 * SHUTDOWN -> TIDYING
 *    When both queue and pool are empty
 * STOP -> TIDYING
 *    When pool is empty
 * TIDYING -> TERMINATED
 *    When the terminated() hook method has completed
 *
 * Threads waiting in awaitTermination() will return when the
 * state reaches TERMINATED.
 *
 * Detecting the transition from SHUTDOWN to TIDYING is less
 * straightforward than you'd like because the queue may become
 * empty after non-empty and vice versa during SHUTDOWN state, but
 * we can only terminate if, after seeing that it is empty, we see
 * that workerCount is 0 (which sometimes entails a recheck -- see
 * below).
 */
private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));

ctl 是一个 AtomicInteger 对象, 也就是一个特殊的 int 型变量, 特殊之处在于所有需要修改其数值的操作都是原子化的. 如果你不熟悉原子化 (atomic) 这个概念, 那么你可以将它简单理解为 synchronized, 即: 所有修改其数值的操作都需要在加了同步锁的情况下来进行.

一个 ctl 变量可以包含两部分信息: 线程池的运行状态 (runState) 和线程池内有效线程的数量 (workerCount). 由于 int 型的变量是由32位二进制的数构成, 所以用 ctl 的高3位来表示线程池的运行状态, 用低29位来表示线程池内有效线程的数量. 由于这两部分信息在该类中很多地方都会使用到, 所以我们也经常会涉及到要获取其中一个信息的操作, 通常来说, 代表这两个信息的变量的名称直接用他们各自英文单词首字母的组合来表示, 所以, 表示线程池运行状态的变量通常命名为 rs, 表示线程池中有效线程数量的变量通常命名为 wc, 另外, ctl 也通常会简写作 c, 你一定要对这里提到的几个变量名稍微留个印象哦. 如果你在该类源码的某个地方遇到了见名却不知意的变量名时, 你在抱怨这糟糕的命名的时候, 要试着去核实一下, 那些变量是不是正是这里提到的几个信息哦.

由于 ctl 变量是由线程池的运行状态 (runState) 和线程池内有效线程的数量 (workerCount)这两个信息组合而成, 所以, 如果知道了这两部分信息各自的数值, 就可以调用下面的 ctlOf() 方法来计算出 ctl 的数值:

// rs: 表示线程池的运行状态 (rs 是 runState中各单词首字母的简写组合)
// wc: 表示线程池内有效线程的数量 (wc 是 workerCount中各单词首字母的简写组合)
private static int ctlOf(int rs, int wc) { return rs | wc; }

反过来, 如果知道了 ctl 的值, 那么也可以通过如下的 runStateOf() 和 workerCountOf() 两个方法来分别获取线程池的运行状态和线程池内有效线程的数量.

private static int runStateOf(int c)     { return c & ~CAPACITY; }
private static int workerCountOf(int c)  { return c & CAPACITY; }

其中, CAPACITY 等于 (2^29)-1, 也就是高3位是0, 低29位是1的一个int型的数,

private static final int COUNT_BITS = Integer.SIZE - 3;     // 29
private static final int CAPACITY = (1 << COUNT_BITS) - 1;  // COUNT_BITS == 29

所以上边两个方法的计算过程也就不难理解了吧 (ps: 如果此时你还是不理解这两个方法的计算过程, 请先学习二进制位运算的相关知识, 然后再来看这两个方法, 你会发现他们很容易理解的). 另外, CAPACITY 这个常量从名字上可以知道, 该常量表示某个容量值, 那么表示的是什么容量值呢? 其实, 我们前面介绍过, ctl 只用他的低29位来表示线程池内的有效线程数, 也就是说, 线程池内有效线程的数量上限就是29个二进制1所表示的数值 (约为5亿), 而线程池就是用 CAPACITY 这个常量来表示这个上限数值的.

下面再介绍下线程池的运行状态. 线程池一共有五种状态, 分别是:

① RUNNING (运行状态): 能接受新提交的任务, 并且也能处理阻塞队列中的任务.
② SHUTDOWN (关闭状态): 不再接受新提交的任务, 但却可以继续处理阻塞队列中已保存的任务. 在线程池处于 RUNNING 状态时, 调用 shutdown()方法会使线程池进入到该状态. 当然, finalize() 方法在执行过程中或许也会隐式地进入该状态.
③ STOP : 不能接受新提交的任务, 也不能处理阻塞队列中已保存的任务, 并且会中断正在处理中的任务. 在线程池处于 RUNNING 或 SHUTDOWN 状态时, 调用 shutdownNow() 方法会使线程池进入到该状态.
④ TIDYING (清理状态): 所有的任务都已终止了, workerCount (有效线程数) 为0, 线程池进入该状态后会调用 terminated() 方法以让该线程池进入TERMINATED 状态. 当线程池处于 SHUTDOWN 状态时, 如果此后线程池内没有线程了并且阻塞队列内也没有待执行的任务了 (即: 二者都为空), 线程池就会进入到该状态. 当线程池处于 STOP 状态时, 如果此后线程池内没有线程了, 线程池就会进入到该状态.
⑤ TERMINATED : terminated() 方法执行完后就进入该状态.

中文翻译可能不太准确, 也不能充分表达源码所表示的所有含义, 还可能造成歧义, 例如: STOP 和 TERMINATED 似乎翻译过来的意思没太大区别啊. 所以我们在描述线程池的运行状态时, 建议直接使用上面的5个英文单词来表示. 这五种状态的具体数值如下:

// runState is stored in the high-order bits
private static final int RUNNING    = -1 << COUNT_BITS;
private static final int SHUTDOWN   =  0 << COUNT_BITS;
private static final int STOP       =  1 << COUNT_BITS;
private static final int TIDYING    =  2 << COUNT_BITS;
private static final int TERMINATED =  3 << COUNT_BITS;

前边提到过 COUNT_BITS == 29. 其实我们只需要知道, 上边这5个常量是按照从小到大的顺序列出的即可. 如果你在源码中看到
rs < SHUTDOWN (假如用 rs 代表线程池的运行状态), 那么你要知道, 这表示线程池处于 RUNNING 状态.

2. 几个重要的参数

ThreadPoolExecutor 类的构造方法中提供了几个非常重要的参数, 这几个参数也对应着该类中的几个同名的字段. 理解这几个重要参数/字段的含义, 将有助于我们分析线程池对线程调度的原理. 下面我们就来看看该类的构造方法吧, 如下图所示:

这里写图片描述

前三个方法最终都会去调用第四个方法, 也就是参数数量最多的那个方法, 所以我们来看看这第四个方法的源码, 如下:

/**
 * Creates a new {@code ThreadPoolExecutor} with the given initial
 * parameters.
 *
 * @param corePoolSize the number of threads to keep in the pool, even
 *        if they are idle, unless {@code allowCoreThreadTimeOut} is set
 * @param maximumPoolSize the maximum number of threads to allow in the
 *        pool
 * @param keepAliveTime when the number of threads is greater than
 *        the core, this is the maximum time that excess idle threads
 *        will wait for new tasks before terminating.
 * @param unit the time unit for the {@code keepAliveTime} argument
 * @param workQueue the queue to use for holding tasks before they are
 *        executed.  This queue will hold only the {@code Runnable}
 *        tasks submitted by the {@code execute} method.
 * @param threadFactory the factory to use when the executor
 *        creates a new thread
 * @param handler the handler to use when execution is blocked
 *        because the thread bounds and queue capacities are reached
 * @throws IllegalArgumentException if one of the following holds:<br>
 *         {@code corePoolSize < 0}<br>
 *         {@code keepAliveTime < 0}<br>
 *         {@code maximumPoolSize <= 0}<br>
 *         {@code maximumPoolSize < corePoolSize}
 * @throws NullPointerException if {@code workQueue}
 *         or {@code threadFactory} or {@code handler} is null
 */
public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          ThreadFactory threadFactory,
                          RejectedExecutionHandler handler) {
    if (corePoolSize < 0 ||
        maximumPoolSize <= 0 ||
        maximumPoolSize < corePoolSize ||
        keepAliveTime < 0)
        throw new IllegalArgumentException();
    if (workQueue == null || threadFactory == null || handler == null)
        throw new NullPointerException();
    this.corePoolSize = corePoolSize;
    this.maximumPoolSize = maximumPoolSize;
    this.workQueue = workQueue;
    this.keepAliveTime = unit.toNanos(keepAliveTime);
    this.threadFactory = threadFactory;
    this.handler = handler;
}

从上述源码可知, 传递的参数必须符合如下要求:

(1) corePoolSize >= 0, maximumPoolSize > 0, maximumPoolSize >= corePoolSize, keepAliveTime >= 0
(2) workQueue, threadFactory, handler 还不能为null.

如果传递的所有参数都符合上述要求, 那么就会执行后边的6个赋值语句, 将6个参数分别赋值给该类内部的6个成员字段. 接下来我们就分别分析一下这6个参数各自的含义. 而要想正确理解这些参数以及对应字段的含义, 需要同时结合该构造方法的注释以及对应字段的注释, 对应字段的getter(), setter()方法的注释才能基本无误地理解透彻, 有时甚至还需要结合源码才能真正理解. 下面我们来逐一分析这几个参数(其实也就是分析6个成员字段的含义).

corePoolSize

将该类最前边关于 Core and maximum pool sizes 和 On-demand construction 的注释, 构造方法对该参数 corePoolSize 的注释, 以及该类对同名字段 corePoolSize 的注释汇总如下:

/**
 * <dt>Core and maximum pool sizes</dt>
 *
 * <dd>A {@code ThreadPoolExecutor} will automatically adjust the
 * pool size (see {@link #getPoolSize})
 * according to the bounds set by
 * corePoolSize (see {@link #getCorePoolSize}) and
 * maximumPoolSize (see {@link #getMaximumPoolSize}).
 *
 * When a new task is submitted in method {@link #execute(Runnable)},
 * and fewer than corePoolSize threads are running, a new thread is
 * created to handle the request, even if other worker threads are
 * idle.  If there are more than corePoolSize but less than
 * maximumPoolSize threads running, a new thread will be created only
 * if the queue is full.  By setting corePoolSize and maximumPoolSize
 * the same, you create a fixed-size thread pool. By setting
 * maximumPoolSize to an essentially unbounded value such as {@code
 * Integer.MAX_VALUE}, you allow the pool to accommodate an arbitrary
 * number of concurrent tasks. Most typically, core and maximum pool
 * sizes are set only upon construction, but they may also be changed
 * dynamically using {@link #setCorePoolSize} and {@link
 * #setMaximumPoolSize}. </dd>
 *
 * <dt>On-demand construction</dt>
 *
 * <dd>By default, even core threads are initially created and
 * started only when new tasks arrive, but this can be overridden
 * dynamically using method {@link #prestartCoreThread} or {@link
 * #prestartAllCoreThreads}.  You probably want to prestart threads if
 * you construct the pool with a non-empty queue. </dd>
 */

/**
 * Core pool size is the minimum number of workers to keep alive
 * (and not allow to time out etc) unless allowCoreThreadTimeOut
 * is set, in which case the minimum is zero.
 */
private volatile int corePoolSize;

/**
 * Creates a new {@code ThreadPoolExecutor} with the given initial
 * parameters.
 *
 * @param corePoolSize the number of threads to keep in the pool, even
 *        if they are idle, unless {@code allowCoreThreadTimeOut} is set
 * ...
 */
public ThreadPoolExecutor(int corePoolSize,
                          int maximumPoolSize,
                          long keepAliveTime,
                          TimeUnit unit,
                          BlockingQueue<Runnable> workQueue,
                          ThreadFactory threadFactory,
                          RejectedExecutionHandler handler) {
    // ...
    this.corePoolSize = corePoolSize;
    // ...
}

结合上述注释可知, corePoolSize 字段表示的是线程池中一直存活着的线程的最小数量, 这些一直存活着的线程又被称为核心线程. 默认情况下, 核心线程的这个最小数量都是正数, 除非调用了allowCoreThreadTimeOut()方法并传递参数为true, 设置允许核心线程因超时而停止(terminated), 在那种情况下, 一旦所有的核心线程都先后因超时而停止了, 将使得线程池中的核心线程数量最终变为0, 也就是一直存活着的线程数为0, 这将是那种情况下, 线程池中核心线程数量的最小值. 默认情况下, 核心线程是按需创建并启动的, 也就是说, 只有当线程池接收到我们提交给他的任务后, 他才会去创建并启动一定数量的核心线程来执行这些任务. 如果他没有接收到相关任务, 他就不会主动去创建核心线程. 这种默认的核心线程的创建启动机制, 有助于降低系统资源的消耗. 变主动为被动, 类似于常见的观察者模式. 当然这只是系统默认的方式, 如果有特殊需求的话, 我们也可以通过调用 prestartCoreThread() 或 prestartAllCoreThreads() 方法来改变这一机制, 使得在新任务还未提交到线程池的时候, 线程池就已经创建并启动了一个或所有核心线程, 并让这些核心线程在池子里等待着新任务的到来.

maximumPoolSize

将该类最前边对 Core and maximum pool sizes 的注释, 构造方法对该参数 maximumPoolSize 的注释, 以及该类中的同名字段 maximumPoolSize 的注释进行汇总如下:

/**
 * <dt>Core and maximum pool sizes</dt>
 *
 * <dd>A {@code ThreadPoolExecutor} will automatically adjust the
 * pool size (see {@link #getPoolSize})
 * according to the bounds set by
 * corePoolSize (see {@link #getCorePoolSize}) and
 * maximumPoolSize (see {@link #getMaximumPoolSize

最低0.47元/天解锁文章

cleverGump

关注

14
点赞
踩
29

收藏

觉得还不错? 一键收藏
23
评论
Java 线程池 ThreadPoolExecutor 源码分析

线程池能够对线程进行有效的管理, 复用和数量上限的限制, 所以比起原始的 new Thread().start() 这种创建并启动线程的方式, 线程池的效率和性能都更好.Java 中的线程池是用 ThreadPoolExecutor 类来表示的. 我们今天就结合该类的源码来分析一下这个类内部对于线程的创建, 管理以及后台任务的调度等方面的执行原理. 我这里分析的是 JDK 1.8 的源码.
复制链接

扫一扫