Java线程池 - 深入解析ThreadPoolExecutor的底层原理（源码全面讲解一篇就够）-CSDN博客

本文链接：https://blog.csdn.net/GDUT_xin/article/details/147341723

文章目录

ThreadPoolExecutor是什么？

ThreadPoolExecutor 是 JUC （java.util.concurrent）下的一个类，故名思议，叫做线程（Thread）池（Pool）执行器（Executor）。

ThreadPoolExecutor 源码注释解释

在这里插入图片描述

ThreadPoolExecutor 类注释源码

/**
 * An {@link ExecutorService} that executes each submitted task using
 * one of possibly several pooled threads, normally configured
 * using {@link Executors} factory methods.
 *
 * <p>Thread pools address two different problems: they usually
 * provide improved performance when executing large numbers of
 * asynchronous tasks, due to reduced per-task invocation overhead,
 * and they provide a means of bounding and managing the resources,
 * including threads, consumed when executing a collection of tasks.
 * Each {@code ThreadPoolExecutor} also maintains some basic
 * statistics, such as the number of completed tasks.
 *
 * <p>To be useful across a wide range of contexts, this class
 * provides many adjustable parameters and extensibility
 * hooks. However, programmers are urged to use the more convenient
 * {@link Executors} factory methods {@link
 * Executors#newCachedThreadPool} (unbounded thread pool, with
 * automatic thread reclamation), {@link Executors#newFixedThreadPool}
 * (fixed size thread pool) and {@link
 * Executors#newSingleThreadExecutor} (single background thread), that
 * preconfigure settings for the most common usage
 * scenarios. Otherwise, use the following guide when manually
 * configuring and tuning this class:
 *
 * <dl>
 *
 * <dt>Core and maximum pool sizes</dt>
 *
 * <dd>A {@code ThreadPoolExecutor} will automatically adjust the
 * pool size (see {@link #getPoolSize})
 * according to the bounds set by
 * corePoolSize (see {@link #getCorePoolSize}) and
 * maximumPoolSize (see {@link #getMaximumPoolSize}).
 *
 * When a new task is submitted in method {@link #execute(Runnable)},
 * and fewer than corePoolSize threads are running, a new thread is
 * created to handle the request, even if other worker threads are
 * idle.  If there are more than corePoolSize but less than
 * maximumPoolSize threads running, a new thread will be created only
 * if the queue is full.  By setting corePoolSize and maximumPoolSize
 * the same, you create a fixed-size thread pool. By setting
 * maximumPoolSize to an essentially unbounded value such as {@code
 * Integer.MAX_VALUE}, you allow the pool to accommodate an arbitrary
 * number of concurrent tasks. Most typically, core and maximum pool
 * sizes are set only upon construction, but they may also be changed
 * dynamically using {@link #setCorePoolSize} and {@link
 * #setMaximumPoolSize}. </dd>
 *
 * <dt>On-demand construction</dt>
 *
 * <dd>By default, even core threads are initially created and
 * started only when new tasks arrive, but this can be overridden
 * dynamically using method {@link #prestartCoreThread} or {@link
 * #prestartAllCoreThreads}.  You probably want to prestart threads if
 * you construct the pool with a non-empty queue. </dd>
 *
 * <dt>Creating new threads</dt>
 *
 * <dd>New threads are created using a {@link ThreadFactory}.  If not
 * otherwise specified, a {@link Executors#defaultThreadFactory} is
 * used, that creates threads to all be in the same {@link
 * ThreadGroup} and with the same {@code NORM_PRIORITY} priority and
 * non-daemon status. By supplying a different ThreadFactory, you can
 * alter the thread's name, thread group, priority, daemon status,
 * etc. If a {@code ThreadFactory} fails to create a thread when asked
 * by returning null from {@code newThread}, the executor will
 * continue, but might not be able to execute any tasks. Threads
 * should possess the "modifyThread" {@code RuntimePermission}. If
 * worker threads or other threads using the pool do not possess this
 * permission, service may be degraded: configuration changes may not
 * take effect in a timely manner, and a shutdown pool may remain in a
 * state in which termination is possible but not completed.</dd>
 *
 * <dt>Keep-alive times</dt>
 *
 * <dd>If the pool currently has more than corePoolSize threads,
 * excess threads will be terminated if they have been idle for more
 * than the keepAliveTime (see {@link #getKeepAliveTime(TimeUnit)}).
 * This provides a means of reducing resource consumption when the
 * pool is not being actively used. If the pool becomes more active
 * later, new threads will be constructed. This parameter can also be
 * changed dynamically using method {@link #setKeepAliveTime(long,
 * TimeUnit)}.  Using a value of {@code Long.MAX_VALUE} {@link
 * TimeUnit#NANOSECONDS} effectively disables idle threads from ever
 * terminating prior to shut down. By default, the keep-alive policy
 * applies only when there are more than corePoolSize threads. But
 * method {@link #allowCoreThreadTimeOut(boolean)} can be used to
 * apply this time-out policy to core threads as well, so long as the
 * keepAliveTime value is non-zero. </dd>
 *
 * <dt>Queuing</dt>
 *
 * <dd>Any {@link BlockingQueue} may be used to transfer and hold
 * submitted tasks.  The use of this queue interacts with pool sizing:
 *
 * <ul>
 *
 * <li> If fewer than corePoolSize threads are running, the Executor
 * always prefers adding a new thread
 * rather than queuing.</li>
 *
 * <li> If corePoolSize or more threads are running, the Executor
 * always prefers queuing a request rather than adding a new
 * thread.</li>
 *
 * <li> If a request cannot be queued, a new thread is created unless
 * this would exceed maximumPoolSize, in which case, the task will be
 * rejected.</li>
 *
 * </ul>
 *
 * There are three general strategies for queuing:
 * <ol>
 *
 * <li> <em> Direct handoffs.</em> A good default choice for a work
 * queue is a {@link SynchronousQueue} that hands off tasks to threads
 * without otherwise holding them. Here, an attempt to queue a task
 * will fail if no threads are immediately available to run it, so a
 * new thread will be constructed. This policy avoids lockups when
 * handling sets of requests that might have internal dependencies.
 * Direct handoffs generally require unbounded maximumPoolSizes to
 * avoid rejection of new submitted tasks. This in turn admits the
 * possibility of unbounded thread growth when commands continue to
 * arrive on average faster than they can be processed.  </li>
 *
 * <li><em> Unbounded queues.</em> Using an unbounded queue (for
 * example a {@link LinkedBlockingQueue} without a predefined
 * capacity) will cause new tasks to wait in the queue when all
 * corePoolSize threads are busy. Thus, no more than corePoolSize
 * threads will ever be created. (And the value of the maximumPoolSize
 * therefore doesn't have any effect.)  This may be appropriate when
 * each task is completely independent of others, so tasks cannot
 * affect each others execution; for example, in a web page server.
 * While this style of queuing can be useful in smoothing out
 * transient bursts of requests, it admits the possibility of
 * unbounded work queue growth when commands continue to arrive on
 * average faster than they can be processed.  </li>
 *
 * <li><em>Bounded queues.</em> A bounded queue (for example, an
 * {@link ArrayBlockingQueue}) helps prevent resource exhaustion when
 * used with finite maximumPoolSizes, but can be more difficult to
 * tune and control.  Queue sizes and maximum pool sizes may be traded
 * off for each other: Using large queues and small pools minimizes
 * CPU usage, OS resources, and context-switching overhead, but can
 * lead to artificially low throughput.  If tasks frequently block (for
 * example if they are I/O bound), a system may be able to schedule
 * time for more threads than you otherwise allow. Use of small queues
 * generally requires larger pool sizes, which keeps CPUs busier but
 * may encounter unacceptable scheduling overhead, which also
 * decreases throughput.  </li>
 *
 * </ol>
 *
 * </dd>
 *
 * <dt>Rejected tasks</dt>
 *
 * <dd>New tasks submitted in method {@link #execute(Runnable)} will be
 * <em>rejected</em> when the Executor has been shut down, and also when
 * the Executor uses finite bounds for both maximum threads and work queue
 * capacity, and is saturated.  In either case, the {@code execute} method
 * invokes the {@link
 * RejectedExecutionHandler#rejectedExecution(Runnable, ThreadPoolExecutor)}
 * method of its {@link RejectedExecutionHandler}.  Four predefined handler
 * policies are provided:
 *
 * <ol>
 *
 * <li> In the default {@link ThreadPoolExecutor.AbortPolicy}, the
 * handler throws a runtime {@link RejectedExecutionException} upon
 * rejection. </li>
 *
 * <li> In {@link ThreadPoolExecutor.CallerRunsPolicy}, the thread
 * that invokes {@code execute} itself runs the task. This provides a
 * simple feedback control mechanism that will slow down the rate that
 * new tasks are submitted. </li>
 *
 * <li> In {@link ThreadPoolExecutor.DiscardPolicy}, a task that
 * cannot be executed is simply dropped.  </li>
 *
 * <li>In {@link ThreadPoolExecutor.DiscardOldestPolicy}, if the
 * executor is not shut down, the task at the head of the work queue
 * is dropped, and then execution is retried (which can fail again,
 * causing this to be repeated.) </li>
 *
 * </ol>
 *
 * It is possible to define and use other kinds of {@link
 * RejectedExecutionHandler} classes. Doing so requires some care
 * especially when policies are designed to work only under particular
 * capacity or queuing policies. </dd>
 *
 * <dt>Hook methods</dt>
 *
 * <dd>This class provides {@code protected} overridable
 * {@link #beforeExecute(Thread, Runnable)} and
 * {@link #afterExecute(Runnable, Throwable)} methods that are called
 * before and after execution of each task.  These can be used to
 * manipulate the execution environment; for example, reinitializing
 * ThreadLocals, gathering statistics, or adding log entries.
 * Additionally, method {@link #terminated} can be overridden to perform
 * any special processing that needs to be done once the Executor has
 * fully terminated.
 *
 * <p>If hook or callback methods throw exceptions, internal worker
 * threads may in turn fail and abruptly terminate.</dd>
 *
 * <dt>Queue maintenance</dt>
 *
 * <dd>Method {@link #getQueue()} allows access to the work queue
 * for purposes of monitoring and debugging.  Use of this method for
 * any other purpose is strongly discouraged.  Two supplied methods,
 * {@link #remove(Runnable)} and {@link #purge} are available to
 * assist in storage reclamation when large numbers of queued tasks
 * become cancelled.</dd>
 *
 * <dt>Finalization</dt>
 *
 * <dd>A pool that is no longer referenced in a program <em>AND</em>
 * has no remaining threads will be {@code shutdown} automatically. If
 * you would like to ensure that unreferenced pools are reclaimed even
 * if users forget to call {@link #shutdown}, then you must arrange
 * that unused threads eventually die, by setting appropriate
 * keep-alive times, using a lower bound of zero core threads and/or
 * setting {@link #allowCoreThreadTimeOut(boolean)}.  </dd>
 *
 * </dl>
 *
 * <p><b>Extension example</b>. Most extensions of this class
 * override one or more of the protected hook methods. For example,
 * here is a subclass that adds a simple pause/resume feature:
 *
 *  <pre> {@code
 * class PausableThreadPoolExecutor extends ThreadPoolExecutor {
 *   private boolean isPaused;
 *   private ReentrantLock pauseLock = new ReentrantLock();
 *   private Condition unpaused = pauseLock.newCondition();
 *
 *   public PausableThreadPoolExecutor(...) { super(...); }
 *
 *   protected void beforeExecute(Thread t, Runnable r) {
 *     super.beforeExecute(t, r);
 *     pauseLock.lock();
 *     try {
 *       while (isPaused) unpaused.await();
 *     } catch (InterruptedException ie) {
 *       t.interrupt();
 *     } finally {
 *       pauseLock.unlock();
 *     }
 *   }
 *
 *   public void pause() {
 *     pauseLock.lock();
 *     try {
 *       isPaused = true;
 *     } finally {
 *       pauseLock.unlock();
 *     }
 *   }
 *
 *   public void resume() {
 *     pauseLock.lock();
 *     try {
 *       isPaused = false;
 *       unpaused.signalAll();
 *     } finally {
 *       pauseLock.unlock();
 *     }
 *   }
 * }}</pre>
 *
 * @since 1.5
 * @author Doug Lea
 */
public class ThreadPoolExecutor extends AbstractExecutorService {
  //...
}

一个通过线程池执行提交任务的 ExecutorService 实现，通常通过 Executors 工厂方法进行配置。

线程池解决的两大问题：性能 + 管理

通过减少任务调用的开销，提升大量异步任务的执行性能
对执行任务时消耗的资源（包括线程）进行限制和管理

Executors 的预配置工厂方法 newCachedThreadPool、newFixedThreadPool、newSingleThreadExecutor

每个 ThreadPoolExecutor 还会维护基础统计信息（如已完成任务数）。

为适应多种场景，本类提供大量可调参数和扩展点。但建议优先使用 Executors 的预配置工厂方法：

Executors. newCachedThreadPool（无界线程池，自动回收空闲线程）
Executors. newFixedThreadPool（固定大小线程池）
Executors. newSingleThreadExecutor（单后台线程）如需手动配置，请参考以下指南：

核心线程数 corePoolSize 与最大线程数 maximumPoolSize

ThreadPoolExecutor 根据以下规则动态调整线程池大小：

当通过 execute(Runnable) 提交新任务时：

若运行线程数 < corePoolSize：即使有空闲线程也创建新线程。
若 corePoolSize ≤ 运行线程数 < maximumPoolSize：仅当队列满时创建新线程。

动态调整的参数

设置 corePoolSize = maximumPoolSize 可创建固定大小线程池。
设置 maximumPoolSize = Integer. MAX_VALUE 允许无限线程数。
核心参数可通过 setCorePoolSize 和 setMaximumPoolSize 动态调整
按需创建线程。

prestartCoreThread 或 prestartAllCoreThreads 预启动线程

默认情况下，核心线程也仅在任务到达时创建。可通过 prestartCoreThread 或 prestartAllCoreThreads 预启动线程。若使用非空队列构造线程池，建议预启动线程。

通过 ThreadFactory 线程工厂创建线程

通过 ThreadFactory 创建新线程。默认使用 Executors. defaultThreadFactory，创建同线程组、普通优先级、非守护线程。 自定义 ThreadFactory 可修改线程名称、组、优先级、守护状态等属性。若 ThreadFactory 返回 null，线程池将无法执行任务。工作线程需拥有 “modifyThread” 运行时权限，否则可能导致配置延迟生效或关闭异常。

线程存活时间 keepAliveTime

当线程数 > corePoolSize 时，空闲超过 keepAliveTime 的线程将被终止。可通过 setKeepAliveTime 动态调整。设置 Long. MAX_VALUE 纳秒可禁用空闲线程回收。默认策略不回收核心线程，但可通过 allowCoreThreadTimeOut(true) 启用核心线程超时回收（需 keepAliveTime > 0）。

线程池大小策略（核心流程三步走）

队列选择与线程池大小策略密切相关：
运行线程数 < corePoolSize：优先创建新线程而非入队
运行线程数 ≥ corePoolSize：优先入队而非创建新线程
无法入队时：创建新线程（不超过 maximumPoolSize），否则触发拒绝策略

三种典型队列策略：SynchronousQueue、LinkedBlockingQueue、

直接传递队列（如 SynchronousQueue）： - 无缓冲队列，若无空闲线程立即创建新线程 - 需设置较大的 maximumPoolSize 避免拒绝 - 适用于避免死锁的场景，但可能导致线程数激增
无界队列（如 LinkedBlockingQueue）： - 固定使用 corePoolSize 个线程，maximumPoolSize 无效 - 适用于任务完全独立的场景（如 Web 服务器） - 可能引发队列无限增长
有界队列（如 ArrayBlockingQueue）： - 需谨慎权衡队列大小与最大线程数 - 大队列+小线程池：节省资源但吞吐量低 - 小队列+大线程池：CPU 利用率高但调度开销大 - 适用于 I/ O 密集型任务

4种内置拒绝策略 AbortPolicy、CallerRunsPolicy、DiscardPolicy、DiscardOldestPolicy

当线程池关闭或达到资源上限时，新提交任务将被拒绝。 execute 方法会调用 RejectedExecutionHandler 处理拒绝任务。提供四种内置策略：

ThreadPoolExecutor. AbortPolicy：默认策略，抛出 RejectedExecutionException
ThreadPoolExecutor. CallerRunsPolicy：由调用者线程直接执行任务
ThreadPoolExecutor. DiscardPolicy：静默丢弃被拒任务
ThreadPoolExecutor. DiscardOldestPolicy：丢弃队列最旧任务并重试提交

自定义策略时需注意与容量/ 队列策略的兼容性。

钩子方法：beforeExecute、afterExecute、terminated

可通过重写以下 protected 方法扩展功能： - beforeExecute(Thread, Runnable)：任务执行前调用（如初始化 ThreadLocal） - afterExecute(Runnable, Throwable)：任务执行后调用（如收集统计信息） - terminated：线程池完全终止后调用注意：若钩子方法抛出异常，可能导致工作线程意外终止

队列维护 getQueue、remove、purge

getQueue() 方法可用于监控队列，但强烈不建议用于其他目的。提供 remove(Runnable) 和 purge 方法帮助清理取消的任务。

终止处理，线程池资源回收 keepAliveTime = 0 、corePoolSize = 0、允许线程超时

当线程池不再被引用且无活动线程时，将自动关闭。为确保资源回收，建议： - 设置 keepAliveTime = 0 - 允许核心线程超时 - 设置 corePoolSize = 0

经典使用案例：Spring 的线程处理器 ThreadPoolTaskExecutor

org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor 是 Spring 的线程池任务执行器，内部也是用的

java.util.concurrent.ThreadPoolExecutor#execute 去提交任务。

在这里插入图片描述

ThreadPoolExecutor 的执行逻辑？

控制变量：AtomicInteger 3（线程池生命周期） + 29（线程数量）

我们先看看这段很有趣的线程数量、线程池状态变量设计的代码：

/**
 * The main pool control state, ctl, is an atomic integer packing
 * two conceptual fields
 *   workerCount, indicating the effective number of threads
 *   runState,    indicating whether running, shutting down etc
 *
 * In order to pack them into one int, we limit workerCount to
 * (2^29)-1 (about 500 million) threads rather than (2^31)-1 (2
 * billion) otherwise representable. If this is ever an issue in
 * the future, the variable can be changed to be an AtomicLong,
 * and the shift/mask constants below adjusted. But until the need
 * arises, this code is a bit faster and simpler using an int.
 *
 * The workerCount is the number of workers that have been
 * permitted to start and not permitted to stop.  The value may be
 * transiently different from the actual number of live threads,
 * for example when a ThreadFactory fails to create a thread when
 * asked, and when exiting threads are still performing
 * bookkeeping before terminating. The user-visible pool size is
 * reported as the current size of the workers set.
 *
 * The runState provides the main lifecycle control, taking on values:
 *
 *   RUNNING:  Accept new tasks and process queued tasks
 *   SHUTDOWN: Don't accept new tasks, but process queued tasks
 *   STOP:     Don't accept new tasks, don't process queued tasks,
 *             and interrupt in-progress tasks
 *   TIDYING:  All tasks have terminated, workerCount is zero,
 *             the thread transitioning to state TIDYING
 *             will run the terminated() hook method
 *   TERMINATED: terminated() has completed
 *
 * The numerical order among these values matters, to allow
 * ordered comparisons. The runState monotonically increases over
 * time, but need not hit each state. The transitions are:
 *
 * RUNNING -> SHUTDOWN
 *    On invocation of shutdown(), perhaps implicitly in finalize()
 * (RUNNING or SHUTDOWN) -> STOP
 *    On invocation of shutdownNow()
 * SHUTDOWN -> TIDYING
 *    When both queue and pool are empty
 * STOP -> TIDYING
 *    When pool is empty
 * TIDYING -> TERMINATED
 *    When the terminated() hook method has completed
 *
 * Threads waiting in awaitTermination() will return when the
 * state reaches TERMINATED.
 *
 * Detecting the transition from SHUTDOWN to TIDYING is less
 * straightforward than you'd like because the queue may become
 * empty after non-empty and vice versa during SHUTDOWN state, but
 * we can only terminate if, after seeing that it is empty, we see
 * that workerCount is 0 (which sometimes entails a recheck -- see
 * below).
 */
private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
private static final int COUNT_BITS = Integer.SIZE - 3;
private static final int CAPACITY   = (1 << COUNT_BITS) - 1;

// runState is stored in the high-order bits
private static final int RUNNING    = -1 << COUNT_BITS;
private static final int SHUTDOWN   =  0 << COUNT_BITS;
private static final int STOP       =  1 << COUNT_BITS;
private static final int TIDYING    =  2 << COUNT_BITS;
private static final int TERMINATED =  3 << COUNT_BITS;

可以看出，ctl 的内存布局，高三位是运行状态，3位就有 8种的容量，不过设计只用了5种状态。

|<----- 3位 ----->|<---------- 29位 ---------->|
|   runState      |       workerCount          |

状态值枚举：

状态名	十进制值（去除低29位）	二进制值	说明
RUNNING	-536870912	`111`00000…	接受新任务并处理队列任务
SHUTDOWN	0	`000`00000…	不再接受新任务，但处理队列中剩余任务
STOP	536870912	`001`00000…	中断正在执行的任务，抛弃队列任务
TIDYING	1073741824	`010`00000…	所有任务已终止，工作线程数为0，准备执行 terminated() 钩子方法
TERMINATED	1610612736	`011`00000…	terminated() 已完成

有了这些前置知识，我们再看线程池提交执行任务的逻辑。

ThreadPoolExecutor#execute 向线程池提交任务执行

public void execute(Runnable command) {
    if (command == null)
        throw new NullPointerException();
    
    // 获取组合状态（包含线程池状态 + 工作线程数）
    int c = ctl.get();

    // ===== 阶段1：尝试使用核心线程处理 =====
    if (workerCountOf(c) < corePoolSize) { // 当前工作线程 < 核心线程数
        if (addWorker(command, true))      // 尝试创建核心线程（true表示用corePoolSize作为上限）
            return;                        // 创建成功直接返回
        c = ctl.get();  // 创建失败（并发情况下线程池可能已被关闭），重新获取状态
    }

    // ===== 阶段2：尝试加入工作队列 =====
    if (isRunning(c) && workQueue.offer(command)) { // 线程池仍为RUNNING状态且入队成功
        int recheck = ctl.get(); // 二次检查（防止此期间线程池状态变化）
        
        if (!isRunning(recheck) && remove(command)) // 发现线程池已关闭：移除任务并拒绝
            reject(command);
        else if (workerCountOf(recheck) == 0)       // 核心线程数被设置为0的特殊情况
            addWorker(null, false); // 创建无初始任务的非核心线程（后续会从队列取任务）
    }
    
    // ===== 阶段3：尝试创建非核心线程 =====
    else if (!addWorker(command, false)) // 队列已满，尝试创建非核心线程（false表示用maxPoolSize）
        reject(command);                 // 创建失败则触发拒绝策略
}

阶段一：线程池会先判断，如果当前工作线程 < 核心线程数，也就是说核心线程数够用，那么调用 addWorker(command, true) 去创建核心线程，这个 bool 变量，源码里写着说，true 就是用核心线程数量 corePoolSize 作为上限，去创建线程。但我更多认为这个 core 就是说的是否本次创建的线程是核心线程，也满足以上描述。

阶段二：如果当前工作线程 >= 核心线程数，也就是核心线程数不够用，而且线程池正在运行，那么就会尝试加入工作队列，代码为 workQueue.offer(command)，之所以要在 isRuning 的时候加入 workQueue，是因为只有运行情况下的线程池才会去“消费” workQueue 的任务。如果入队成功，还要再判断一次是否在运行，因为 workQueue 的类型是 BlockingQueue 阻塞队列，在 offer 的时候会阻塞，阻塞过程中 c 可能是会变化的，类似“双重锁校验”的逻辑。如果这时候发现线程池已经不是 running 状态了，那么尝试移除任务，如果移除成功，那么说明此次的任务不执行了，那么就是去“拒绝”。如果移除不成功，或者说压根一开始 running 状态确实没有变化，这两种情况都会去判断是否核心线程数已经是0了，是的话， addWorker(null, false)，创建无任务的非核心线程，去“消费”队列里的任务，core 参数 false 表示此次加的worker不是核心线程，所以内部就会以最大线程数为上界。所以说，创建无任务的线程去消费，是因为两种情况：要么是线程池不在running但队列还有任务（加入队列后线程池状态变更了），要么是现在在running但核心线程数为0（被设置或者其他情况）。

阶段三：如果工作队列无法用 workQueue “异步”掉这个任务，或者说工作队列已满，那么还可以去尝试创建非核心线程，addWorker(command, true)，这个 bool 变量，false 就是用最大线程数量 maxPoolSize 作为上限去判断，去创建线程。

我对三阶段的理解：

一阶段，可以优先用工作线程，这样处理的任务，直接用工作线程执行，工作线程本身存在，无需创建和销毁线程。
二阶段，是超出工作线程数，可以用工作队列缓冲，后面还是工作线程处理，无需创建和销毁线程，只不过需要等待时间去消费。
三阶段，是任务总量已经超出工作线程 + 队列容量的总和，这时候在配置的最大线程数下去开启更多线程去压榨性能，超出部分的任务只能拒绝。

这里就涉及到对以上这些“东西”的配置了。我们要关注哪些东西是可配置的，为了更便于理解，我们先讲讲线程池里增加工作线程的逻辑，配置的东西等本文后续会讲到。

ThreadPoolExecutor#addWorker 往线程池里添加工作线程（包含线程启动）

在 java.util.concurrent.ThreadPoolExecutor#execute 逻辑里，我们重点看 ThreadPoolExecutor#addWorker 方法，这个是添加工作线程的核心方法。

/**
 * 创建新工作线程并启动，承载指定初始任务
 * @param firstTask 新线程应首先执行的任务（或null表示从队列获取）
 * @param core true表示用corePoolSize作为限制，false用maximumPoolSize
 * @return 是否成功创建并启动线程
 */
private boolean addWorker(Runnable firstTask, boolean core) {
    // 外层循环：处理线程池状态变化的场景
    retry:
    for (;;) {
        int c = ctl.get();
        int rs = runStateOf(c); // 分解出运行状态

        // 状态检查（当线程池处于非RUNNING状态时的准入条件）
        if (rs >= SHUTDOWN && // 状态 >= SHUTDOWN（即非RUNNING）
            !(rs == SHUTDOWN && // 特殊情况：SHUTDOWN状态时允许添加空任务的工作线程
               firstTask == null && // 且任务为空（用于处理队列中的剩余任务）
               !workQueue.isEmpty())) // 且队列非空
            return false;

        // 内层循环：处理工作线程数变更的CAS操作
        for (;;) {
            int wc = workerCountOf(c); // 当前工作线程数
            // 容量检查（超过理论最大值 或 达到核心/最大线程限制）
            if (wc >= CAPACITY || 
                wc >= (core ? corePoolSize : maximumPoolSize))
                return false;
            
            // CAS原子增加工作线程数（解决并发创建问题）
            if (compareAndIncrementWorkerCount(c)) 
                break retry; // 成功增加后跳出整个循环
            
            // CAS失败后的重试逻辑
            c = ctl.get(); // 重新获取最新状态
            if (runStateOf(c) != rs) // 如果运行状态已变，回到外层循环重新检查
                continue retry;
            // 否则继续内层循环尝试CAS
        }
    }

    // 初始化标识
    boolean workerStarted = false;
    boolean workerAdded = false;
    Worker w = null;
    try {
        w = new Worker(firstTask); // 创建Worker实例（包含实际线程）
        final Thread t = w.thread;
        if (t != null) {
            final ReentrantLock mainLock = this.mainLock;
            mainLock.lock(); // 加锁保证workers集合线程安全
            try {
                // 二次检查（避免获取锁期间线程池状态变化）
                int rs = runStateOf(ctl.get());

                // 状态有效性检查
                if (rs < SHUTDOWN || // 处于RUNNING
                    (rs == SHUTDOWN && firstTask == null)) { // 或SHUTDOWN状态处理队列任务
                    if (t.isAlive()) // 线程已启动的预检查
                        throw new IllegalThreadStateException();
                    
                    workers.add(w); // 将新Worker加入线程集合
                    
                    // 更新统计信息
                    int s = workers.size();
                    if (s > largestPoolSize)
                        largestPoolSize = s;
                    workerAdded = true;
                }
            } finally {
                mainLock.unlock();
            }
            
            // 成功加入集合后启动线程
            if (workerAdded) {
                t.start(); // 真正启动工作线程
                workerStarted = true;
            }
        }
    } finally {
        // 启动失败的清理逻辑
        if (!workerStarted)
            addWorkerFailed(w); // 回滚worker计数、移除worker等
    }
    return workerStarted;
}

添加工作线程的逻辑，是通过 addWorker，先获取控制变量 ctl.get()，用 workerCountOf 函数获取到线程数量，然后进行容量检查（超过理论最大值或达到核心/最大线程限制），失败则直接返回 false，退出创建线程的流程。（可以说这里的判断都是内聚的）

判断成功，则通过 compareAndIncrementWorkerCount 方法，用 CAS 去让 ctl + 1，否则继续循环尝试，知道成功后 break retry 直接跳出双层循环（外层是判断准入逻辑，内层是 CAS 循环尝试修改，这段 CAS 代码值得我们去参考模仿）。

将 ctl 成功 + 1 之后，new Worker(firstTask) 去创建工作线程，之后 workers.add(w) 将其加入线程集合，进行统一管理，然后解锁添加 worker 的流程。之后才进行 t.start 去开启线程处理。

双重锁校验：创建 worker 后，拿到线程池的主锁去进行同步操作，this.mainLock.lock()，锁 lock 之后，又去判断了一次外层的判断准入逻辑，是一种“双重锁校验”的思想，在进入锁之后再进行外层判断，防止锁获取期间发生了外层条件的变更（不管是正向还是反向）。

ThreadPoolExecutor 内置的4种拒绝策略

我们 reject 方法，是用的：

/**
 * Handler called when saturated or shutdown in execute.
 */
private volatile RejectedExecutionHandler handler;

final void reject(Runnable command) {
    handler.rejectedExecution(command, this);
}

这里的 RejectedExecutionHandler，ThreadPoolExecutor 内部也有相应的内部类实现。

/**
 * 当任务被拒绝时，直接在调用execute方法的线程中运行该任务
 * （除非线程池已关闭，此时直接丢弃任务）
 * 
 * 适用场景：不希望任务丢失且能接受同步执行的场景（如关键日志记录）
 */
public static class CallerRunsPolicy implements RejectedExecutionHandler {
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
        if (!e.isShutdown()) {  // 检查线程池是否处于运行状态
            r.run();  // 直接在当前线程（调用者线程）执行任务
        }
    }
}

/**
 * 拒绝任务时直接抛出RejectedExecutionException异常
 * 
 * 适用场景：需要严格处理任务过载的敏感系统（如金融交易系统）
 */
public static class AbortPolicy implements RejectedExecutionHandler {
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
        throw new RejectedExecutionException("Task " + r + " rejected from " + e);
        // 强制调用者处理异常，避免静默失败
    }
}

/**
 * 静默丢弃被拒绝的任务（无任何通知）
 * 
 * 适用场景：允许丢弃非关键任务的场景（如监控数据采样）
 */
public static class DiscardPolicy implements RejectedExecutionHandler {
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
        // 空实现 == 直接丢弃任务
    }
}

/**
 * 丢弃队列中最老的未处理任务，然后重试执行当前任务
 * （如果线程池已关闭则直接丢弃）
 * 
 * 适用场景：需要优先处理新请求的场景（如实时消息推送）
 */
public static class DiscardOldestPolicy implements RejectedExecutionHandler {
    public void rejectedExecution(Runnable r, ThreadPoolExecutor e) {
        if (!e.isShutdown()) {
            e.getQueue().poll();  // 移除队列头部的旧任务
            e.execute(r);         // 重新尝试提交当前任务
        }
    }
}

其实各个拒绝策略的内部代码都十分简单，关键还是在于运用场景。

ThreadPoolExecutor 工作队列workQueue的任务如何消费？

其实是 addWorker 方法的 t.start() 的时候执行的，为什么呢？

因为这个 t，虽然说代码里写的是 Thread 类型，但是其实是多态的，也就是根据 w.thread 从worker 里拿出来的并不是普通的 Thread，我们看看 Worker 的构造方法，里面的入参 Runnable firstTask，其实也是提醒了，new Worker(firstTask) 传的任务只是这个“工作器”第一个执行的任务，隐含意思是后续还要处理队列任务（因为我们除了队列也没有其他地方有保存任务了！）。

那么具体是什么 Thread/runner 呢？看看构造方法：

Worker(Runnable firstTask) {
    setState(-1); // inhibit interrupts until runWorker
    this.firstTask = firstTask;
    this.thread = getThreadFactory().newThread(this);
}

将 Worker 自己 this 作为线程，注册到自己的作用域 this.thread。我们看下 Worker 其实是实现了 Runnable 接口的，run方法如下，其中这句 Delegates main run loop to outer runWorker ，已经充分诠释了委派模式的设计思想了，将本来是main 线程（这里可能指线程池的主线程）要去做的循环（处理队列任务），委派给每个 worker 去执行。

/** Delegates main run loop to outer runWorker  */
public void run() {
    runWorker(this);
}
/**
 * 工作线程的核心执行循环，负责从队列获取任务并执行。
 * 处理以下关键问题：
 * 1. 初始任务直接执行，后续任务通过 getTask() 获取
 * 2. 执行任务前加锁防止并发中断
 * 3. 通过 beforeExecute/afterExecute 提供扩展点
 * 4. 统一处理任务抛出的各类异常
 * 
 * @param w 绑定的 Worker 对象
 */
final void runWorker(Worker w) {
    Thread wt = Thread.currentThread();
    Runnable task = w.firstTask;  // 获取 Worker 的初始任务
    w.firstTask = null;
    w.unlock(); // 允许中断（设置 Worker 的锁状态）
    
    boolean completedAbruptly = true; // 标记是否因异常退出
    try {
        // 循环获取任务：先执行初始任务，再从队列获取
        while (task != null || (task = getTask()) != null) {
            w.lock();  // 加锁确保任务执行期间不被中断
            
            // 状态检查：若线程池已停止，确保线程被中断
            if ((runStateAtLeast(ctl.get(), STOP) || 
                (Thread.interrupted() && runStateAtLeast(ctl.get(), STOP)) 
                && !wt.isInterrupted()) {
                wt.interrupt(); // 中断线程
            }
            
            try {
                beforeExecute(wt, task); // 执行前钩子
                Throwable thrown = null;
                try {
                    task.run();  // 执行任务
                } catch (RuntimeException | Error x) { // 捕获标准异常
                    thrown = x;
                    throw x;
                } catch (Throwable x) { // 其他异常包装为 Error
                    thrown = x;
                    throw new Error(x);
                } finally {
                    afterExecute(task, thrown); // 执行后钩子（即使抛出异常）
                }
            } finally {
                task = null; // 清空任务引用
                w.completedTasks++; 
                w.unlock(); // 释放锁
            }
        }
        completedAbruptly = false; // 正常退出
    } finally {
        processWorkerExit(w, completedAbruptly); // 处理线程退出
    }
}

简化后的逻辑，就是

 while (task != null || (task = getTask()) != null) task.run();

getTask 就是从队列中获取任务

/**
 * 工作线程从队列获取任务的核心方法，决定线程是否继续存活或退出
 * 
 * 返回 null 表示工作线程需要终止，此时会减少 workerCount
 * 触发返回 null 的条件：
 * 1. 工作线程数超过 maximumPoolSize（因动态调整最大线程数）
 * 2. 线程池已进入 STOP 状态
 * 3. 线程池处于 SHUTDOWN 状态且队列为空
 * 4. 等待任务超时且符合线程回收条件（允许核心线程超时或当前线程数 > corePoolSize）
 */
private Runnable getTask() {
    boolean timedOut = false; // 标记上次 poll 是否超时

    // 重试循环（应对状态变化和中断）
    for (;;) {
        int c = ctl.get();
        int rs = runStateOf(c); // 分解运行状态

        // 检查线程池状态是否已关闭且满足终止条件
        if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
            decrementWorkerCount(); // 减少工作线程计数
            return null; // 触发条件2或3
        }

        int wc = workerCountOf(c); // 当前工作线程数

        // 判断当前线程是否允许超时回收
        boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;

        // 检查是否满足线程回收条件（触发条件1或4）
        if ((wc > maximumPoolSize || (timed && timedOut))
            && (wc > 1 || workQueue.isEmpty())) { // 避免回收最后一个线程导致队列任务无法处理
            if (compareAndDecrementWorkerCount(c)) // CAS减少线程数
                return null;
            continue; // CAS失败则重试
        }

        try {
            // 根据超时策略选择获取任务方式
            Runnable r = timed ?
                workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) : // 带超时的poll
                workQueue.take(); // 阻塞式take

            if (r != null)
                return r; // 成功获取任务
            
            timedOut = true; // 标记超时（下次循环可能触发回收）
        } catch (InterruptedException retry) {
            timedOut = false; // 中断后重置标记（可能是shutdownNow触发）
        }
    }
}

workQueue.poll 或者 workQueue.take 就是将任务拿出来，然后返回，到外层去处理。

工作线程Worker何时被终止、回收？

这个 java.util.concurrent.ThreadPoolExecutor#getTask 方法源码注释里写的：

Performs blocking or timed wait for a task，也就是说“决定线程是否继续存活或退出”。

返回 null 表示工作线程需要终止，此时会减少 workerCount。触发返回 null 的条件：

 * 工作线程数超过 maximumPoolSize（因动态调整最大线程数）
 * 线程池已进入 STOP 状态
 * 线程池处于 SHUTDOWN 状态且队列为空
 * 等待任务超时且符合线程回收条件（允许核心线程超时或当前线程数 > corePoolSize）

其实可以理解，返回null之后，外层循环 while (task != null || (task = getTask()) != null) 其实就会因为判断为 false，就会跳出循环，线程也就执行完成了。

返回 null 的逻辑有两处，一处是：

// 判断当前线程是否允许超时回收
boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;

// 检查是否满足线程回收条件（触发条件1或4）
if ((wc > maximumPoolSize || (timed && timedOut))
    && (wc > 1 || workQueue.isEmpty())) { // 避免回收最后一个线程导致队列任务无法处理
    if (compareAndDecrementWorkerCount(c)) // CAS减少线程数
        return null;
    continue; // CAS失败则重试
}

这段逻辑，有两个比较关键的变量：

timed：表示“当前线程是否允许超时回收”，看全局策略变量 allowCoreThreadTimeOut（是否允许核心线程超时回收），或者是否线程数超过核心线程数。
timedOut：获取队列超时了，也就是说 timed 情况下获取队列等待的时候超时了。

allowCoreThreadTimeOut，源码里写的这个变量的解释：If false (default), core threads stay alive even when idle. If true, core threads use keepAliveTime to time out waiting for work. 也就是说，如果为false（默认），则核心线程即使在空闲时也会保持活动状态。如果为true，核心线程将使用 keepAliveTime 超时等待工作。

我理解：这里的空闲时保持活动状态，指的是阻塞式等待。

我们再结合获取任务队列的逻辑，一起看，便能知道原理：

try {
  // 根据超时策略选择获取任务方式
  Runnable r = timed ?
      workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) : // 带超时的poll
      workQueue.take(); // 阻塞式take

  if (r != null)
      return r; // 成功获取任务

  timedOut = true; // 标记超时（下次循环可能触发回收）
} catch (InterruptedException retry) {
  timedOut = false; // 中断后重置标记（可能是shutdownNow触发）
}

所以整段的有可能结束线程的情况，是：

线程数大于最大线程数。
允许超时回收，而且确实超时了（等待了 keepAliveTime 却还是获取不到线程）。

但是这有个条件，就是要保证 wc > 1 || workQueue.isEmpty()，也就是保证不能是 wc <= 1 && workQueue.isNotEmpty()，也就是：工作队列还没有消费完，而且只剩当前线程一个线程了，这时候一定不能结束线程工作。

以上，如果是不允许超时回收（没有超过核心线程数，并且不允许核心线程超时），那么线程会一直阻塞在 workQueue.take()，直到等待到有任务进来。

几个等价的描述，可以帮助大家理解：

所谓跳出线程循环，就是线程被回收。
所谓超时回收，指的就是线程在需要回收的情况下，在队列里超时了还获取不到后被回收。

线程池 ThreadPoolExecutor 的被动终止和回收

还记得 runWorker 的最后，其实是 finally 去执行了ThreadPoolExecutor#processWorkerExit（处理工作退出）方法。

/**
 * 处理工作线程退出后的清理和簿记工作，仅在工作线程中调用。
 * 若非异常退出（completedAbruptly=false），
 * 假设 workerCount 已提前调整。此方法负责：
 * 1. 更新统计信息
 * 2. 移出 workers 集合
 * 3. 尝试终止线程池
 * 4. 必要时补充新线程（如队列仍有任务但无线程处理）
 * 
 * @param w 要退出的工作线程
 * @param completedAbruptly true 表示因用户任务异常导致的非正常退出
 */
private void processWorkerExit(Worker w, boolean completedAbruptly) {
    // 情况1：异常退出时需手动减少工作线程计数（正常退出在getTask()中已处理）
    if (completedAbruptly) 
        decrementWorkerCount();

    // 加锁操作保证线程安全
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock();
    try {
        completedTaskCount += w.completedTasks; // 累加已完成任务数
        workers.remove(w); // 从集合中移出该Worker
    } finally {
        mainLock.unlock();
    }

    tryTerminate(); // 尝试推进线程池终止状态（如 SHUTDOWN -> TIDYING）

    int c = ctl.get();
    // 仅在线程池仍处于 RUNNING/SHUTDOWN 状态时检查是否需要补充线程
    if (runStateLessThan(c, STOP)) { 
        if (!completedAbruptly) { // 正常退出的线程补充逻辑
            // 计算最小所需线程数
            int min = allowCoreThreadTimeOut ? 0 : corePoolSize; 
            // 特殊情况：队列非空时至少保留1个线程
            if (min == 0 && !workQueue.isEmpty())
                min = 1;

            // 当前线程数足够则无需补充
            if (workerCountOf(c) >= min)
                return; 
        }
        // 补充新线程（传递null任务，由新线程从队列取任务）
        addWorker(null, false); 
    }
}

这个方法做的几件事情：

减少工作线程数标记
累加已完成任务数
尝试推进线程池终止状态
必要时补充线程，非正常退出，或正常退出后队列非空后没有线程了。

尝试推进线程池终止状态

/**
 * 尝试将线程池推进到 TERMINATED 终止状态。
 * 触发条件：
 * 1. SHUTDOWN 状态且线程池和队列均为空
 * 2. STOP 状态且线程池为空
 * 
 * 若满足终止条件但仍有活动线程，则中断一个空闲线程传播关闭信号。
 * 此方法必须在任何可能导致终止的操作后调用（如减少工作线程数或清除队列任务）。
 * 方法设计为非 private 以允许 ScheduledThreadPoolExecutor 访问。
 */
final void tryTerminate() {
    // 自旋循环处理并发状态变化
    for (;;) {
        int c = ctl.get();
        
        // 检查是否无法终止（直接返回）
        if (isRunning(c) ||                    // 状态仍为 RUNNING
            runStateAtLeast(c, TIDYING) ||      // 已处于 TIDYING/TERMINATED
            (runStateOf(c) == SHUTDOWN && !workQueue.isEmpty()) // SHUTDOWN 但队列非空
            return;

        // 工作线程数不为零时：中断一个空闲线程传播信号
        if (workerCountOf(c) != 0) { 
            interruptIdleWorkers(ONLY_ONE); // 仅中断一个空闲线程
            return; 
        }

        // 加锁保证状态转换原子性
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
            // CAS 尝试将状态置为 TIDYING（workerCount=0）
            if (ctl.compareAndSet(c, ctlOf(TIDYING, 0))) {
                try {
                    terminated(); // 执行终止钩子方法（子类可扩展）
                } finally {
                    // 最终状态转为 TERMINATED
                    ctl.set(ctlOf(TERMINATED, 0));
                    termination.signalAll(); // 唤醒 awaitTermination 等待线程
                }
                return; // 终止完成
            }
        } finally {
            mainLock.unlock();
        }
        // CAS 失败则重试（其他线程可能修改了状态）
    }
}

CAS 去循环判断线程池的一些变量是否被其他线程改变，如果改变了之后发现没有必要终止，那么还是会跳出。推进终止的关键在于 ctl.set(ctlOf(TERMINATED, 0)) 将状态变量设置为终止。

而这个终止的状态，在 getTask 流程中，rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty()) 这个条件成立下，就会跳出 worker 的工作线程循环，所以线程池的状态变更为TERMINATED，会在线程循环getTask时被感知，导致跳出循环，最终走向终止。

线程池 ThreadPoolExecutor 的主动终止和回收

另外，还可以主动回收，例如 shutdown()/shutdownNow()。

/**
 * 启动有序关闭流程：
 * 1. 不再接受新任务
 * 2. 继续执行已提交的任务（包括队列中的任务）
 * 3. 若线程池已关闭，调用无任何效果
 *
 * 注意：此方法不会阻塞等待任务完成，需配合 awaitTermination() 使用
 * 
 * @throws SecurityException 无权限关闭线程池时抛出
 */
public void shutdown() {
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock(); // 加锁保证原子性
    try {
        checkShutdownAccess();      // 安全检查（如安全管理器权限）
        advanceRunState(SHUTDOWN);  // 推进状态到 SHUTDOWN
        interruptIdleWorkers();     // 中断所有空闲线程（阻塞在队列的线程）
        onShutdown();              // 空方法（供 ScheduledThreadPoolExecutor 扩展）
    } finally {
        mainLock.unlock();
    }
    tryTerminate(); // 尝试推进终止流程（若满足条件）
}

/**
 * 立即终止线程池：
 * 1. 中断所有正在执行的任务
 * 2. 丢弃队列中等待的任务
 * 3. 返回未执行的任务列表
 *
 * 注意：
 * - 不保证能立即停止所有任务（依赖任务对中断的响应）
 * - 需配合 awaitTermination() 等待线程池完全终止
 *
 * @return 被丢弃的未执行任务列表
 * @throws SecurityException 无权限关闭线程池时抛出
 */
public List<Runnable> shutdownNow() {
    List<Runnable> tasks;
    final ReentrantLock mainLock = this.mainLock;
    mainLock.lock(); // 加锁保证原子性
    try {
        checkShutdownAccess();     // 安全检查
        advanceRunState(STOP);     // 推进状态到 STOP
        interruptWorkers();       // 强制中断所有工作线程（包括活跃线程）
        tasks = drainQueue();    // 清空队列并返回未处理任务
    } finally {
        mainLock.unlock();
    }
    tryTerminate(); // 尝试推进终止流程
    return tasks;  // 返回被丢弃的任务列表
}

shutdown 做了几件事：

状态推进到 SHUTDOWN
中断所有空闲线程（阻塞在队列的线程）
最终调用了 tryTerminate 去尝试推进终止流程。

shutdownNow 做了几件事：

推进状态到 STOP
强制中断所有工作线程（包括活跃线程），底层遍历 workers 调用 java.lang.Thread#interrupt 去中断线程
清空队列并返回未处理任务，drainQueue 意为“排水”。
最终调用了 tryTerminate 去尝试推进终止流程。

结合前面的线程提交和允许过程的源码解析，可以得出以下对比：

特性	shutdown()	shutdownNow()
新任务提交	拒绝（RejectedExecutionException）	拒绝（RejectedExecutionException）
已提交队列任务	继续执行	清空队列并返回未执行任务
活跃任务处理	不中断正在执行的任务	强制中断所有线程（通过 Thread.interrupt()）
中断策略	仅中断空闲线程（阻塞在队列的线程）	中断所有线程（包括正在执行任务的线程）
适用场景	优雅关闭，允许队列任务完成	紧急终止，立即释放资源
返回值	无	未执行的队列任务列表