线程池ThreadPoolExecutor底层原理源码分析
学习Java线程池可以先看下作者其他文章,可增强对该文章的理解。
Java线程池的状态
Java线程池核心线程数设置
java线程是如何优雅关闭
Java线程池执行任务的具体流程
Java线程池的队列为什么一定是阻塞队列
Java线程池发生异常后,会发生什么
Tomcat线程池和JDK线程池的区别
线程池源码的基础属性和方法
在JDK线程池中,通过一个原子变量java.util.concurrent.ThreadPoolExecutor#ctl来表示当前线程池的状态
和当前线程池中工作的线程数
一个Integer类型怎么能标示两个类型的值呢?
Java一个Integer占4个字节,也就是32个bit。线程池有5个状态,只需要3个bit位即可全部定义全。
/**
* The main pool control state, ctl, is an atomic integer packing
* two conceptual fields
* workerCount, indicating the effective number of threads
* runState, indicating whether running, shutting down etc
*
* In order to pack them into one int, we limit workerCount to
* (2^29)-1 (about 500 million) threads rather than (2^31)-1 (2
* billion) otherwise representable. If this is ever an issue in
* the future, the variable can be changed to be an AtomicLong,
* and the shift/mask constants below adjusted. But until the need
* arises, this code is a bit faster and simpler using an int.
*
* The workerCount is the number of workers that have been
* permitted to start and not permitted to stop. The value may be
* transiently different from the actual number of live threads,
* for example when a ThreadFactory fails to create a thread when
* asked, and when exiting threads are still performing
* bookkeeping before terminating. The user-visible pool size is
* reported as the current size of the workers set.
*
* The runState provides the main lifecycle control, taking on values:
*
* RUNNING: Accept new tasks and process queued tasks
* SHUTDOWN: Don't accept new tasks, but process queued tasks
* STOP: Don't accept new tasks, don't process queued tasks,
* and interrupt in-progress tasks
* TIDYING: All tasks have terminated, workerCount is zero,
* the thread transitioning to state TIDYING
* will run the terminated() hook method
* TERMINATED: terminated() has completed
*
* The numerical order among these values matters, to allow
* ordered comparisons. The runState monotonically increases over
* time, but need not hit each state. The transitions are:
*
* RUNNING -> SHUTDOWN
* On invocation of shutdown(), perhaps implicitly in finalize()
* (RUNNING or SHUTDOWN) -> STOP
* On invocation of shutdownNow()
* SHUTDOWN -> TIDYING
* When both queue and pool are empty
* STOP -> TIDYING
* When pool is empty
* TIDYING -> TERMINATED
* When the terminated() hook method has completed
*
* Threads waiting in awaitTermination() will return when the
* state reaches TERMINATED.
*
* Detecting the transition from SHUTDOWN to TIDYING is less
* straightforward than you'd like because the queue may become
* empty after non-empty and vice versa during SHUTDOWN state, but
* we can only terminate if, after seeing that it is empty, we see
* that workerCount is 0 (which sometimes entails a recheck -- see
* below).
*/
private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
private static final int COUNT_BITS = Integer.SIZE - 3;
private static final int CAPACITY = (1 << COUNT_BITS) - 1;
// runState is stored in the high-order bits
private static final int RUNNING = -1 << COUNT_BITS;
private static final int SHUTDOWN = 0 << COUNT_BITS;
private static final int STOP = 1 << COUNT_BITS;
private static final int TIDYING = 2 << COUNT_BITS;
private static final int TERMINATED = 3 << COUNT_BITS;
Integer.SIZE为32,所以COUNT_BITS为29,最终各个状态对应的二级制为:
- RUNNING:11100000 00000000 00000000 00000000
- SHUTDOWN:0000000000000000 00000000 00000000
- STOP:00100000 00000000 00000000 00000000
- TIDYING:01000000 00000000 00000000 00000000
- TERMINATED:01100000 00000000 00000000 00000000
所以,只需要使用一个Integer数字的最高三个bit,就可以表示5种线程池的状态,而剩下的29个bit就可以用来表示工作线程数。(如果的线程数不够用,在JDK源码注释中有说明,可将改类型改为AtomicLong)
状态举例
例如:11100000 00000000 00000000 00001010
表示线程池的状态为RUNNING,线程池中目前在工作的线程
有10个。
“在工作的线程”意思是线程活着,要么在执行任务,要么在阻塞等待任务。
相关方法
在线程池中也提供了一些方法用来获取线程池状态和工作线程数。
// Packing and unpacking ctl
//得到传入数字的高三位 ~CAPACITY为11100000 00000000 00000000 00000000 &操作之后,得到就是c的高3位
private static int runStateOf(int c) { return c & ~CAPACITY; }
//得到传入数字的低29位 &操作之后,得到就是c的低29位
private static int workerCountOf(int c) { return c & CAPACITY; }
// 把运行状态和工作线程数量进行合并的一个方法,不过传入这个方法的两个int数字有限制,rs的低29位都得为0,wc的高3位都得为0,这样经过或运算之后,才能得到准确的ctl。
private static int ctlOf(int rs, int wc) { return rs | wc; }
/*
* Bit field accessors that don't require unpacking ctl.
* These depend on the bit layout and on workerCount being never negative.
*/
// c状态是否小于s状态,比如RUNNING小于SHUTDOWN
private static boolean runStateLessThan(int c, int s) {
return c < s;
}
// c状态是否大于等于s状态,比如STOP大于SHUTDOWN
private static boolean runStateAtLeast(int c, int s) {
return c >= s;
}
// c状态是不是RUNNING,只有RUNNING是小于SHUTDOWN的
private static boolean isRunning(int c) {
return c < SHUTDOWN;
}
/**
* Attempts to CAS-increment the workerCount field of ctl.
*/
// 通过cas来增加工作线程数量,直接对ctl进行加1
// 这个方法没考虑是否超过最大工作线程数的(2的29次方)限制,源码中在调用该方法之前会进行判断的
private boolean compareAndIncrementWorkerCount(int expect) {
return ctl.compareAndSet(expect, expect + 1);
}
/**
* Attempts to CAS-decrement the workerCount field of ctl.
*/
// 通过cas来减少工作线程数量,直接对ctl进行减1
private boolean compareAndDecrementWorkerCount(int expect) {
return ctl.compareAndSet(expect, expect - 1);
}
/**
* Decrements the workerCount field of ctl. This is called only on
* abrupt termination of a thread (see processWorkerExit). Other
* decrements are performed within getTask.
*/
// 通过cas来减少工作线程数量,直接对ctl进行减1。仅在线程突然终止时调用。
private void decrementWorkerCount() {
do {} while (! compareAndDecrementWorkerCount(ctl.get()));
}
execute方法:提交线程任务
/**
* Executes the given task sometime in the future. The task
* may execute in a new thread or in an existing pooled thread.
*
* If the task cannot be submitted for execution, either because this
* executor has been shutdown or because its capacity has been reached,
* the task is handled by the current {@code RejectedExecutionHandler}.
*
* @param command the task to execute
* @throws RejectedExecutionException at discretion of
* {@code RejectedExecutionHandler}, if the task
* cannot be accepted for execution
* @throws NullPointerException if {@code command} is null
*/
public void execute(Runnable command) {
if (command == null)
throw new NullPointerException();
/*
* Proceed in 3 steps:
*
* 1. If fewer than corePoolSize threads are running, try to
* start a new thread with the given command as its first
* task. The call to addWorker atomically checks runState and
* workerCount, and so prevents false alarms that would add
* threads when it shouldn't, by returning false.
*
* 2. If a task can be successfully queued, then we still need
* to double-check whether we should have added a thread
* (because existing ones died since last checking) or that
* the pool shut down since entry into this method. So we
* recheck state and if necessary roll back the enqueuing if
* stopped, or start a new thread if there are none.
*
* 3. If we cannot queue task, then we try to add a new
* thread. If it fails, we know we are shut down or saturated
* and so reject the task.
*/
// 获取ctl
// ctl初始值是ctlOf(RUNNING, 0),表示线程池处于运行中,工作线程数为0
int c = ctl.get();
// 工作线程数小于corePoolSize,则添加工作线程,并把command作为该线程要执行的任务
if (workerCountOf(c) < corePoolSize) {
// true表示添加的是核心工作线程,具体一点就是,在addWorker内部会判断当前工作线程数是不是超过了corePoolSize
// 如果超过了则会添加失败,addWorker返回false,表示不能直接开启新的线程来执行任务,而是应该先入队
if (addWorker(command, true))
return;
// 如果添加核心工作线程失败,那就重新获取ctl,可能是线程池状态被其他线程修改了
// 也可能是其他线程也在向线程池提交任务,导致核心工作线程已经超过了corePoolSize
c = ctl.get();
}
// 线程池状态是否还是RUNNING,如果是就把任务添加到阻塞队列中
if (isRunning(c) && workQueue.offer(command)) {
int recheck = ctl.get();
// 在任务入队时,线程池的状态可能也会发生改变
// 再次检查线程池的状态,如果线程池不是RUNNING了,那就不能再接受任务了,就得把任务从队列中移除,并进行拒绝策略
if (! isRunning(recheck) && remove(command))
//执行拒绝策略
reject(command);
// 如果线程池的状态没有发生改变,仍然是RUNNING,那就不需要把任务从队列中移除掉
// 不过,为了确保刚刚入队的任务有线程会去处理它,需要判断一下工作线程数,如果为0,那就添加一个非核心的工作线程
// 添加的这个线程没有自己的任务,目的就是从队列中获取任务来执行
else if (workerCountOf(recheck) == 0)
addWorker(null, false);
}
// 如果线程池状态不是RUNNING,或者线程池状态是RUNNING但是队列满了,则去添加一个非核心工作线程
// 实际上,addWorker中会判断线程池状态如果不是RUNNING,是不会添加工作线程的
// false表示非核心工作线程,作用是,在addWorker内部会判断当前工作线程数已经超过了maximumPoolSize,如果超过了则会添加不成功,执行拒绝策略
else if (!addWorker(command, false))
reject(command);
}
addWorker方法:添加新线程
private boolean addWorker(Runnable firstTask, boolean core)
addWorker()方法是核心方法,是用来添加线程的,core参数表示添加的是核心线程还是非核心线程。
我们先分析一下,什么时候会添加线程呢?
实际上就要开启一个线程,不管是核心线程还是非核心线程其实都只是一个普通的线程,而核心和非核心的区别在于:
-
添加核心工作线程:判断目前的工作线程数是否超过corePoolSize
-
没有超过:直接开启新的工作线程执行任务
-
超过:不会开启新的工作线程,而是把任务进行入队
-
-
添加非核心工作线程,判断目前的工作线程数是否超过maximumPoolSize
-
没有超过:直接开启新的工作线程执行任务
-
超过:拒绝执行任务
-
所以在addWorker方法中,首先就要判断工作线程有没有超过限制,如果没有超过限制再去开启一个线程。
并且在addWorker方法中,还得判断线程池的状态,如果线程池的状态不是RUNNING状态了,那就没必要要去添加线程了。
当然有一种特例,就是线程池的状态是SHUTDOWN,但是队列中有任务,那此时还是需要添加添加一个线程的。那这种特例是如何产生的呢?
我们前面提到的都是开启新的工作线程,那么工作线程怎么回收呢?不可能开启的工作线程一直活着,因为如果任务由多变少,那也就不需要过多的线程资源,所以线程池中会有机制对开启的工作线程进行回收,如何回收的,后文会提到,我们这里先分析,有没有可能线程池中所有的线程都被回收了,答案的是有的。
首先非核心工作线程被回收是可以理解的,那核心工作线程要不要回收掉呢?其实线程池存在的意义,就是提前生成好线程资源,需要线程的时候直接使用就可以,而不需要临时去开启线程,所以正常情况下,开启的核心工作线程是不用回收掉的,就算暂时没有任务要处理,也不用回收,就让核心工作线程在那等着就可以了。
但是!在线程池中有这么一个参数:allowCoreThreadTimeOut,表示是否允许核心工作线程超时,意思就是是否允许核心工作线程回收,默认这个参数为false,但是我们可以调用allowCoreThreadTimeOut(boolean value)来把这个参数改为true,只要改了,那么核心工作线程也就会被回收了,那这样线程池中的所有工作线程都可能被回收掉,那如果所有工作线程都被回收掉之后,阻塞队列中来了一个任务,这样就形成了特例情况。
源码分析
/**
* Checks if a new worker can be added with respect to current
* pool state and the given bound (either core or maximum). If so,
* the worker count is adjusted accordingly, and, if possible, a
* new worker is created and started, running firstTask as its
* first task. This method returns false if the pool is stopped or
* eligible to shut down. It also returns false if the thread
* factory fails to create a thread when asked. If the thread
* creation fails, either due to the thread factory returning
* null, or due to an exception (typically OutOfMemoryError in
* Thread.start()), we roll back cleanly.
*
* @param firstTask the task the new thread should run first (or
* null if none). Workers are created with an initial first task
* (in method execute()) to bypass queuing when there are fewer
* than corePoolSize threads (in which case we always start one),
* or when the queue is full (in which case we must bypass queue).
* Initially idle threads are usually created via
* prestartCoreThread or to replace other dying workers.
*
* @param core if true use corePoolSize as bound, else
* maximumPoolSize. (A boolean indicator is used here rather than a
* value to ensure reads of fresh values after checking other pool
* state).
* @return true if successful
*/
private boolean addWorker(Runnable firstTask, boolean core) {
retry:
for (;;) {
int c = ctl.get();
int rs = runStateOf(c);
// Check if queue empty only if necessary.
// 线程池如果是SHUTDOWN状态并且队列非空则创建线程
// 线程池如果是SHUTDOWN状态并且队列为空则不创建线程
// 线程池如果是STOP状态则直接不创建线程了
if (rs >= SHUTDOWN &&
! (rs == SHUTDOWN &&
firstTask == null &&
! workQueue.isEmpty()))
return false;
for (;;) {
int wc = workerCountOf(c);
// 判断工作线程数是否超过了限制,如果超过限制了,则return false
if (wc >= CAPACITY ||
wc >= (core ? corePoolSize : maximumPoolSize))
return false;
// 如果没有超过限制,则修改ctl,增加工作线程数,cas成功则退出外层retry循环,去创建新的工作线程
if (compareAndIncrementWorkerCount(c))
break retry;
// 如果cas失败,则表示有其他线程也在提交任务,也在增加工作线程数,此时重新获取ctl
c = ctl.get(); // Re-read ctl
// 如果发现线程池的状态发生了变化,则继续回到retry,重新判断线程池的状态是不是SHUTDOWN或STOP
if (runStateOf(c) != rs)
continue retry;
// else CAS failed due to workerCount change; retry inner loop
// 如果状态没有变化,则继续for循环利用cas来增加工作线程数,直到cas成功
}
}
// ctl修改成功,也就是工作线程数+1成功,接下来就要开启一个新的工作线程了
boolean workerStarted = false;
boolean workerAdded = false;
Worker w = null;
try {
// Worker实现了Runnable接口
// 在构造一个Worker对象时,就会利用ThreadFactory新建一个线程
// Worker对象有两个属性:
// Runnable firstTask:表示Worker待执行的第一个任务,第二个任务会从阻塞队列中获取
// Thread thread:表示Worker对应的线程,就是这个线程来获取队列中的任务并执行的
w = new Worker(firstTask);
// 拿出线程对象,还没有start
final Thread t = w.thread;
if (t != null) {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
// Recheck while holding lock.
// Back out on ThreadFactory failure or if
// shut down before lock acquired.
//获取当前状态
int rs = runStateOf(ctl.get());
// 如果线程池的状态是RUNNING
// 或者线程池的状态变成了SHUTDOWN,但是当前线程没有自己的第一个任务,那就表示当前调用addWorker方法是为了从队列中获取任务来执行
// 正常情况下线程池的状态如果是SHUTDOWN,是不能创建新的工作线程的,但是队列中如果有任务,那就是上面说的特例情况
if (rs < SHUTDOWN ||
(rs == SHUTDOWN && firstTask == null)) {
// 测试此线程是否处于活动状态
// 如果Worker对象对应的线程已经在运行了,那就有问题,直接抛异常
if (t.isAlive()) // precheck that t is startable
throw new IllegalThreadStateException();
workers.add(w);
int s = workers.size();
// largestPoolSize用来跟踪线程池在运行过程中工作线程数的峰值
if (s > largestPoolSize)
largestPoolSize = s;
workerAdded = true;
}
} finally {
mainLock.unlock();
}
// 运行线程
if (workerAdded) {
t.start();
workerStarted = true;
}
}
} finally {
// 在上述过程中如果抛了异常,需要从works中移除所添加的work,并且还要修改ctl,工作线程数-1,表示新建工作线程失败
if (! workerStarted)
addWorkerFailed(w);
}
// 最后表示添加工作线程成功状态
return workerStarted;
}
addWorker方法核心逻辑
- 判断工作线程数是否超过了限制
- 修改ctl,使得工作线程数+1
- 构造Work对象,并把它添加到workers集合中
- 启动Work对象对应的工作线程
Work类:线程池内部实际创建用来执行任务的线程类
java.util.concurrent.ThreadPoolExecutor.Worker
可以看到,work类继承了AQS,同时实现了Runnable接口。
构造方法
/**
* Creates with given first task and thread from ThreadFactory.
* @param firstTask the first task (null if none)
*/
Worker(Runnable firstTask) {
setState(-1); // inhibit interrupts until runWorker
this.firstTask = firstTask;
// 在利用ThreadFactory创建线程时,会把this,也就是当前Work对象作为Runnable传给线程,所以工作线程运行时,就会执行Worker的run方法:
this.thread = getThreadFactory().newThread(this);
}
run()方法
/** Delegates main run loop to outer runWorker */
public void run() {
// 工作线程运行时的执行逻辑
runWorker(this);
}
runWorker()方法:实际运行线程任务的方法
/**
* Main worker run loop. Repeatedly gets tasks from queue and
* executes them, while coping with a number of issues:
*
* 1. We may start out with an initial task, in which case we
* don't need to get the first one. Otherwise, as long as pool is
* running, we get tasks from getTask. If it returns null then the
* worker exits due to changed pool state or configuration
* parameters. Other exits result from exception throws in
* external code, in which case completedAbruptly holds, which
* usually leads processWorkerExit to replace this thread.
*
* 2. Before running any task, the lock is acquired to prevent
* other pool interrupts while the task is executing, and then we
* ensure that unless pool is stopping, this thread does not have
* its interrupt set.
*
* 3. Each task run is preceded by a call to beforeExecute, which
* might throw an exception, in which case we cause thread to die
* (breaking loop with completedAbruptly true) without processing
* the task.
*
* 4. Assuming beforeExecute completes normally, we run the task,
* gathering any of its thrown exceptions to send to afterExecute.
* We separately handle RuntimeException, Error (both of which the
* specs guarantee that we trap) and arbitrary Throwables.
* Because we cannot rethrow Throwables within Runnable.run, we
* wrap them within Errors on the way out (to the thread's
* UncaughtExceptionHandler). Any thrown exception also
* conservatively causes thread to die.
*
* 5. After task.run completes, we call afterExecute, which may
* also throw an exception, which will also cause thread to
* die. According to JLS Sec 14.20, this exception is the one that
* will be in effect even if task.run throws.
*
* The net effect of the exception mechanics is that afterExecute
* and the thread's UncaughtExceptionHandler have as accurate
* information as we can provide about any problems encountered by
* user code.
*
* @param w the worker
*/
final void runWorker(Worker w) {
// 就是当前工作线程
Thread wt = Thread.currentThread();
// 把Worker要执行的第一个任务拿出来
Runnable task = w.firstTask;
w.firstTask = null;
// 这个地方,后面单独分析中断的时候来分析
w.unlock(); // allow interrupts
boolean completedAbruptly = true;
try {
// 判断当前线程是否有自己的第一个任务,如果没有就从阻塞队列中获取任务
// 如果阻塞队列中也没有任务,那线程就会阻塞在这里
// 但是并不会一直阻塞,在getTask方法中,会根据我们所设置的keepAliveTime来设置阻塞时间
// 如果当前线程去阻塞队列中获取任务时,等了keepAliveTime时间,还没有获取到任务,则getTask方法返回null,相当于退出循环
// 当然并不是所有线程都会有这个超时判断,主要还得看allowCoreThreadTimeOut属性和当前的工作线程数等等,后面单独分析
// 目前,我们只需要知道工作线程在执行getTask()方法时,可能能直接拿到任务,也可能阻塞,也可能阻塞超时最终返回null
while (task != null || (task = getTask()) != null) {
// 只要拿到了任务,就要去执行任务
// Work加锁(可参考shutdown()方法)
w.lock();
// If pool is stopping, ensure thread is interrupted;
// if not, ensure thread is not interrupted. This
// requires a recheck in second case to deal with
// shutdownNow race while clearing interrupt
// 工作线程在运行过程中
// 如果发现线程池的状态变成了STOP,正常来说当前工作线程的中断标记应该为true,如果发现中断标记不为true,则需要中断自己
// 如果线程池的状态不是STOP,要么是RUNNING,要么是SHUTDOWN
// 但是如果发现中断标记为true,那是不对的,因为线程池状态不是STOP,工作线程仍然是要正常工作的,不能中断掉
// 就算是SHUTDOWN,也要等任务都执行完之后,线程才结束,而目前线程还在执行任务的过程中,不能中断
// 所以需要重置线程的中断标记,不过interrupted方法会自动清空中断标记
// 清空为中断标记后,再次判断一下线程池的状态,如果又变成了STOP,那就仍然中断自己
// 中断了自己后,会把当前任务执行完,在下一次循环调用getTask()方法时,从阻塞队列获取任务时,阻塞队列会负责判断当前线程的中断标记
// 如果发现中断标记为true,那就会抛出异常,最终退出while循环,线程执行结束
if ((runStateAtLeast(ctl.get(), STOP)
|| (Thread.interrupted()
&& runStateAtLeast(ctl.get(), STOP)))
&& !wt.isInterrupted())
wt.interrupt();
try {
// 钩子方法,给自定义线程池来实现
beforeExecute(wt, task);
Throwable thrown = null;
try {
// 执行任务
// 执行任务时可能会抛异常,如果抛了异常会先依次执行三个finally,同时completedAbruptly = false这行代码也不会执行
task.run();
} catch (RuntimeException x) {
thrown = x; throw x;
} catch (Error x) {
thrown = x; throw x;
} catch (Throwable x) {
thrown = x; throw new Error(x);
} finally {
// 钩子方法,给自定义线程池来实现
afterExecute(task, thrown);
}
} finally {
task = null;
// 跟踪当前Work总共执行了多少了任务
w.completedTasks++;
w.unlock();
}
}
// 正常退出While循环
// 如果是执行任务的时候抛了异常,虽然也退出了循环,但是是不会执行这行代码的,只会直接进去下面的finally块中
// 所以,要么是线程从队列中获取任务时阻塞超时了从而退出了循环会进入到这里
// 要么是线程在阻塞的过程中被中断了,在getTask()方法中会处理中断的情况,如果被中断了,那么getTask()方法会返回null,从而退出循环
// completedAbruptly=false,表示线程正常退出
completedAbruptly = false;
} finally {
// 因为当前线程退出了循环,如果不做某些处理,那么这个线程就运行结束了,就是上文说的回收(自然消亡)掉了,线程自己运行完了也就结束了
// 但是如果是由于执行任务的时候抛了异常,那么这个线程不应该直接结束,而应该继续从队列中获取下一个任务
// 可是代码都执行到这里了,该怎么继续回到while循环呢,怎么实现这个效果呢?
// 当然,如果是由于线程被中断了,或者线程阻塞超时了,那就应该正常的运行结束
// 只不过有一些善后工作要处理,比如修改ctl,工作线程数-1
processWorkerExit(w, completedAbruptly);
}
}
processWorkerExit():线程退出获取循环任务时做的特殊处理
/**
* Performs cleanup and bookkeeping for a dying worker. Called
* only from worker threads. Unless completedAbruptly is set,
* assumes that workerCount has already been adjusted to account
* for exit. This method removes thread from worker set, and
* possibly terminates the pool or replaces the worker if either
* it exited due to user task exception or if fewer than
* corePoolSize workers are running or queue is non-empty but
* there are no workers.
*
* @param w the worker
* @param completedAbruptly if the worker died due to user exception
*/
private void processWorkerExit(Worker w, boolean completedAbruptly) {
// 如果completedAbruptly为true,表示是执行任务的时候抛了异常,那就修改ctl,工作线程数-1
// 如果completedAbruptly为false,表示是线程阻塞超时了或者被中断了,实际上也要修改ctl,工作线程数-1
// 只不过在getTask方法中已经做过了,这里就不用再做一次了
if (completedAbruptly) // If abrupt, then workerCount wasn't adjusted
decrementWorkerCount();
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
// 当前Work要运行结束了,将完成的任务数累加到线程池上
completedTaskCount += w.completedTasks;
// 将当前Work对象从workers中移除
workers.remove(w);
} finally {
mainLock.unlock();
}
// 因为当前是处理线程退出流程中,所以要尝试去修改线程池的状态为TINDYING
tryTerminate();
int c = ctl.get();
// 如果线程池的状态为RUNNING或者SHUTDOWN,则可能要替补一个线程
if (runStateLessThan(c, STOP)) {
// completedAbruptly为false,表示线程是正常要退出了,则看是否需要保留线程
if (!completedAbruptly) {
// 如果allowCoreThreadTimeOut为true,同时阻塞队列中还有任务,那就至少得保留一个工作线程来处理阻塞队列中的任务
// 如果allowCoreThreadTimeOut为false,那min就是corePoolSize,表示至少得保留corePoolSize个工作线程活着
int min = allowCoreThreadTimeOut ? 0 : corePoolSize;
if (min == 0 && ! workQueue.isEmpty())
min = 1;
// 如果当前工作线程数大于等于min,则表示符合所需要保留的最小线程数,那就直接return,不会调用下面的addWorker方法新开一个工作线程了
if (workerCountOf(c) >= min)
return; // replacement not needed
}
// 如果线程池的状态为RUNNING或者SHUTDOWN
// 如果completedAbruptly为true,表示当前线程是执行任务时抛了异常,那就得新开一个工作线程
// 如果completedAbruptly为false,但是不符合所需要保留的最小线程数,那也得新开一个工作线程
addWorker(null, false);
}
}
总结
某个工作线程正常情况下会不停的循环从阻塞队列中获取任务来执行,正常情况下就是通过阻塞来保证线程永远活着,但是会有一些特殊情况:
- 如果线程被中断了,那就会退出循环,然后做一些善后处理,比如ctl中的工作线程数-1,然后自己运行结束
- 如果线程阻塞超时了,那也会退出循环,此时就需要判断线程池中的当前工作线程够不够,比如是否有corePoolSize个工作线程,如果不够就需要新开一个线程,然后当前线程自己运行结束,这种看上去效率比较低,但是也没办法,当然如果当前工作线程数足够,那就正常,自己正常的运行结束即可
- 如果线程是在执行任务的时候抛了异常,从而退出循环,那就直接新开一个线程作为替补,当然前提是线程池的状态是RUNNING
getTask()方法:获取队列中任务
/**
* Performs blocking or timed wait for a task, depending on
* current configuration settings, or returns null if this worker
* must exit because of any of:
* 1. There are more than maximumPoolSize workers (due to
* a call to setMaximumPoolSize).
* 2. The pool is stopped.
* 3. The pool is shutdown and the queue is empty.
* 4. This worker timed out waiting for a task, and timed-out
* workers are subject to termination (that is,
* {@code allowCoreThreadTimeOut || workerCount > corePoolSize})
* both before and after the timed wait, and if the queue is
* non-empty, this worker is not the last thread in the pool.
*
* @return task, or null if the worker must exit, in which case
* workerCount is decremented
*/
private Runnable getTask() {
boolean timedOut = false; // Did the last poll() time out?
for (;;) {
int c = ctl.get();
int rs = runStateOf(c);
// Check if queue empty only if necessary.
// 如果线程池状态是STOP,表示当前线程不需要处理任务了,那就修改ctl工作线程数-1
// 如果线程池状态是SHUTDOWN,但是阻塞队列中为空,表示当前任务没有任务要处理了,那就修改ctl工作线程数-1
// return null表示当前线程无需处理任务,线程退出
if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
decrementWorkerCount();
return null;
}
// 当前工作线程数
int wc = workerCountOf(c);
// Are workers subject to culling?
// 用来判断当前线程是无限阻塞还是超时阻塞,如果一个线程超时阻塞,一旦超时了,那么这个线程最终就会退出
// 如果是无限阻塞,那除非被中断了,不然这个线程就一直等着获取队列中的任务
// allowCoreThreadTimeOut为true,表示线程池中的所有线程都可以被回收掉,则当前线程应该直接使用超时阻塞,一旦超时就回收
// allowCoreThreadTimeOut为false,则要看当前工作线程数是否超过了corePoolSize,如果超过了,则表示超过部分的线程要用超时阻塞,一旦超时就回收
boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;
// 如果工作线程数超过了工作线程的最大限制或者线程超时了,则要修改ctl,工作线程数减1,并且return null
// return null就会导致外层的while循环退出,从而导致线程直接运行结束
if ((wc > maximumPoolSize || (timed && timedOut))
&& (wc > 1 || workQueue.isEmpty())) {
if (compareAndDecrementWorkerCount(c))
return null;
continue;
}
try {
// 要么超时阻塞,要么无限阻塞
Runnable r = timed ?
workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
workQueue.take();
if (r != null)
// 表示没有超时,在阻塞期间获取到了任务
return r;
// 超时了,重新进入循环,上面的代码会判断出来当前线程阻塞超时了,最后return null,线程会运行结束
timedOut = true;
} catch (InterruptedException retry) {
// 从阻塞队列获取任务时,被中断了,也会再次进入循环,此时并不是超时,但是重新进入循环后,会判断线程池的状态
// 如果线程池的状态变成了STOP或者SHUTDOWN,最终也会return null,线程会运行结束
// 但是如果线程池的状态仍然是RUNNING,那当前线程会继续从队列中去获取任务,表示忽略了本次中断
// 只有通过调用线程池的shutdown方法或shutdownNow方法才能真正中断线程池中的线程
timedOut = false;
}
}
}
线程池关闭方法
只有通过调用线程池的shutdown方法或shutdownNow方法才能真正中断线程池中的线程。
在java中,中断一个线程,只是修改了该线程的一个标记,并不是直接kill了这个线程,被中断的线程到底要不要消失,由被中断的线程自己来判断,比如上面代码中,线程遇到了中断异常,它可以选择什么都不做,那线程就会继续进行外层循环,如果选择return,那就退出了循环,后续就会运行结束从而消失。
shutdown():关闭线程池,不在接受新的线程任务,但是会将阻塞队列中剩余的任务执行完
根据前面execute方法的源码,只要线程池的状态不是RUNNING,那么就表示线程池不接受新任务。
shutdown方法要做两件事:
- 修改线程池状态
- 中断线程池中的工作线程。此时又有两种情况:
- 对于正在阻塞等待任务的线程,直接中断
- 对于正在执行任务的线程,只要等它们把任务执行完,就可以中断了。因为此时线程池不能接受新任务,所以正在执行的任务就是最后剩余的任务
源码
/**
* Initiates an orderly shutdown in which previously submitted
* tasks are executed, but no new tasks will be accepted.
* Invocation has no additional effect if already shut down.
*
* <p>This method does not wait for previously submitted tasks to
* complete execution. Use {@link #awaitTermination awaitTermination}
* to do that.
*
* @throws SecurityException {@inheritDoc}
*/
public void shutdown() {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
checkShutdownAccess();
// 修改ctl,将线程池状态改为SHUTDOWN
advanceRunState(SHUTDOWN);
// 中断工作线程
interruptIdleWorkers();
// 空方法,给子类扩展使用
onShutdown(); // hook for ScheduledThreadPoolExecutor
} finally {
mainLock.unlock();
}
// 调用terminated方法
tryTerminate();
}
/**
* Common form of interruptIdleWorkers, to avoid having to
* remember what the boolean argument means.
*/
private void interruptIdleWorkers() {
interruptIdleWorkers(false);
}
/**
* Interrupts threads that might be waiting for tasks (as
* indicated by not being locked) so they can check for
* termination or configuration changes. Ignores
* SecurityExceptions (in which case some threads may remain
* uninterrupted).
*
* @param onlyOne If true, interrupt at most one worker. This is
* called only from tryTerminate when termination is otherwise
* enabled but there are still other workers. In this case, at
* most one waiting worker is interrupted to propagate shutdown
* signals in case all threads are currently waiting.
* Interrupting any arbitrary thread ensures that newly arriving
* workers since shutdown began will also eventually exit.
* To guarantee eventual termination, it suffices to always
* interrupt only one idle worker, but shutdown() interrupts all
* idle workers so that redundant workers exit promptly, not
* waiting for a straggler task to finish.
*/
private void interruptIdleWorkers(boolean onlyOne) {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
// 遍历所有正在工作的线程,要么在执行任务,要么在阻塞等待任务
for (Worker w : workers) {
Thread t = w.thread;
// 如果线程没有被中断,并且能够拿到锁,就中断线程
// Worker在执行任务时会先加锁,执行完任务之后会释放锁
// 所以只要这里拿到了锁,就表示线程空出来了,可以中断了
if (!t.isInterrupted() && w.tryLock()) {
try {
t.interrupt();
} catch (SecurityException ignore) {
} finally {
w.unlock();
}
}
if (onlyOne)
break;
}
} finally {
mainLock.unlock();
}
}
通过源码,会发现一种特殊情况,就是目前所有工作线程都在执行任务,但是阻塞队列中还有剩余任务,那逻辑应该就是这些工作线程执行完当前任务后要继续执行队列中的剩余任务,但是根据我们看到的shutdown方法的逻辑,发现这些工作线程在执行完当前任务后,就会释放锁,那就可能(因为是尝试获取锁,但是又不一定能获取锁)会被中断掉,那队列中剩余的任务怎么办呢?
工作线程一旦被中断,就会进入processWorkerExit方法,根据前面的分析,我们发现,在这个方法中会对线程池状态为SHUTDOWN进行判断,会重新生成新的工作线程,那么这样就能保证队列中剩余的任务一定会被执行完。
shutdownNow():立即关闭线程池
/**
* Attempts to stop all actively executing tasks, halts the
* processing of waiting tasks, and returns a list of the tasks
* that were awaiting execution. These tasks are drained (removed)
* from the task queue upon return from this method.
*
* <p>This method does not wait for actively executing tasks to
* terminate. Use {@link #awaitTermination awaitTermination} to
* do that.
*
* <p>There are no guarantees beyond best-effort attempts to stop
* processing actively executing tasks. This implementation
* cancels tasks via {@link Thread#interrupt}, so any task that
* fails to respond to interrupts may never terminate.
*
* @throws SecurityException {@inheritDoc}
*/
public List<Runnable> shutdownNow() {
List<Runnable> tasks;
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
checkShutdownAccess();
// 修改ctl,将线程池状态改为STOP
advanceRunState(STOP);
// 中断工作线程
interruptWorkers();
// 返回阻塞队列中剩余的任务
tasks = drainQueue();
} finally {
mainLock.unlock();
}
// 调用terminated方法
tryTerminate();
return tasks;
}
/**
* Interrupts all threads, even if active. Ignores SecurityExceptions
* (in which case some threads may remain uninterrupted).
*/
private void interruptWorkers() {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
// 中断所有工作线程,不管有没有在执行任务
for (Worker w : workers)
w.interruptIfStarted();
} finally {
mainLock.unlock();
}
}
void interruptIfStarted() {
Thread t;
// 只要线程没有被中断,那就中断线程,中断的线程虽然也会进入processWorkerExit方法,但是该方法中判断了线程池的状态
// 线程池状态为STOP的情况下,不会再开启新的工作线程了
// 这里getState>=0表示,一个工作线程在创建好,但是还没运行时,这时state为-1(work构造方法设置的初始状态为-1),不能被中断,就算中断也没意义,都还没运行
if (getState() >= 0 && (t = thread) != null && !t.isInterrupted()) {
try {
t.interrupt();
} catch (SecurityException ignore) {
}
}
}
mainLock:控制workers集合的并发安全
/**
* Lock held on access to workers set and related bookkeeping.
* While we could use a concurrent set of some sort, it turns out
* to be generally preferable to use a lock. Among the reasons is
* that this serializes interruptIdleWorkers, which avoids
* unnecessary interrupt storms, especially during shutdown.
* Otherwise exiting threads would concurrently interrupt those
* that have not yet interrupted. It also simplifies some of the
* associated statistics bookkeeping of largestPoolSize etc. We
* also hold mainLock on shutdown and shutdownNow, for the sake of
* ensuring workers set is stable while separately checking
* permission to interrupt and actually interrupting.
*/
private final ReentrantLock mainLock = new ReentrantLock();
在源码中,会发现很多地方都会用到mainLock,它是线程池中的一把全局锁,主要是用来控制workers集合的并发安全,因为如果没有这把全局锁,就有可能多个线程共用同一个线程池对象,如果一个线程在向线程池提交任务,一个线程在shutdown线程池,如果不做并发控制,那就有可能线程池shutdown了,但是还有工作线程没有被中断,如果1个线程在shutdown,99个线程在提交任务,那么最终就可能导致线程池关闭了,但是线程池中的很多线程都没有停止,仍然在运行,这肯定是不行,所以需要这把全局锁来对workers集合的操作进行并发安全控制。