线程池大家都用过,线程池的几个参数,大家也都熟悉:
corePoolSize:核心线程数。核心线程会一直存活,即使啥也不干。如果ThreadPoolExecutor的allowCoreThreadTimeOut这个属性为true,那么核心线程闲置一段时间也会被销毁。
maximumPoolSize:最大线程数。线程总数=核心线程数+非核心线程数。
keepAliveTime:闲置线程存活时间,默认作用于非核心线程,如果allowCoreThreadTimeOut = true,则会作用于核心线程。
unit:keepAliveTime单位
workQueue:工作队列,核心线程满了后,新来的任务进入到工作队列。队列满了才会建非核心线程。
threadFactory:线程创建工厂,这是个接口,用时实现接口里的newThread,通常用来给线程命名。
RejectedExecutionHandler:拒绝策略
在阅读源码之前,笔者留下几个问题,我们要从源码中去找这几个问题的答案。
线程池为什么可以复用线程?
核心线程如何做到不会销毁?
shutdown和shutdownNow是如何让线程池关闭的?
阅读源码
线程池状态位
这里先介绍一下线程池的状态是怎么设计的。因为线程池的状态设计的比较复杂,不了解清楚,后面的状态判断与更改变很难理解。线程池状态复杂就复杂在线程池状态和线程池容量存放在一个变量里面。
private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
private static final int COUNT_BITS = Integer.SIZE - 3;
private static final int CAPACITY = (1 << COUNT_BITS) - 1;
// runState is stored in the high-order bits
private static final int RUNNING = -1 << COUNT_BITS;
private static final int SHUTDOWN = 0 << COUNT_BITS;
private static final int STOP = 1 << COUNT_BITS;
private static final int TIDYING = 2 << COUNT_BITS;
private static final int TERMINATED = 3 << COUNT_BITS;
// Packing and unpacking ctl
private static int runStateOf(int c) { return c & ~CAPACITY; }
private static int workerCountOf(int c) { return c & CAPACITY; }
private static int ctlOf(int rs, int wc) { return rs | wc; }
private static boolean runStateLessThan(int c, int s) {
return c < s;
}
private static boolean runStateAtLeast(int c, int s) {
return c >= s;
}
private static boolean isRunning(int c) {
return c < SHUTDOWN;
}
COUNT_BITS=32-3=29。这个可以理解成一个定义成这样的常量,为什么常量定义成这样?为什么是-3。因为线程池有5个状态,需要3个位来保存,而且在3位是高三位。那低29位做什么用的?放的是线程池容量也就是线程池里面能存放的最大任务数。如果再有人问线程池最大能容纳多少线程,就不会在蒙蔽了。
CAPACITY就是上面说的容量,(1 << COUNT_BITS) - 1。低29位全部为1。
从RUNNING到TERMINATED表示线程池的各种状态,用高3位表示。而且他们从小到大刚好满足线程池状态流转顺序。所以上面runStateLessThan、runStateAtLeast、isRunning三给方法正好利用这点进行线程池状态的判断。
ctlOf,这个方法就是把线程池状态和容量和一起。所以ctl初始值的意义就是,线程池running状态,里面任务数为0
runStateOf,ctl中拆解出线程状态。workerCountOf则是拆解出任务数。
准备工作做完了,开始入正题。入口execute方法入手。
public void execute(Runnable command) {
if (command == null)
throw new NullPointerException();
/*
* Proceed in 3 steps:
*
* 1. If fewer than corePoolSize threads are running, try to
* start a new thread with the given command as its first
* task. The call to addWorker atomically checks runState and
* workerCount, and so prevents false alarms that would add
* threads when it shouldn't, by returning false.
*
* 2. If a task can be successfully queued, then we still need
* to double-check whether we should have added a thread
* (because existing ones died since last checking) or that
* the pool shut down since entry into this method. So we
* recheck state and if necessary roll back the enqueuing if
* stopped, or start a new thread if there are none.
*
* 3. If we cannot queue task, then we try to add a new
* thread. If it fails, we know we are shut down or saturated
* and so reject the task.
*/
int c = ctl.get();
if (workerCountOf(c) < corePoolSize) {
if (addWorker(command, true))
return;
c = ctl.get();
}
if (isRunning(c) && workQueue.offer(command)) {
int recheck = ctl.get();
if (! isRunning(recheck) && remove(command))
reject(command);
else if (workerCountOf(recheck) == 0)
addWorker(null, false);
}
else if (!addWorker(command, false))
reject(command);
}
注释三个步骤已经写的很明白了。我们再源码怎么做的。command就是传入的Runnable类型的任务addWorker具体后面再看,这里就先理解成添加工作队列吧。
第一步:判断当前线程数(workerCountOf解析出线程数)是否小于核心线程数,满足条件就addWorker,true表示核心线程。这一步的意义就是核心线程未满先创建核心线程。
第二步:判断线程池是否是running状态,如果是,加入到workQueue队列,加入成功后,还有再次检测的过程。
第三步:前面两个判断都为false才进来,addWorker添加非核心线程,添加失败,就执行拒绝策略。
这里的重点方法是addWorker方法,看一下如何添加线程的
private boolean addWorker(Runnable firstTask, boolean core) {
retry:
for (;;) {
int c = ctl.get();
int rs = runStateOf(c);
// Check if queue empty only if necessary.
if (rs >= SHUTDOWN &&
! (rs == SHUTDOWN &&
firstTask == null &&
! workQueue.isEmpty()))
return false;
for (;;) {
int wc = workerCountOf(c);
if (wc >= CAPACITY ||
wc >= (core ? corePoolSize : maximumPoolSize))
return false;
if (compareAndIncrementWorkerCount(c))
break retry;
c = ctl.get(); // Re-read ctl
if (runStateOf(c) != rs)
continue retry;
// else CAS failed due to workerCount change; retry inner loop
}
}
boolean workerStarted = false;
boolean workerAdded = false;
Worker w = null;
try {
w = new Worker(firstTask);
final Thread t = w.thread;
if (t != null) {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
// Recheck while holding lock.
// Back out on ThreadFactory failure or if
// shut down before lock acquired.
int rs = runStateOf(ctl.get());
if (rs < SHUTDOWN ||
(rs == SHUTDOWN && firstTask == null)) {
if (t.isAlive()) // precheck that t is startable
throw new IllegalThreadStateException();
workers.add(w);
int s = workers.size();
if (s > largestPoolSize)
largestPoolSize = s;
workerAdded = true;
}
} finally {
mainLock.unlock();
}
if (workerAdded) {
t.start();
workerStarted = true;
}
}
} finally {
if (! workerStarted)
addWorkerFailed(w);
}
return workerStarted;
}
先看到两层for循环,外层循环给了个标识retry,用来控制外层循环。外层循环有个线程池状态判断和队列是否为空的判断,线程池状态已经是终止了,或者等待队列有任务,就直接返回false不创建任务。内存循环主要是compareAndIncrementWorkerCount这个CAS操作来增加线程池任务数量。
循环执行完后,线程池任务数量加1了,来到后面操作。将我们传入的Runnable 对象,包装成一个Worker对象,Thread t = w.thread拿到worker的thread属性。Worker对象放入workers这个set集合中,调用了t.start()运行线程。
这里还有个小细节,这个方法的入参core属性,标识是否是核心线程,仅在一个地方用到了,那就上述的内存for循环中 wc >= (core ? corePoolSize : maximumPoolSize) 检测任务数量用到。然后我们包装的Worker对象中没有任何标识,这里就可以说明核心线程和非核心线程其实在线程池中没有本质区分。
再看看worker类是如何运行线程的。
private final class Worker
extends AbstractQueuedSynchronizer
implements Runnable
{
/**
* This class will never be serialized, but we provide a
* serialVersionUID to suppress a javac warning.
*/
private static final long serialVersionUID = 6138294804551838833L;
/** Thread this worker is running in. Null if factory fails. */
final Thread thread;
/** Initial task to run. Possibly null. */
Runnable firstTask;
/** Per-thread task counter */
volatile long completedTasks;
/**
* Creates with given first task and thread from ThreadFactory.
* @param firstTask the first task (null if none)
*/
Worker(Runnable firstTask) {
setState(-1); // inhibit interrupts until runWorker
this.firstTask = firstTask;
this.thread = getThreadFactory().newThread(this);
}
/** Delegates main run loop to outer runWorker */
public void run() {
runWorker(this);
}
// Lock methods
//
// The value 0 represents the unlocked state.
// The value 1 represents the locked state.
protected boolean isHeldExclusively() {
return getState() != 0;
}
protected boolean tryAcquire(int unused) {
if (compareAndSetState(0, 1)) {
setExclusiveOwnerThread(Thread.currentThread());
return true;
}
return false;
}
protected boolean tryRelease(int unused) {
setExclusiveOwnerThread(null);
setState(0);
return true;
}
public void lock() { acquire(1); }
public boolean tryLock() { return tryAcquire(1); }
public void unlock() { release(1); }
public boolean isLocked() { return isHeldExclusively(); }
void interruptIfStarted() {
Thread t;
if (getState() >= 0 && (t = thread) != null && !t.isInterrupted()) {
try {
t.interrupt();
} catch (SecurityException ignore) {
}
}
}
}
构造方法中传入的Runnable对象赋值给了firstTask,如何创建了一个线程赋值给了thread。如何run()方法中调用了runWorker方法。这个类就是一个封装,然后线程并没有用我们传入的Runnable对象,而是自己new了一个。
再看runWorker方法
final void runWorker(Worker w) {
Thread wt = Thread.currentThread();
Runnable task = w.firstTask;
w.firstTask = null;
w.unlock(); // allow interrupts
boolean completedAbruptly = true;
try {
while (task != null || (task = getTask()) != null) {
w.lock();
// If pool is stopping, ensure thread is interrupted;
// if not, ensure thread is not interrupted. This
// requires a recheck in second case to deal with
// shutdownNow race while clearing interrupt
if ((runStateAtLeast(ctl.get(), STOP) ||
(Thread.interrupted() &&
runStateAtLeast(ctl.get(), STOP))) &&
!wt.isInterrupted())
wt.interrupt();
try {
beforeExecute(wt, task);
Throwable thrown = null;
try {
task.run();
} catch (RuntimeException x) {
thrown = x; throw x;
} catch (Error x) {
thrown = x; throw x;
} catch (Throwable x) {
thrown = x; throw new Error(x);
} finally {
afterExecute(task, thrown);
}
} finally {
task = null;
w.completedTasks++;
w.unlock();
}
}
completedAbruptly = false;
} finally {
processWorkerExit(w, completedAbruptly);
}
}
在addWorker的时候,线程已经start()了,runWorker()放则是真正线程执行逻辑。task就是我们最初传入的Runnable对象。while循环的条件是task不为空或者getTask()方法能拿到task。循环中的逻辑,先是判断线程池状态或者线程是否中断,然后执行beforeExecute(),(这个方法是线程池提供的一个让我们扩展的回调方法,afterExecute()也是)。然后调用run方法,注意这里是执行run方法而不是start。然后执行afterExecute(),然后task置为null。循环退出后执行processWorkerExit()。
再看看getTask()方法如何获取task。
private Runnable getTask() {
boolean timedOut = false; // Did the last poll() time out?
for (;;) {
int c = ctl.get();
int rs = runStateOf(c);
// Check if queue empty only if necessary.
if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
decrementWorkerCount();
return null;
}
int wc = workerCountOf(c);
// Are workers subject to culling?
boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;
if ((wc > maximumPoolSize || (timed && timedOut))
&& (wc > 1 || workQueue.isEmpty())) {
if (compareAndDecrementWorkerCount(c))
return null;
continue;
}
try {
Runnable r = timed ?
workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
workQueue.take();
if (r != null)
return r;
timedOut = true;
} catch (InterruptedException retry) {
timedOut = false;
}
}
}
一个for循环,先是线程池状态判断。timed属性决定是否要进行超时判断,要么核心线程允许销毁,要么当前线程数已大于核心线程数。(线程数量大于核心线程数或已经超时)并且(队列为空或线程数大于1),看起来这个判断有点复杂,简单理解就是等待超时或者线程数量超了,后面的条件意义不是那么大,因为一般是等待超时返回的null。
然后就是从等待队列中拿任务了,根据是否有超时时间判定,决定使用poll或者take方法,这两个方法都是阻塞式的。
再看processWorkerExit方法
private void processWorkerExit(Worker w, boolean completedAbruptly) {
if (completedAbruptly) // If abrupt, then workerCount wasn't adjusted
decrementWorkerCount();
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
completedTaskCount += w.completedTasks;
workers.remove(w);
} finally {
mainLock.unlock();
}
tryTerminate();
int c = ctl.get();
if (runStateLessThan(c, STOP)) {
if (!completedAbruptly) {
int min = allowCoreThreadTimeOut ? 0 : corePoolSize;
if (min == 0 && ! workQueue.isEmpty())
min = 1;
if (workerCountOf(c) >= min)
return; // replacement not needed
}
addWorker(null, false);
}
}
运行到processWorkerExit说明while循环退出了,也就是这个线程执行完了,要销毁了。这个方法做的是线程执行完后处理,workers队列移除,任务数量-1。这里有个completedAbruptly参数,它什么只有一种情况会为true,beforeExecute()或afterExecute()发生异常。
可以总结一波线程池到底做了啥了
首先我们提交的Runable对象会被包装成Worker,Worker自己创建线程执行,调用Runable的run方法。
执行完Runable的run方法后,从等待队列拿任务,如果超过核心线程数或者allowCoreThreadTimeOut为ture,则会使用poll方法在超时时间范围内拿,没拿到那就退出了,线程也就销毁了。拿到了就执行任务。这可以解释开篇提出的两个问题了,并且也说明了线程池的并没有核心线程的区分,只是最多会让核心线程数这么多线程一直存活(为什么是最多?假设核心线程数为10,它得提交10个任务,才能创建10个线程,只提交了两个任务,那就只有两个线程)。
最后看看线程池如何退出。
public void shutdown() {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
checkShutdownAccess();
advanceRunState(SHUTDOWN);
interruptIdleWorkers();
onShutdown(); // hook for ScheduledThreadPoolExecutor
} finally {
mainLock.unlock();
}
tryTerminate();
}
checkShutdownAccess()检查权限。advanceRunState(SHUTDOWN);设置状态为ShutDown。interruptIdleWorkers();给线程设置中断。onShutdown(),关闭回调。
看看interruptIdleWorkers方法
private void interruptIdleWorkers() {
interruptIdleWorkers(false);
}
private void interruptIdleWorkers(boolean onlyOne) {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
for (Worker w : workers) {
Thread t = w.thread;
if (!t.isInterrupted() && w.tryLock()) {
try {
t.interrupt();
} catch (SecurityException ignore) {
} finally {
w.unlock();
}
}
if (onlyOne)
break;
}
} finally {
mainLock.unlock();
}
}
这里主要是给所有空闲线程加上中断标志。w.tryLock()成功,表示是空闲线程
对比看一下shutdownNow
public List<Runnable> shutdownNow() {
List<Runnable> tasks;
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
checkShutdownAccess();
advanceRunState(STOP);
interruptWorkers();
tasks = drainQueue();
} finally {
mainLock.unlock();
}
tryTerminate();
return tasks;
}
advanceRunState(STOP);这里将线程池状态置为STOP。
private void interruptWorkers() {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
for (Worker w : workers)
w.interruptIfStarted();
} finally {
mainLock.unlock();
}
}
也是将所有线程给个中断标识。跟interruptIdleWorkers的区别在于,它使用的是 w.interruptIfStarted(),这个方法会给所有线程加中断标记,无论是否空闲
private List<Runnable> drainQueue() {
BlockingQueue<Runnable> q = workQueue;
ArrayList<Runnable> taskList = new ArrayList<Runnable>();
q.drainTo(taskList);
if (!q.isEmpty()) {
for (Runnable r : q.toArray(new Runnable[0])) {
if (q.remove(r))
taskList.add(r);
}
}
return taskList;
}
这个方法将workQueue中的任务全部转移出来。并清空workQueue。
到此,两种关闭线程池方式也很明确了。
shutdown:拒绝新任务,中断空闲worker,等队列里面的任务能执行完,关闭
shutdownNow:拒绝新任务,中断所有worker,拷贝并清空队列里面的任务,返回任务集合,关闭。
阅读过程中,遇到过一个误区,认为getTask方法中,调用的队列的take()是阻塞式的,应该lock住的,是怎么退出的呢。后面去翻看ArrayBlockingQueue的take()源码,发现它用的竟是lockInterruptibly()而不是lock()。
public E take() throws InterruptedException {
final ReentrantLock lock = this.lock;
lock.lockInterruptibly();
try {
while (count == 0)
notEmpty.await();
return dequeue();
} finally {
lock.unlock();
}
}