记得在4年前面试的时候,在追问为什么要用线程池呢?很简单的回答了因为线程池不用重复创建线程,重复创建线程是一个比较对性能有影响的动作。那么线程池的原理是什么呢? 这个问题一直没去研究,今天总结一下。
总的结构
无论是使用Executors
还是Guava的ThreadFactory
去构建线程池,都构建的是ThreadPoolExecutor
,所以这个就是我们这次分析的目标。
ctl变量
首先不得不提一个神奇的变量ctl
,这里我们简单理解它包含了两个部分: 线程池状态前3位和线程池worker数量。可以参见下面一张图.
核心的worker集合
private final HashSet<Worker> workers = new HashSet<Worker>();
是一个HashSet的worker集合,就是我们工作线程的集合。那么让我们看一下Worker的代码:
private final class Worker
extends AbstractQueuedSynchronizer
implements Runnable {
final Thread thread;
/** Initial task to run. Possibly null. */
Runnable firstTask;
Worker(Runnable firstTask) {
setState(-1); // inhibit interrupts until runWorker
this.firstTask = firstTask;
this.thread = getThreadFactory().newThread(this);
}
}
Worker对象里主要包含了两个内容: Thread
和firstTask
,而且Woker实现了Runnable,传给Thread的runnable对象是自己。
那么我们看一下Thread的代码, new之后主要调用了init
方法
private void init(ThreadGroup g, Runnable target, String name,
long stackSize, AccessControlContext acc,
boolean inheritThreadLocals) {
if (name == null) {
throw new NullPointerException("name cannot be null");
}
this.name = name;
...
}
但这里并没有发现任何分配动作,只是新建一个对象。
任务列表
private final BlockingQueue<Runnable> workQueue;
一个线程安全的Queue,这里不多说。
我们看看线程怎么起来并且怎么保持的吧
运行起来
从java.util.concurrent.ThreadPoolExecutor#addWorker
方法可以看到,如果worker添加成功,那么就启动Woker中的线程
if (workerAdded) {
t.start();
workerStarted = true;
}
那么我们看看Thread.start
方法吧.
try {
start0();
started = true;
} finally {
...
这里看到调用了Native的start0
方法,从JVM.C
中发现调用了操作系统创建线程并调用了Thread.run
方法。(之所以没贴代码,是因为这段代码的确看不太懂orz)
那么关键的问题来了,当Thread执行完这个任务后,会怎么样?
答案是等待任务。
源码方法是: java.util.concurrent.ThreadPoolExecutor#runWorker
final void runWorker(Worker w) {
Thread wt = Thread.currentThread();
Runnable task = w.firstTask;
w.firstTask = null;
...
boolean completedAbruptly = true;
try {
while (task != null || (task = getTask()) != null) {
...
try {
...
try {
task.run();
} catch (RuntimeException x) {
thrown = x; throw x;
} catch (Error x) {
thrown = x; throw x;
} catch (Throwable x) {
thrown = x; throw new Error(x);
} finally {
afterExecute(task, thrown);
}
} finally {
task = null;
...
}
}
completedAbruptly = false;
} finally {
processWorkerExit(w, completedAbruptly);
}
}
执行之后会在等待task = getTask()
,看到这里我们其实就知道线程其实还在运行,只不过从运行任务变成了等待任务的状态,线程没有退出。
我们再看一下getTask
方法。
for (;;) {
int c = ctl.get();
int rs = runStateOf(c);
// Check if queue empty only if necessary.
if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
decrementWorkerCount();
return null;
}
int wc = workerCountOf(c);
// Are workers subject to culling?
boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;
if ((wc > maximumPoolSize || (timed && timedOut))
&& (wc > 1 || workQueue.isEmpty())) {
if (compareAndDecrementWorkerCount(c))
return null;
continue;
}
try {
Runnable r = timed ?
workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
workQueue.take();
if (r != null)
return r;
timedOut = true;
} catch (InterruptedException retry) {
timedOut = false;
}
}
代码中可以看到:
- 会先判断是否停止状态,停止状态的话就返回null
- 接着判断是否有core线程也可以减少(默认false)或者当前线程数大于core
- 如果判断为true时,那么就按照keepAliveTime作为超时时间去拉取,拉取失败就返回null,那么worker就会退出,线程也会结束。
任务是怎么进来的?
这源于一个线上问题。我们的RPC调用可以配置线程池,为了减少响应时间,我把线程池的队列配的很小,固定为8。这导致了经常会碰到线程池溢出的情况,但是从活跃线程池上又很少,不可能出现线程不够用的情况。
在我们的直觉里,我们的任务进来就是给了worker线程,但是事实真的如此吗?
让我们再来看一次源码。
public void execute(Runnable command) {
if (command == null)
throw new NullPointerException();
/*
* Proceed in 3 steps:
*
* 1. If fewer than corePoolSize threads are running, try to
* start a new thread with the given command as its first
* task. The call to addWorker atomically checks runState and
* workerCount, and so prevents false alarms that would add
* threads when it shouldn't, by returning false.
*
* 2. If a task can be successfully queued, then we still need
* to double-check whether we should have added a thread
* (because existing ones died since last checking) or that
* the pool shut down since entry into this method. So we
* recheck state and if necessary roll back the enqueuing if
* stopped, or start a new thread if there are none.
*
* 3. If we cannot queue task, then we try to add a new
* thread. If it fails, we know we are shut down or saturated
* and so reject the task.
*/
int c = ctl.get();
if (workerCountOf(c) < corePoolSize) {
if (addWorker(command, true))
return;
c = ctl.get();
}
if (isRunning(c) && workQueue.offer(command)) {
int recheck = ctl.get();
if (! isRunning(recheck) && remove(command))
reject(command);
else if (workerCountOf(recheck) == 0)
addWorker(null, false);
}
else if (!addWorker(command, false))
reject(command);
}
那么,是否有我们我们直观印象,有空闲线程就可以执行的呢? 是有的,通过看我们的RPC框架代码,其中用到了一个关键的虚拟队列: SynchronousQueue。
我们来看下这个队列的put方法
public void put(E e) throws InterruptedException {
if (e == null) throw new NullPointerException();
if (transferer.transfer(e, false, 0) == null) {
Thread.interrupted();
throw new InterruptedException();
}
}
如果transfer为null,则抛出异常。否则就代表成功。
那接下来我们看下transferer的transfer方法实现。
E transfer(E e, boolean timed, long nanos) {
/* Basic algorithm is to loop trying to take either of
* two actions:
*
* 1. If queue apparently empty or holding same-mode nodes,
* try to add node to queue of waiters, wait to be
* fulfilled (or cancelled) and return matching item.
*
* 2. If queue apparently contains waiting items, and this
* call is of complementary mode, try to fulfill by CAS'ing
* item field of waiting node and dequeuing it, and then
* returning matching item.
*
* In each case, along the way, check for and try to help
* advance head and tail on behalf of other stalled/slow
* threads.
*
* The loop starts off with a null check guarding against
* seeing uninitialized head or tail values. This never
* happens in current SynchronousQueue, but could if
* callers held non-volatile/final ref to the
* transferer. The check is here anyway because it places
* null checks at top of loop, which is usually faster
* than having them implicitly interspersed.
*/
QNode s = null; // constructed/reused as needed
boolean isData = (e != null);
for (;;) {
QNode t = tail;
QNode h = head;
// 如果头部和尾部都是null 代表队列还未初始化完成 就自旋
if (t == null || h == null) // saw uninitialized value
continue; // spin
// 如果头部 = 尾部,或者尾部节点的类型与要加进来的类型一样
if (h == t || t.isData == isData) { // empty or same-mode
// 尾部的next指针是否已有数据 是的话就CAS把next指向的数据置为新的tail 目的是保证tail的next为null
QNode tn = t.next;
if (t != tail) // inconsistent read
continue;
if (tn != null) { // lagging tail
advanceTail(t, tn);
continue;
}
// 如果是带超时时间并且不能等待 那么就返回null
if (timed && nanos <= 0) // can't wait
return null;
// 把我们的任务包装成QNode 并把t.next指向我们新建的QNode
if (s == null)
s = new QNode(e, isData);
if (!t.casNext(null, s)) // failed to link in
continue;
// 把我们任务的指针作为队尾指针
advanceTail(t, s); // swing tail and wait
// 等待e被匹配(生产者和消费者进行匹配)
Object x = awaitFulfill(s, e, timed, nanos);
// 如果s已经取消 那么就清理
if (x == s) { // wait was cancelled
clean(t, s);
return null;
}
// 如果s还没有脱链 那么就queue向前移动 并且将s.water线程设置为null
if (!s.isOffList()) { // not already unlinked
advanceHead(t, s); // unlink if head
if (x != null) // and forget fields
s.item = s;
s.waiter = null;
}
// 返回数据
return (x != null) ? (E)x : e;
} else {
// 如果不是同样的类型,并且有元素 // complementary-mode
QNode m = h.next; // node to fulfill
if (t != tail || m == null || h != head)
continue; // inconsistent read
Object x = m.item;
if (isData == (x != null) || // m already fulfilled
x == m || // m cancelled
!m.casItem(x, e)) { // lost CAS
advanceHead(h, m); // dequeue and retry
continue;
}
// 将h节点前移
advanceHead(h, m); // successfully fulfilled
// 唤醒m的wait线程
LockSupport.unpark(m.waiter);
// 返回元素
return (x != null) ? (E)x : e;
}
}
}
其中awaitFulfill方法是等待匹配的方法。
Object awaitFulfill(QNode s, E e, boolean timed, long nanos) {
/* Same idea as TransferStack.awaitFulfill */
final long deadline = timed ? System.nanoTime() + nanos : 0L;
// 取当前线程
Thread w = Thread.currentThread();
int spins = ((head.next == s) ?
(timed ? maxTimedSpins : maxUntimedSpins) : 0);
for (;;) {
// 如果是被中断或者超时的话 就取消 否则就尝试直到满足时间或者次数
if (w.isInterrupted())
s.tryCancel(e);
Object x = s.item;
// 如果s的item不等于e了 就代表e已经被匹配 返回x(因为s会把当前线程进行等待)
if (x != e)
return x;
if (timed) {
nanos = deadline - System.nanoTime();
if (nanos <= 0L) {
s.tryCancel(e);
continue;
}
}
if (spins > 0)
--spins;
// 如果等待线程是null 那么就把当前线程作为s的等待线程
else if (s.waiter == null)
s.waiter = w;
else if (!timed)
LockSupport.park(this);
else if (nanos > spinForTimeoutThreshold)
LockSupport.parkNanos(this, nanos);
}
}
这个队列起到的效果是: 存储0个元素,然后生产者和消费者进行单一(尾部元素一样才可以进队列)排队,匹配之后就从队列中移除并唤醒线程。