线程池的概念
在谈到线程池之前,我们先说一说什么是线程与进程的区别与联系。
每一个JAVA程序都是一个进程,在某些场景下,为了获得更好的处理性能,缩短处理时间,可以启动多线程来并发执行任务。每一个进程包含一到多个线程,其中一个为主线程;一个线程从属于一个进程。
多线程概念的提出主要是为了充分利用CPU资源,在当前多核CPU的计算机上,并行执行任务,提高性能,缩短任务执行时间。但是线程会占用计算机的资源(CPU资源、内存资源),创建和启动线程的操作带有一定开销(因为在创建和启动线程的时候需要分配线程所需要的资源),频繁地创建启动线程和销毁线程,会带来一定的开销;此外,如果线程数太多,CPU需要在执行任务的过程中,频繁地进行线程切换,造成上下文切换的开销。因此,过多的线程不仅不会提高性能,反而会拖慢性能。
那么如何能够解决以上两个问题:不频繁开启/销毁线程和控制线程数,一个比较直接的想法是,使用一个承装线程的有容量限制的池子来解决。首先,采用池化的处理,可以做到线程的复用,不会导致频繁的开启和销毁;另外,由于具备容量限制,因此间接地控制住了线程数。这就是线程池提出的初衷。
线程池的重要属性
上文中提出了线程池的两大作用:
- 复用线程;
- 限制线程总数。
从以上两点也就能大致推理出线程池中重要的属性。由于要复用线程,因此需要保有一个已创建线程的容器,这个可根据业务需求采用集合来做承载。此外,复用的线程主要是用来处理业务方提交过来的任务,因此需要保有一个任务集合(通常来讲,是一个队列),由复用的多线程从任务集合中获取任务进行处理。同时,由于通常任务集合是有界的(对内存占用进行限制),因此设置任务集合满载之后的拒绝策略。最后,为了限制线程总数,需要设置线程池的最大线程数,为了应对突发洪峰,可以设置一些替补线程(临时工线程),待洪峰过境之后进行销毁。
综上所述,线程池的主要属性就是:
- 最大线程数
- 任务队列,需要注意线程安全性
- 任务队列满之后的拒绝策略
- 线程容器,需要注意线程安全性
- 线程池状态,需要注意线程安全性
- 线程池中的当前线程数,需要注意线程安全性
JAVA中的线程池实现
说到JAVA中的线程池实现,就不能不提及到ThreadPoolExecutor,其继承关系如下所示。
ThreadPoolExecutor的构造函数、重要属性、execute方法都将会是本章节讨论的内容。
构造函数
public ThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
BlockingQueue<Runnable> workQueue,
ThreadFactory threadFactory,
RejectedExecutionHandler handler) {
if (corePoolSize < 0 ||
maximumPoolSize <= 0 ||
maximumPoolSize < corePoolSize ||
keepAliveTime < 0)
throw new IllegalArgumentException();
if (workQueue == null || threadFactory == null || handler == null)
throw new NullPointerException();
this.acc = System.getSecurityManager() == null ?
null :
AccessController.getContext();
this.corePoolSize = corePoolSize;
this.maximumPoolSize = maximumPoolSize;
this.workQueue = workQueue;
this.keepAliveTime = unit.toNanos(keepAliveTime);
this.threadFactory = threadFactory;
this.handler = handler;
}
构造函数中设置了线程池的若干重要属性,如下所示:
名称 | 含义 |
corePoolSize | 线程池中核心线程数,即“编制内正式”线程数 |
maximumPoolSize | 最大线程数,核心线程数(编制内正式)+临时线程数(临时工) |
keepAliveTime | 线程存活的时间(当设置了allowCoreThreadTimeOut时,核心线程超过该时间没有获取到要执行的任务,则退出;否则,对非核心线程进行同样处理) |
unit | 时间单位 |
workQueue | 工作线程队列 |
threadFactory | 线程创建工厂 |
handler | 工作线程队列满之后的拒绝策略处理器 |
重要属性
线程池中,除了上述在构造函数中设置的属性之外,还需要关注以下属性:
/**
* The queue used for holding tasks and handing off to worker
* threads. We do not require that workQueue.poll() returning
* null necessarily means that workQueue.isEmpty(), so rely
* solely on isEmpty to see if the queue is empty (which we must
* do for example when deciding whether to transition from
* SHUTDOWN to TIDYING). This accommodates special-purpose
* queues such as DelayQueues for which poll() is allowed to
* return null even if it may later return non-null when delays
* expire.
*/
private final BlockingQueue<Runnable> workQueue;
/**
* Lock held on access to workers set and related bookkeeping.
* While we could use a concurrent set of some sort, it turns out
* to be generally preferable to use a lock. Among the reasons is
* that this serializes interruptIdleWorkers, which avoids
* unnecessary interrupt storms, especially during shutdown.
* Otherwise exiting threads would concurrently interrupt those
* that have not yet interrupted. It also simplifies some of the
* associated statistics bookkeeping of largestPoolSize etc. We
* also hold mainLock on shutdown and shutdownNow, for the sake of
* ensuring workers set is stable while separately checking
* permission to interrupt and actually interrupting.
*/
private final ReentrantLock mainLock = new ReentrantLock();
/**
* Set containing all worker threads in pool. Accessed only when
* holding mainLock.
*/
private final HashSet<Worker> workers = new HashSet<Worker>();
private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
private static final int COUNT_BITS = Integer.SIZE - 3;
private static final int CAPACITY = (1 << COUNT_BITS) - 1;
// runState is stored in the high-order bits
private static final int RUNNING = -1 << COUNT_BITS;
private static final int SHUTDOWN = 0 << COUNT_BITS;
private static final int STOP = 1 << COUNT_BITS;
private static final int TIDYING = 2 << COUNT_BITS;
private static final int TERMINATED = 3 << COUNT_BITS;
看着虽然洋洋洒洒一大堆,但是关键的属性可分为以下几类:
- 状态类信息:ctl,作为一个按位的表示,糅合了当前线程数和线程池当前状态。
- 锁信息:mainLock
- 线程容器:workers;
- 任务队列:workQueue。
对于状态类信息ctl,作为一个32位的AtomicInteger型变量,将高三位留出来表示线程池的状态;剩下的位数用来存储worker的数量,即线程数。不过我个人向来是比较倾向于职责单一的设计,对这种将两种不同含义的表示糅合在一起的做法保留个人意见。
对于锁信息,后续在分析execute方法的时候会谈到,其主要解决多线程下临界资源操作的问题,保证线程安全性;
对于线程容器和任务队列,由于其容器中存储的类型都是Runnable,初学者经常会搞混。我们举个例子,线程容器就是一个商店,容器中的线程就是服务员,其包含正式在编职员(coreThread,核心线程)和临时工(用来应对短时洪峰)。任务队列就是顾客,业务方调用线程池的方法,比如execute,就是将他们的需求包装成任务(Task),然后交由线程池中的多线程来进行处理。
execute方法
execute方法作为ThreadPoolExecutor中最为重要的方法,主要用于提交Runnable(可以认为是一个任务),交由ThreadPoolExecutor分配线程,运行任务,获取最终的处理结果。
public void execute(Runnable command) {
if (command == null)
throw new NullPointerException();
/*
* Proceed in 3 steps:
*
* 1. If fewer than corePoolSize threads are running, try to
* start a new thread with the given command as its first
* task. The call to addWorker atomically checks runState and
* workerCount, and so prevents false alarms that would add
* threads when it shouldn't, by returning false.
*
* 2. If a task can be successfully queued, then we still need
* to double-check whether we should have added a thread
* (because existing ones died since last checking) or that
* the pool shut down since entry into this method. So we
* recheck state and if necessary roll back the enqueuing if
* stopped, or start a new thread if there are none.
*
* 3. If we cannot queue task, then we try to add a new
* thread. If it fails, we know we are shut down or saturated
* and so reject the task.
*/
int c = ctl.get();
if (workerCountOf(c) < corePoolSize) {
if (addWorker(command, true))
return;
c = ctl.get();
}
if (isRunning(c) && workQueue.offer(command)) {
int recheck = ctl.get();
if (! isRunning(recheck) && remove(command))
reject(command);
else if (workerCountOf(recheck) == 0)
addWorker(null, false);
}
else if (!addWorker(command, false))
reject(command);
}
从中可以看出,execute方法的执行步骤如下:
首先,获取线程池当前线程数,即上边提到的重要属性中的ctl,获取其低位中的值,即线程数。如果当前线程数小于构造函数中设置的核心线程最大值,则尝试创建并启动新线程,添加到线程容器中去。如果成功创建并启动新线程,则函数返回。addWorker函数执行过程如下:
private boolean addWorker(Runnable firstTask, boolean core) {
retry:
for (;;) {
int c = ctl.get();
int rs = runStateOf(c);
// Check if queue empty only if necessary.
if (rs >= SHUTDOWN &&
! (rs == SHUTDOWN &&
firstTask == null &&
! workQueue.isEmpty()))
return false;
for (;;) {
int wc = workerCountOf(c);
if (wc >= CAPACITY ||
wc >= (core ? corePoolSize : maximumPoolSize))
return false;
if (compareAndIncrementWorkerCount(c))
break retry;
c = ctl.get(); // Re-read ctl
if (runStateOf(c) != rs)
continue retry;
// else CAS failed due to workerCount change; retry inner loop
}
}
boolean workerStarted = false;
boolean workerAdded = false;
Worker w = null;
try {
w = new Worker(firstTask);
final Thread t = w.thread;
if (t != null) {
final ReentrantLock mainLock = this.mainLock;
mainLock.lock();
try {
// Recheck while holding lock.
// Back out on ThreadFactory failure or if
// shut down before lock acquired.
int rs = runStateOf(ctl.get());
if (rs < SHUTDOWN ||
(rs == SHUTDOWN && firstTask == null)) {
if (t.isAlive()) // precheck that t is startable
throw new IllegalThreadStateException();
workers.add(w);
int s = workers.size();
if (s > largestPoolSize)
largestPoolSize = s;
workerAdded = true;
}
} finally {
mainLock.unlock();
}
if (workerAdded) {
t.start();
workerStarted = true;
}
}
} finally {
if (! workerStarted)
addWorkerFailed(w);
}
return workerStarted;
}
不难看出,addWorker方法先校验线程池状态,再检验线程池中当前线程数,校验通过之后,为保证线程安全,持有mainLock,创建并启动线程。
进入后续步骤的是上一步中没有成功创建并启动新线程的情形,这其中有可能是因为线程池没有处于Running状态或当前线程数已经达到核心线程数(因此不能创建核心线程了)。在这一步中,首先判断线程池是否还处于运行态,是的话,尝试将任务添加到workQueue中,由已保有的线程来执行。如果添加不成功,尝试回滚添加(即remove并拒绝该任务)。
如果已经不能向workQueue中添加command了,即workQueue满了,则尝试通过增加非核心线程(临时工)的方式进行处理。
线程池中的线程run方法具体执行步骤如下,其实就是从workQueue中获取Task,然后执行Task的run方法:
final void runWorker(Worker w) {
Thread wt = Thread.currentThread();
Runnable task = w.firstTask;
w.firstTask = null;
w.unlock(); // allow interrupts
boolean completedAbruptly = true;
try {
while (task != null || (task = getTask()) != null) {
w.lock();
// If pool is stopping, ensure thread is interrupted;
// if not, ensure thread is not interrupted. This
// requires a recheck in second case to deal with
// shutdownNow race while clearing interrupt
if ((runStateAtLeast(ctl.get(), STOP) ||
(Thread.interrupted() &&
runStateAtLeast(ctl.get(), STOP))) &&
!wt.isInterrupted())
wt.interrupt();
try {
beforeExecute(wt, task);
Throwable thrown = null;
try {
task.run();
} catch (RuntimeException x) {
thrown = x; throw x;
} catch (Error x) {
thrown = x; throw x;
} catch (Throwable x) {
thrown = x; throw new Error(x);
} finally {
afterExecute(task, thrown);
}
} finally {
task = null;
w.completedTasks++;
w.unlock();
}
}
completedAbruptly = false;
} finally {
processWorkerExit(w, completedAbruptly);
}
}
让我们来看一下其中最为关键的方法--getTask。其执行流程如下:
private Runnable getTask() {
boolean timedOut = false; // Did the last poll() time out?
for (;;) {
int c = ctl.get();
int rs = runStateOf(c);
// Check if queue empty only if necessary.
if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
decrementWorkerCount();
return null;
}
int wc = workerCountOf(c);
// Are workers subject to culling?
boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;
if ((wc > maximumPoolSize || (timed && timedOut))
&& (wc > 1 || workQueue.isEmpty())) {
if (compareAndDecrementWorkerCount(c))
return null;
continue;
}
try {
Runnable r = timed ?
workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
workQueue.take();
if (r != null)
return r;
timedOut = true;
} catch (InterruptedException retry) {
timedOut = false;
}
}
}
其执行过程包含在一个无限循环中,首先校验线程池的状态,即:
if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) { decrementWorkerCount(); return null; }
对于这种状态,需要不断地减少线程池中线程的数量,并返回null值给上层,由上层终止线程。
接下来校验线程池中的线程数目,以及是否需要缩减线程数目,即:
if ((wc > maximumPoolSize || (timed && timedOut)) && (wc > 1 || workQueue.isEmpty())) { if (compareAndDecrementWorkerCount(c)) return null; continue; }
再之后就是最为关键的从workQueue中获取待执行任务:
Runnable r = timed ? workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) : workQueue.take();
这里边会用到我们在构造函数时设置的keepAliveTime。如果设置了allowCoreThreadTimeOut,则会启用keepAliveTime,当从workQueue中获取任务达到这个keepAliveTime的时候,返回null值,并在上层中将该线程结束。如果没有设置,则仅当当前线程池中的线程个数大于构造函数中设置的最大核心线程数时启用keepAliveTime(即仅对临时工线程有效)。