深入剖析 Java 中的线程池

在中等强度的负载水平下,“每任务每线程(thread-per-task)”的方法是对顺序化执行的良好改进,在请求的到达速度尚未超出服务器的请求处理能力的情况下,这种方法可以同时带来更快的响应性和更大的吞吐量。

当用于生产环境中时,该方法存在一些实际的缺陷尤其在需要创建大量的线程时尤为突出:

  • 线程生命周期的开销。线程的创建与销毁并不是“免费”的,实际的开销依据不同的平台而不同,但是创建线程的确需要时间,带来处理请求的延迟,并且需要在JVM和操作系统之间进行相应的处理活动。如果请求是频繁的且轻量的,就像大多数服务器程序一样,那么为每个请求创建一个新线程的做法就会消耗大量的计算资源。
  • 资源消耗量。活动的线程会消耗系统资源,尤其是内存。如果可运行的线程数多于可用的处理器数,线程将会空闲。大量空闲线程占用更多的内存,给垃圾回收器带来压力,而且大量线程在竞争CPU资源,还会产生其他的性能开销。如果你已经有了足够多的线程保持所有CPU忙碌,俺么创建更多的线程是百害而无一利的。
  • 稳定性。应该限制可创建线程的数目。限制的数目依据不同的平台而定,同时也受到JVM的启动参数、Thread的构造方法中请求的栈大小等因素的影响,以及底层操作系统线程的限制。而一昧的创建线程,会导致应用程序面临奔溃。为了摆脱这种危险,应该设置一个范围来限制你的应用程序可以创建的线程数,然后彻底的测试你的应用程序,确保即使线程数达到了这个范围的极限,应用程序也不至于耗尽所有的资源。

通过线程池可以使得线程可以被复用,执行完一个任务并不被销毁,而是可以继续执行其他的任务,从而提升效率。并且将任务提交与任务执行之间进行解耦。

ThreadPoolExecutor

线程池中最核心的一个类就是ThreadPoolExecutor ,继承自AbstractExecutor

从最核心的构造方法分析:

public ThreadPoolExecutor(int corePoolSize,
                             int maximumPoolSize,
                             long keepAliveTime,
                             TimeUnit unit,
                             BlockingQueue<Runnable> workQueue,
                             ThreadFactory threadFactory,
                             RejectedExecutionHandler handler) {
       if (corePoolSize < 0 ||
           maximumPoolSize <= 0 ||
           maximumPoolSize < corePoolSize ||
           keepAliveTime < 0)
           throw new IllegalArgumentException();
       if (workQueue == null || threadFactory == null || handler == null)
           throw new NullPointerException();
       this.corePoolSize = corePoolSize;
       this.maximumPoolSize = maximumPoolSize;
       this.workQueue = workQueue;
       this.keepAliveTime = unit.toNanos(keepAliveTime);
       this.threadFactory = threadFactory;
       this.handler = handler;
   }

关键参数:

  • corePoolSize : 核心池的大小,在创建了线程池后,默认情况下,该线程池中是没有任何线程的,而是等待有任务到来才会创建线程去执行,当线程池中的线程数目达到corePoolSize后,就会吧到达的任务放到缓存队列当中。还有另一种情况就是,创建线程池后紧接着调用了prestartCoreThread或者prestartAllCoreThreads 方法,将会预创建一个线程或者corePoolSize个线程。
  • maximumPoolSize : 线程池的最大线程数
  • keepAliveTime : 表示线程最大空闲时间。在默认情况下,只有在线程池中的线程数量大于corePoolSize时才会起作用,知道线程池中的线程数量不大于corePoolSize,即当线程池中的线程数大于corePoolSize时,如果一个线程空闲的时间达到keepAliveTime,则会终止,直到线程池中的线程数不超过corePoolSize。但是如果调用了allowCoreThreadTimeOut(boolean)方法,在线程池中的线程数不大于corePoolSize时,keepAliveTime参数也会起作用,直到线程池中的线程数为0。
  • unit : 参数keepAliveTime的时间单位。
  • workQueue:一个阻塞队列,用来存储等待执行的任务,这个参数的选择也很重要,会对线程池的运行过程产生重大影响,一般来说,这里的阻塞队列有以下几种选择:
    • ArrayBlockingQueue
    • LinkedBlockingQueue
    • SynchronousQueue
      ArrayBlockingQueue和PriorityBlockingQueue使用较少,一般使用LinkedBlockingQueue和Synchronous。线程池的排队策略与BlockingQueue有关。
  • threadFactory:线程工厂,主要用来创建线程。
  • handler:表示当拒绝处理任务时的策略,有以下四种取值:
    • ThreadPoolExecutor.AbortPolicy:丢弃任务并抛出RejectedExecutionException异常。
    • ThreadPoolExecutor.DiscardPolicy:也是丢弃任务,但是不抛出异常。
    • ThreadPoolExecutor.DiscardOldestPolicy:丢弃队列最前面的任务,然后重新尝试执行任务(重复此过程)
    • ThreadPoolExecutor.CallerRunsPolicy:由调用线程处理该任务

在ThreadPoolExecutor类中有几个非常重要的方法:

  • execute()
  • submit()
  • shutdown()
  • shutdownNow()

execute()方法实际上是Executor中声明的方法,在ThreadPoolExecutor进行了具体的实现,这个方法是ThreadPoolExecutor的核心方法,通过这个方法可以向线程池提交一个任务,交由线程池去执行。

submit()方法是在ExecutorService中声明的方法,在AbstractExecutorService就已经有了具体的实现,在ThreadPoolExecutor中并没有对其进行重写,这个方法也是用来向线程池提交任务的,但是它和execute()方法不同,它能够返回任务执行的结果,去看submit()方法的实现,会发现它实际上还是调用的execute()方法,只不过它利用了Future来获取任务执行结果:

// AbstractExecutorService
public Future<?> submit(Runnable task) {
       if (task == null) throw new NullPointerException();
       RunnableFuture<Void> ftask = newTaskFor(task, null);
       execute(ftask);
       return ftask;
   }

shutdown()和shutdownNow()是用来关闭线程池的,不同在于shutdown()不会立即终止线程池,而是要等所有任务缓存队列中的任务都执行完后才终止,但再也不会接受新的任务,而shutdownNow()立即终止线程池,并尝试打断正在执行的任务,并且清空任务缓存队列,返回尚未执行的任务列表(List类型)。

深入剖析线程池实现原理

线程池的状态

在ThreadPoolExecutor中定义了一个volatile变量,另外定义了几个static final变量表示线程池的各个状态(JDK8):

    /* The runState provides the main lifecycle control, taking on values:
    *
    *   RUNNING:  Accept new tasks and process queued tasks
    *   SHUTDOWN: Don't accept new tasks, but process queued tasks
    *   STOP:     Don't accept new tasks, don't process queued tasks,
    *             and interrupt in-progress tasks
    *   TIDYING:  All tasks have terminated, workerCount is zero,
    *             the thread transitioning to state TIDYING
    *             will run the terminated() hook method
    *   TERMINATED: terminated() has completed
    *
    * The numerical order among these values matters, to allow
    * ordered comparisons. The runState monotonically increases over
    * time, but need not hit each state. The transitions are:
    *
    * RUNNING -> SHUTDOWN
    *    On invocation of shutdown(), perhaps implicitly in finalize()
    * (RUNNING or SHUTDOWN) -> STOP
    *    On invocation of shutdownNow()
    * SHUTDOWN -> TIDYING
    *    When both queue and pool are empty
    * STOP -> TIDYING
    *    When pool is empty
    * TIDYING -> TERMINATED
    *    When the terminated() hook method has completed
    *
    * Threads waiting in awaitTermination() will return when the
    * state reaches TERMINATED.
    *
    * Detecting the transition from SHUTDOWN to TIDYING is less
    * straightforward than you'd like because the queue may become
    * empty after non-empty and vice versa during SHUTDOWN state, but
    * we can only terminate if, after seeing that it is empty, we see
    * that workerCount is 0 (which sometimes entails a recheck -- see
    * below).
    */
// runState is stored in the high-order bits
private static final int RUNNING    = -1 << COUNT_BITS;
private static final int SHUTDOWN   =  0 << COUNT_BITS;
private static final int STOP       =  1 << COUNT_BITS;
private static final int TIDYING    =  2 << COUNT_BITS;
private static final int TERMINATED =  3 << COUNT_BITS;

runState表示当前线程池的状态,它是一个volatile变量用来保证线程之间的可见性。(各个状态之间的含义参照上方注释)。

  • RUNNING :接收新任务并且处理已经置入任务队列的任务
  • SHUTDOWN :不接受新的任务,但会继续处理已经置入任务队列的任务
  • STOP : 不接受新的任务,不再处理任务队列中的任务,且会中断正在处理的任务
  • TIDYING :所有任务都被终结,并且工作线程(workerCount)为0,在调用terminated()方法时才会变成这个状态
  • TERMINATED : terminated() 方法运行完后将会变为这个状态

任务的执行

在了解将任务提交给线程池到任务执行完毕整个过程之前,先看下该类中其他的一些比较重要的成员变量:

private final BlockingQueue<Runnable> workQueue; // 任务缓存队列,用来存放等待执行的任务
private final ReentrantLock mainLock = new ReentrantLock(); // 线程池的主要状态锁,对线程池状态(比如线程池大小runState等)的改变都要使用这个锁
private final HashSet<Worker> workers = new HashSet<Worker>(); // 用来存放工作线程
private volatile long  keepAliveTime;  // 线程存货时间   
private volatile boolean allowCoreThreadTimeOut; // 是否允许为核心线程设置存活时间
private volatile int   corePoolSize; // 核心池的大小(即线程池中的线程数目大于这个参数时,提交的任务会被放进任务缓存队列)
private volatile int   maximumPoolSize;  // 线程池最大能容纳的线程数
private volatile int   poolSize; // 线程池中当前的线程数
private volatile RejectedExecutionHandler handler; // 任务拒绝策略
private volatile ThreadFactory threadFactory; // 线程工厂,用来创建线程
private int largestPoolSize; // 用来记录线程池中曾经出现过的最大线程数
private long completedTaskCount; // 用来记录已经执行完毕的任务个数

重要的是corePoolSize,maximumPoolSize,largestPoolSize这三个变量。

corePoolSize在很多地方被翻译成核心池大小,其实我的理解这个就是线程池的大小。举个简单的例子:

假如有一个工厂,工厂里面有10个工人,每个工人同时只能做一件任务。

因此只要当10个工人中有工人是空闲的,来了任务就分配给空闲的工人做;

当10个工人都有任务在做时,如果还来了任务,就把任务进行排队等待;

如果说新任务数目增长的速度远远大于工人做任务的速度,那么此时工厂主管可能会想补救措施,比如重新招4个临时工人进来;

然后就将任务也分配给这4个临时工人做;

如果说着14个工人做任务的速度还是不够,此时工厂主管可能就要考虑不再接收新的任务或者抛弃前面的一些任务了。

当这14个工人当中有人空闲时,而新任务增长的速度又比较缓慢,工厂主管可能就考虑辞掉4个临时工了,只保持原来的10个工人,毕竟请额外的工人是要花钱的。

这个例子中的corePoolSize就是10,而maximumPoolSize就是14(10+4)。

也就是说corePoolSize就是线程池大小,maximumPoolSize在我看来是线程池的一种补救措施,即任务量突然过大时的一种补救措施。

largestPoolSize只是一个用来起记录作用的变量,用来记录线程池中曾经有过的最大线程数目,跟线程池的容量没有任何关系。

下面我们进入正题,看一下任务从提交到最终执行完毕经历了哪些过程。

在ThreadPoolExecutor类中,最核心的任务提交方法是execute()方法,虽然通过submit也可以提交任务,但是实际上submit方法里面最终调用的还是execute()方法,所以我们只需要研究execute()方法的实现原理即可:

public void execute(Runnable command) {
    if (command == null)
        throw new NullPointerException();
    /*
     * Proceed in 3 steps:
     *
     * 1. If fewer than corePoolSize threads are running, try to
     * start a new thread with the given command as its first
     * task.  The call to addWorker atomically checks runState and
     * workerCount, and so prevents false alarms that would add
     * threads when it shouldn't, by returning false.
     *
     * 2. If a task can be successfully queued, then we still need
     * to double-check whether we should have added a thread
     * (because existing ones died since last checking) or that
     * the pool shut down since entry into this method. So we
     * recheck state and if necessary roll back the enqueuing if
     * stopped, or start a new thread if there are none.
     *
     * 3. If we cannot queue task, then we try to add a new
     * thread.  If it fails, we know we are shut down or saturated
     * and so reject the task.
     */
    int c = ctl.get();
    if (workerCountOf(c) < corePoolSize) {
        if (addWorker(command, true))
            return;
        c = ctl.get();
    }
    if (isRunning(c) && workQueue.offer(command)) {
        int recheck = ctl.get();
        if (! isRunning(recheck) && remove(command))
            reject(command);
        else if (workerCountOf(recheck) == 0)
            addWorker(null, false);
    }
    else if (!addWorker(command, false))
        reject(command);
}

其中ctl是一个成员变量:

 private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));

其中ctlof方法源码如下:

// 将runState和workerCount装入一个整形变量,注意这里的runState是经过左移操作的只有高三位是有效的
private static int ctlOf(int rs, int wc) { return rs | wc; }

源码中关于ctl变量的注释如下:

The main pool control state, ctl, is an atomic integer packing two conceptual fields
workerCount, indicating the effective number of threads
runState,    indicating whether running, shutting down etc

In order to pack them into one int, we limit workerCount to (2^29)-1 (about 500 million) threads rather than (2^31)-1 (2 billion) otherwise representable. If this is ever an issue in the future, the variable can be changed to be an AtomicLong, and the shift/mask constants below adjusted. But until the need arises, this code is a bit faster and simpler using an int.

The workerCount is the number of workers that have been permitted to start and not permitted to stop.  The value may be transiently different from the actual number of live threads, for example when a ThreadFactory fails to create a thread when asked, and when exiting threads are still performing bookkeeping before terminating. The user-visible pool size is reported as the current size of the workers set.

该原子变量用于表示线程池的控制状态信息,包括workerCount(工作线程数)以及runstate(线程池的运行状态),这个设计非常巧妙,高2位用于表示runstate,后面的29位用于表示workerCount的数量,充分的利用了空间。

其实流程在注释中已经解释的十分清楚了。

首先检查提交任务是否为空,如果为空就抛出空指针异常。

if (command == null)
         throw new NullPointerException();

如果当前池中线程数小于corePoolSize,则尝试新增一个工作线程负责处理这个任务。

if (workerCountOf(c) < corePoolSize) {
            if (addWorker(command, true))
                return;
            c = ctl.get();
        }

我们再来看看addWorker这个方法:

/**
     * Checks if a new worker can be added with respect to current
     * pool state and the given bound (either core or maximum). If so,
     * the worker count is adjusted accordingly, and, if possible, a
     * new worker is created and started, running firstTask as its
     * first task. This method returns false if the pool is stopped or
     * eligible to shut down. It also returns false if the thread
     * factory fails to create a thread when asked.  If the thread
     * creation fails, either due to the thread factory returning
     * null, or due to an exception (typically OutOfMemoryError in
     * Thread.start()), we roll back cleanly.
     *
     * @param firstTask the task the new thread should run first (or
     * null if none). Workers are created with an initial first task
     * (in method execute()) to bypass queuing when there are fewer
     * than corePoolSize threads (in which case we always start one),
     * or when the queue is full (in which case we must bypass queue).
     * Initially idle threads are usually created via
     * prestartCoreThread or to replace other dying workers.
     *
     * @param core if true use corePoolSize as bound, else
     * maximumPoolSize. (A boolean indicator is used here rather than a
     * value to ensure reads of fresh values after checking other pool
     * state).
     * @return true if successful
     */
    private boolean addWorker(Runnable firstTask, boolean core) {
        retry:
        for (;;) {
            int c = ctl.get();
            int rs = runStateOf(c);

            // Check if queue empty only if necessary.
            if (rs >= SHUTDOWN &&
                ! (rs == SHUTDOWN &&
                   firstTask == null &&
                   ! workQueue.isEmpty()))
                return false;

            for (;;) {
                int wc = workerCountOf(c);
                if (wc >= CAPACITY ||
                    wc >= (core ? corePoolSize : maximumPoolSize))
                    return false;
                if (compareAndIncrementWorkerCount(c))
                    break retry;
                c = ctl.get();  // Re-read ctl
                if (runStateOf(c) != rs)
                    continue retry;
                // else CAS failed due to workerCount change; retry inner loop
            }
        }

        boolean workerStarted = false;
        boolean workerAdded = false;
        Worker w = null;
        try {
            w = new Worker(firstTask);
            final Thread t = w.thread;
            if (t != null) {
                final ReentrantLock mainLock = this.mainLock;
                mainLock.lock();
                try {
                    // Recheck while holding lock.
                    // Back out on ThreadFactory failure or if
                    // shut down before lock acquired.
                    int rs = runStateOf(ctl.get());

                    if (rs < SHUTDOWN ||
                        (rs == SHUTDOWN && firstTask == null)) {
                        if (t.isAlive()) // precheck that t is startable
                            throw new IllegalThreadStateException();
                        workers.add(w);
                        int s = workers.size();
                        if (s > largestPoolSize)
                            largestPoolSize = s;
                        workerAdded = true;
                    }
                } finally {
                    mainLock.unlock();
                }
                if (workerAdded) {
                    t.start();
                    workerStarted = true;
                }
            }
        } finally {
            if (! workerStarted)
                addWorkerFailed(w);
        }
        return workerStarted;
    }

在addWorker这个方法里,首先检查线程池的一些状态信息,是否允许添加新的工作线程,不允许则直接返回false:

这里的判断条件并不是很好理解。

if (rs >= SHUTDOWN &&
    ! (rs == SHUTDOWN &&
       firstTask == null &&
       ! workQueue.isEmpty()))
    return false;

如果允许添加,则判断当前池中的工作线程数是否超出应有的范围(corePoolSize或者maximumPoolSize),如果超出则直接返回false:

int wc = workerCountOf(c);
if (wc >= CAPACITY ||
    wc >= (core ? corePoolSize : maximumPoolSize))
    return false;

如果满足要求则调用compareAndIncrementWorkerCount(int expect)方法将workerCount值加1,前面提到了使用ctl这个原子变量的低29位来保存workerCount的值,使用该原子变量的CAS操作,就避免了加锁,如果成功则跳出当前的循环,如果不成功则再次读取ctl的值,判断其运行状态,如果运行状态发生了改变则重新进入retry即外围的循环,如果未发生改变,则继续内部的循环 :

if (compareAndIncrementWorkerCount(c))
      break retry;
  c = ctl.get();  // Re-read ctl
  if (runStateOf(c) != rs)
      continue retry;
  // else CAS failed due to workerCount change; retry inner loop

compareAndIncrementWorkerCount:

/**
 * Attempts to CAS-increment the workerCount field of ctl.
 */
private boolean compareAndIncrementWorkerCount(int expect) {
    return ctl.compareAndSet(expect, expect + 1);
}

跳出当前的循环后,加锁,重新检查线程池的运行状态,尝试添加Worker,并使用当前任务作为第一个任务,将其存入工作线程的集合中(workers:HashSet)中,还使用finally块来保证当添加Worker失败的时候进行回滚操作(从集合中移除当前Worker,将workerCount的值减一等):

boolean workerStarted = false;
boolean workerAdded = false;
Worker w = null;
try {
   w = new Worker(firstTask);
   final Thread t = w.thread;
   if (t != null) {
       final ReentrantLock mainLock = this.mainLock;
       mainLock.lock();
       try {
           // Recheck while holding lock.
           // Back out on ThreadFactory failure or if
           // shut down before lock acquired.
           int rs = runStateOf(ctl.get());

           if (rs < SHUTDOWN ||
               (rs == SHUTDOWN && firstTask == null)) {
               if (t.isAlive()) // precheck that t is startable
                   throw new IllegalThreadStateException();
               workers.add(w);
               int s = workers.size();
               if (s > largestPoolSize)
                   largestPoolSize = s;
               workerAdded = true;
           }
       } finally {
           mainLock.unlock();
       }
       if (workerAdded) {
           t.start();
           workerStarted = true;
       }
   }
} finally {
   if (! workerStarted)
       addWorkerFailed(w);
}

让我们再回到execute方法,如果当前的线程数大于等于核心池的数量或者添加新的Work失败(并发导致,因为没有加锁),那么则重新检查线程池的运行状态,将当前任务入队:

 if (isRunning(c) && workQueue.offer(command))

接着又再次检查了运行状态,如果处于非运行状态,则将之前入队的Work移除,并且拒绝:

int recheck = ctl.get();
if (! isRunning(recheck) && remove(command))
    reject(command);

如果仍处于运行态,则检查worker的数量,如果为零则新建一个Worker:

else if (workerCountOf(recheck) == 0)
     addWorker(null, false);

如果任务队列已满(加入任务队列失败)则尝试新建一个Worker,并且让这个Worker来处理这个任务:

else if (!addWorker(command, false))
     reject(command);

简单地总结起来可以分为三步:

  1. 如果 当前的线程数小于corePoolSize,那么每来一个任务就尝试新建一个线程并且将这个任务作为该线程的第一个任务,让这个线程去执行。
  2. 如果 当前的线程数大于等于corePoolSize,那么尝试将当前任务置入队列中 ( 当有对应的线程任务执行完毕,则会去队列中取出任务执行)。
  3. 如果 置入队列失败(队列已满)且 当前线程数 小于maximumPoolSize , 则尝试新建一个线程去执行这个任务。如果 线程数大于maximumPoolSize 则采取任务拒绝策略 。

这里面的每个步骤都巧妙的利用了ctl这个原子变量,大部分都是无锁操作,并且保证了并发下的代码正确性。

那么到底是如何做到重用这些线程的呢?

这里需要重点关注一下Wroker这个内部类的实现:

/**
 * Class Worker mainly maintains interrupt control state for
 * threads running tasks, along with other minor bookkeeping.
 * This class opportunistically extends AbstractQueuedSynchronizer
 * to simplify acquiring and releasing a lock surrounding each
 * task execution.  This protects against interrupts that are
 * intended to wake up a worker thread waiting for a task from
 * instead interrupting a task being run.  We implement a simple
 * non-reentrant mutual exclusion lock rather than use
 * ReentrantLock because we do not want worker tasks to be able to
 * reacquire the lock when they invoke pool control methods like
 * setCorePoolSize.  Additionally, to suppress interrupts until
 * the thread actually starts running tasks, we initialize lock
 * state to a negative value, and clear it upon start (in
 * runWorker).
 */
private final class Worker
    extends AbstractQueuedSynchronizer
    implements Runnable
{
    /**
     * This class will never be serialized, but we provide a
     * serialVersionUID to suppress a javac warning.
     */
    private static final long serialVersionUID = 6138294804551838833L;

    /** Thread this worker is running in.  Null if factory fails. */
    final Thread thread;
    /** Initial task to run.  Possibly null. */
    Runnable firstTask;
    /** Per-thread task counter */
    volatile long completedTasks;

    /**
     * Creates with given first task and thread from ThreadFactory.
     * @param firstTask the first task (null if none)
     */
    Worker(Runnable firstTask) {
        setState(-1); // inhibit interrupts until runWorker
        this.firstTask = firstTask;
        this.thread = getThreadFactory().newThread(this);
    }

    /** Delegates main run loop to outer runWorker  */
    public void run() {
        runWorker(this);
    }

    // Lock methods
    //
    // The value 0 represents the unlocked state.
    // The value 1 represents the locked state.

    protected boolean isHeldExclusively() {
        return getState() != 0;
    }

    protected boolean tryAcquire(int unused) {
        if (compareAndSetState(0, 1)) {
            setExclusiveOwnerThread(Thread.currentThread());
            return true;
        }
        return false;
    }

    protected boolean tryRelease(int unused) {
        setExclusiveOwnerThread(null);
        setState(0);
        return true;
    }

    public void lock()        { acquire(1); }
    public boolean tryLock()  { return tryAcquire(1); }
    public void unlock()      { release(1); }
    public boolean isLocked() { return isHeldExclusively(); }

    void interruptIfStarted() {
        Thread t;
        if (getState() >= 0 && (t = thread) != null && !t.isInterrupted()) {
            try {
                t.interrupt();
            } catch (SecurityException ignore) {
            }
        }
    }
}

核心方法就是 :

public void run() {
    runWorker(this);
}

调用了runWorker():

/**
* Main worker run loop.  Repeatedly gets tasks from queue and
* executes them, while coping with a number of issues:
*
* 1. We may start out with an initial task, in which case we
* don't need to get the first one. Otherwise, as long as pool is
* running, we get tasks from getTask. If it returns null then the
* worker exits due to changed pool state or configuration
* parameters.  Other exits result from exception throws in
* external code, in which case completedAbruptly holds, which
* usually leads processWorkerExit to replace this thread.
*
* 2. Before running any task, the lock is acquired to prevent
* other pool interrupts while the task is executing, and then we
* ensure that unless pool is stopping, this thread does not have
* its interrupt set.
*
* 3. Each task run is preceded by a call to beforeExecute, which
* might throw an exception, in which case we cause thread to die
* (breaking loop with completedAbruptly true) without processing
* the task.
*
* 4. Assuming beforeExecute completes normally, we run the task,
* gathering any of its thrown exceptions to send to afterExecute.
* We separately handle RuntimeException, Error (both of which the
* specs guarantee that we trap) and arbitrary Throwables.
* Because we cannot rethrow Throwables within Runnable.run, we
* wrap them within Errors on the way out (to the thread's
* UncaughtExceptionHandler).  Any thrown exception also
* conservatively causes thread to die.
*
* 5. After task.run completes, we call afterExecute, which may
* also throw an exception, which will also cause thread to
* die. According to JLS Sec 14.20, this exception is the one that
* will be in effect even if task.run throws.
*
* The net effect of the exception mechanics is that afterExecute
* and the thread's UncaughtExceptionHandler have as accurate
* information as we can provide about any problems encountered by
* user code.
*
* @param w the worker
*/
final void runWorker(Worker w) {
  Thread wt = Thread.currentThread();
  Runnable task = w.firstTask;
  w.firstTask = null;
  w.unlock(); // allow interrupts
  boolean completedAbruptly = true;
  try {
      while (task != null || (task = getTask()) != null) {
          w.lock();
          // If pool is stopping, ensure thread is interrupted;
          // if not, ensure thread is not interrupted.  This
          // requires a recheck in second case to deal with
          // shutdownNow race while clearing interrupt
          if ((runStateAtLeast(ctl.get(), STOP) ||
               (Thread.interrupted() &&
                runStateAtLeast(ctl.get(), STOP))) &&
              !wt.isInterrupted())
              wt.interrupt();
          try {
              beforeExecute(wt, task);
              Throwable thrown = null;
              try {
                  task.run();
              } catch (RuntimeException x) {
                  thrown = x; throw x;
              } catch (Error x) {
                  thrown = x; throw x;
              } catch (Throwable x) {
                  thrown = x; throw new Error(x);
              } finally {
                  afterExecute(task, thrown);
              }
          } finally {
              task = null;
              w.completedTasks++;
              w.unlock();
          }
      }
      completedAbruptly = false;
  } finally {
      processWorkerExit(w, completedAbruptly);
  }
}

重点关注while循环内的实现:

while (task != null || (task = getTask()) != null)

当任务(通常是第一个任务)不为空,或者从任务队列中取出的任务不为空的时候( getTask() ),先检查线程池的状态,如果处于运行中,并且未被中断,就会调用task.run() 来执行任务 :

while (task != null || (task = getTask()) != null) {
    w.lock();
    // If pool is stopping, ensure thread is interrupted;
    // if not, ensure thread is not interrupted.  This
    // requires a recheck in second case to deal with
    // shutdownNow race while clearing interrupt
    if ((runStateAtLeast(ctl.get(), STOP) ||
         (Thread.interrupted() &&
          runStateAtLeast(ctl.get(), STOP))) &&
        !wt.isInterrupted())
        wt.interrupt();
    try {
        beforeExecute(wt, task);
        Throwable thrown = null;
        try {
            task.run();
        } catch (RuntimeException x) {
            thrown = x; throw x;
        } catch (Error x) {
            thrown = x; throw x;
        } catch (Throwable x) {
            thrown = x; throw new Error(x);
        } finally {
            afterExecute(task, thrown);
        }
    } finally {
        task = null;
        w.completedTasks++;
        w.unlock();
    }
}

这里还需要重点关注一下getTask()方法的实现 :

/**
* Performs blocking or timed wait for a task, depending on
* current configuration settings, or returns null if this worker
* must exit because of any of:
* 1. There are more than maximumPoolSize workers (due to
*    a call to setMaximumPoolSize).
* 2. The pool is stopped.
* 3. The pool is shutdown and the queue is empty.
* 4. This worker timed out waiting for a task, and timed-out
*    workers are subject to termination (that is,
*    {@code allowCoreThreadTimeOut || workerCount > corePoolSize})
*    both before and after the timed wait, and if the queue is
*    non-empty, this worker is not the last thread in the pool.
*
* @return task, or null if the worker must exit, in which case
*         workerCount is decremented
*/
private Runnable getTask() {
   boolean timedOut = false; // Did the last poll() time out?

   for (;;) {
       int c = ctl.get();
       int rs = runStateOf(c);

       // Check if queue empty only if necessary.
       if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
           decrementWorkerCount();
           return null;
       }

       int wc = workerCountOf(c);

       // Are workers subject to culling?
       boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;

       if ((wc > maximumPoolSize || (timed && timedOut))
           && (wc > 1 || workQueue.isEmpty())) {
           if (compareAndDecrementWorkerCount(c))
               return null;
           continue;
       }

       try {
           Runnable r = timed ?
               workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
               workQueue.take();
           if (r != null)
               return r;
           timedOut = true;
       } catch (InterruptedException retry) {
           timedOut = false;
       }
   }
}

可以说设计的十分巧妙,先根据allowCoreThreadTimeOut这个字段以及当前的线程数量与corePoolSize之间的关系来判断是否需要设置超时时间,前面提到了在allowCoreThreadTimeOut 或者 当 当前的线程数量>corePoolSize的时候,空闲线程超过最大时间就会被回收,这里如果超时则会调用 compareAndDecrementWorkerCount() 去减少workerCount的数量,并且直接返回null。 那么在runWork的while循环中接收到null这个参数的话就跳出当前循环,并且会调用 processWorkerExit(w, completedAbruptly)去完成后续的清理工作,从而达到超时自动回收线程的功能。

如果不需要设置超时时间,那么就会调用workQueue.take(),因为workQueue是BlockingQueue的实现类,所以这个方法是阻塞的,如果队列为空,就会一直阻塞,直至有新的任务加入队列才会被唤醒。

线程池中线程的初始化

默认情况下,创建线程池之后,线程池中是没有线程的,需要提交任务之后才会创建线程。

在实际中如果需要线程池创建之后立即创建线程,可以通过以下两个方法办到:

  • prestartCoreThread():初始化一个核心线程
  • prestartAllCoreThreads():初始化所有核心线程
/**
 * Starts a core thread, causing it to idly wait for work. This
 * overrides the default policy of starting core threads only when
 * new tasks are executed. This method will return {@code false}
 * if all core threads have already been started.
 *
 * @return {@code true} if a thread was started
 */
public boolean prestartCoreThread() {
    return workerCountOf(ctl.get()) < corePoolSize &&
        addWorker(null, true);
}
/**
 * Starts all core threads, causing them to idly wait for work. This
 * overrides the default policy of starting core threads only when
 * new tasks are executed.
 *
 * @return the number of threads started
 */
public int prestartAllCoreThreads() {
    int n = 0;
    while (addWorker(null, true))
        ++n;
    return n;
}

其中addWorker()方法在上文中已经分析过,不再赘述。

任务缓存队列及排队策略

前面已经提到workQueue的类型为BlockingQueue<Runnable> 比较常用的实现类有:

  • ArrayBlockingQueue:基于数组的先进先出队列,此队列创建时必须指定大小
  • LinkedBlockingQueue:基于链表的先进先出队列,如果创建时没有指定此队列大小,则默认为Integer.MAX_VALUE
  • SynchronousQueue:这个队列比较特殊,它不会保存提交的任务,而是将直接新建一个线程来执行新来的任务

任务拒绝策略

当线程池的任务缓存队列已满并且线程池中的线程数目达到maximumPoolSize,如果还有任务到来就会采取任务拒绝策略,比较常用的有如下4种:

  • ThreadPoolExecutor.AbortPolicy:丢弃任务并抛出RejectedExecutionException异常。
  • ThreadPoolExecutor.DiscardPolicy:也是丢弃任务,但是不抛出异常。
  • ThreadPoolExecutor.DiscardOldestPolicy:丢弃队列最前面的任务,然后重新尝试执行任务(重复此过程)
  • ThreadPoolExecutor.CallerRunsPolicy:由调用线程处理该任务

线程池的关闭

ThreadPoolExecutor提供了两个方法,用于线程池的关闭,分别是shutdown()和shutdownNow(),其中:

  • shutdown():不会立即终止线程池,而是要等所有任务缓存队列中的任务都执行完后才终止,但再也不会接受新的任务
  • shutdownNow():立即终止线程池,并尝试打断正在执行的任务,并且清空任务缓存队列,返回尚未执行的任务

线程池容量的动态调整

ThreadPoolExecutor提供了动态调整线程池容量大小的方法:setCorePoolSize()和setMaximumPoolSize() :

  • setCorePoolSize:设置核心池大小
  • setMaximumPoolSize:设置线程池最大能创建的线程数目大小

当上述参数从小变大时,ThreadPoolExecutor进行线程赋值,还可能立即创建新的线程来执行任务。

合理配置线程池大小

一般需要根据任务的类型来配置线程池大小:

如果是CPU密集型任务,就需要尽量压榨CPU,参考值可以设为 NCPU+1

如果是IO密集型任务,参考值可以设置为2*NCPU

当然,这只是一个参考值,具体的设置还需要根据实际情况进行调整,比如可以先将线程池大小设置为参考值,再观察任务运行情况和系统负载、资源利用率来进行适当调整。

参考资料

Java并发编程:线程池的使用

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值