ThreadPoolExecutor源码理解(一)ThreadPoolExecutor中的一些设计和问题的解答

3 篇文章 0 订阅

ThreadPoolExecutor中的位运算

ThreadPoolExecutor在实现的过程中为了省空间,将线程池中的运行线程数线程状态整合到一个int型的变量中。

从ThreadPoolExecutor的实现中得知线程池的状态有5种,需要使用至少3位才能表示所有的线程池状态,那么运行线程数就是29位了(int在java中是4字节32位,状态占用了3位,那么剩下的就是32-3=29位了)。基于这种考虑,ThreadPoolExecutor是这样安排运行线程数和线程状态的位置的——

+-------+------------------------+
|状态(3) |      运行线程数(29)     |
+-------+------------------------+

因此在定义COUNT_BITS的时候会减去3,而这3正好是状态的3位

private static final int COUNT_BITS = Integer.SIZE - 3;

那么这样做的后果就是,虽然节省了空间。让线程池中的运行线程数和线程状态放在同一个int型数字中,但是由于状态占用了运行线程数的3位,因此导致线程池允许运行的最大线程数从原来的 2 32 − 1 2^{32} - 1 2321(大约20亿)减小为 2 29 − 1 2^{29}-1 2291(大约5亿)。

因此源码中的以下变量我们就可以理解了——

private static final int COUNT_BITS = Integer.SIZE - 3;		//workerCount的数量29=32-3
private static final int CAPACITY   = (1 << COUNT_BITS) - 1; // 容量就是 2^29 - 1

// runState is stored in the high-order bits
private static final int RUNNING    = -1 << COUNT_BITS;	// 11100...0
private static final int SHUTDOWN   =  0 << COUNT_BITS;	// 00000...0
private static final int STOP       =  1 << COUNT_BITS;	// 00100...0
private static final int TIDYING    =  2 << COUNT_BITS;	// 01000...0
private static final int TERMINATED =  3 << COUNT_BITS;	// 01100...0

接下来该怎么获取到这些状态值呢?ThreadPoolExecutor提供的是这样的方法——

private static int runStateOf(int c)     { return c & ~CAPACITY; }
private static int workerCountOf(int c)  { return c & CAPACITY; }

看上面的获取状态和获取workerCount使用了位运算,实际上在这里我们就是想要将int变量中的状态所在的位置获取。CAPACITY实际上就是两个变量的分界值。这里将CAPACITY使用二进制表现出来就容易理解了。

CAPACITY=00011111111111111111111111111111  // (3个0,29个1)
~CAPACITY=11100000000000000000000000000000 // (3个1,29个0)

那么通过CAPACITY和传入的int进行与操作,这样就获取了int类型中的运行线程数。反之通过~CAPACITY和传入的int取值则可以获取到状态的信息。

让我们来看看上述封装在一起的两个变量runState和workerCount究竟是怎样的含义。

workerCount

字面意思就是worker的数量,换而言之就是线程的数量。那么这个变量在实现文档中有这样一段话——

The workerCount is the number of workers that have been permitted to start and not permitted to stop. The value may be transiently different from the actual number of live threads, for example when a ThreadFactory fails to create a thread when asked, and when exiting threads are still performing bookkeeping before terminating. The user-visible pool size is reported as the current size of the workers set.

我的理解是worker会返回当前线程池中的所有的WorkerCount数,而不管这个线程的状态是怎样的。但是对于有些场景线程状态已经被置为Terminate了,并且该线程只是在做terminate之前的收尾工作。此时返回的仍然是所有的线程数。

runState

这个是线程池的运行状态——

状态名描述
RUNNING接收新任务,还可以从队列取任务
SHUTDOWN不接受新任务,但可以从队列中取任务
STOP不接受新任务,不处理队列中任务,终止正在执行的任务
TIDYING所有的任务都终止,workerCount=0,马上就运行terminate()
TERMINATEDterminate()执行完

以上的五种状态的值的相对大小关系需要引起注意,runState随着时间的增加呈现一个单调递增的趋势,但是并不是所有的状态都必须存在,五种状态的转换图如下——

RUNNING SHUTDOWN STOP TIDYING TERMINATED 调用shutdown(),可能隐式的在finalize()中调用 调用shutdownNow() 调用shutdownNow() 当线程池和队列均为空时 当队列为空 terminated()调用完成即可

正是因为以上几种状态是递增的状态,因此在判断线程池运行状态的时候使用的是以下的方式——

private static boolean isRunning(int c) {
	return c < SHUTDOWN;
}
一些变量的作用
/**
 * Lock held on access to workers set and related bookkeeping.
 * While we could use a concurrent set of some sort, it turns out
 * to be generally preferable to use a lock. Among the reasons is
 * that this serializes interruptIdleWorkers, which avoids
 * unnecessary interrupt storms, especially during shutdown.
 * Otherwise exiting threads would concurrently interrupt those
 * that have not yet interrupted. It also simplifies some of the
 * associated statistics bookkeeping of largestPoolSize etc. We
 * also hold mainLock on shutdown and shutdownNow, for the sake of
 * ensuring workers set is stable while separately checking
 * permission to interrupt and actually interrupting.
 */
private final ReentrantLock mainLock = new ReentrantLock();

/**
 * Tracks largest attained pool size. Accessed only under
 * mainLock.
 */
private int largestPoolSize;

ThreadPoolExecutor为什么使用ReentrantLock进行workers集合的同步呢?虽然我们也可以使用别的并发集合,那么为什么我们选择了Lock呢。其中原因之一就是,使用Lock能够保证方法interruptIdleWorkers能够序列化执行(换而言之就是顺序执行),当线程池的状态为shutdown的时候能够防止中断风暴(interrupt storms)。

同时这个线程将会保护largestPoolSize最大线程的数量。

注意,有一点很重要——所有用户参与的改变的变量在ThreadPoolExecutor都使用volatile修饰以每次访问都是从内存中拿到的值。

/*
 * All user control parameters are declared as volatiles so that
 * ongoing actions are based on freshest values, but without need
 * for locking, since no internal invariants depend on them
 * changing synchronously with respect to other actions.
 */
ThreadPoolExecutor是怎样保证线程复用的?

我们从这个方法体中就能寻找到答案

final void runWorker(Worker w) {
    Thread wt = Thread.currentThread();
    Runnable task = w.firstTask;
    w.firstTask = null;
    w.unlock(); // allow interrupts
    boolean completedAbruptly = true;
    try {
        while (task != null || (task = getTask()) != null) {
            w.lock();
            // If pool is stopping, ensure thread is interrupted;
            // if not, ensure thread is not interrupted.  This
            // requires a recheck in second case to deal with
            // shutdownNow race while clearing interrupt
            if ((runStateAtLeast(ctl.get(), STOP) ||
                 (Thread.interrupted() &&
                  runStateAtLeast(ctl.get(), STOP))) &&
                !wt.isInterrupted())
                wt.interrupt();
            try {
                beforeExecute(wt, task);
                Throwable thrown = null;
                try {
                    task.run();
                } catch (RuntimeException x) {
                    thrown = x; throw x;
                } catch (Error x) {
                    thrown = x; throw x;
                } catch (Throwable x) {
                    thrown = x; throw new Error(x);
                } finally {
                    afterExecute(task, thrown);
                }
            } finally {
                task = null;
                w.completedTasks++;
                w.unlock();
            }
        }
        completedAbruptly = false;
    } finally {
        processWorkerExit(w, completedAbruptly);
    }
}

从这个方法的逻辑其实相当清楚,之前我们的几个疑问可以从这里找到答案——

  1. 线程池中的线程是怎样维护的?

    答:线程池中的线程是通过一个Set维护的,其中这个Set维护的并不是简单的Thread集合。而是一个Worker的数据结构,

    Worker Thread thread Runnable firstTask ... Runnable ... AbstractQueuedSynchronizer ...

    然后使用这个worker作为真正承载线程的实体。

  2. 是线程池怎样实现线程的复用的?

    答:在ThreadPoolExecutor中有三种提交任务的方式。使用核心线程、放入阻塞队列、使用最大线程。那么在这三种场景中核心线程和最大线程如果介绍到任务肯定是都会创建新线程的。那么重点就是怎样利用核心线程或者最大线程运行阻塞队列中的任务。

    private boolean addWorker(Runnable firstTask, boolean core) {
    	...
     boolean workerStarted = false;
     boolean workerAdded = false;
     Worker w = null;
     try {
         w = new Worker(firstTask);
         final Thread t = w.thread;
         if (t != null) {
             final ReentrantLock mainLock = this.mainLock;
             mainLock.lock();
             try {
                ...
                workers.add(w);
                ...
                workerAdded = true;
             } finally {
                 mainLock.unlock();
             }
             if (workerAdded) {
                 t.start();
                 workerStarted = true;
             }
         }
     } finally {
         if (! workerStarted)
             addWorkerFailed(w);
     }
     return workerStarted;
    }
    

    注意线程是在addWorker启动的,运行的任务就是worker对象。而worker对象最终会调用到上面的runWorker中。

    注意runWorker线程中的while循环的条件,发现如果存在firstTask那么就直接运行,否则就通过getTask获取任务,我们来看看getTask是怎样实现的

    private Runnable getTask() {
        boolean timedOut = false; // Did the last poll() time out?
    
        for (;;) {
            int c = ctl.get();
            int rs = runStateOf(c);
    
            // Check if queue empty only if necessary.
            if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
                decrementWorkerCount();
                return null;
            }
    
            int wc = workerCountOf(c);
    
            // Are workers subject to culling?
            boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;
    
            if ((wc > maximumPoolSize || (timed && timedOut))
                && (wc > 1 || workQueue.isEmpty())) {
                if (compareAndDecrementWorkerCount(c))
                    return null;
                continue;
            }
    
            try {
                // 重点在这里,,从阻塞队列中获取任务,其中poll是含超时的获取;而take则是永久等待
                Runnable r = timed ?
                    workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
                    workQueue.take();
                if (r != null)
                    return r;
                timedOut = true;
            } catch (InterruptedException retry) {
                timedOut = false;
            }
        }
    }
    

    所以从以上的整个流程我们得知,当核心线程或者最大线程执行完自己的任务之后就会从阻塞队列中获取新的任务然后在自己的线程中执行。因此就是这样实现了线程的复用。

  3. 线程池怎样保证超过设置的时间之后,就销毁多余的线程?

    答:这个问题的背景是我们在初始化线程池的时候,会传入一个超时时间,这个超时时间的含义是当线程数大于核心线程数的时候,并且这些线程的闲置时间(idle)超过这个时间,那么就会将这些线程进行销毁。

    我们从源码中获取到相关的信息,我们从getTask()中可以看到从阻塞队列中获取任务的时候存在一个超时的判断,如下

    /**
     * Retrieves and removes the head of this queue, waiting up to the
     * specified wait time if necessary for an element to become available.
     *
     * @param timeout how long to wait before giving up, in units of
     *        {@code unit}
     * @param unit a {@code TimeUnit} determining how to interpret the
     *        {@code timeout} parameter
     * @return the head of this queue, or {@code null} if the
     *         specified waiting time elapses before an element is available
     * @throws InterruptedException if interrupted while waiting
     */
    E poll(long timeout, TimeUnit unit) throws InterruptedException;
    

    我们可以从poll接口中的定义得知,如果是超时那么返回的结果就是null。那么这里返回了null,紧接着交由runWorker进行while条件判断,发现不满足然后跳出循环,并且通过processWorkerExit对线程做最后的处理。

  • 2
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
ThreadPoolExecutorJava用于管理线程池的一个类。它是Executor框架的一种具体实现,提供了对线程池的管理和控制。 以下是ThreadPoolExecutor的简化版源码: ```java public class ThreadPoolExecutor { private final BlockingQueue<Runnable> workQueue; // 任务队列 private final HashSet<Worker> workers; // 工作线程集合 private final ReentrantLock mainLock = new ReentrantLock(); // 控制线程池状态的锁 private final Condition termination = mainLock.newCondition(); // 线程池终止条件 private volatile boolean isShutdown = false; // 线程池是否已关闭 private volatile boolean isTerminating = false; // 线程池是否正在关闭 private volatile boolean isTerminated = false; // 线程池是否已终止 private int corePoolSize; // 核心线程数 private int maximumPoolSize; // 最大线程数 private long keepAliveTime; // 非核心线程的空闲超时时间 public ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue) { this.corePoolSize = corePoolSize; this.maximumPoolSize = maximumPoolSize; this.keepAliveTime = unit.toNanos(keepAliveTime); this.workQueue = workQueue; this.workers = new HashSet<>(); } public void execute(Runnable task) { if (task == null) { throw new NullPointerException(); } if (isShutdown) { throw new RejectedExecutionException(); } int workerCount = workers.size(); if (workerCount < corePoolSize) { // 如果核心线程数未满,直接创建并启动一个核心线程来执行任务 addWorker(task, true); } else if (workQueue.offer(task)) { // 将任务添加到任务队列 // do nothing } else if (workerCount < maximumPoolSize) { // 如果任务队列已满但线程数未达到最大值,则创建并启动一个非核心线程来执行任务 addWorker(task, false); } else { reject(task); // 否则拒绝执行任务 } } private void addWorker(Runnable task, boolean core) { Worker worker = new Worker(task); worker.thread.start(); // 启动工作线程 workers.add(worker); if (core) { corePoolSize++; } } private void reject(Runnable task) { throw new RejectedExecutionException(); } public void shutdown() { mainLock.lock(); try { isShutdown = true; interruptIdleWorkers(); } finally { mainLock.unlock(); } } private void interruptIdleWorkers() { for (Worker worker : workers) { if (!worker.thread.isInterrupted() && worker.tryLock()) { try { worker.thread.interrupt();
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值