线程池原理简单解析

记得在4年前面试的时候,在追问为什么要用线程池呢?很简单的回答了因为线程池不用重复创建线程,重复创建线程是一个比较对性能有影响的动作。那么线程池的原理是什么呢? 这个问题一直没去研究,今天总结一下。

总的结构

无论是使用Executors还是Guava的ThreadFactory去构建线程池,都构建的是ThreadPoolExecutor,所以这个就是我们这次分析的目标。

ctl变量

首先不得不提一个神奇的变量ctl,这里我们简单理解它包含了两个部分: 线程池状态前3位和线程池worker数量。可以参见下面一张图.

在这里插入图片描述

核心的worker集合

private final HashSet<Worker> workers = new HashSet<Worker>();

是一个HashSet的worker集合,就是我们工作线程的集合。那么让我们看一下Worker的代码:

private final class Worker
        extends AbstractQueuedSynchronizer
        implements Runnable {
        
        final Thread thread;
        /** Initial task to run.  Possibly null. */
        Runnable firstTask;
        
		Worker(Runnable firstTask) {
            setState(-1); // inhibit interrupts until runWorker
            this.firstTask = firstTask;
            this.thread = getThreadFactory().newThread(this);
        }        
}

Worker对象里主要包含了两个内容: ThreadfirstTask,而且Woker实现了Runnable,传给Thread的runnable对象是自己。

那么我们看一下Thread的代码, new之后主要调用了init方法

private void init(ThreadGroup g, Runnable target, String name,
                      long stackSize, AccessControlContext acc,
                      boolean inheritThreadLocals) {
        if (name == null) {
            throw new NullPointerException("name cannot be null");
        }

        this.name = name;

        ...
}

但这里并没有发现任何分配动作,只是新建一个对象。

任务列表

private final BlockingQueue<Runnable> workQueue;

一个线程安全的Queue,这里不多说。

我们看看线程怎么起来并且怎么保持的吧

运行起来

java.util.concurrent.ThreadPoolExecutor#addWorker方法可以看到,如果worker添加成功,那么就启动Woker中的线程

if (workerAdded) {
   t.start();
   workerStarted = true;
}

那么我们看看Thread.start方法吧.

try {
    start0();
    started = true;
} finally {
...

这里看到调用了Native的start0方法,从JVM.C中发现调用了操作系统创建线程并调用了Thread.run方法。(之所以没贴代码,是因为这段代码的确看不太懂orz)

那么关键的问题来了,当Thread执行完这个任务后,会怎么样?

答案是等待任务。

源码方法是: java.util.concurrent.ThreadPoolExecutor#runWorker

final void runWorker(Worker w) {
   Thread wt = Thread.currentThread();
   Runnable task = w.firstTask;
	 w.firstTask = null;
   ...
	 boolean completedAbruptly = true;
	 try {
 			while (task != null || (task = getTask()) != null) {
            ...
            try {
               ...
                try {
                    task.run();
                } catch (RuntimeException x) {
                    thrown = x; throw x;
                } catch (Error x) {
                    thrown = x; throw x;
                } catch (Throwable x) {
                    thrown = x; throw new Error(x);
                } finally {
                    afterExecute(task, thrown);
                }
            } finally {
                task = null;
                ...
            }
        }
        completedAbruptly = false;
    } finally {
        processWorkerExit(w, completedAbruptly);
   }
}

执行之后会在等待task = getTask(),看到这里我们其实就知道线程其实还在运行,只不过从运行任务变成了等待任务的状态,线程没有退出。

我们再看一下getTask方法。

for (;;) {
    int c = ctl.get();
    int rs = runStateOf(c);

    // Check if queue empty only if necessary.
    if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
        decrementWorkerCount();
        return null;
    }

    int wc = workerCountOf(c);

    // Are workers subject to culling?
    boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;

    if ((wc > maximumPoolSize || (timed && timedOut))
        && (wc > 1 || workQueue.isEmpty())) {
        if (compareAndDecrementWorkerCount(c))
            return null;
        continue;
    }

    try {
        Runnable r = timed ?
            workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
            workQueue.take();
        if (r != null)
            return r;
        timedOut = true;
    } catch (InterruptedException retry) {
        timedOut = false;
    }
}

代码中可以看到:

  1. 会先判断是否停止状态,停止状态的话就返回null
  2. 接着判断是否有core线程也可以减少(默认false)或者当前线程数大于core
  3. 如果判断为true时,那么就按照keepAliveTime作为超时时间去拉取,拉取失败就返回null,那么worker就会退出,线程也会结束。

任务是怎么进来的?

这源于一个线上问题。我们的RPC调用可以配置线程池,为了减少响应时间,我把线程池的队列配的很小,固定为8。这导致了经常会碰到线程池溢出的情况,但是从活跃线程池上又很少,不可能出现线程不够用的情况。

在我们的直觉里,我们的任务进来就是给了worker线程,但是事实真的如此吗?
让我们再来看一次源码。

public void execute(Runnable command) {
        if (command == null)
            throw new NullPointerException();
        /*
         * Proceed in 3 steps:
         *
         * 1. If fewer than corePoolSize threads are running, try to
         * start a new thread with the given command as its first
         * task.  The call to addWorker atomically checks runState and
         * workerCount, and so prevents false alarms that would add
         * threads when it shouldn't, by returning false.
         *
         * 2. If a task can be successfully queued, then we still need
         * to double-check whether we should have added a thread
         * (because existing ones died since last checking) or that
         * the pool shut down since entry into this method. So we
         * recheck state and if necessary roll back the enqueuing if
         * stopped, or start a new thread if there are none.
         *
         * 3. If we cannot queue task, then we try to add a new
         * thread.  If it fails, we know we are shut down or saturated
         * and so reject the task.
         */
        int c = ctl.get();
        if (workerCountOf(c) < corePoolSize) {
            if (addWorker(command, true))
                return;
            c = ctl.get();
        }
        if (isRunning(c) && workQueue.offer(command)) {
            int recheck = ctl.get();
            if (! isRunning(recheck) && remove(command))
                reject(command);
            else if (workerCountOf(recheck) == 0)
                addWorker(null, false);
        }
        else if (!addWorker(command, false))
            reject(command);
    }

那么,是否有我们我们直观印象,有空闲线程就可以执行的呢? 是有的,通过看我们的RPC框架代码,其中用到了一个关键的虚拟队列: SynchronousQueue。

我们来看下这个队列的put方法

public void put(E e) throws InterruptedException {
    if (e == null) throw new NullPointerException();
    if (transferer.transfer(e, false, 0) == null) {
        Thread.interrupted();
        throw new InterruptedException();
    }
}

如果transfer为null,则抛出异常。否则就代表成功。

那接下来我们看下transferer的transfer方法实现。

E transfer(E e, boolean timed, long nanos) {
    /* Basic algorithm is to loop trying to take either of
     * two actions:
     *
     * 1. If queue apparently empty or holding same-mode nodes,
     *    try to add node to queue of waiters, wait to be
     *    fulfilled (or cancelled) and return matching item.
     *
     * 2. If queue apparently contains waiting items, and this
     *    call is of complementary mode, try to fulfill by CAS'ing
     *    item field of waiting node and dequeuing it, and then
     *    returning matching item.
     *
     * In each case, along the way, check for and try to help
     * advance head and tail on behalf of other stalled/slow
     * threads.
     *
     * The loop starts off with a null check guarding against
     * seeing uninitialized head or tail values. This never
     * happens in current SynchronousQueue, but could if
     * callers held non-volatile/final ref to the
     * transferer. The check is here anyway because it places
     * null checks at top of loop, which is usually faster
     * than having them implicitly interspersed.
     */

    QNode s = null; // constructed/reused as needed
    boolean isData = (e != null);

    for (;;) {
        QNode t = tail;
        QNode h = head;
        // 如果头部和尾部都是null 代表队列还未初始化完成 就自旋
        if (t == null || h == null)         // saw uninitialized value
            continue;                       // spin

        // 如果头部 = 尾部,或者尾部节点的类型与要加进来的类型一样
        if (h == t || t.isData == isData) { // empty or same-mode
            // 尾部的next指针是否已有数据 是的话就CAS把next指向的数据置为新的tail 目的是保证tail的next为null
            QNode tn = t.next;
            if (t != tail)                  // inconsistent read
                continue;
            if (tn != null) {               // lagging tail
                advanceTail(t, tn);
                continue;
            }
            // 如果是带超时时间并且不能等待 那么就返回null
            if (timed && nanos <= 0)        // can't wait
                return null;
            // 把我们的任务包装成QNode 并把t.next指向我们新建的QNode
            if (s == null)
                s = new QNode(e, isData);
            if (!t.casNext(null, s))        // failed to link in
                continue;

            // 把我们任务的指针作为队尾指针
            advanceTail(t, s);              // swing tail and wait
            // 等待e被匹配(生产者和消费者进行匹配)
            Object x = awaitFulfill(s, e, timed, nanos);
            // 如果s已经取消 那么就清理
            if (x == s) {                   // wait was cancelled
                clean(t, s);
                return null;
            }

            // 如果s还没有脱链 那么就queue向前移动 并且将s.water线程设置为null
            if (!s.isOffList()) {           // not already unlinked
                advanceHead(t, s);          // unlink if head
                if (x != null)              // and forget fields
                    s.item = s;
                s.waiter = null;
            }

            // 返回数据
            return (x != null) ? (E)x : e;
        } else {
            // 如果不是同样的类型,并且有元素                            // complementary-mode
            QNode m = h.next;               // node to fulfill
            if (t != tail || m == null || h != head)
                continue;                   // inconsistent read

            Object x = m.item;
            if (isData == (x != null) ||    // m already fulfilled
                x == m ||                   // m cancelled
                !m.casItem(x, e)) {         // lost CAS
                advanceHead(h, m);          // dequeue and retry
                continue;
            }

            // 将h节点前移
            advanceHead(h, m);              // successfully fulfilled
            // 唤醒m的wait线程
            LockSupport.unpark(m.waiter);
            // 返回元素
            return (x != null) ? (E)x : e;
        }
    }
}

其中awaitFulfill方法是等待匹配的方法。

Object awaitFulfill(QNode s, E e, boolean timed, long nanos) {
    /* Same idea as TransferStack.awaitFulfill */
    final long deadline = timed ? System.nanoTime() + nanos : 0L;
    // 取当前线程
    Thread w = Thread.currentThread();
    int spins = ((head.next == s) ?
                 (timed ? maxTimedSpins : maxUntimedSpins) : 0);
    for (;;) {
    	// 如果是被中断或者超时的话 就取消 否则就尝试直到满足时间或者次数
        if (w.isInterrupted())
            s.tryCancel(e);
        Object x = s.item;
        // 如果s的item不等于e了 就代表e已经被匹配 返回x(因为s会把当前线程进行等待)
        if (x != e)
            return x;
        if (timed) {
            nanos = deadline - System.nanoTime();
            if (nanos <= 0L) {
                s.tryCancel(e);
                continue;
            }
        }
        if (spins > 0)
            --spins;
        // 如果等待线程是null 那么就把当前线程作为s的等待线程
        else if (s.waiter == null)
            s.waiter = w;
        else if (!timed)
            LockSupport.park(this);
        else if (nanos > spinForTimeoutThreshold)
            LockSupport.parkNanos(this, nanos);
    }
}

这个队列起到的效果是: 存储0个元素,然后生产者和消费者进行单一(尾部元素一样才可以进队列)排队,匹配之后就从队列中移除并唤醒线程。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值