AQS: 等待队列详解（AbstractQueuedSynchronizer）

最新推荐文章于 2024-09-03 21:31:15 发布

Helloworld先生

最新推荐文章于 2024-09-03 21:31:15 发布

阅读量2.3k

点赞数

分类专栏： java 文章标签： java aqs AbstractQueuedSynchronizer

本文链接：https://blog.csdn.net/u010841296/article/details/88625517

版权

java 专栏收录该内容

20 篇文章 1 订阅

订阅专栏

AQS中的等待队列：是一个双向链表，并使用了“CLH锁”的思想实现等待队列一.CLH锁

二.Node的数据结构：记录了等待状态、当前线程、前后节点的引用

三.Node如何入队

acquire(int arg)：获取资源

acquireQueued(final Node node, int arg)：通过入队获取资源

shouldParkAfterFailedAcquire(Node pred, Node node) ：检查前继节点的状态

shouldParkAfterFailedAcquire中 do {node.prev = pred = pred.prev; } while (pred.waitStatus > 0);为什么不担心并发问题：

四.Node如何出队

release(int arg)

unparkSuccessor(Node node)

为什么unparkSuccessor是从tail节点往回遍历去找下一个需要唤醒的节点？

五. 节点取消

cancelAcquire

AQS中的等待队列：是一个双向链表，并使用了“CLH锁”的思想实现等待队列
一.CLH锁

（1）CLH锁是一个自旋锁，能确保无饥饿性，提供先来先服务的公平性。

（2）CLH锁也是一种基于链表的可扩展、高性能、公平的自旋锁，申请线程只在本地变量上自旋，它不断轮询前驱的状态，如果发现前驱释放了锁就结束自旋。

（3）在AQS中，等待队列中的每个节点也是类似的去轮训前继节点的状态。不过为了减少资源浪费，会加入线程waitting的优化。当然，这不是唯一的获取资源的方式，如果头部节点释放资源后，也会主动的去唤醒waitting状态的后继节点

二.Node的数据结构：记录了等待状态、当前线程、前后节点的引用

三.Node如何入队

Node节点的入队可以分解为以下几个步骤：

（1）在 acquire(int arg) 中，如果没有通过 tryAcquire(arg) 获取到资源，则通过 addWaiter(Node.EXCLUSIVE) 把当前线程构建为一个Node节点准备入队。（这里的代码是“排它模式”的入队）。

（1）调用 acquireQueued 方法将Node节点入队。在 acquireQueued 中，AQS会让node本地自旋，不断轮训前继节点的状态。

acquire(int arg)：获取资源

1.在AQS中，使用2个变量维护等待队列：head、tail

2.在acquire中，先通过tryAcquire方法获取资源，如果获取资源失败，构造一个node进入等待队列。

3.在addWaiter中，AQS先快速执行一次cas操作把node设置为tail节点，如果失败了，再使用 enq 方法通过死循环的方式把node设置为tail节点。

（1）这里的cas操作保证了 tail 的正确性，避免并发问题。

（2）在这个环节中，node.prev 和 pred.next 均可认为是可信任的。

    public final void acquire(int arg) {
        if (!tryAcquire(arg) &&
            acquireQueued(addWaiter(Node.EXCLUSIVE), arg))
            selfInterrupt();
    }

    private Node addWaiter(Node mode) {
        Node node = new Node(Thread.currentThread(), mode);
        // Try the fast path of enq; backup to full enq on failure
        Node pred = tail;
        if (pred != null) {
            node.prev = pred;
            if (compareAndSetTail(pred, node)) {
                pred.next = node;
                return node;
            }
        }
        enq(node);
        return node;
    }

    private Node enq(final Node node) {
        for (;;) {
            Node t = tail;
            if (t == null) { // Must initialize
                if (compareAndSetHead(new Node()))
                    tail = head;
            } else {
                node.prev = t;
                if (compareAndSetTail(t, node)) {
                    t.next = node;
                    return t;
                }
            }
        }
    }

acquireQueued(final Node node, int arg)：通过入队获取资源

node节点入队成功后，通过acquireQueued 方法轮训前继节点的状态或使自己进入waitting状态，直到获得资源，这里有3个分支
（1）如果node的前继节点是header节点，意味着只要获得资源的线程释放了资源，就轮到node节点所在的线程获取资源了。所以可以开始通过 tryAcquire 方法去获取资源。

（2）当（1）条件不通过时，就进入shouldParkAfterFailedAcquire 方法检查node的前继节点（shouldParkAfterFailedAcquire方法内部还有其他逻辑，后面有讲解），

一旦发现前继节点的状态位为“SIGNAL”，则通过parkAndCheckInterrupt 方法使node对应的线程进入waitting状态。

（3）如果中间因为一些原因导致线程终端，则通过finally中的逻辑取消node节点。取消逻辑这里先不讲。

这里返回的boolean值是告诉调用方，node线程在等待资源的过程中是否发生了线程中断。


	final boolean acquireQueued(final Node node, int arg) {
        boolean failed = true;
        try {
            boolean interrupted = false;
            for (;;) {
                final Node p = node.predecessor();
                //前继节点就是head节点，尝试获取资源
                if (p == head && tryAcquire(arg)) {
                    setHead(node);
                    p.next = null; // help GC
                    failed = false;
                    return interrupted;
                }
              	//检查前继节点的状态，park线程
                if (shouldParkAfterFailedAcquire(p, node) &&
                    parkAndCheckInterrupt())
                    interrupted = true;
            }
        } finally {
            if (failed)
                cancelAcquire(node);
        }
    }

shouldParkAfterFailedAcquire(Node pred, Node node) ：检查前继节点的状态

ws = “SIGNAL”，则返回并park线程
ws > 0，意味着前继节点是取消状态（CANCELLED = 1）。此时通过prev指针往前找，跳过连续的“cancelled”节点，直到找到ws<=0的节点，设置该节点的next指针为node节点。此时node的prev指针已经指向为一个ws<=0的节点。在整个等待队列中，取消的节点已经被放弃了。
ws <= 0，则把前继节点的状态置为 “SIGNAL”

    private static boolean shouldParkAfterFailedAcquire(Node pred, Node node) {
        int ws = pred.waitStatus;
        if (ws == Node.SIGNAL)
            /*
             * This node has already set status asking a release
             * to signal it, so it can safely park.
             */
            return true;
        if (ws > 0) {
            /*
             * Predecessor was cancelled. Skip over predecessors and
             * indicate retry.
             */
            do {
                node.prev = pred = pred.prev;
            } while (pred.waitStatus > 0);
            pred.next = node;
        } else {
            /*
             * waitStatus must be 0 or PROPAGATE.  Indicate that we
             * need a signal, but don't park yet.  Caller will need to
             * retry to make sure it cannot acquire before parking.
             */
            compareAndSetWaitStatus(pred, ws, Node.SIGNAL);
        }
        return false;
    }

shouldParkAfterFailedAcquire中 do {node.prev = pred = pred.prev; } while (pred.waitStatus > 0);为什么不担心并发问题：

node.prev 的赋值只有node对应的线程能进行操作。在执行node.prev = pred =pred.prev 时不会有多线程并发问题
取消的节点不能恢复为正常，那么跳过的节点肯定是没问题的
后面入队的节点和当前节点无关，因为enq中处理的是入队元素的prev指针和aqs中的tail引用，和前面的node节点的prev指针无关
如果前面没有取消的prev节点在当前节点执行node.prev = pred = pred.prev;时被取消了，也没关系，因为这里本身就是一个链路优化。减少队列的长度。
pred元素在满足pred.waitStatus > 0情况下，pred.prev肯定不为空，因为只有head的prev为null，在Node head里面的注释上说明了：If head exists, its waitStatus is guaranteed not to be CANCELLED。如果往回遍历到head节点的时候，head节点的状态肯定不是cancelled（只有cancelled=1 是大于0的），当pred = head时，不满足 pred.waitStatus > 0。

四.Node如何出队

node节点的出队分为2步

（1）释放state：tryRelease(int arg)

（2）唤醒后继节点：unparkSuccessor(Node node)

release(int arg)

1.先释放state资源，这个是由实现者自己去实现的，理论上是相关state的释放

2.释放state资源成功后，通过 unparkSuccessor 唤醒head的后继节点，因为后继节点有可能进入了waitting状态

    public final boolean release(int arg) {
        if (tryRelease(arg)) {
            Node h = head;
            if (h != null && h.waitStatus != 0)
                unparkSuccessor(h);
            return true;
        }
        return false;
    }

unparkSuccessor(Node node)

通过cas操作修改node的状态为0（相当于标记为无意义）
判断node的后继节点next是否有效，如果无效，则从队列的tail节点开始往前遍历，直到找到第一个有效的节点
（1）为什么这里是从tail节点往回遍历去找下一个需要唤醒的节点，原因涉及到等待队列中 prev、next指针的实际情况，具体解析请继续往下看

    private void unparkSuccessor(Node node) {
        /*
         * If status is negative (i.e., possibly needing signal) try
         * to clear in anticipation of signalling.  It is OK if this
         * fails or if status is changed by waiting thread.
         */
        int ws = node.waitStatus;
        if (ws < 0)
            compareAndSetWaitStatus(node, ws, 0);

        /*
         * Thread to unpark is held in successor, which is normally
         * just the next node.  But if cancelled or apparently null,
         * traverse backwards from tail to find the actual
         * non-cancelled successor.
         */
        Node s = node.next;
        if (s == null || s.waitStatus > 0) {
            s = null;
            for (Node t = tail; t != null && t != node; t = t.prev)
                if (t.waitStatus <= 0)
                    s = t;
        }
        if (s != null)
            LockSupport.unpark(s.thread);
    }

为什么unparkSuccessor是从tail节点往回遍历去找下一个需要唤醒的节点？

（1）在AQS的论文中有这么一段篇幅介绍了为什么唤醒下个等待线程时，一旦下一个节点不存在或取消了，则从链表的tail开始往前扫描

An AbstractQueuedSynchronizer queue node contains a next link to its successor. But because there are no applicable techniques for lock-free atomic insertion of double-linked list nodes using compareAndSet, this link is not atomically set as part of insertion; it is simply assigned: pred.next = node; after the insertion. This is reflected in all usages. The next link is treated only as an optimized path. If a node's successor does not appear to exist (or appears to be cancelled) via its next field, it is always possible to start at the tail of the list and traverse backwards using the pred field to accurately check if there really is one.

AQS队列节点包含了一个指向后继节点的引用。但是因为没有适用的无锁技术，使节点可以通过compareAndSet原子性的往双链表中，所以这个引用（指向后继节点）并没有作为原子性插入过程的一部分；在插入后进行简单的赋值：pred.next = node。很多地方的用法都是这样的。后继节点的引用被视为优化路径。如果一个节点通过它的next field引用的后继节点不存在了（或者取消了），它总是可能从列表的tail开始，使用 pred field 往前遍历精确的检查是否真的只有它一个节点。

（2）在cancelAcquire中，也可以看到next指针的不可靠性：

当node取消时，状态会标记为“CANCELLED”，如果此时发现node == tail（即取消的node是tail节点），会通过compareAndSetNext(pred, predNext, null) 设置pred.next = null。这时候有新的线程入队，在下图1和2中间，会有一个空档，此时旧的tail节点“t”的next指针是null，而实际上tail已经更换了。那么在unparkSuccessor中 node.next 就是null，会误以为等待队列中已经没有了正在等待的线程了。

当node取消时，状态会标记为“CANCELLED”，cancelAcquire 方法的最后有可能会执行 node.next = node; 。这时候node的前继节点调用 unparkSuccessor，发现node的状态为取消时候，如果不通过prev指针往前遍历，而是通过next指针往后遍历。就会导致死循环，因为 node.next 的引用就是 node 本身。

node的prev指针还是可靠的，因为在 cancelAcquire 方法中，对prev引用的变更也只是在优化队列节点，跳过一些连续的已取消的节点，最终依然能完整的遍历整个队列。
node的prev指针还是可靠的，因为在 cancelAcquire 方法中，对prev引用的变更也只是在优化队列节点，跳过一些连续的已取消的节点，最终依然能完整的遍历整个队列。