ConcurrentSkipListMap源码分析

最新推荐文章于 2024-08-04 11:07:52 发布

SolitudeCoding

最新推荐文章于 2024-08-04 11:07:52 发布

阅读量888

点赞数 24

文章标签： skiplist java 数据结构 jdk

本文链接：https://blog.csdn.net/palkia1998/article/details/135660874

版权

通过上一篇文章，相信各位读者已经对SkipList（跳表）有了一定的了解，在JDK中SkipList也被应用于构建并发容器，这篇文章将主要对JDK中应用了SkipList的ConcurrentSkipListMap的源码进行分析。

1. ConcurrentSkipListMap 与 ConcurrentSkipListSet

JDK中包含两个与SkipList相关的数据结构，分别是ConcurrentSkipListSet和ConcurrentSkipListMap，二者均为并发容器，一个是Set实现一个是Map实现。
而ConcurrentSkipListSet内部实际上包含一个ConcurrentSkipListMap，其将元素作为ConcurrentSkipListMap的key存放，所以本质上也以ConcurrentSkipListMap作为底层实现。这里将主要分析ConcurrentSkipListMap的源代码。

2. 与ConcurrentHashMap对比

ConcurrentSkipListMap与ConcurrentHashMap都是线程安全的并发容器，可以应用于并发读写场景。
ConcurrentHashMap底层基于哈希，ConcurrentSkipListMap底层基于跳表结构，而跳表本身是有序的，所以ConcurrentSkipListMap也是有序的。

3. ConcurrentSkipListMap采用的无锁删除算法

基础的算法基于 HM linked ordered set algorithm (Tim Harris, “A pragmatic implementation of non-blocking linked lists”)的变种。它的基本思想是在删除节点时标记待删除结点的下一个结点，来避免并发的插入冲突。作者也提到另一种方案是使用AtomicMarkedReference，但是作者认为AtomicMarkedReference不够高效, 直接使用CAS操作更加高效。

具体在删除时，采用的是类似懒删除的策略，如果一个结点的value为null，则认为它已经被逻辑删除

假设待删除结点为n，b为它的前驱结点，f为它的后继结点。
在这里插入图片描述

第一步CAS设置n的value为null，此时该结点已经被逻辑删除，插入、查找等操作将会忽略该结点。

第二步CAS设置n的后继结点为一个Marker结点（value指向自己），此时任何操作都不会再其后继续添加结点。
在这里插入图片描述

第三步CAS设置n的前驱结点b指向其后继结点f，完成删除。
在这里插入图片描述

在这个过程中，如果第一步失败，则会进行重试。第二步或第三步失败也不会影响其他操作，因为该结点value已经被设置成null，其他操作会忽略该结点，相当于该结点已经被删除。并且其他操作内部有帮助删除操作，会检查value已经被设定成null的结点，帮助其完成删除的第二步和第三步。

4. 源码分析

在ConcurrentSkipListMap源代码的基础上，对主要的查询、插入、删除等操作进行分析，并增加了详细的代码注释，帮助理解ConcurrentSkipListMap的源代码。

4.1 基本数据结构

Index索引结点：

static final class Node<K,V> {
    final K key;
    volatile Object value;
    volatile Node<K,V> next;

    /**
        * Creates a new regular node.
        */
    Node(K key, Object value, Node<K,V> next) {
        this.key = key;
        this.value = value;
        this.next = next;
    }
}

static class Index<K,V> {
    final Node<K,V> node;
    final Index<K,V> down;
    volatile Index<K,V> right;

    /**
     * Creates index node with given values.
     */
    Index(Node<K,V> node, Index<K,V> down, Index<K,V> right) {
        this.node = node;
        this.down = down;
        this.right = right;
    }
}

Index索引结点，内部采用Node来构成跳表，包含自身节点node，下一层索引结点down和右侧索引结点right。（这里和普通的跳表结构是一样的，内部采用的是纯链表实现）

/* ---------------- Head nodes -------------- */

/**
 * Nodes heading each level keep track of their level.
 */
static final class HeadIndex<K,V> extends Index<K,V> {
    final int level;
    HeadIndex(Node<K,V> node, Index<K,V> down, Index<K,V> right, int level) {
        super(node, down, right);
        this.level = level;
    }
}

HeadIndex，内部比普通的索引结点增加了level来代表层级。

/**
 * Initializes or resets state. Needed by constructors, clone,
 * clear, readObject. and ConcurrentSkipListSet.clone.
 * (Note that comparator must be separately initialized.)
 */
private void initialize() {
    keySet = null;
    entrySet = null;
    values = null;
    descendingMap = null;
    head = new HeadIndex<K,V>(new Node<K,V>(null, BASE_HEADER, null),
                                null, null, 1);
}

整体跳表的初始化，头结点的down指针定义为一个特殊的BASE_HEADER。

4.2 代码中局部变量代表的含义

为了方便对代码中的链表操作的理解，作者在代码中使用了一致的局部变量。有两个部分在代码中出现频率较高：

b, n, f：代码中的很多操作以（前驱结点，当前结点，后继结点）三元组来进行。其中b为前驱结点，n为当前结点，f为后继结点。（ b -> n -> f ）
q, r, d：当涉及层级之间的操作时，q为当前结点，r为q的右节点，d为为q的下一层结点。

下方为全部局部变量含义：

Notation guide for local variables
* Node:         b, n, f    for predecessor, node, successor
* Index:        q, r, d    for index node, right, down
*               t          for another index node
* Head:         h
* Levels:       j
* Keys:         k, key
* Values:       v, value
* Comparisons:  c

4.3 通用操作

这些操作在插入、删除和查询等集合操作中使用的频率较高，主要包括：

findPredecessor：寻找小于当前key的最大的那个结点（目标key值结点的前一个结点），这一点与普通跳表相同，都需要从头结点开始搜索key的前驱结点。
helpDelete：帮助删除，在发现结点已经进入上文的删除步骤时，帮助删除该结点。在ConcurrentSkipListMap的插入、删除和查询等主要操作中均包含helpDelete操作，类似于懒删除策略。

/* ---------------- Traversal -------------- */

/**
 * Returns a base-level node with key strictly less than given key,
 * or the base-level header if there is no such node.  Also
 * unlinks indexes to deleted nodes found along the way.  Callers
 * rely on this side-effect of clearing indices to deleted nodes.
 * @param key the key
 * @return a predecessor of key
 */
private Node<K,V> findPredecessor(Object key, Comparator<? super K> cmp) {
    if (key == null)
        throw new NullPointerException(); // don't postpone errors
    for (;;) {
        for (Index<K,V> q = head, r = q.right, d;;) {
            if (r != null) {
                Node<K,V> n = r.node;
                K k = n.key;

                // 如果结点的value为null，说明该结点被删除，
                // 直接忽视该结点（上文的删除操作中提到的忽视null value的结点）
                if (n.value == null) {
                // 尝试CAS删除r结点，也是一种帮助删除操作
                    if (!q.unlink(r))
                        break;           // restart
                    
                // 删除r结点后，重新获取新的右结点
                    r = q.right;         // reread r
                    continue;
                }

                // cpr为compare。cpr > 0，说明目标key值比当前结点n的key大，
                // 将当前结点指针向后移动一格，然后继续循环 
                if (cpr(cmp, key, k) > 0) {
                    q = r;
                    r = r.right;
                    continue;
                }
            }
            // q.down为null，已经到达最底层，说明已经完成了整个跳表的搜索，
            // 并且仍然小于目标值key，直接返回该值
            if ((d = q.down) == null)
                return q.node;
                
            //未到达最底层，可以继续向下遍历跳表
            q = d;
            r = d.right;
        }
    }
}

findPredecessor相比普通跳表增加了判断结点是否删除的操作，并且外层用for (;;)循环套住内层循环，如果和删除操作冲突，可以仅跳出内层循环，然后不断重试。

void helpDelete(Node<K,V> b, Node<K,V> f) {
    /*
        * Rechecking links and then doing only one of the
        * help-out stages per call tends to minimize CAS
        * interference among helping threads.
        */
    if (f == next && this == b.next) {
        // 后一个结点为null或者后一个结点还没有标记为Marker结点，将其进行标记
        if (f == null || f.value != f) // not already marked
            casNext(f, new Node<K,V>(f));
        // 后一个节点已经标记为Marker结点，直接删除后一个结点f  
        else
            b.casNext(this, f.next);
    }
}

helpDelete会执行上文所述的删除算法，同时由于helpDelete操作包含在插入、删除等主操作中，在注释中作者提到一次helpDelete操作只会进行删除算法的一步，以此来减少CAS操作对于插入、删除等主操作的影响。

4.4 get(doGet)

/**
     * Gets value for key. Almost the same as findNode, but returns
     * the found value (to avoid retries during re-reads)
     *
     * @param key the key
     * @return the value, or null if absent
     */
    private V doGet(Object key) {
        if (key == null)
            throw new NullPointerException();
        Comparator<? super K> cmp = comparator;
        outer: for (;;) {
            for (Node<K,V> b = findPredecessor(key, cmp), n = b.next;;) {
                Object v; int c;

                // n为null说明是最后一个结点，已经没有结点可以搜索了，
                // 直接跳出外层循环
                if (n == null)
                    break outer;
                
                // f为目标值key的后一个结点
                Node<K,V> f = n.next;

                // 如果n此时已经不是b的下一个结点，说明其他并发操作使跳表结构发生了变化，
                // 跳出内部循环重新搜索目标值key的前一个结点
                if (n != b.next)                // inconsistent read
                    break;

                // 如果n此时value为null，说明已经被逻辑删除，
                // 帮助删除该节点，并跳出内部循环，重新搜索目标key值的前一个结点    
                if ((v = n.value) == null) {    // n is deleted
                    n.helpDelete(b, f);
                    break;
                }

                // 如果b此时value为null或者n.value==n（Marker结点），
                // 说明b已经被逻辑删除，跳出内部循环重新搜索目标key值前一个结点
                if (b.value == null || v == n)  // b is deleted
                    break;
                
                // cpr为compare操作。比较目标key值和当前结点的key。
                // 如果相等，则找到目标key，返回value
                if ((c = cpr(cmp, key, n.key)) == 0) {
                    @SuppressWarnings("unchecked") V vv = (V)v;
                    return vv;
                }

                // 如果 c < 0，说明当前结点n的key比目标值大，没有找到目标值
                if (c < 0)
                    break outer;

                // 执行到这里说明 c = 0 和 c < 0 都没有执行，则 c > 0，
                // 说明当前结点n的key比目标值小，将当前结点指针向后移动一格，然后重新循环查找 
                b = n;
                n = f;
            }
        }
        return null;
    }

doGet操作整体上先通过findPredecessor找到目标key值的前一个结点，然后判断结点是否被删除。同时由于并发操作有可能改变跳表结构，因此作者采用循环来不断重试并检查key值。

4.5 put(doPut)

doPut操作方法的代码前面部分查找前驱结点、判断是否删除、比较结点key值的操作与doGet操作类似，因此仅在后面不同的部分添加注释。由于doPut操作代码量较大，因此按照代码逻辑分为三个部分来分析。

/* ---------------- Insertion -------------- */

/**
 * Main insertion method.  Adds element if not present, or
 * replaces value if present and onlyIfAbsent is false.
 * @param key the key
 * @param value the value that must be associated with key
 * @param onlyIfAbsent if should not insert if already present
 * @return the old value, or null if newly inserted
 */
private V doPut(K key, V value, boolean onlyIfAbsent) {
    Node<K,V> z;             // added node
    if (key == null)
        throw new NullPointerException();
    Comparator<? super K> cmp = comparator;
    outer: for (;;) {
        for (Node<K,V> b = findPredecessor(key, cmp), n = b.next;;) {
            if (n != null) {
                Object v; int c;
                Node<K,V> f = n.next;
                if (n != b.next)               // inconsistent read
                    break;
                if ((v = n.value) == null) {   // n is deleted
                    n.helpDelete(b, f);
                    break;
                }
                if (b.value == null || v == n) // b is deleted
                    break;
                if ((c = cpr(cmp, key, n.key)) > 0) {
                    b = n;
                    n = f;
                    continue;
                }

                // 找到了与新插入的key值一样的结点
                if (c == 0) {

                    // onlyIfAbsent控制在已存在key的情况下是否替换value。当其为true时，
                    // if条件为真，不替换，直接返回value；为false时，CAS替换value并返回value
                    if (onlyIfAbsent || n.casValue(v, value)) {
                        @SuppressWarnings("unchecked") V vv = (V)v;
                        return vv;
                    }

                    // 替换value失败，则break内层循环，重新尝试插入
                    break; // restart if lost race to replace value
                }
                // else c < 0; fall through
            }

            // 新插入的结点
            z = new Node<K,V>(key, value, n);

            // CAS将新结点设定为前驱结点b的下一个结点
            if (!b.casNext(n, z))
                break;         // restart if lost race to append to b
            break outer;
        }
    }

在这部分操作完成后，新插入结点已经找到了插入位置，并且添加了最底层结点z。

    // 多线程使用ThreadLocalRandom来生成随机数，用于生成随机索引层数
    int rnd = ThreadLocalRandom.nextSecondarySeed();

    if ((rnd & 0x80000001) == 0) { // test highest and lowest bits
        int level = 1, max;

        // 随机层数生成采用不断除2，根据余数来生成，随机数是2的倍数生成的层高为1，
        // 随机数是4的倍数，则while执行两次，生成的层高为2，
        // 以此类推，保证概率为 1/2,1/4,1/8......
        while (((rnd >>>= 1) & 1) != 0)
            ++level;
        Index<K,V> idx = null;
        HeadIndex<K,V> h = head;

        // 生成的随机层数小于等于当前最大层数
        if (level <= (max = h.level)) {
            for (int i = 1; i <= level; ++i)
                // 从下往上建立Index结点，Index（node， down，right），
                // 使用循环建立node和down指针，此时还没有设定right指针
                idx = new Index<K,V>(z, idx, null);
        }
        
        // 生成的随机层数大于当前最大层数
        else { // try to grow by one level
            // 将新的最大层数设定为原先最大层数加一
            level = max + 1; // hold in array and later pick the one to use
            
            // 建立level个Index结点
            @SuppressWarnings("unchecked")Index<K,V>[] idxs =
                (Index<K,V>[])new Index<?,?>[level+1];
            for (int i = 1; i <= level; ++i)
            // 从下往上建立Index结点，Index（node， down，right），
            // 使用循环建立node和down指针，此时还没有设定right指针
                idxs[i] = idx = new Index<K,V>(z, idx, null);
            
            for (;;) {
                h = head;
                int oldLevel = h.level;
                
                // 有其他线程put操作增加了最大level，此时level已经不超过最大level，则break
                if (level <= oldLevel) // lost race to add level
                    break;
                HeadIndex<K,V> newh = h;
                Node<K,V> oldbase = h.node;
                
                // 建立高出最大level部分的HeadIndex，并指向前面新创建的Index结点。
                // 在这一步之后，高出原先最大level部分的结点都已经创建完成，并连接了Head结点
                for (int j = oldLevel+1; j <= level; ++j)
                    newh = new HeadIndex<K,V>(oldbase, newh, idxs[j], j);
                if (casHead(h, newh)) {
                    h = newh;
                    idx = idxs[level = oldLevel];
                    break;
                }
            }
        }

这部分操作主要是在插入的最底层结点z的基础上，生成随机层高，建立该结点每一层上的索引结点。如果高出当前跳表最大层高，则还要建立多出来部分的头结点，然后与头结点相连。在这一步完成之后，每一层的索引结点已经完成建立，高出先前最大层高的部分完成与头结点的连接，仅剩下先前最大层高之内的部分还没有连接。

        // 对原先最大level之内的结点进行连接
        // find insertion points and splice in
        splice: for (int insertionLevel = level;;) {
            int j = h.level;
            
            // 要插入的层级insertionLevel从原先最大level开始，逐步向下
            for (Index<K,V> q = h, r = q.right, t = idx;;) {
                if (q == null || t == null)
                    break splice;
                if (r != null) {
                    Node<K,V> n = r.node;
                    // compare before deletion check avoids needing recheck
                    int c = cpr(cmp, key, n.key);
                    
                    // 如果n的value为null（被删除），则帮助删除n结点
                    if (n.value == null) {
                        if (!q.unlink(r))
                            break;
                        r = q.right;
                        continue;
                    }
                    
                    // c > 0, 继续向右遍历,直到目标key小于r结点的key
                    if (c > 0) {
                        q = r;
                        r = r.right;
                        continue;
                    }
                }
                
                // 当前层级j等于要插入的层级insertionLevel
                if (j == insertionLevel) {
                    // 此时t是前一步待插入结点idx原先level的最高层。
                    // q为待插入结点的前驱结点，r比待插入key大，
                    // 所以是待插入结点的后继结点。CAS尝试连接，
                    // 使用link将新的后继结点设置为t。
                    // link（原先后继，新后继）
                    if (!q.link(r, t))
                        break; // restart
                   
                    // t被删除，findNode整理结点，break跳出外层循环
                    if (t.node.value == null) {
                        findNode(key);
                        break splice;
                    }
                    
                    // 已经到达最底层，插入完成，break跳出外层循环
                    if (--insertionLevel == 0)
                        break splice;
                }
               
                // 前面一层插入完成，此时继续向下移动，准备插入下一层
                if (--j >= insertionLevel && j < level)
                    t = t.down;
                q = q.down;
                r = q.right;
            }
        }
    }
    return null;
}

这一步主要对新插入的结点进行连接。这一步完成之后，新插入的节点已经完成了索引结点建立，头结点连接，跳表内部左右结点之间的连接，完成了全部插入操作。

4.6 remove(doRemove)

remove方法的代码前面部分查找前驱结点、判断是否删除、比较结点key值的操作与其他操作类似，因此仅在后面不同的部分添加注释。

/* ---------------- Deletion -------------- */

/**
 * Main deletion method. Locates node, nulls value, appends a
 * deletion marker, unlinks predecessor, removes associated index
 * nodes, and possibly reduces head index level.
 *
 * Index nodes are cleared out simply by calling findPredecessor.
 * which unlinks indexes to deleted nodes found along path to key,
 * which will include the indexes to this node.  This is done
 * unconditionally. We can't check beforehand whether there are
 * index nodes because it might be the case that some or all
 * indexes hadn't been inserted yet for this node during initial
 * search for it, and we'd like to ensure lack of garbage
 * retention, so must call to be sure.
 *
 * @param key the key
 * @param value if non-null, the value that must be
 * associated with key
 * @return the node, or null if not found
 */
final V doRemove(Object key, Object value) {
    if (key == null)
        throw new NullPointerException();
    Comparator<? super K> cmp = comparator;
    outer: for (;;) {
        for (Node<K,V> b = findPredecessor(key, cmp), n = b.next;;) {
            Object v; int c;
            if (n == null)
                break outer;
            Node<K,V> f = n.next;
            if (n != b.next)                    // inconsistent read
                break;
            if ((v = n.value) == null) {        // n is deleted
                n.helpDelete(b, f);
                break;
            }
            if (b.value == null || v == n)      // b is deleted
                break;
            if ((c = cpr(cmp, key, n.key)) < 0)
                break outer;
            if (c > 0) {
                b = n;
                n = f;
                continue;
            }
            
            // 执行到这里说明 c < 0 和 c > 0 都没有执行，则 c = 0，找到目标key。
            // 方法入参的value不为null，代表需要找到与目标（key，value）都相符的结点，
            // 如果与当前结点value不相等，则失败，跳出外部循环
            if (value != null && !value.equals(v))
                break outer;
           
            // CAS设定当前结点value为null，进行删除算法
            if (!n.casValue(v, null))
                break;
            
            // 设定marker结点并且移动前一个节点的指针指向当前结点的后一个节点，
            // 如果失败，则findnode整理节点,在其他线程执行其他操作时会帮助删除
            if (!n.appendMarker(f) || !b.casNext(n, f))
                findNode(key);                  // retry via findNode
            else {
            // 当前结点成功被删除，如果最高层头结点右侧已经没有结点，
            // 代表最高层索引被删除，则尝试降低level
                findPredecessor(key, cmp);      // clean index
                
                // 如果最高层已经没有索引，降低head来降低level
                if (head.right == null)
                    tryReduceLevel();
            }

            // 执行到这里代表删除成功，返回value值
            @SuppressWarnings("unchecked") V vv = (V)v;
            return vv;
        }
    }
    return null;
}

doRemove操作前半部分与其他操作类似，需要先查找key，下半部分为上文所述的删除算法的实现。

5. 总结

ConcurrentSkipList采用了大量CAS操作来代替锁，实现线程安全的并发Map。其总体结构与基本的跳表类似，但是为了确保线程安全，在普通跳表的插入、查询等操作基础上加入了并发删除算法，并且在这些主要操作的内部加入了对于并发冲突的检查，配合CAS不断重试来实现线程安全。

SolitudeCoding

关注

24
点赞
踩
21

收藏

觉得还不错? 一键收藏
0
评论
ConcurrentSkipListMap源码分析

ConcurrentSkipList采用了大量CAS操作来代替锁，实现线程安全的并发Map。其总体结构与基本的跳表类似，但是为了确保线程安全，在普通跳表的插入、查询等操作基础上加入了并发删除算法，并且在这些主要操作的内部加入了对于并发冲突的检查，配合CAS不断重试来实现线程安全。
复制链接

扫一扫