Java8中ConcurrentHashMap的源码分析

最新推荐文章于 2024-04-08 05:20:17 发布

returnTrue999

最新推荐文章于 2024-04-08 05:20:17 发布

阅读量337

点赞数

本文链接：https://blog.csdn.net/dap769815768/article/details/96596287

版权

Java架构师交流群：793825326

java版本：jdk1.8

IDE：idea 18

ConcurrentHashMap是java并发库里面的一个集合类，是一个线程安全的HashMap，它继承ConcurrentMap<K,V>接口。它的基本操作和HashMap几乎一样，差别就在于它实现了线程安全，它实现线程安全在Java8中使用的策略是CAS+volatile+synchronized。下面具体分下它的源码。由于源码和HashMap差不多，所以这里面我们着重关注它线程安全的部门，其他的部门可以参考我之前的HashMap源码分析文章。

1.写一段测试代码：

ConcurrentHashMap<String,String> map=new ConcurrentHashMap<>();
for (int i=0;i<20000000;i++)
{
    map.put(Integer.toString(i),Integer.toString(i));
}

2.跟踪进put方法内部：

final V putVal(K key, V value, boolean onlyIfAbsent) {
        if (key == null || value == null) throw new NullPointerException();
        int hash = spread(key.hashCode());
        int binCount = 0;
        for (Node<K,V>[] tab = table;;) {
            Node<K,V> f; int n, i, fh;
            if (tab == null || (n = tab.length) == 0)
                tab = initTable();
            else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
                if (casTabAt(tab, i, null,
                             new Node<K,V>(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin
            }
            else if ((fh = f.hash) == MOVED)
                tab = helpTransfer(tab, f);
            else {
                V oldVal = null;
                synchronized (f) {
                    if (tabAt(tab, i) == f) {
                        if (fh >= 0) {
                            binCount = 1;
                            for (Node<K,V> e = f;; ++binCount) {
                                K ek;
                                if (e.hash == hash &&
                                    ((ek = e.key) == key ||
                                     (ek != null && key.equals(ek)))) {
                                    oldVal = e.val;
                                    if (!onlyIfAbsent)
                                        e.val = value;
                                    break;
                                }
                                Node<K,V> pred = e;
                                if ((e = e.next) == null) {
                                    pred.next = new Node<K,V>(hash, key,
                                                              value, null);
                                    break;
                                }
                            }
                        }
                        else if (f instanceof TreeBin) {
                            Node<K,V> p;
                            binCount = 2;
                            if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                           value)) != null) {
                                oldVal = p.val;
                                if (!onlyIfAbsent)
                                    p.val = value;
                            }
                        }
                    }
                }
                if (binCount != 0) {
                    if (binCount >= TREEIFY_THRESHOLD)
                        treeifyBin(tab, i);
                    if (oldVal != null)
                        return oldVal;
                    break;
                }
            }
        }
        addCount(1L, binCount);
        return null;
    }

这里我们注意到了它和HashMap的第一个区别，就是它不允许key和value为空，而HashMap在这方面没有限制。为什么要做这个限制，后面我会讲到。

3.紧接着首先进入的方法是initTable()方法，看下这段代码怎么实现的：

    private final Node<K,V>[] initTable() {
        Node<K,V>[] tab; int sc;
        while ((tab = table) == null || tab.length == 0) {
            if ((sc = sizeCtl) < 0)  //sizeCtl默认为0，当小于0表示有线程正在进行初始化或者扩容。
                                    //那么就调用Thread.yield()让出当前线程，也就起到了阻塞的目的。
                Thread.yield(); // lost initialization race; just spin
                    //利用compareAndSwapInt设置sizeCtl为-1，让map进入初始化状态
            else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
                try {
                    if ((tab = table) == null || tab.length == 0) {
                        //使用默认容量16进行初始化
                        int n = (sc > 0) ? sc : DEFAULT_CAPACITY;
                        @SuppressWarnings("unchecked")
                        Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n];
                        table = tab = nt;
                        sc = n - (n >>> 2);
                    }
                } finally {
                    //初始化结束后，
                    sizeCtl = sc;
                }
                break;
            }
        }
        return tab;
    }

这段代码就是CAS算法，这里面是一个while循环，这里面有个控制变量sizeCtl，如果它小于0，则表示正在初始化或者扩容。那么其他的线程检测不通过后就会阻塞，另外sizeCtl 在不进行初始化或者扩容的时候则代表当前map的阈值，我们可以看到它的值是n减去n向右移两位，即n-n/4=0.75n，恰好等于默认的加载因子。

4.紧接着存入第一个元素：

//使用cas判断当前key所在的索引位置是否有数据，如果没有，则尝试在该位置插入一个新的节点数据
else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
                if (casTabAt(tab, i, null,
                             new Node<K,V>(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin
            }

//tabAt
static final <K,V> Node<K,V> tabAt(Node<K,V>[] tab, int i) {
        return (Node<K,V>)U.getObjectVolatile(tab, ((long)i << ASHIFT) + ABASE);
    }

//casTabAt的源码
static final <K,V> boolean casTabAt(Node<K,V>[] tab, int i,
                                        Node<K,V> c, Node<K,V> v) {
        return U.compareAndSwapObject(tab, ((long)i << ASHIFT) + ABASE, c, v);
    }

该方法完成后结束了，其他线程如果恰好也正在存入数据，那么将会不成功，继续循环

5.在链上新增数据，采用的策略是使用synchronized锁住第一个节点：

V oldVal = null;
synchronized (f) {
    if (tabAt(tab, i) == f) {
        if (fh >= 0) {
            binCount = 1;
            for (Node<K,V> e = f;; ++binCount) {
                K ek;
                if (e.hash == hash &&
                    ((ek = e.key) == key ||
                     (ek != null && key.equals(ek)))) {
                    oldVal = e.val;
                    if (!onlyIfAbsent)
                        e.val = value;
                    break;
                }
                Node<K,V> pred = e;
                if ((e = e.next) == null) {
                    pred.next = new Node<K,V>(hash, key,
                                              value, null);
                    break;
                }
            }
        }
        else if (f instanceof TreeBin) {
            Node<K,V> p;
            binCount = 2;
            if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                           value)) != null) {
                oldVal = p.val;
                if (!onlyIfAbsent)
                    p.val = value;
            }
        }
    }
}

6.在连上新增数据后，检查是否需要树形化：

if (binCount != 0) {
    if (binCount >= TREEIFY_THRESHOLD)
        treeifyBin(tab, i);
    if (oldVal != null)
        return oldVal;
    break;
}

我们看到树形化的阈值和HashMap一样，都是8。treeifyBin方法的源码如下：

private final void treeifyBin(Node<K,V>[] tab, int index) {
    Node<K,V> b; int n, sc;
    if (tab != null) {
        if ((n = tab.length) < MIN_TREEIFY_CAPACITY)
            tryPresize(n << 1);
        else if ((b = tabAt(tab, index)) != null && b.hash >= 0) {
            synchronized (b) {
                if (tabAt(tab, index) == b) {
                    TreeNode<K,V> hd = null, tl = null;
                    for (Node<K,V> e = b; e != null; e = e.next) {
                        TreeNode<K,V> p =
                            new TreeNode<K,V>(e.hash, e.key, e.val,
                                              null, null);
                        if ((p.prev = tl) == null)
                            hd = p;
                        else
                            tl.next = p;
                        tl = p;
                    }
                    setTabAt(tab, index, new TreeBin<K,V>(hd));
                }
            }
        }
    }
}

其保证线程安全的策略仍然是synchronized +cas方法。其他的操作则基本和HashMap一样。同样的，首先不是进行树形化操作，而是检查是否大于等于最小树形化阈值，这个阈值和HashMap一样，都是64。

7.数据存入结束后，调用addCount()方法增加map的数据总数，同时检查是否需要扩容，扩容的条件判断是

//while里面的条件为扩容条件，里面的控制因子是sizeCtl，当进行扩容时，它会为-1，其他的线程会帮忙扩容
//这里正常的扩容条件和加载因子的关系，请看后续总结
while (s >= (long)(sc = sizeCtl) && (tab = table) != null &&
       (n = tab.length) < MAXIMUM_CAPACITY) {
    int rs = resizeStamp(n);
    if (sc < 0) {
        if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 ||
            sc == rs + MAX_RESIZERS || (nt = nextTable) == null ||
            transferIndex <= 0)
            break;
        if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1))
            transfer(tab, nt);
    }
    else if (U.compareAndSwapInt(this, SIZECTL, sc,
                                 (rs << RESIZE_STAMP_SHIFT) + 2))
        transfer(tab, null);
    s = sumCount();
}

总结ConcurrentHashMap的源码如下：

1.构造函数的区别，当传入初始大小为10，加载因子为0.5时，得到的初始大小并不是16，而是32，区别于HashMap

2.key和value都不能为空，如果你比较细心，你会发现同样支持并发操作的HashTable也不允许key和value为null。究其原因，我在网上看到一段据说是作者的原话：

The main reason that nulls aren't allowed in ConcurrentMaps (ConcurrentHashMaps, ConcurrentSkipListMaps) is that ambiguities that may be just barely tolerable in non-concurrent maps can't be accommodated. The main one is that if map.get(key) returns null, you can't detect whether the key explicitly maps to null vs the key isn't mapped. In a non-concurrent map, you can check this via map.contains(key), but in a concurrent one, the map might have changed between calls

翻译过来的意思是：

在ConcurrentMaps（ConcurrentHashMaps, ConcurrentSkipListMaps）null值不被允许的原因是只在非并发环境中允许的歧义，在并发环境中不允许。最主要的一点是，如果map.get(key)返回null，你无法明确地区分是因为字典根据获取到的值是null还是这个字典压根不包含这个key。在一个非并发字典中，你可以通过map.contains(key)来检查，但在并发环境中，字典可能在几个操作之间被改变了。

这句话是什么意思呢，看如下的代码：

HashMap<String,String> hashMap=new HashMap<>();
if (hashMap.containsKey("ceshi"))
{
    String str=hashMap.get("ceshi");
}
ConcurrentHashMap<String,String> concurrentHashMap=new ConcurrentHashMap<>(5,0.00001f);
if (concurrentHashMap.containsKey("ceshi")) 
{
    String str = concurrentHashMap.get("ceshi");
}

通过containsKey判断是否存在“ceshi”这个键，如果存在就从map里面取出它的值。这在HashMap里没有任何问题，因为它是在非并发环境下使用的，但在ConcurrentHashMap中就有问题了，因为当你的判断结束，正要通过get取数据的时候，这期间可能另外一个线程改变了这个map，把“ceshi”这个key删除了，这个时候你当前的线程获取到的是null值，那么你就分辨不了是这个key对应的value是null，还是其他的线程把这个key从map中移除了。这就解释了为何value不能为null了。

那么为何key也不能为null呢？暂时不知道，我个人猜测是存key为null的值本身就没什么意义，所以作者在不允许存value为null的值的同时，索性也不允许存key为null的值进来。

3.保证取数据的线程安全采用的是volatile，保证存数据的线程安全以及树形化操作等采用的是cas+synchronized，保证扩容和初始化的线程安全才采用的是sizeCtl（控制变量）+cas。cas属于Unsafe类。

4.扩容的时候其他的修改数据的线程无法进行修改，但是会参与到扩容当中，帮助扩容线程快速完成扩容。这个时候使用sizeCtl来控制的，当为-2时，表示有一个线程正在扩容，当为-3时，表示有两个线程正在扩容，即当为-n时，表示n-1个线程正在扩容。

扩容的时候不影响取数据，因为扩容采用的方式是新建一个数组，替换掉原来的数组。因此不会影响到正在进行的查询。

5.扩容条件和加载因子的关系不同于HashMap，HashMap是将容量和加载因子相乘，得出阈值。而ConcurrentHashMap的首次阈值计算，如果不给记载因子，则默认是容量*0.75，如果给了，则默认是：

long size = (long)(1.0 + (long)initialCapacity / loadFactor);
int cap = (size >= (long)MAXIMUM_CAPACITY) ?
    MAXIMUM_CAPACITY : tableSizeFor((int)size);
this.sizeCtl = cap;

比如如果我们传进去的初始容量是5，加载因子是20f，则计算出来的扩容阈值是2。后续的扩容计算则按照容量*0.75来扩容。

6.线程安全策略存在死锁的隐患，具体参考我另外一篇博文https://blog.csdn.net/dap769815768/article/details/96481502

returnTrue999

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Java8中ConcurrentHashMap的源码分析

Java架构师交流群：793825326java版本：jdk1.8IDE：idea 18ConcurrentHashMap是java并发库里面的一个集合类，是一个线程安全的HashMap，它继承ConcurrentMap<K,V>接口。它的基本操作和HashMap几乎一样，差别就在于它实现了线程安全，它实现线程安全在Java8中使用的策略是CAS+volatile+sync...
复制链接

扫一扫