HashMap与ConcurrentHashMap深入解析

最新推荐文章于 2022-11-17 13:05:23 发布

丿剑过血洒泪轻吟

最新推荐文章于 2022-11-17 13:05:23 发布

阅读量226

点赞数

文章标签： java

本文链接：https://blog.csdn.net/weixin_46200547/article/details/119606452

版权

HashMap
在这里插入图片描述
关于红黑树和链表转换一些细节
1)链表转红黑树
时机：发生在链表长度大于等于 8 时
原因：此时，红黑树的平均查找长度为 O(log n)=3 < 链表平均查找长度 O(n/2)=4。
发生：插入时进行判断与转换
2）红黑树转链表
时机：红黑树节点数小于等于 6 时
原因：链表平均查找长度更小。
发生：删除节点时进行判断和转换
如果留心点，会发现两个临界值之间隔了一个 7。原因是由于转换操作比较耗时，中间隔个 7，可以防止转换操作过于频繁。
HashMap是线程不安全的，其主要体现：

#1.在jdk1.7中，在多线程环境下，扩容时会造成环形链或数据丢失。
jdk1.7扩容时采用头插法链表顺序会反转
jdk1.8后采用尾插法

 1   void transfer(Entry[] newTable, boolean rehash) {
 2         int newCapacity = newTable.length;
 3         for (Entry<K,V> e : table) {
 4             while(null != e) {
 5                 Entry<K,V> next = e.next;
 6                 if (rehash) {
 7                     e.hash = null == e.key ? 0 : hash(e.key);
 8                 }
 9                 int i = indexFor(e.hash, newCapacity);
10                 e.next = newTable[i];
				//多线程环境下此处被挂起
11                 newTable[i] = e;
12                 e = next;
13             }
14         }
15     }

#2.在jdk1.8中，在多线程环境下，会发生数据覆盖的情况。

    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
           // 如果没有hash碰撞则直接插入元素
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

如果没有hash碰撞则会直接插入元素。如果线程A和线程B同时进行put操作，刚好这两条不同的数据hash值一样，并且该位置数据为null，所以这线程A、B都会进入第6行代码中。假设一种情况，线程A进入后还未进行数据插入时挂起，而线程B正常执行，从而正常插入数据，然后线程A获取CPU时间片，此时线程A不用再进行hash判断了，问题出现：线程A会把线程B插入的数据给覆盖，发生线程不安

ConcurrentHashMap
在 jdk1.7 和 1.8 中具体实现稍有不同
jdk1.7
如图所示，是由 Segment 数组、HashEntry 组成，和 HashMap 一样，仍然是数组加链表。
在这里插入图片描述
Segment 是 ConcurrentHashMap 的一个内部类，主要的组成如下：

static final class Segment<K,V> extends ReentrantLock implements Serializable {

    private static final long serialVersionUID = 2249069246763182397L;

    // 和 HashMap 中的 HashEntry 作用一样，真正存放数据的桶
    transient volatile HashEntry<K,V>[] table;

    transient int count;
       
    transient int modCount;
        // 大小
    transient int threshold;
        // 负载因子
    final float loadFactor;

}

ConcurrentHashMap的初始化是会通过位与运算来初始化Segment的大小，用ssize来表示，源码如下所示

private static final int DEFAULT_CONCURRENCY_LEVEL = 16;
private void writeObject(java.io.ObjectOutputStream s)
        throws java.io.IOException {
        // For serialization compatibility
        // Emulate segment calculation from previous version of this class
        int sshift = 0;
        int ssize = 1;
        while (ssize < DEFAULT_CONCURRENCY_LEVEL) {
            ++sshift;
            ssize <<= 1;
        }
        int segmentShift = 32 - sshift;
        int segmentMask = ssize - 1;

由此可以看出：因为ssize用位于运算来计算（ssize <<=1），所以Segment的大小取值都是以2的N次方，无关concurrencyLevel的取值，当然concurrencyLevel最大只能用16位的二进制来表示，即65536，换句话说，Segment的大小最多65536个，没有指定concurrencyLevel元素初始化，Segment的大小ssize默认为 DEFAULT_CONCURRENCY_LEVEL =16

HashEntry跟HashMap差不多的，但是不同点是，他使用volatile去修饰了他的数据Value还有下一个节点next。

原理上来说，ConcurrentHashMap 采用了分段锁技术，其中 Segment 继承于 ReentrantLock。

不会像 HashTable 那样不管是 put 还是 get 操作都需要做同步处理，理论上 ConcurrentHashMap 支持 CurrencyLevel (Segment 数组数量)的线程并发。

每当一个线程占用锁访问一个 Segment 时，不会影响到其他的 Segment。

public V put(K key, V value) {
    Segment<K,V> s;
    //值不能为空
    if (value == null)
        throw new NullPointerException();
    int hash = hash(key);
    int j = (hash >>> segmentShift) & segmentMask;
    if ((s = (Segment<K,V>)UNSAFE.getObject          
         (segments, (j << SSHIFT) + SBASE)) == null) 
        s = ensureSegment(j);
    return s.put(key, hash, value, false);
}

他先定位到Segment，然后再进行put操作。

  final V put(K key, int hash, V value, boolean onlyIfAbsent) {
          // 将当前 Segment 中的 table 通过 key 的 hashcode 定位到 HashEntry
            HashEntry<K,V> node = tryLock() ? null :
                scanAndLockForPut(key, hash, value);
            V oldValue;
            try {
                HashEntry<K,V>[] tab = table;
                int index = (tab.length - 1) & hash;
                HashEntry<K,V> first = entryAt(tab, index);
                for (HashEntry<K,V> e = first;;) {
                    if (e != null) {
                        K k;
 // 遍历该 HashEntry，如果不为空则判断传入的 key 和当前遍历的 key 是否相等，相等则覆盖旧的 value。
                        if ((k = e.key) == key ||
                            (e.hash == hash && key.equals(k))) {
                            oldValue = e.value;
                            if (!onlyIfAbsent) {
                                e.value = value;
                                ++modCount;
                            }
                            break;
                        }
                        e = e.next;
                    }
                    else {
                 // 不为空则需要新建一个 HashEntry 并加入到 Segment 中，同时会先判断是否需要扩容。
                        if (node != null)
                            node.setNext(first);
                        else
                            node = new HashEntry<K,V>(hash, key, value, first);
                        int c = count + 1;
                        if (c > threshold && tab.length < MAXIMUM_CAPACITY)
                            rehash(node);
                        else
                            setEntryAt(tab, index, node);
                        ++modCount;
                        count = c;
                        oldValue = null;
                        break;
                    }
                }
            } finally {
               //释放锁
                unlock();
            }
            return oldValue;
        }

首先第一步的时候会尝试获取锁，如果获取失败肯定就有其他线程存在竞争，则利用 scanAndLockForPut() 自旋获取锁。

尝试自旋获取锁。
如果重试的次数达到了 MAX_SCAN_RETRIES 则改为阻塞锁获取，保证能获取成功。

get 逻辑比较简单，只需要将 Key 通过 Hash 之后定位到具体的 Segment ，再通过一次 Hash 定位到具体的元素上。

由于 HashEntry 中的 value 属性是用 volatile 关键词修饰的，保证了内存可见性，所以每次获取时都是最新值。

ConcurrentHashMap 的 get 方法是非常高效的，因为整个过程都不需要加锁。

该结构的优劣势
好处
写操作的时候可以只对元素所在的Segment进行加锁即可，不会影响到其他的Segment，这样，在最理想的情况下，ConcurrentHashMap可以最高同时支持Segment数量大小的写操作（刚好这些写操作都非常平均地分布在所有的Segment上）。
所以，通过这一种结构，ConcurrentHashMap的并发能力可以大大的提高。

劣处
这一种结构的带来的副作用是Hash的过程要比普通的HashMap要长

jdk1.8
其中抛弃了原有的 Segment 分段锁，而采用了 CAS + synchronized 来保证并发安全性。

跟HashMap很像，也把之前的HashEntry改成了Node，但是作用不变，把值和next采用了volatile去修饰，保证了可见性，并且也引入了红黑树，在链表大于一定值的时候会转换（默认是8）。

ConcurrentHashMap在进行put操作的还是比较复杂的，大致可以分为以下步骤：

根据 key 计算出 hashcode 。
判断是否需要进行初始化。
即为当前 key 定位出的 Node，如果为空表示当前位置可以写入数据，利用 CAS 尝试写入，失败则自旋保证成功。
如果当前位置的 hashcode == MOVED == -1,则需要进行扩容。
如果都不满足，则利用 synchronized 锁写入数据。
如果数量大于 TREEIFY_THRESHOLD 则要转换为红黑树。

   final V putVal(K key, V value, boolean onlyIfAbsent) {
        if (key == null || value == null) throw new NullPointerException();
        //根据 key 计算出 hashcode 。
        int hash = spread(key.hashCode());
        int binCount = 0;
        //对这个table进行迭代
        for (Node<K,V>[] tab = table;;) {
            Node<K,V> f; int n, i, fh;
           //这里就是上面构造方法没有进行初始化，在这里进行判断，为null就调用initTable进行初始化，属于懒汉模式初始化
            if (tab == null || (n = tab.length) == 0)
                tab = initTable();
            else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
            //即为当前 key 定位出的 Node，如果为空表示当前位置可以写入数据，利用 CAS 尝试写入，失败则自旋保证成功。
                if (casTabAt(tab, i, null,
                             new Node<K,V>(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin
            }
            //如果当前位置的 hashcode == MOVED == -1,则需要进行扩容。
            else if ((fh = f.hash) == MOVED)
                tab = helpTransfer(tab, f);
            else {
                V oldVal = null;
                //如果都不满足，那就要进行加锁操作，也就是存在hash冲突，锁住链表或者红黑树的头结点
                synchronized (f) {
                    if (tabAt(tab, i) == f) {
                    //表示该节点是链表结构
                        if (fh >= 0) {
                            binCount = 1;
                            for (Node<K,V> e = f;; ++binCount) {
                                K ek;
                                 //这里涉及到相同的key进行put就会覆盖原先的value
                                if (e.hash == hash &&
                                    ((ek = e.key) == key ||
                                     (ek != null && key.equals(ek)))) {
                                    oldVal = e.val;
                                    if (!onlyIfAbsent)
                                        e.val = value;
                                    break;
                                }
                                Node<K,V> pred = e;
                                if ((e = e.next) == null) {
                                 //插入链表尾部
                                    pred.next = new Node<K,V>(hash, key,
                                                              value, null);
                                    break;
                                }
                            }
                        }
                        //红黑树结构
                        else if (f instanceof TreeBin) {
                            Node<K,V> p;
                            binCount = 2;
                             //红黑树结构旋转插入
                            if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                           value)) != null) {
                                oldVal = p.val;
                                if (!onlyIfAbsent)
                                    p.val = value;
                            }
                        }
                    }
                }
                if (binCount != 0) {
               // 如果数量大于 TREEIFY_THRESHOLD 则要转换为红黑树。
                    if (binCount >= TREEIFY_THRESHOLD)
                        treeifyBin(tab, i);
                    if (oldVal != null)
                        return oldVal;
                    break;
                }
            }
        }
        //统计size，并且检查是否需要扩容
        addCount(1L, binCount);
        return null;
    }

get方法

    public V get(Object key) {
        Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
        int h = spread(key.hashCode());
        //根据计算出来的 hashcode 寻址，如果就在桶上那么直接返回值。
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (e = tabAt(tab, (n - 1) & h)) != null) {
            //如果是红黑树那就按照树的方式获取值。
            if ((eh = e.hash) == h) {
                if ((ek = e.key) == key || (ek != null && key.equals(ek)))
                    return e.val;
            }
            //hash值为负值表示正在扩容，这个时候查的是ForwardingNode的find方法来定位到nextTable来
             //查找，查找到就返回
            //不满足那就按照链表的方式遍历获取值。
            else if (eh < 0)
                return (p = e.find(h, key)) != null ? p.val : null;
            while ((e = e.next) != null) {
                if (e.hash == h &&
                    ((ek = e.key) == key || (ek != null && key.equals(ek))))
                    return e.val;
            }
        }
        return null;
    }

丿剑过血洒泪轻吟

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
HashMap与ConcurrentHashMap深入解析

ConcurrentHashMap在 jdk1.7 和 1.8 中具体实现稍有不同jdk1.7如图所示，是由 Segment 数组、HashEntry 组成，和 HashMap 一样，仍然是数组加链表。Segment 是 ConcurrentHashMap 的一个内部类，主要的组成如下：static final class Segment<K,V> extends ReentrantLock implements Serializable { private static
复制链接

扫一扫