HashMap和ConcurrentHashMap原理及源码解析

最新推荐文章于 2023-05-27 16:20:51 发布

XiaoYeYe003

最新推荐文章于 2023-05-27 16:20:51 发布

阅读量128

点赞数

分类专栏： java

本文链接：https://blog.csdn.net/xueshanfeihu555/article/details/105186840

版权

java 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

HashMap原理及源码解析

数据结构

jdk8 数组+链表或者数组+红黑树（链表长度超过阀值8，就把链表转成红黑树，链表长度低于6，就把红黑树转回链表）HashMap初始容量（16）和默认负载系数（0.75）。 HashMap储存数据结构如下所示：

Node<K,V> implements Map.Entry<K,V>
Map.Entry为基本的get,set(key,value)方法
Node对象 final int hash;
         final K key;
         V value;
         Node<K,V> next;

原理

put过程

hash = (h = key.hashCode()) ^ (h >>> 16)(高16位不变，低16位异或)
hashcode = hash & (n-1) (得到数组下标，效果等同于hash%length)
下标冲突，生成新的节点，从头部插入，判断链表长度，超过8转化成红黑树。

get过程

hash = (h = key.hashCode()) ^ (h >>> 16)(高16位不变，低16位异或)
hashcode = hash & (n-1) (得到数组下标)
下标冲突，通过Node对象的key的equals方法比较链表或红黑树对象

扩容

2次幂的扩展，扩容后元素的位置要么是在原位置，要么是在原位置再移动2次幂的位置
hashcode & oldCap == 0位置不变
hashcode & oldCap == 1，原位置+oldCap
链表次序和原链表次序一致

jdk8优化

resize 扩容优化，无需重新hash, hashcode & oldCap == 0位置不变，hashcode & oldCap == 1，原位置+oldCap

引入了红黑树，目的是避免单条链表过长而影响查询效率

解决了resize时多线程死循环问题，但仍是非线程安全的

线程不安全

HashMap 在并发时可能出现的问题主要是两方面：

put的时候导致的多线程数据不一致
比如有两个线程A和B，首先A希望插入一个key-value对到HashMap中，首先计算到hashEntity桶的索引坐标，
然后获取到该桶里面的链表头结点，此时线程A挂起，而线程B被调度得以执行，和线程A一样执行，且要插入的key计算的hashcode与A一样，
当线程B成功将记录插到了桶里面，线程A再次被调度运行时，会直接覆盖线程B插入的记录。
resize而引起死循环（JDK1.8已经不会出现该问题）
这种情况发生在JDK1.7 中HashMap自动扩容时，当2个线程同时扩容时，两个线程同时修改一个链表结构会产生一个循环链表
（JDK1.7中，会出现resize前后元素顺序倒置的情况）。接下来再想通过get()获取某一个元素，就会出现死循环。

ConcurrentHashMap

结构

底层同样采用了数组+链表+红黑树。Node, value使用volatile修饰，保证可见性。

ConcurrentHashMap{
transient volatile Node<K,V>[] table;
static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        volatile V val;
        volatile Node<K,V> next;
}
}

put过程
采用cas+synchronized优化分段锁
源码

 //for循环自旋锁
 final V putVal(K key, V value, boolean onlyIfAbsent) {
        if (key == null || value == null) throw new NullPointerException();
        int hash = spread(key.hashCode());
        int binCount = 0;
        for (Node<K,V>[] tab = table;;) {
            Node<K,V> f; int n, i, fh;
            if (tab == null || (n = tab.length) == 0)
                tab = initTable();
            else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
            //cas插入数据，如果插入的数据tab[i]不为null,说明有别的线程修改了tab[i],改用synchronized同步
                if (casTabAt(tab, i, null,
                             new Node<K,V>(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin
            }
            //负数代表正在进行初始化或扩容操作,-1代表正在初始化,-N 表示有N-1个线程正在进行扩容操作
            else if ((fh = f.hash) == MOVED)//MOVED=-1
                tab = helpTransfer(tab, f);
                //不满足cas条件使用synchronized加锁
            else {
                V oldVal = null;
                synchronized (f) {
                    if (tabAt(tab, i) == f) {
                        if (fh >= 0) {
                            binCount = 1;
                            for (Node<K,V> e = f;; ++binCount) {
                                K ek;
                                if (e.hash == hash &&
                                    ((ek = e.key) == key ||
                                     (ek != null && key.equals(ek)))) {
                                    oldVal = e.val;
                                    if (!onlyIfAbsent)
                                        e.val = value;
                                    break;
                                }
                                Node<K,V> pred = e;
                                if ((e = e.next) == null) {
                                    pred.next = new Node<K,V>(hash, key,
                                                              value, null);
                                    break;
                                }
                            }
                        }
                        else if (f instanceof TreeBin) {
                            Node<K,V> p;
                            binCount = 2;
                            if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                           value)) != null) {
                                oldVal = p.val;
                                if (!onlyIfAbsent)
                                    p.val = value;
                            }
                        }
                    }
                }
                if (binCount != 0) {
                    if (binCount >= TREEIFY_THRESHOLD)
                        treeifyBin(tab, i);
                    if (oldVal != null)
                        return oldVal;
                    break;
                }
            }
        }
        addCount(1L, binCount);
        return null;
 }
 //计算hashcode,定位node位置
 static final int spread(int h) {
          return (h ^ (h >>> 16)) & HASH_BITS;
      }
 //Usafe原子操作
 static final <K,V> boolean casTabAt(Node<K,V>[] tab, int i,
                                         Node<K,V> c, Node<K,V> v) {
         return U.compareAndSwapObject(tab, ((long)i << ASHIFT) + ABASE, c, v);
 }

1,根据hashcode = (n-1) & (h ^ (h >>> 16) & 0x7fffffff) (h = key.hashcode()) 计算出hashcode值
2, 通过hashcode定位出node，如果为空表示当前位置可以写入数据，利用循环CAS写入，如果不为空，
则利用synchronized锁写入数据，如果数量大于TREEIFY_HRESHOLD则要转化为红黑树。

cas:
cas是一种基于锁的操作，而且是乐观锁。在java中锁分为乐观锁和悲观锁。悲观锁是将资源锁住，等一个之前获得锁的线程释放锁之后，下一个线程才可以访问。

CAS 操作包含三个操作数 —— 内存位置（V）、预期原值（A）和新值(B)。如果内存地址里面的值和A的值是一样的，
那么就将内存里面的值更新成B。CAS是通过无限循环来获取数据的，若果在第一轮循环中，a线程获取地址里面的值被b线程修改了，
那么a线程需要自旋，到下次循环才有可能机会执行。
CAS是一种系统原语，原语属于操作系统用语范畴，是由若干条指令组成的，用于完成某个功能的一个过程，
并且原语的执行必须是连续的，在执行过程中不允许被中断，也就是说CAS是一条CPU的原子指令，不会造成所谓的数据不一致问题。

自旋锁

public class SpinLock {
      private AtomicReference<Thread> sign =new AtomicReference<>();
     
      public void lock(){
        Thread current = Thread.currentThread();
        while(!sign .compareAndSet(null, current)){
        }
      }
     
      public void unlock (){
        Thread current = Thread.currentThread();
        sign .compareAndSet(current, null);
      }
    }

可重入锁

public class ReentrantSpinLock {
 private AtomicReference<Thread> owner =new AtomicReference<>();
 private int count =0;
 public void lock(){
  Thread current = Thread.currentThread();
  if(current==owner.get()) {
   count++;
   return ;
  }
  while(!owner.compareAndSet(null, current)){
  }
 }

 public void unlock (){
  Thread current = Thread.currentThread();
  if(current==owner.get()){
   if(count!=0){
    count--;
   }else{
    owner.compareAndSet(current, null);
   }
  }
 }
}

get过程

 public V get(Object key) {
        Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
        int h = spread(key.hashCode());
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (e = tabAt(tab, (n - 1) & h)) != null) {
            if ((eh = e.hash) == h) {
                if ((ek = e.key) == key || (ek != null && key.equals(ek)))
                    return e.val;
            }
            else if (eh < 0)
                return (p = e.find(h, key)) != null ? p.val : null;
            while ((e = e.next) != null) {
                if (e.hash == h &&
                    ((ek = e.key) == key || (ek != null && key.equals(ek))))
                    return e.val;
            }
        }
        return null;
    }
 static final int spread(int h) {
         return (h ^ (h >>> 16)) & HASH_BITS;
     }