hashmap 源码分析_1.8

最新推荐文章于 2023-12-05 20:27:26 发布

langwuzhe

最新推荐文章于 2023-12-05 20:27:26 发布

阅读量373

点赞数

分类专栏： Java基础

本文链接：https://blog.csdn.net/langwuzhe/article/details/117653896

版权

Java基础专栏收录该内容

11 篇文章 0 订阅

订阅专栏

一、属性

(n - 1) & hash == hash%n

/**
 1. 容量（capacity）： HashMap中数组的长度
     容量范围：必须是2的幂，最大容量：2的30次方
 */
 
 //默认容量 = 16 = 1<<4 
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;
  
//最大容量 =  2的30次方（若传入的容量过大，将被最大值替换）
static final int MAXIMUM_CAPACITY = 1 << 30;

/**
 2. 加载因子(Load factor)：HashMap在其容量自动增加前可达到多满的一种尺度
    a. 加载因子越大、填满的元素越多 = 空间利用率高、但冲突的机会加大、查找效率变低（因为链表变长了）
    b. 加载因子越小、填满的元素越少 = 空间利用率小、冲突的机会减小、查找效率高（链表不长）
 */
 
 // 加载因子
  final float loadFactor;
  
// 默认加载因子的值 = 0.75
  static final float DEFAULT_LOAD_FACTOR = 0.75f;

/**
 3. 扩容阈值（threshold）：当哈希表 容量（table 的 size） ≥ 扩容阈值时，就会扩容哈希表（即扩充HashMap的容量） 
    a. 扩容 = 对哈希表进行resize操作（即重建内部数据结构），将hashmap 的容量增加一倍
    b. 扩容阈值 = 容量（table.size） x 加载因子（loadFactor）
 */
//扩容阈值： 当实际大小(容量*填充比 = table.size * loadFactor ) > threshold，会进行扩容    
  int threshold;
  
  
//4.当add一个元素到某个位桶，其链表长度达到8时将链表转换为红黑树  
      static final int TREEIFY_THRESHOLD = 8;  
      static final int UNTREEIFY_THRESHOLD = 6;  
      static final int MIN_TREEIFY_CAPACITY = 64;   

//5.位桶数组，默认长度16，终极大 boss
transient Node<k,v>[] table;       

//6.其他

//HashMap的大小，即 HashMap中存储的键值对的数量
transient int size;
//被修改的次数fast-fail机制  
transient int modCount;

二、构造函数

//构造函数1
public HashMap(int initialCapacity, float loadFactor) {
    //指定的初始容量非负
    if (initialCapacity < 0)
        throw new IllegalArgumentException(Illegal initial capacity:  +
                                           initialCapacity);
    //如果指定的初始容量大于最大容量,置为最大容量
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    //填充比为正
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException(Illegal load factor:  +
                                           loadFactor);
                                           
    //默认 0.75                                       
    this.loadFactor = loadFactor;
    
    //tableSizeFor 负责根据 initialCapacity的值，找到initialCapacity最近的一个幂值
    //默认16
    this.threshold = tableSizeFor(initialCapacity);//新的扩容临界值
}

//构造函数2
public HashMap(int initialCapacity) {
    this(initialCapacity, DEFAULT_LOAD_FACTOR);
}

//构造函数3
public HashMap() {
    this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}

//构造函数4用m的元素初始化散列映射
public HashMap(Map<!--? extends K, ? extends V--> m) {
    this.loadFactor = DEFAULT_LOAD_FACTOR;
    putMapEntries(m, false);
}

三、HashMap的存取机制

1.HashMap如何getValue值，看源码

   public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }

    /**
     * Implements Map.get and related methods
     *
     * @param hash hash for key
     * @param key the key
     * @return the node, or null if none
     */
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab;
        Node<K,V> first, e; 
        int n; 
        K k;
        
        //判断中做赋值操作也是醉了
        //1.判断  位桶数组 table 是否为空；table 的长度大于0；桶位的第一个元素不是空
        //2.把 table 赋值给 tab； 把桶中的 第一个元素赋值给 first
        //特别的.『(n - 1) & hash』表示 hash 对 table 的长度取余
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            //如果 hash 值吻合 且  key 也对  那么返回 位桶中的第一个元素
            if (first.hash == hash &&  ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            //如果位桶中的第一个元素不是想找的元素，那么走下面逻辑    
            if ((e = first.next) != null) {
                //如果第一个元素属于 TreeNode（可能是判断是否红黑树） 
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                //循环 当前位桶 中的链表元素，直到找到和查询的 key 一致的元素
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

get(key)方法时获取key的hash值，计算hash&(n-1)得到在链表数组中的位置first=tab[hash&(n-1)],先判断first的key是否与参数key相等，不等就遍历后面的链表找到相同的key值返回对应的Value值即可

2.HashMap如何put(key，value);看源码

public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

   final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; 
        Node<K,V> p;
        int n, i;
        
        //1.如果 table 是 null  那么先进行扩容处理（put 第一个元素的时候）
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        //2.如果当前位桶 是个空的那么 进行 newNode 的操作 
            //即使不是空的，找到当前 key 的 hash 对应的位桶  p 指向当前位桶的首节点（重要：对下面的 else 有影响）
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
        //3.当 hash 有冲突(即 当前位桶已经有了元素：(n - 1) & hash是相等的<两种情况：1.key 相等  2.key 不相等，和 n取模相等>)
            Node<K,V> e; K k;
            //3.1如果首节点 的 key 和  需要添加的 key 是同一个 key，那么把首节点赋值给 e(注意是地址赋值，只要 e 有变化 p、tab、table 都会有变化)，在 ★ ★ ★ ★ ★ ★处，把新值替换进去，老值返回
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k)))){
                e = p;
            }else if (p instanceof TreeNode){
             //3.2 红黑树操作  
             e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            }else {
            //3.3 重点：如果hash 对 n 取模相同， key 又不同（需要放在同一个位桶，但是不是同一个 key）
                //此循环的意思是找到当前位桶的最后一个 node，把新 node 挂在最后一个 node 上面
                for (int binCount = 0; ; ++binCount) {
                    //当只有一个首节点
                    if ((e = p.next) == null) {
                        //把新加入的元素 放到首节点的后面
                        p.next = newNode(hash, key, value, null);
                        //操作红黑树
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k)))){
                            break;
                        }
                        
                    //此时 p 已经不在指向 table 当前位桶的首元素了。    
                    p = e;
                }
            }
            //★ ★ ★ ★ ★ ★  替换值
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

判断键值对数组tab[]是否为空或为null，否则以默认大小resize()；
根据键值key计算hash值得到插入的数组索引i，如果tab[i]==null，直接新建节点添加，否则转入3
判断当前数组中处理hash冲突的方式为链表还是红黑树(check第一个节点类型即可),分别处理
如果是链表，那么把新元素放到队尾

四、HasMap的扩容机制resize();

构造hash表时，如果不指明初始大小，默认大小为16（即Node数组大小16），如果Node[]数组中的元素达到（填充比*Node.length）重新调整HashMap大小变为原来2倍大小,扩容很耗时

    final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
    //记录扩容    
        //table 位桶 数组的大小（位桶的容量）
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        //扩容阈值
        int oldThr = threshold;
        //新的位桶数组容量，扩容阈值
        int newCap, newThr = 0;
        //如果旧的 table 位桶数组容量 > 0
        if (oldCap > 0) {
            //如果旧的 table 位桶数组容量 >= 最大容量：1 << 30
            if (oldCap >= MAXIMUM_CAPACITY) {
                //那么把扩容阈值设置为 最大（Integer.MAX_VALUE > MAXIMUM_CAPACITY）：这就意味着 当table 位桶的容量达到最大后，就不再进行扩容了
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            //1.把旧的 位桶 table 容量增加一倍
            //2.如果新扩容出来的 容量 < 最大容量，且原容量也不是最小容量，那么就把扩容阈值增加一倍
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY){
                         newThr = oldThr << 1; // double threshold
                     }
                
        }
        
        //下面  else if 和 else  是初始化的两个操作
        //（new HashMap 时 使用有参构造器<指定容量>）如果 原容量是0，且 扩容阈值 > 0，那么把构造器中拿到的初始容量赋值给newCap 
        else if (oldThr > 0){// initial capacity was placed in threshold
            //如果指定容量 new hashmap，存容量值的是threshold（扩容阈值），所以在这里把指定的容量值还给newCap，在重新构造 node 数组的时候使用
            newCap = oldThr;
        }
        //（new Hashmap 时 是无参构造器<不指定容量> ）当原容量 和 扩容阈值都是0  证明new hashmap 时是无参构造器，那么容量就是16，扩容阈值 是  16*0.75
        else { // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        //如果是  有参构造器的初始化。扩容阈值newThr = 容量 * loadFactor
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        //最终确定的 扩容因子 赋值给  threshold
        threshold = newThr;
        //构建新的Node 数组，并时 位桶 table 指向该数组
        @SuppressWarnings({"rawtypes","unchecked"})
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        if (oldTab != null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                //1.把当前位桶的单链表交给 e，然后把当前的位桶清空
                //2.当前位桶如果 是空，那么啥事也不做
                if ((e = oldTab[j]) != null) {
                    oldTab[j] = null;
                    //如果当前桶位只有一个首节点
                    if (e.next == null){
                        newTab[e.hash & (newCap - 1)] = e;
                    }
                    //红黑树
                    else if (e instanceof TreeNode)
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        //看此逻辑重点理解 (e.hash & oldCap)
                        //比如oldCap=8,hash是3，11，19，27时，(e.hash & oldCap)的结果是0，8，0，8，这样3，19组成新的链表，index为3；而11，27组成新的链表，新分配的index为3+8；
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            //通过e.hash & oldCap 把原 位桶中的链表分成  两个链表（e.hash & oldCap) == 0 的放在原 位桶中，else 组成的链表放到  原位桶的index+oldCap中）
                            //while 循环的过程，其实是两个单链表不断的 链元素的过程
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null){
                                    loHead = e;
                                }else{
                                //①②是一个 指针往后移动的过程
                                    loTail.next = e;①
                                }
                                loTail = e;②
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            //（e.hash & oldCap) == 0 的放在原位桶中
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            //else 组成的链表放到  原位桶的index+oldCap中
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

jdk 1.7 resize 重新分配元素使用的是(e.hash & oldCap-1) hask 值对 oldCap取余来获取坐标
jdk 1.8 通过(e.hash & oldCap) 查看是否为0 来把原链表分裂成两个链表。为0的在原位桶不变，不为0的原位桶 index + oldCap

五、JDK1.8使用红黑树的改进

在jdk7中，HashMap处理“碰撞”的时候，都是采用链表来存储，当碰撞的结点很多时，查询时间是O（n）。
在jdk8中，HashMap处理“碰撞”增加了红黑树这种数据结构，当碰撞结点较少时，采用链表存储，当较大时（>=8个），采用红黑树。
当当前位桶元素>=8时转为红黑树，当元素<= 6 时从红黑树转为单链表

六、HashMap线程不安全的体现

jdk1.7

1.resize 时的 transfer函数

①②③

void transfer(Entry[] newTable, boolean rehash) {
        int newCapacity = newTable.length;
        for (Entry<K,V> e : table) {
            while(null != e) {
                Entry<K,V> next = e.next;
                if (rehash) {
                    e.hash = null == e.key ? 0 : hash(e.key);
                }
                int i = indexFor(e.hash, newCapacity);
                e.next = newTable[i]; //①
                newTable[i] = e;//②
                e = next;//③
            }
        }
    }

如果两个线程同时去 resize 的时候，比如 A 线程 resize 完毕，B 线程再去根据 A 线程 resize 完毕后的结构去 resize，因为 resize 方法中使用了头插法去排列元素，所有就有可能会造成链表元素形成环形链表，在 get 这个链表元素的时候形成死循环。也有可能不形成环形链表，但是会丢元素。

2.put 元素时候也会因为不是同步的方法而可能丢失元素

jdk1.8

在jdk1.8中对HashMap进行了优化，在发生hash碰撞，不再采用头插法方式，而是直接插入链表尾部，因此不会出现环形链表的情况，但是在多线程的情况下仍然不安全，这里我们看jdk1.8中HashMap的put操作源码：

 1  final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
 2                    boolean evict) {
 3         Node<K,V>[] tab; Node<K,V> p; int n, i;
 4         if ((tab = table) == null || (n = tab.length) == 0)
 5             n = (tab = resize()).length;
 6         if ((p = tab[i = (n - 1) & hash]) == null) // 如果没有hash碰撞则直接插入元素
 7             tab[i] = newNode(hash, key, value, null);
 8         else {
                ....... 省略下面代码

这是jdk1.8中HashMap中put操作的主函数，注意第6行代码，如果没有hash碰撞则会直接插入元素。如果线程A和线程B同时进行put操作，刚好这两条不同的数据hash值一样，并且该位置数据为null，所以这线程A、B都会进入第6行代码中。假设一种情况，线程A进入后还未进行数据插入时挂起，而线程B正常执行，从而正常插入数据，然后线程A获取CPU时间片，此时线程A不用再进行hash判断了，问题出现：线程A会把线程B插入的数据给覆盖，发生线程不安全。

参考：
https://blog.csdn.net/bnmb888/article/details/77164485
https://www.cnblogs.com/little-fly/p/7344285.html
https://www.cnblogs.com/developer_chan/p/10450908.html

langwuzhe

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
hashmap 源码分析_1.8

一、属性(n - 1) & hash == hash%n/** 1. 容量（capacity）： HashMap中数组的长度容量范围：必须是2的幂，最大容量：2的30次方 */ //默认容量 = 16 = 1<<4 static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; //最大容量 = 2的30次方（若传入的容量过大，将被最大值替换）static final int MAXIMUM_CAP
复制链接

扫一扫