HashMap源码剖析

最新推荐文章于 2024-08-19 16:01:20 发布

野鸭丁真

最新推荐文章于 2024-08-19 16:01:20 发布

阅读量91

点赞数

文章标签： java

本文链接：https://blog.csdn.net/codepupil/article/details/120310736

版权

JAVA集合源码专栏收录该内容

13 篇文章 0 订阅

订阅专栏

本文详细介绍了HashMap的内部工作原理，包括其存储结构（数组+链表+红黑树）、冲突解决机制、扩容策略以及源码分析。HashMap使用哈希函数存储键值对，当冲突时通过链表或红黑树解决。当链表长度超过8时，会转换为红黑树以提高性能。同时，文章还探讨了HashMap的构造函数、put方法以及相关常量和成员变量的作用。

摘要由CSDN通过智能技术生成

HashMap汇总

1. HashMap是用于存储Key-Value键值对的集合；

2. HashMap根据键的hashCode值存储数据，大多数情况下可以直接定位到它的值，So具有很快的访问速度，但遍历顺序不确定；

3. HashMap中键key为null的记录至多只允许一条，值value为null的记录可以有多条；

4. HashMap非线程安全，即任一时刻允许多个线程同时写HashMap，可能会导致数据的不一致。

5. capacity：目前数组的长度。为了实现高效的扩容，其值总为2^n的形式。每次扩容后，n会加1，即整个数组的容量变为之前的2倍。该值初始默认值为16。

6. loadFactor：负载因子，默认值为 0.75。该值与threshold配合使用。

7. threshold：扩容的阈值，等于 capacity * loadFactor。即当数组内达到这么多元素时，会触发数组的扩容。

从整体结构上看HashMap是由数组+链表+红黑树（JDK1.8后增加了红黑树部分）实现的。

数组：

HashMap是一个用于存储Key-Value键值对的集合，每一个键值对也叫做一个Entry；这些Entry分散的存储在一个数组当中，该数组就是HashMap的主干。

链表：

因为数组Table的长度是有限的，使用hash函数计算时可能会出现index冲突的情况，所以我们需要链表来解决冲突；数组Table的每一个元素不单纯只是一个Entry对象，它还是一个链表的头节点，每一个Entry对象通过Next指针指向下一个Entry节点；当新来的Entry映射到冲突数组位置时，只需要插入对应的链表位置即可。

index冲突如下：

比如调用 hashMap.put("China", 0) ，插入一个Key为“China"的元素；这时候我们需要利用一个哈希函数来确定Entry的具体插入位置(index)：通过index = Hash("China")，假定最后计算出的index是2，那么Entry的插入结果如下：

图5. index冲突-1

但是，因为HashMap的长度是有限的，当插入的Entry越来越多时，再完美的Hash函数也难免会出现index冲突的情况。比如下面这样：

图6. index冲突-2

经过hash函数计算发现即将插入的Entry的index值也为2，这样就会与之前插入的Key为“China”的Entry起冲突；这时就可以用链表来解决冲突，当新来的Entry映射到冲突的数组位置时，只需要插入到对应的链表即可；此外，新来的Entry节点插入链表时使用的是“头插法”，即会插在链表的头部，因为HashMap的发明者认为后插入的Entry被查找的概率更大。

图7. index冲突-3

红黑树：

当链表长度超过阈值（8）时，会将链表转换为红黑树，使HashMap的性能得到进一步提升。

HashMap底层存储结构源码：

Node<K,V>类用来实现数组及链表的数据结构：

 /** 数组及链表的数据结构
 2      * Basic hash bin node, used for most entries.  (See below for
 3      * TreeNode subclass, and in LinkedHashMap for its Entry subclass.)
 4      */
 5     static class Node<K,V> implements Map.Entry<K,V> {
 6         final int hash;  //保存节点的hash值
 7         final K key;  //保存节点的key值
 8         V value;  //保存节点的value值
 9        //next是指向链表结构下当前节点的next节点，红黑树TreeNode节点中也用到next
10         Node<K,V> next;  
11 
12         Node(int hash, K key, V value, Node<K,V> next) {
13             this.hash = hash;
14             this.key = key;
15             this.value = value;
16             this.next = next;
17         }
18 
19         public final K getKey()        { return key; }
20         public final V getValue()      { return value; }
21         public final String toString() { return key + "=" + value; }
22 
23         public final int hashCode() {//哈希函数
24             return Objects.hashCode(key) ^ Objects.hashCode(value);
25         }
26 
27         public final V setValue(V newValue) {//覆盖value，并返回被覆盖的value
28             V oldValue = value;
29             value = newValue;
30             return oldValue;
31         }
32 
33         public final boolean equals(Object o) {
34             if (o == this)
35                 return true;
36             if (o instanceof Map.Entry) {//如果o是entry类
37                 Map.Entry<?,?> e = (Map.Entry<?,?>)o;//强转
38                 if (Objects.equals(key, e.getKey()) &&
39                     Objects.equals(value, e.getValue()))
40                     return true;
41             }
42             return false;
43         }
44     }

TreeNode<K,V>用来实现红黑树相关的存储结构

 /**  继承LinkedHashMap.Entry<K,V>，红黑树相关存储结构
 2      * Entry for Tree bins. Extends LinkedHashMap.Entry (which in turn
 3      * extends Node) so can be used as extension of either regular or
 4      * linked node.
 5      */
 6     static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
 7         TreeNode<K,V> parent;  //存储当前节点的父节点
 8         TreeNode<K,V> left;  //存储当前节点的左孩子
 9         TreeNode<K,V> right;  //存储当前节点的右孩子
10         TreeNode<K,V> prev;    //存储当前节点的前一个节点
11         boolean red;  //存储当前节点的颜色（红、黑）
12         TreeNode(int hash, K key, V val, Node<K,V> next) {
13             super(hash, key, val, next);
14         }
15 
16 public class LinkedHashMap<K,V>
17     extends HashMap<K,V>
18     implements Map<K,V>
19 {
20 
21     /**
22      * HashMap.Node subclass for normal LinkedHashMap entries.
23      */
24     static class Entry<K,V> extends HashMap.Node<K,V> {
25         Entry<K,V> before, after;
26         Entry(int hash, K key, V value, Node<K,V> next) {
27             super(hash, key, value, next);
28         }
29     }

HashMap各常量及成员变量的作用

HashMap相关常量：

  /** 创建HashMap时未指定初始容量情况下的默认容量
 2      * The default initial capacity - MUST be a power of two.
 3      */
 4     static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16   1 << 4 = 16
 5 
 6     /** HashMap的最大容量
 7      * The maximum capacity, used if a higher value is implicitly specified
 8      * by either of the constructors with arguments.
 9      * MUST be a power of two <= 1<<30.
10      */
11     static final int MAXIMUM_CAPACITY = 1 << 30;  // 1 << 30 = 1073741824
12 
13     /** HashMap默认的装载因子，当HashMap中元素数量超过 容量*装载因子 时，则进行resize()扩容操作
14      * The load factor used when none specified in constructor.
15      */
16     static final float DEFAULT_LOAD_FACTOR = 0.75f;
17 
18     /** 用来确定何时解决hash冲突的，链表转为红黑树
19      * The bin count threshold for using a tree rather than list for a
20      * bin.  Bins are converted to trees when adding an element to a
21      * bin with at least this many nodes. The value must be greater
22      * than 2 and should be at least 8 to mesh with assumptions in
23      * tree removal about conversion back to plain bins upon
24      * shrinkage.
25      */
26     static final int TREEIFY_THRESHOLD = 8;
27 
28     /** 用来确定何时解决hash冲突的，红黑树转变为链表
29      * The bin count threshold for untreeifying a (split) bin during a
30      * resize operation. Should be less than TREEIFY_THRESHOLD, and at
31      * most 6 to mesh with shrinkage detection under removal.
32      */
33     static final int UNTREEIFY_THRESHOLD = 6;
34 
35     /** 当想要将解决hash冲突的链表转变为红黑树时，需要判断下此时数组的容量，若是由于数组容量太小（小于MIN_TREEIFY_CAPACITY）而导致hash冲突，则不进行链表转为红黑树的操作，而是利用resize()函数对HashMap扩容
36      * The smallest table capacity for which bins may be treeified.
37      * (Otherwise the table is resized if too many nodes in a bin.)
38      * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
39      * between resizing and treeification thresholds.
40      */
41     static final int MIN_TREEIFY_CAPACITY = 64;

HashMap相关成员变量：

 /** 保存Node<K,V>节点的数组
 4      * The table, initialized on first use, and resized as
 5      * necessary. When allocated, length is always a power of two.
 6      * (We also tolerate length zero in some operations to allow
 7      * bootstrapping mechanics that are currently not needed.)
 8      */
 9     transient Node<K,V>[] table;
10 
11     /** 由HashMap中Node<K,V>节点构成的set
12      * Holds cached entrySet(). Note that AbstractMap fields are used
13      * for keySet() and values().
14      */
15     transient Set<Map.Entry<K,V>> entrySet;
16 
17     /** 记录HashMap当前存储的元素的数量
18      * The number of key-value mappings contained in this map.
19      */
20     transient int size;
21 
22     /** 记录HashMap发生结构性变化的次数（value值的覆盖不属于结构性变化）
23      * The number of times this HashMap has been structurally modified
24      * Structural modifications are those that change the number of mappings in
25      * the HashMap or otherwise modify its internal structure (e.g.,
26      * rehash).  This field is used to make iterators on Collection-views of
27      * the HashMap fail-fast.  (See ConcurrentModificationException).
28      */
29     transient int modCount;
30 
31     /** threshold的值应等于table.length*loadFactor，size超过这个值时会进行resize()扩容
32      * The next size value at which to resize (capacity * load factor).
33      *
34      * @serial
35      */
36     // (The javadoc description is true upon serialization.
37     // Additionally, if the table array has not been allocated, this
38     // field holds the initial array capacity, or zero signifying
39     // DEFAULT_INITIAL_CAPACITY.)
40     int threshold;
41 
42     /** 记录HashMap的装载因子
43      * The load factor for the hash table.
44      *
45      * @serial
46      */
47     final float loadFactor;

HashMap底层原理

HashMap构造器

构造方式一

public HashMap(int initialCapacity, float loadFactor) {//容量，负载因子
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }


// tableSizeFor(initialCapacity)方法返回的值是最接近initialCapacity的2的幂次方
   static final int tableSizeFor(int cap) {
        int n = cap - 1;
        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }

构造方式二

仅指定初始容量，装载因子的值采用默认的0.75

public HashMap(int initialCapacity) {
         this(initialCapacity, DEFAULT_LOAD_FACTOR);
     }

构造方式三

 public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
    }

构造方式四

指定集合转为HashMap

JDK8 HashMap源码 putMapEntries解析_anlian523的博客-CSDN博客

public HashMap(Map<? extends K, ? extends V> m) {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        putMapEntries(m, false);
    }

// 把Map<? extends K, ? extends V> m中的元素插入HashMap

 final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) {
        int s = m.size();
        if (s > 0) { //在创建HashMap时调用putMapEntries()函数，则table数组一定为空
            if (table == null) { // pre-size
     //根据待插入map的size计算出要创建的HashMap的容量
      //加1.0f是因为后面要向下转型为int，精度会损失，造成容量不够大
      //这样会算出小数来，但作为容量就必须向上取整，所以这里要加1
                float ft = ((float)s / loadFactor) + 1.0F;
       //如果小于最大容量，就取t；否则就赋值为最大容量
                int t = ((ft < (float)MAXIMUM_CAPACITY) ?
                         (int)ft : MAXIMUM_CAPACITY);
       //t大于扩容值        
                if (t > threshold)
                    threshold = tableSizeFor(t);
            }
        //s大于扩容值就扩容 
            else if (s > threshold)
                resize();
        //增强for循环
            for (Map.Entry<? extends K, ? extends V> e : m.entrySet()) {
                K key = e.getKey();
                V value = e.getValue();
                putVal(hash(key), key, value, false, evict);
            }
        }
    }

HashMap的put方法

假如调用hashMap.put("apple",0)方法，将会在HashMap的table数组中插入一个Key为“apple”的元素；这时需要通过hash()函数来确定该Entry的具体插入位置，而hash()方法内部会调用hashCode()函数得到“apple”的hashCode；然后putVal()方法经过一定计算得到最终的插入位置index，最后将这个Entry插入到table的index位置。

//指定key和value，向HashMap中插入节点

 public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

 static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

HashMap底层原理

野鸭丁真

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
HashMap源码剖析

HashMap汇总1. HashMap是用于存储Key-Value键值对的集合；2. HashMap根据键的hashCode值存储数据，大多数情况下可以直接定位到它的值，So具有很快的访问速度，但遍历顺序不确定；3. HashMap中键key为null的记录至多只允许一条，值value为null的记录可以有多条；4. HashMap非线程安全，即任一时刻允许多个线程同时写HashMap，可能会导致数据的不一致。从整体结构上看HashMap是由数组+链表+红黑树（JDK1.8后增加了红
复制链接

扫一扫

专栏目录