JDK8 源码解读：HashMap-构造函数

最新推荐文章于 2024-03-09 16:19:47 发布

乐之终曲

最新推荐文章于 2024-03-09 16:19:47 发布

阅读量533

点赞数

分类专栏： # JDK8 源码解读文章标签：链表 java hashmap 红黑树源码

本文链接：https://blog.csdn.net/qq_37143673/article/details/110196639

版权

JDK8 源码解读专栏收录该内容

6 篇文章 2 订阅

订阅专栏

JDK8 源码解读：HashMap-构造函数

源码阅读答疑
构造函数以及相关方法
模拟构造函数运行
- HashMap()
- HashMap(Map<? extends K, ? extends V> m)
结尾
源码

源码阅读答疑

个人对于学习源码时的体会，以及自己遇到困惑的解答，有问题可以来问
源码阅读问答

构造函数以及相关方法

个人觉得 HashMap 构造函数的方法很难读，因为他把一个功能的实现，拆散在了不同的方法中，而不是统一处理。

举个例子：
以扩容阈值参数 threshold 打比方

有的构造函数中初始化了扩容阈值，但又不是完全正确的 HashMap(int initialCapacity, float loadFactor)

有的构造函数又将扩容阈值的计算丢到 put 中去做 HashMap()

反正处理方式各有不同，对新手源码阅读不友好，容易看的云里雾里

HashMap(int initialCapacity, float loadFactor)

唯一需要理解的就是 this.threshold = tableSizeFor(initialCapacity);

正常的阈值计算公式应为：this.threshold = tableSizeFor(initialCapacity) * this.loadFactor;

但是由于，在这个构造函数中，并没有对 table 与 size 参数进行初始化，因此在 put 方法中，会优先进行一次扩容

/**
 * Constructs an empty <tt>HashMap</tt> with the specified initial
 * capacity and load factor.
 *
 * @param  initialCapacity 数组初始容量
 * @param  loadFactor      负载因子
 * @throws IllegalArgumentException if the initial capacity is negative
 *         or the load factor is nonpositive
 */
public HashMap(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal initial capacity: " +
                                           initialCapacity);
    // 如果传入的初始容量大于最大容量，按照最大容量计算
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal load factor: " +
                                           loadFactor);
    this.loadFactor = loadFactor;
    // 计算下次需要扩容的临界值
    // 或许你会对这里产生疑问
    // this.threshold 代表的是下次需要扩容的临界值，而不是扩容后的大小
    // 所以公式理应是：this.threshold = tableSizeFor(initialCapacity) * this.loadFactor;
    // 那是因为 table 的初始化被推迟到了 put 方法中，在 put 方法中会对 threshold 重新计算
    // 注意，此时的 size 是 0，table 为 null，put 方法中会优先进行一步扩容
    this.threshold = tableSizeFor(initialCapacity);
}

tableSizeFor(int cap)

tableSizeFor(int cap) 方法目的是为了计算出符合 HashMap 要求的数组容量，即数组的容量必须满足：值为 2 ^ N

举例：假如 cap 入参容量为 10，那么计算出来的值就满足大于 10，且值为 2 ^ N，最后算出来是 16

/**
 * 计算扩容后的数组大小
 * 对于我这种数学不太好的人来说，这个计算不是太能理解
 * But 我知道这玩意的作用就行
 * 这个计算的作用就是保证：最终的初始容量值在（大于等于传入的容量值的同时，满足值为 2^n）
 * @param  cap 数组的初始容量
 * Returns a power of two size for the given target capacity.
 */
static final int tableSizeFor(int cap) {
    int n = cap - 1;
    n |= n >>> 1;
    n |= n >>> 2;
    n |= n >>> 4;
    n |= n >>> 8;
    n |= n >>> 16;
    return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}

HashMap(int initialCapacity)

/**
 * Constructs an empty <tt>HashMap</tt> with the specified initial
 * capacity and the default load factor (0.75).
 *
 * @param  initialCapacity 数组初始容量
 * @throws IllegalArgumentException if the initial capacity is negative.
 */
public HashMap(int initialCapacity) {
    // 没传负载因子，就按照默认值
    this(initialCapacity, DEFAULT_LOAD_FACTOR);
}

HashMap()

/**
 * Constructs an empty <tt>HashMap</tt> with the default initial capacity
 * (16) and the default load factor (0.75).
 */
public HashMap() {
    // 没传负载因子，就按照默认值
    this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}

HashMap(Map<? extends K, ? extends V> m)

主要还是要看 putMapEntries 方法

/**
 * Constructs a new <tt>HashMap</tt> with the same mappings as the
 * specified <tt>Map</tt>.  The <tt>HashMap</tt> is created with
 * default load factor (0.75) and an initial capacity sufficient to
 * hold the mappings in the specified <tt>Map</tt>.
 *
 * @param   m the map whose mappings are to be placed in this map
 * @throws  NullPointerException if the specified map is null
 */
public HashMap(Map<? extends K, ? extends V> m) {
    // 没传负载因子，就按照默认值
    this.loadFactor = DEFAULT_LOAD_FACTOR;
    // 将值设置到 map 中，并且计算扩容后的数组容量以及下次扩容阈值
    putMapEntries(m, false);
}

putMapEntries(Map<? extends K, ? extends V> m, boolean evict)

这个方法的目的是：将参数 map 合并到当前 map 中

主要逻辑：

计算数组的容量，即 float ft 变量
校验值有没有超最大值，即 int t 变量
通过数组的容量 ft，计算出扩容阈值 threshold
在 resize() 扩容方法中，会根据 threshold 计算出数组的容量
最后一个循环把 新 map 中的值放入 当前 map 中

PS：总的来说就是很绕，计算出初始的数组容量后不直接使用，当做扩容阈值，然后再用 resize() 扩容方法，计算一个更大的数组容量

/**
 1. 设置 map 到已有 map 中
 2. Implements Map.putAll and Map constructor.
 3.  4. @param m the map
 5. @param evict false when initially constructing this map, else
 6. true (relayed to method afterNodeInsertion).
 */
final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) {
    // 计算入参 map 的长度
    int s = m.size();
    if (s > 0) {
        // 如果数组为 null
        // 情况1：用于构造函数的初始化
        // 情况2：如果 map 为 null 情况下对其进行分 putAll
        if (table == null) { // pre-size
            // 计算应该扩容到的大小
            // 如何理解？
            // 假设入参的 map 长度为 12，扩容阈值为 0.75
            // 假设扩容后的大小为 x，得到公式 0.75x = 12
            // 解方程得应扩容到的大小为 16 + 1
            // 为啥 + 1？ 因为 16 是正好需要扩容的阈值，因此需要加 1
            float ft = ((float)s / loadFactor) + 1.0F;
            // 这里只是为了判断扩容后的大小是否超过最大值，超过就只按最大值算
            // （注意，这里的扩容后的大小还不一定复合 HashMap 对扩容大小的要求，即 2 ^ N）
            int t = ((ft < (float)MAXIMUM_CAPACITY) ?
                     (int)ft : MAXIMUM_CAPACITY);

            // 假如大小大于需要扩容大小，超过即重新计算需要扩容的临界值
            // 注意：loadFactor 只是扩容的阈值
            // threshold 则是经过 loadFactor 计算过后具体的下次需要扩容的临界值
            if (t > threshold)
                // 解析同上
                threshold = tableSizeFor(t);
        }
        // 走这说明是向一个非空的 map 中 put 一个 map
        // 判断是否超临界值
        else if (s > threshold)
            // 扩容
            resize();
        // 设置参数
        for (Map.Entry<? extends K, ? extends V> e : m.entrySet()) {
            K key = e.getKey();
            V value = e.getValue();
            putVal(hash(key), key, value, false, evict);
        }
    }
}

resize()

扩容方法：核心作用有数组的扩容，保证扩容后的 HashMap 上的值均匀分布，利用 splite() 方法对红黑树进行修剪

/**
 * 扩容
 * Initializes or doubles table size.  If null, allocates in
 * accord with initial capacity target held in field threshold.
 * Otherwise, because we are using power-of-two expansion, the
 * elements from each bin must either stay at same index, or move
 * with a power of two offset in the new table.
 *
 * @return the table
 */
final Node<K,V>[] resize() {
    // this map
    Node<K,V>[] oldTab = table;
    // 获取旧数组容量
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    // 获取旧数组的临界值
    int oldThr = threshold;
    int newCap, newThr = 0;

    // 正常的扩容都走这，即扩大为 2 倍，临界值也扩大为 2 倍
    if (oldCap > 0) {
        // 如果旧大小超过 Integer 最大值，则 threshold 临界值也只能到 Integer 最大值
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        // 如果数组的大小左移 1 位（即扩大两倍）没超过最大值且旧大小大于等于初始化大小，则新临界值也向左移 1 位（即扩大两倍）
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            // 从这里可以看到，后续的扩容阈值不再通过负载因子进行计算，而是直接扩大两倍
            newThr = oldThr << 1; // double threshold
    }
    else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    else {               // zero initial threshold signifies using defaults
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    // 如果新临界值为 0，重新计算临界值
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    threshold = newThr;
    // 扩容需要重新创建一个数组，大小为扩容后的大小
    @SuppressWarnings({"rawtypes","unchecked"})
    Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    table = newTab;
    if (oldTab != null) {
        // 循环旧数组
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            if ((e = oldTab[j]) != null) {
                oldTab[j] = null;
                // 说明此数组位置没出现 hash 冲突
                if (e.next == null)
                    // 计算新数组中的位置，并设置
                    newTab[e.hash & (newCap - 1)] = e;
                // 存在 hash 冲突，结构为红黑树
                else if (e instanceof TreeNode)
                    // 循环修剪红黑树，目的是使得新的 map 上的 hashcode 均匀分布
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else {
                    // 存在 hash 冲突，结构为链表
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    // 此处参照上方 split 中的说明，目的一样，使得 hashcode 均分分布
                    do {
                        next = e.next;
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null)
                                loHead = e;
                            else
                                loTail.next = e;
                            loTail = e;
                        }
                        else {
                            if (hiTail == null)
                                hiHead = e;
                            else
                                hiTail.next = e;
                            hiTail = e;
                        }
                    } while ((e = next) != null);
                    if (loTail != null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    if (hiTail != null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}

split(HashMap< K, V > map, Node< K, V >[] tab, int index, int bit)

重要的方法，对红黑树进行修剪

使红黑树节点均匀分布，同时如果节点数量小于等于 6 的，要转为链表

PS：一般仅在扩容时，才会对节点过少的红黑树修剪为链表。移除节点时，除非红黑树的首节点的左右子节点存在 null 情况下才会转链表

/**
 * 树形结构修剪
 * Splits nodes in a tree bin into lower and upper tree bins,
 * or untreeifies if now too small. Called only from resize;
 * see above discussion about split bits and indices.
 *
 * @param map the map
 * @param tab the table for recording bin heads 表示新保存桶头结点的哈希表
 * @param index the index of the table being split 数组索引
 * @param bit the bit of hash to split on 旧数组容量
 */
final void split(HashMap<K,V> map, Node<K,V>[] tab, int index, int bit) {
    // 指代当前数组位置的（链表或红黑树的头结点）
    TreeNode<K,V> b = this;

    TreeNode<K,V> loHead = null, loTail = null;
    TreeNode<K,V> hiHead = null, hiTail = null;
    int lc = 0, hc = 0;
    // 循环链表或红黑树
    // e：当前循环到的节点
    // next：指向的下个节点
    for (TreeNode<K,V> e = b, next; e != null; e = next) {
        // 临时变量存储下个节点
        next = (TreeNode<K,V>)e.next;
        // 清空下个节点
        e.next = null;
        // 这里将红黑树切成两颗树
        // 如果 e.hash & bit 位运算结果为 0 放到 loHead 树中
        // 如果 e.hash & bit 位运算结果 不为 0 放到 hiHead 树中
        // e.hash & bit 的结果只会有两种，1种为 0，一种为 bit
        // 原因是 bit 是旧数组容量，满足 2^n 原则，因此转化为二进制就只会有一位为 1，其他位都为 0
        // 最终目的是为了使 hashcode 均匀分布
        if ((e.hash & bit) == 0) {
            if ((e.prev = loTail) == null)
                loHead = e;
            else
                loTail.next = e;
            loTail = e;
            ++lc;
        }
        else {
            if ((e.prev = hiTail) == null)
                hiHead = e;
            else
                hiTail.next = e;
            hiTail = e;
            ++hc;
        }
    }

    // lo 树放 index 位置
    if (loHead != null) {
        if (lc <= UNTREEIFY_THRESHOLD)
            // 树小于等于链表的阈值，转为链表
            tab[index] = loHead.untreeify(map);
        else {
            tab[index] = loHead;
            if (hiHead != null) // (else is already treeified)
                loHead.treeify(tab);
        }
    }
    // hi 树放 index + bit 位置（放在了扩容出来的部分上）
    if (hiHead != null) {
        if (hc <= UNTREEIFY_THRESHOLD)
            // 树小于等于链表的阈值，转为链表
            tab[index + bit] = hiHead.untreeify(map);
        else {
            tab[index + bit] = hiHead;
            if (loHead != null)
                hiHead.treeify(tab);
        }
    }
}

untreeify(HashMap< K, V > map)

PS：红黑树作为双向链表的作用再次体现，直接更新双向链表转回单链表，next 参数

/**
 * 红黑树转回单链表
 * Returns a list of non-TreeNodes replacing those linked from
 * this node.
 */
final Node<K,V> untreeify(HashMap<K,V> map) {
    Node<K,V> hd = null, tl = null;
    // 循环替换红黑树节点为链表节点
    // 顺序直接使用红黑树双向链表的顺序
    for (Node<K,V> q = this; q != null; q = q.next) {
        Node<K,V> p = map.replacementNode(q, null);
        if (tl == null)
            hd = p;
        else
            tl.next = p;
        tl = p;
    }
    return hd;
}

模拟构造函数运行

通过具体的构造函数，来看看代码是怎么走的

HashMap()

最常用的构造函数，刚开始啥事都不干，就初始化一下负载因子

public HashMap() {
    // 没传负载因子，就按照默认值
    this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}

put 参数时

public V put(K key, V value) {
    return putVal(hash(key), key, value, false, true);
}

由于 table 等参数都没初始化，因此为 null，进入 resize() 扩容

final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
	// ...
	if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
	// ...
}

此时 table == null，size == 0，threshold == 0

final Node<K,V>[] resize() {
	// ...
	// 由于 size == 0，因此 oldCap == 0
    if (oldCap > 0) {
		// ...
    }
    // 由于 threshold == 0，因此 oldThr == 0
    else if (oldThr > 0) 
        // ...
    else {
    	// 用的都是静态变量进行初始化
    	// 由于啥玩意都没有，会进入这里初始化值
        newCap = DEFAULT_INITIAL_CAPACITY;
        // 计算扩容阈值：负载因子 * 默认容量，算出来正好整数 12
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
	// ...
}

HashMap(Map<? extends K, ? extends V> m)

进入构造函数，利用 map 初始化一个新 map

初始化负载因子

public HashMap(Map<? extends K, ? extends V> m) {
    // 没传负载因子，就按照默认值
    this.loadFactor = DEFAULT_LOAD_FACTOR;
    // 将值设置到 map 中，并且计算扩容后的数组容量以及下次扩容阈值
    putMapEntries(m, false);
}

调用 putMapEntries 方法

final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) {
	// 计算入参 map 的长度
    int s = m.size();
    if (s > 0) {
    	// table 还没初始化，还是 null
    	if (table == null) {
    		// 计算容量，但不直接使用
    		float ft = ((float)s / loadFactor) + 1.0F;
    		// 验证最大值
    		int t = ((ft < (float)MAXIMUM_CAPACITY) ? (int)ft : MAXIMUM_CAPACITY);
            if (t > threshold)
                // 将计算出来的容量作为扩容阈值
                threshold = tableSizeFor(t);
		}
	}
	// ...
    // 设置参数
    for (Map.Entry<? extends K, ? extends V> e : m.entrySet()) {
        K key = e.getKey();
        V value = e.getValue();
        putVal(hash(key), key, value, false, evict);
    }
}

调用 putVal 循环把键值对设置进去

final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
	// ...
	// 由于 table 为 null，先进行扩容
	if ((tab = table) == null || (n = tab.length) == 0)
    	n = (tab = resize()).length;
	// ...
}

进入 resize() 进行扩容

final Node<K,V>[] resize() {
	// this map
    Node<K,V>[] oldTab = table;
    // 获取旧数组容量
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    // 获取旧数组的临界值
    int oldThr = threshold;
    int newCap, newThr = 0;
	
	// 由于 table 未初始化，因此 oldCap 为 0
	if (oldCap > 0) {
        // ...
    }
    // 在 putMapEntries 中
    // 执行了 threshold = tableSizeFor(t);
    // 将计算出来的容量不直接使用，而是作为扩容阈值
    // 因此 oldThr > 0
    else if (oldThr > 0)
    	// 暂时有了个新的容量，但是与临界值一样大
        newCap = oldThr;
    else {
        // ...
    }
	// 此时 newThr 没执行过任何东西，还是 0
    if (newThr == 0) {
    	// 计算新的扩容阈值
    	// 扩容阈值由原来的 threshold == newCap，变为了 newCap * loadFactor
    	// 变小了
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
	// ...
	// 后续就是正常的扩容
}