浅析HashMap源码系列----put过程(JDK1.8版)

一码事

已于 2024-05-26 22:50:02 修改

阅读量582

点赞数 2

分类专栏： Java基础文章标签： HashMap源码源码 put过程

于 2019-07-20 12:11:42 首次发布

本文链接：https://blog.csdn.net/qq_42742861/article/details/96576271

版权

Java基础专栏收录该内容

9 篇文章 0 订阅

订阅专栏

1 put方法整体概述

JDK1.8版本的HashMap的数据结构是：数组+链表+红黑树结构。

根据插入的key，hash数值。具体算法是就key的hashCode的高16和低16位做异或运算（）hashCode返回的是int数)，尽可能得到不同的hash值。
查看node数组有没有初始化。没有就创建。
根据hash值计算数组中唯一角标。具体算法是（capacity-1）& hash值，得到的结果必然是0~(capacity-1)之间的数。类似的算法有%算法。
找到具体角标，此时有两种种情况：
1. 当前角标(桶)下没有元素，为null。直接创建新的node节点，放入key-value即可。
2. 当前桶下有元素。分为三种情况：
  1. 当前桶的第一个元素k和要插入的key值一模一样。暂存当前的node，在e中，方便后面返回OldValue。
  2. 当前桶的第一个元素 instanceof TreeNode，也就是说当前桶结构为红黑树，则调用红黑树的putTreeNode方法。
  3. 当前桶结构为链表。循环遍历这个链表，直到node.next为null是才插入该key-value.如果插入之后链表长度大于8，就会进行树化处理。在遍历过程中，如果发现有node的k和插入的key相同，直接退出遍历。注意：在树化过程中，如果元素个数小于64只会通过扩容降低Hash冲突。
3. 返回旧值情况：用e设置新value，并放回旧的value。注意：此处都会返回原先node上面的value，如果相同也会返回。
验证当前集合容量是否达到阈值，如果达到进行resize扩容。
没有旧值返回就返回null。(也就是当前桶没有元素的的时候返回null)

2 具体源码详解

2.1 put()

// put方法直接调用putVal();其中onlyIfAbsent为false。表示需要改变现有的值。也就是原来节点V覆盖。
// evict 为 true 表示为插入模式，不是创建状态。
final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        // table未初始化，或者初始化长度为0的数组
        if ((tab = table) == null || (n = tab.length) == 0)
            // 初始化table数组
            n = (tab = resize()).length;
        // 根据(n-1)&hash获得一个0~(n-1)的一个具体角标，如果数组中次角标位置为null。
        // 直接将node放入。
        if ((p = tab[i = (n - 1) & hash]) == null)
            // 创建新节点的方式也比较简单，是直接new对象而已。
            tab[i] = newNode(hash, key, value, null);
            // 如果有node，就会查看一下节点的重复情况。
        else {
            Node<K,V> e; K k;
            // 如果新插入的key和value和原来Node一样，就直接先替换为插入的新V。
            if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
                // 如果原来的是TreeNode节点
            else if (p instanceof TreeNode)
                // 按照TreeNode节点方式插入。后文有详解
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                // 以下就是用链表方式的Node，也就是桶中第一个node和新插入的key不同且不是红黑树的桶。
                // 依次循环桶中node
                for (int binCount = 0; ; ++binCount) {
                    // 寻找桶中第二个元素，如果为null
                    if ((e = p.next) == null) {
                        // 将新的key-value插入该节点，也就是尾部插入
                        p.next = newNode(hash, key, value, null);
                        // 如果桶中数据大于等于树化阈值8，将该桶数据进行树化处理
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            // 树化处理。注意：树化过程分两步进行。 后文有详解
                            treeifyBin(tab, hash);
                        break;
                    }
                    // 判断刚新插入的node和现在的k-v是否一样。一样就结束遍历插入， 
                    if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    // 递归到下一个节点node。如果是最后一个e为null。
                    p = e;
                }
            }
            // 如果插入成功
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                // 判断是可变的node点，替换为新的value。
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                // hashMap做的空实现
                afterNodeAccess(e);
                // 返回旧值
                return oldValue;
            }
        }
        ++modCount;
        // 超过阈值进行扩容。
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

2.2 putTreeVal() TreeNode版本的插入节点

final TreeNode<K,V> putTreeVal(HashMap<K,V> map, Node<K,V>[] tab, int h, K k, V v) {
    Class<?> kc = null;
    boolean searched = false;
    // 原来该桶位上的父节点或者root节点，也就是上面判断的p节点
    TreeNode<K,V> root = (parent != null) ? root() : this;
    // 循环桶中所有Node节点
    for (TreeNode<K,V> p = root;;) {
        int dir, ph; K pk;
        // 插入节点hash小的node向左排
        if ((ph = p.hash) > h)
            dir = -1;
        else if (ph < h)
            dir = 1;
        // 插入node和父节点相同，直接返回父节点。
        else if ((pk = p.key) == k || (k != null && k.equals(pk)))
            return p;
        // key的Class在第一次使用会缓存下来
        else if ((kc == null && (kc = comparableClassFor(k)) == null) ||
                 (dir = compareComparables(kc, k, pk)) == 0) {
            // 在父节点的左右孩子循环查找是否插入Node节点已经存在 
            if (!searched) {
                TreeNode<K,V> q, ch;
                searched = true;
                //  find()采用do-while循环查找
                if (((ch = p.left) != null &&
                     (q = ch.find(h, k, kc)) != null) ||
                    ((ch = p.right) != null &&
                     (q = ch.find(h, k, kc)) != null))
                    return q;
            }
            dir = tieBreakOrder(k, pk);
        }

        TreeNode<K,V> xp = p;
        // 经过上边一些操作，确定dir的数值，如果是<=0就向左边放，否则向右放。
        if ((p = (dir <= 0) ? p.left : p.right) == null) {
            Node<K,V> xpn = xp.next;
            // 创建新TreeNode
            TreeNode<K,V> x = map.newTreeNode(h, k, v, xpn);
            if (dir <= 0)
                xp.left = x;
            else
                xp.right = x;
           // 确立父子关系
            xp.next = x;
            x.parent = x.prev = xp;
            // 父节点原先的next有node，相当于将原来的节点挤下去
            if (xpn != null)
                ((TreeNode<K,V>)xpn).prev = x;
            // 着色、旋转、确定Root节点三个步骤
            moveRootToFront(tab, balanceInsertion(root, x));
            return null;
        }
    }
}

2.3 treeifyBin() 第一次树化代码

final void treeifyBin(Node<K,V>[] tab, int hash) {
    int n, index; Node<K,V> e;
    // 如果table没有初始化或者table的长度小于64，是不通过树化来降低hash冲突的，仅仅用扩容方式。
    if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
        resize();
    // 树化某个桶中的节点。
    else if ((e = tab[index = (n - 1) & hash]) != null) {
        TreeNode<K,V> hd = null, tl = null;
        do {
            // 将原先的普通node置换为TreeNode。
            TreeNode<K,V> p = replacementTreeNode(e, null);
            // 将当前置换的新TreeNode放入节点。
            if (tl == null)
                hd = p;
            else {
               	// 第一次树化临时链接，删除时需要断开的连接。
                p.prev = tl;
                // 把当前treeNode设置上一个节点的子节点
                tl.next = p;
            }
            // 置换为下一个node。继续树化
            tl = p;
        } while ((e = e.next) != null);
        // 如果当前桶上有数据
        if ((tab[index] = hd) != null)
            // 真正树化开始 	后文详解
            hd.treeify(tab);
    }
}

2.4 treeify() 第二次树化(着色，旋转，确认root节点)

// 遍历某个桶中所有的Node
final void treeify(Node<K,V>[] tab) {
    TreeNode<K,V> root = null;
    
    for (TreeNode<K,V> x = this, next; x != null; x = next) {
        // 下一个TreeNode
        next = (TreeNode<K,V>)x.next;
        // 左右孩子置空
        x.left = x.right = null;
        // 如果该桶没有Root根节点，设置为根节点，颜色暂时标注黑色。
        if (root == null) {
            x.parent = null;
            x.red = false;
            root = x;
        } else { // 有根节点的情况。
            K k = x.key;
            int h = x.hash;
            Class<?> kc = null;
            for (TreeNode<K,V> p = root;;) {
                int dir, ph;
                K pk = p.key;
                // 还是根据根节点的key值的hash大的向左排
                if ((ph = p.hash) > h)
                    dir = -1;
                else if (ph < h)
                    dir = 1;
                // 根节点和新插入节点Hash冲突时。
                else if ((kc == null && (kc = comparableClassFor(k)) == null) ||
                         (dir = compareComparables(kc, k, pk)) == 0)
                    // k--新插入node的key，pk--是上一个节点的key。
                    // 获取两个key的默认HashCode比较值，k<=pk 为-1，相反为1.
                    dir = tieBreakOrder(k, pk);

                TreeNode<K,V> xp = p;
                // 根据dir的值情况，也就是Hash值小的作为左孩子。大的做右孩子。
                if ((p = (dir <= 0) ? p.left : p.right) == null) {
                    x.parent = xp;
                    if (dir <= 0)
                        xp.left = x;
                    else
                        xp.right = x;
                    // treeNode的着色，旋转阶段。可以参考TreeMap的着色平衡过程
                    root = balanceInsertion(root, x);
                    break;
                }
            }
        }
    }
    // 保证root是根节点。
    moveRootToFront(tab, root);
}

3. HashMap允许插入null的k-v，为什么ConcurrentHashMap 不行呢？

源码中第一句就非常明确地做了判断，如果 Key 或者 Value 为 null（空）值，就直接抛出空指针异常。但是为什么呢？
歧义问题：ConcurrentHashMap 中插入 null （空）值会存在歧义。我们可以假设ConcurrentHashMap 允许插入 null（空）值，那么，我们取值的时候会出现两种结果：
2.1 值没有在集合中，所以返回的结果就是 null （空）；
2.2 值就是 null（空），所以返回的结果就是它原本的 null（空）值。
这就产生了歧义问题。
那 HashMap 允许插入 null（空）值，难道它就不担心出现歧义吗？这是因为HashMap 的设计是给单线程使用的，所以如果取到 null（空）值，我们可以通过HashMap 的 containsKey(key)方法来区分这个 null（空）值到底是插入值是 null（空），还是本就没有才返回的 null（空）值。
而 ConcurrentHashMap 就不一样了，因为 ConcurrentHashMap 是在多线程场景下使用的，它的情况更加复杂。
ConcurrentHashMap的containsKey(key) 判断空值，如果为true。无法判断为是本线程还是其他线程。
作者回复：

The main reason that nulls aren't allowed in ConcurrentMaps
(ConcurrentHashMaps, ConcurrentSkipListMaps) is that ambiguities that may
be just barely tolerable in non-concurrent maps can't be accommodated. The
main one is that if map.get(key) returns null, you can't detect whether the key
explicitly maps to null vs the key isn't mapped.In a non-concurrent map, you
can check this via map.contains(key),but in a concurrent one, the map might
have changed between calls. 

Further digressing: I personally think that allowingnulls in Maps (also Sets) is an
open invitation for programsto contain errors that remain undetected untilthey
break at just the wrong time. (Whether to allow nulls evenin non-concurrent
Maps/Sets is one of the few design issues surroundingCollections that Josh
Bloch and I have long disagreed about.)
It is very difficult to check for null keys and valuesin my entire application . 
Would it be easier to declare somewherestatic final Object NULL = new
Object();and replace all use of nulls in uses of maps with NULL?

Doug Lea 认为这样设计最主要的原因是：不容忍在并发场景
下出现歧义！