【Java】JDK1.8 HashMap源码解析

最新推荐文章于 2021-08-23 09:29:00 发布

chy-7

最新推荐文章于 2021-08-23 09:29:00 发布

阅读量227

点赞数

分类专栏：练习文章标签： Java 源码

本文链接：https://blog.csdn.net/chan272/article/details/103262141

版权

练习专栏收录该内容

15 篇文章 2 订阅

订阅专栏

HashMap是基于哈希表的Map接口的实现。采用key-value（键值对）来存储，并允许使用null作为键和值。HashMap由数组+链表组成的，数组是HashMap的主体，链表则是解决哈希冲突，如果定位到的数组位置不含链表（当前Node的next指向null）,那么对于查找，添加等操作很快，仅需一次寻址即可；如果定位到的数组包含链表，对于添加操作，其时间复杂度为O(n)，首先遍历链表，存在即覆盖，否则新增；对于查找操作来讲，仍需遍历链表，然后通过key对象的equals方法逐一比对查找。所以从性能考虑，HashMap中的链表出现越少性能越好，在JDK1.8中引入了红黑树作为优化。
在这里插入图片描述

基本原理

我们从最常用的put、get方法开始，先看一下HashMap的基本属性。

1.基本属性

    // 数组默认初始容量为16
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;
    
    // 数组最大容量，1 << 30 是int能表示的最大数
    static final int MAXIMUM_CAPACITY = 1 << 30;  
    
    // 负载因子，当元素个数超过此比例就会进行数组扩容，扩容门槛 = 表长 * 0.75
    static final float DEFAULT_LOAD_FACTOR = 0.75f;
    
    // 树化阈值，当链表节点达到此值时，将链表转化为红黑树
    static final int TREEIFY_THRESHOLD = 8;
    
    // 当链表节点数 <= 6时，将树转换回单链表
    static final int UNTREEIFY_THRESHOLD = 6;
    
    // 树化容量，当内部数组长度未达到64时只扩容数组
    static final int MIN_TREEIFY_CAPACITY = 64;

2.put

在此之前我们先了解一下Node节点：
Node<K,V> 这个内部类本质上是一个链表，连接各个键值对。其属性：hash值，键key，值value，以及下一个Node节点next。

static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }

        public final K getKey()        { return key; }
        public final V getValue()      { return value; }
        public final String toString() { return key + "=" + value; }

        public final int hashCode() {
            return Objects.hashCode(key) ^ Objects.hashCode(value);
        }

        public final V setValue(V newValue) {
            V oldValue = value;
            value = newValue;
            return oldValue;
        }

        public final boolean equals(Object o) {
            if (o == this)
                return true;
            if (o instanceof Map.Entry) {
                Map.Entry<?,?> e = (Map.Entry<?,?>)o;
                if (Objects.equals(key, e.getKey()) &&
                    Objects.equals(value, e.getValue()))
                    return true;
            }
            return false;
        }
    }

现在来看put方法内部的实现：

 public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

当把一个键值对put进map时，会调用 putVal() 方法：

每次put的时候都要取得由节点 Node<K,V> 组成的数组（哈希表），当哈希表不存在时，则需要新建表resize()；
然后根据hash值在哈希表中寻找相应链的首节点，若不存在则以此节点作为链表头创建新链。若存在，key相同则覆盖，不同则添加，添加后判断链表长度是否达到阈值TREEIFY_THRESHOLD - 1，然后“尝试”将链表转换成红黑树（“尝试”是因为如果当前数组容量很小，则会先进行数组扩容）；
最后判断当前元素个数是否大于threshold，扩充数组。

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        // tab：内部数组，即哈希表；
        // p：hash索引到的首节点；
        // n：数组长度；
        // i：hash对应的索引位；
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        // 哈希表不存在或者数组长度 == 0时，新建哈希表；
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        // 根据数组索引，获取该位置的首节点，若为null，则添加一个新节点；
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            // 如果首节点的key和要存入的key相同，则覆盖value值；
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            // 如果首节点是红黑树，则将值插入红黑树，否则进行链表操作；
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
            	// 首先遍历这个链；
                for (int binCount = 0; ; ++binCount) {
                	// 到达链尾，添加新节点；
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        // 之后判断是否达到树化条件，如果长度达到8，则转换为树结构；
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    // 如果链表包含此key，则break；
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    // 更新p节点
                    p = e;
                }
            }
            // 如果e节点存在，则用新节点的值替换，并返回旧值；
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        // fail-fast 机制,防止迭代时错误；
        ++modCount;
        // 如果元素个数大于阈值，则扩充数组；
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

put方法中涉及到树化treeifyBin() 和扩容 resize()。

先看一下红黑树的定义：
TreeNode属性有父节点，左孩子，右孩子，前驱节点，颜色。

static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
        TreeNode<K,V> parent;  // red-black tree links
        TreeNode<K,V> left;
        TreeNode<K,V> right;
        TreeNode<K,V> prev;    // needed to unlink next upon deletion
        boolean red;
        TreeNode(int hash, K key, V val, Node<K,V> next) {
            super(hash, key, val, next);
        }
        // ......
}

我们再看树化 treeifyBin(Node<K,V>[] tab, int hash)
为提高存取效率，当数组中哈希值所对应的链表长大于8时，将链表转化为红黑树。即把链表的各个结点转化为红黑树结点，并设置为双向链表。

final void treeifyBin(Node<K,V>[] tab, int hash) {
        int n, index; Node<K,V> e;
        // 若哈希表不存在或未达到树化容量时，不作树化处理，而是扩容；
        if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
            resize();
        else if ((e = tab[index = (n - 1) & hash]) != null) {
            TreeNode<K,V> hd = null, tl = null;
            do {
            	// 用头结点作为红黑树节点；
                TreeNode<K,V> p = replacementTreeNode(e, null);
                // 判断是否为头结点；
                if (tl == null)
                    hd = p;
                else {
                    p.prev = tl;
                    tl.next = p;
                }
                tl = p;
            } while ((e = e.next) != null);
            // 树化该链表；
            if ((tab[index] = hd) != null)
                hd.treeify(tab);
        }
    }

treeify() 方法：具体实现树化。

final void treeify(Node<K,V>[] tab) {
            TreeNode<K,V> root = null;
            // 遍历链表；
            for (TreeNode<K,V> x = this, next; x != null; x = next) {
                next = (TreeNode<K,V>)x.next;
                x.left = x.right = null;
                // 设置红黑树根节点，为黑；
                if (root == null) {
                    x.parent = null;
                    x.red = false;
                    root = x;
                }
               // x指向树中的某个节点；
                else {
                    K k = x.key;
                    int h = x.hash;
                    Class<?> kc = null;
                    // 从根节点开始，遍历所有节点跟当前节点 x 比较;
                    for (TreeNode<K,V> p = root;;) {
                        int dir, ph;
                        K pk = p.key;
                        // 节点的哈希值比 x 大时， dir 为 -1, 表示左移;
                        // 反之为 1， 右移；
                        if ((ph = p.hash) > h)
                            dir = -1;
                        else if (ph < h)
                            dir = 1;
                        // 节点冲突，
                        // 首先调用comparableClassFor()方法判断节点的key是否实现了Comparable接口，
                        // 如果kc ！= null ，则通过compareComparables()方法通过compareTo()比较，
                        //如果还是返回 0，即dir == 0，则调用tieBreakOrder()方法来比较了
                      else if ((kc == null &&
                                  (kc = comparableClassFor(k)) == null) ||
                                 (dir = compareComparables(kc, k, pk)) == 0)
                            dir = tieBreakOrder(k, pk);

                        TreeNode<K,V> xp = p;
                        // 根据比较结果判断是左孩子树还是右孩子树；
                        if ((p = (dir <= 0) ? p.left : p.right) == null) {
                            x.parent = xp;
                            if (dir <= 0)
                                xp.left = x;
                            else
                                xp.right = x;
                            // 更新红黑树；
                            root = balanceInsertion(root, x);
                            break;
                        }
                    }
                }
            }
            // 确保根节点是表中哈希值所对应链表的头结点；
            moveRootToFront(tab, root);
        }

更新红黑树：balanceInsertion(TreeNode<K,V> root, TreeNode<K,V> x)

static <K,V> TreeNode<K,V> balanceInsertion(TreeNode<K,V> root,
                                                    TreeNode<K,V> x) {
            // 设置插入节点默认颜色为红色；
            x.red = true;
            for (TreeNode<K,V> xp, xpp, xppl, xppr;;) {
            	// 若插入结点的父不存在，即根节点，设置其颜色为黑，并返回；
               if ((xp = x.parent) == null) {
                    x.red = false;
                    return x;
                }
                // 若插入节点的父节点为黑或其父节点无父，则不需要调整；
                else if (!xp.red || (xpp = xp.parent) == null)
                    return root;
                // 若插入节点的父节点是该父节点的父节点的左孩子
                if (xp == (xppl = xpp.left)) {
                	// 若该父节点的父节点的右孩子存在且为红；
                    if ((xppr = xpp.right) != null && xppr.red) {
                    	// 置该父节点的父节点的右孩子为黑；
                        xppr.red = false;
                        // 置该父节点为黑；
                        xp.red = false;
                        // 置该父节点的父节点为红；
                        xpp.red = true;
                        // 置插入节点为该父节点的父节点；
                        x = xpp;
                    }
                    else {
                        // 若插入节点为该节点父节点的右孩子；
                        if (x == xp.right) {
                            // 左旋；
                            root = rotateLeft(root, x = xp);
                            // 更新父节点的父节点；
                            xpp = (xp = x.parent) == null ? null : xp.parent;
                        }
                        // 若插入节点的父节点存在；
                        if (xp != null) {
                        	// 设置父节点为黑；
                            xp.red = false;
                            // 若父节点的父节点存在；
                            if (xpp != null) {
                            	// 设置父节点为红；
                                xpp.red = true;
                                // 右旋；
                                root = rotateRight(root, xpp);
                            }
                        }
                    }
                }                
                // 若插入节点的父节点不是该父节点的父父节点的左孩子
                else {
                    // 插入节点的父节点的父节点的左孩子存在且为红；
                    if (xppl != null && xppl.red) {
                        xppl.red = false;
                        xp.red = false;
                        xpp.red = true;
                        x = xpp;
                    }
                    else {
                    	// 若插入节点是父节点的左孩子；
                        if (x == xp.left) {
                            root = rotateRight(root, x = xp);
                            xpp = (xp = x.parent) == null ? null : xp.parent;
                        }
                        // 若插入节点的父节点存在；
                        if (xp != null) {
                            xp.red = false;
                            if (xpp != null) {
                                xpp.red = true;
                                root = rotateLeft(root, xpp);
                            }
                        }
                    }
                }
            }
        }

moveRootToFront(Node<K, V>[] tab, Node<K, V> root) 方法，确保根节点是表中哈希值所对应链表的头结点。

static <K,V> void moveRootToFront(Node<K,V>[] tab, TreeNode<K,V> root) {
            int n;
            if (root != null && tab != null && (n = tab.length) > 0) {
                int index = (n - 1) & root.hash;
                TreeNode<K,V> first = (TreeNode<K,V>)tab[index];
                if (root != first) {
                    Node<K,V> rn;
                    tab[index] = root;
                    TreeNode<K,V> rp = root.prev;
                    // 如果根的next存在，设置这个next的前驱节点为根的前驱节点；
                    if ((rn = root.next) != null)
                        ((TreeNode<K,V>)rn).prev = rp;
                    // 如果根的前驱节点存在，设置前驱节点的next为根的next；
                    if (rp != null)
                        rp.next = rn;
                    // 如果first存在，first前驱节点为根；
                    if (first != null)
                        first.prev = root;
                    root.next = first;
                    root.prev = null;
                }
                assert checkInvariants(root);
            }
        }

3.扩容

在jdk1.8中对此进行了一些优化，通过网上查找了一些资料算是明白一些了。
资料：https://zhuanlan.zhihu.com/p/21673805
因为当数组长度是通过2的次方扩充时，元素的位置是在原位置或者原位置再移动2次幂的位置。
如下图：
图（a）表示扩容前的key1和key2两种key确定索引位置的示例。
图（b）表示扩容后key1和key2两种key确定索引位置的示例，其中hash1是key1对应的哈希与高位运算结果。

在这里插入图片描述
元素在重新计算hash之后，因为n变为2倍，在按位相与是会在高位多1bit(红色)，因此新的下标就会发生这样的变化：

但是，这只是在“1”时发生变化（原下标 + 原表长度），在“0”时下标并不会发生变化。
扩充HashMap时，只需要看看原来的hash值新增的那个bit是1还是0，是0的话索引没变，是1的话索引变成“原索引+oldCap”，下图为16扩充为32的resize（重哈希）示意图：
在这里插入图片描述

这个设计既省去了重新计算hash值的时间，而且同时，由于新增的1bit是0还是1可以认为是随机的，因此resize的过程，均匀的把之前的冲突的节点分散到新的bucket了。这一块就是JDK1.8新增的优化点。有一点注意区别，JDK1.7中rehash的时候，旧链表迁移新链表的时候，如果在新表的数组索引位置相同，则链表元素会倒置（因为从一个链表存遍历到另一个链表时导致倒置了），但是从上图可以看出，JDK1.8不会倒置。

final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        // 旧哈希表容量
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        // 扩容门槛
        int oldThr = threshold;
        // 初始化新的容量和门槛
        int newCap, newThr = 0;
        // 若哈希表还有空间
        if (oldCap > 0) {
        	// 如果数组达到最大长度，不进行扩充。
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            // 当前容量大于默认容量16，且当前容量的两倍不超过最大容量，则容量扩大一倍；
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        // 哈希表还没创建，但是已经指定了threshold，则threshold的值为表（数组）长度；
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
        // 采用默认容量和默认门槛；
        else {               // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr;
        // 更新哈希表
        @SuppressWarnings({"rawtypes","unchecked"})
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        // 若旧表不为空，则将数据移到新表；
        if (oldTab != null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if ((e = oldTab[j]) != null) {
                    oldTab[j] = null;
                    // 如果链表只有一个节点，则重新计算索引存入新表；
                    if (e.next == null)
                        newTab[e.hash & (newCap - 1)] = e;
                    // 若是红黑树，将树转存到新表中；
                    else if (e instanceof TreeNode)
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    // 不为红黑树时；
                    // 重哈希；
                    else { // preserve order
                    	// loHead，loTail  原始链表节点，索引
                        Node<K,V> loHead = null, loTail = null;
                        // hiHead，hiTail 新链表节点，原索引 + 原数组长度；
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        // 遍历链表；
                        do {
                            next = e.next;
                            // 新增bit为0的节点，存入原链表;
                            if ((e.hash & oldCap) == 0) {
                            	// 原索引
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                            	// 原索引 + oldCap
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        // 将原索引加到新链中；
                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        // 新链表存到：原索引位 + 原表长度；
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

树形修剪：split()
HashMap 扩容时会对红黑树节点修剪；
如果当前表中元素结构是红黑树，并且元素个数小于链表阈值UNTREEIFY_THRESHOLD （默认为 6），就会把表中的树形结构缩小或直接还原（切分）为链表结构。

final void split(HashMap<K,V> map, Node<K,V>[] tab, int index, int bit) {
            TreeNode<K,V> b = this;
            // Relink into lo and hi lists, preserving order
            TreeNode<K,V> loHead = null, loTail = null;
            TreeNode<K,V> hiHead = null, hiTail = null;
            int lc = 0, hc = 0;
            for (TreeNode<K,V> e = b, next; e != null; e = next) {
                next = (TreeNode<K,V>)e.next;
                e.next = null;
                // 若当前节点hash值的最后一位等于要修剪的bit值；
                // 把当前节点放到原始树中；
                if ((e.hash & bit) == 0) {
                    if ((e.prev = loTail) == null)
                        loHead = e;
                    else
                        loTail.next = e;
                    loTail = e;
                    // 记录原始树的节点数量；
                    ++lc;
                }
                else {
                	// 若当前节点hash值的最后一位不是要修剪的；
                	// 把当前节点放到新树中；
                    if ((e.prev = hiTail) == null)
                        hiHead = e;
                    else
                        hiTail.next = e;
                    hiTail = e;
                    // 记录新树的节点数量；
                    ++hc;
                }
            }
            if (loHead != null) {
				// 如果原始树的节点数量小于6，就把原始树的枝叶都置为空，变成一个单节点；
            	//然后让这个表中，将索引位置之后的节点都还原成链表的节点；
            	//后面就是一个链表结构；
                if (lc <= UNTREEIFY_THRESHOLD)
                    tab[index] = loHead.untreeify(map);
                else {
                	// 让索引位置的节点指向原始树，这个树被修剪过，元素变少；
                    tab[index] = loHead;
                    if (hiHead != null) // (else is already treeified)
                        loHead.treeify(tab);
                }
            }
            if (hiHead != null) {
            	// 将指定索引 index + bit 之后的元素指向新树，还原成链表或者修剪过的树；
                if (hc <= UNTREEIFY_THRESHOLD)
                    tab[index + bit] = hiHead.untreeify(map);
                else {
                    tab[index + bit] = hiHead;
                    if (loHead != null)
                        hiHead.treeify(tab);
                }
            }
        }

修剪主要分两部分，先分类、再根据元素个数决定是还原成链表还是精简元素仍保留红黑树结构。
1.指定位置、指定范围，让指定位置中的元素（hash & bit) == 0 的，放到原始树中，不相等的放到新树中。
2.原始树中，在元素个数小于 6 时还原成链表，最后让哈希表中修剪的表 tab[index] 指向原始树；在元素个数大于 6 时，还是用红黑树，只不过是修剪了下枝叶；新树也是一样的操作，但最后放在了修剪范围外 tab[index + bit]。

4.get

public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }

在哈希表中查找，并根据存储类型决定是链表查找还是红黑树查找。

final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

总结

1.HashMap实际上是“链表散列”的数据结构，数组+链表+红黑树的结合；

2.HashMap初始容量为16，容量是以 2^n 扩充的，一是为了提高性能使用足够大的数组，二是为了能使用位运算代替取模预算；

3.扩容阈值 = 表长 x 负载因子，负载因子默认0.75；

4.hashMap非线程安全。

chy-7

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
【Java】JDK1.8 HashMap源码解析

HashMap是基于哈希表的Map接口的实现。采用key-value（键值对）来存储，并允许使用null作为键和值。HashMap由数组+链表组成的，数组是HashMap的主体，链表则是解决哈希冲突，如果定位到的数组位置不含链表（当前entry的next指向null）,那么对于查找，添加等操作很快，仅需一次寻址即可；如果定位到的数组包含链表，对于添加操作，其时间复杂度为O(n)，首先遍历链表，存在...
复制链接

扫一扫