HashMap源码整理

最新推荐文章于 2024-04-23 14:24:28 发布

一枚螺丝钉

最新推荐文章于 2024-04-23 14:24:28 发布

阅读量166

点赞数

文章标签： java

本文链接：https://blog.csdn.net/ymlsd/article/details/112072571

版权

本文详细梳理了HashMap的内部数据结构，包括构造函数、哈希计算、插入、获取及删除操作。特别关注了HashMap的树化过程，从基本数据结构到treeifyBin、treeify方法，再到树的查找、删除及平衡树的维护策略。对于树节点的删除，介绍了如何寻找替代节点并保持树的平衡。

摘要由CSDN通过智能技术生成

HashMap类图

在这里插入图片描述

重要注释

<p>As a general rule, the default load factor (.75) offers a good
 * tradeoff between time and space costs.  Higher values decrease the
 * space overhead but increase the lookup cost (reflected in most of
 * the operations of the {@code HashMap} class, including
 * {@code get} and {@code put}).  The expected number of entries in
 * the map and its load factor should be taken into account when
 * setting its initial capacity, so as to minimize the number of
 * rehash operations.  If the initial capacity is greater than the
 * maximum number of entries divided by the load factor, no rehash
 * operations will ever occur.
 * 1.负载因子0.75很好的平衡了时间和空间花费，太高的负载因子可以减少空间消耗，但是会降低查询的效率。
2.如果在创建 hashMap 中指定一个容量大小使得初始容量能够大于实际的数据个数除以负载因子，即可以避免扩容操作。
因此我们使用 hashMap 时应该提前评估我们要存放数据量，可以根据数据量在创建 hashMap 时指定大小，提高效率。

* Because TreeNodes are about twice the size of regular nodes, we
     * use them only when bins contain enough nodes to warrant use
     * (see TREEIFY_THRESHOLD). And when they become too small (due to
     * removal or resizing) they are converted back to plain bins.  In
     * usages with well-distributed user hashCodes, tree bins are
     * rarely used.  Ideally, under random hashCodes, the frequency of
     * nodes in bins follows a Poisson distribution
     * (http://en.wikipedia.org/wiki/Poisson_distribution) with a
     * parameter of about 0.5 on average for the default resizing
     * threshold of 0.75, although with a large variance because of
     * resizing granularity. Ignoring variance, the expected
     * occurrences of list size k are (exp(-0.5) * pow(0.5, k) /
     * factorial(k)). The first values are:
     *
     * 0:    0.60653066
     * 1:    0.30326533
     * 2:    0.07581633
     * 3:    0.01263606
     * 4:    0.00157952
     * 5:    0.00015795
     * 6:    0.00001316
     * 7:    0.00000094
     * 8:    0.00000006
     * more: less than 1 in ten million
     当列表长度超过阈值时，链表会转为红黑树，优化查询效率。一般理想情况下，链表长度大于8的情况是很小的。该操作仅防止某些极端 hash 计算。

内部数据结构

static class Node<K,V> implements Map.Entry<K,V> {

        final int hash;
        final K key;
        V value;
        Node<K,V> next;
}
transient Node<K,V>[] table;

构造函数

/**
     * Constructs an empty {@code HashMap} with the specified initial
     * capacity and load factor.
     *
     * @param  initialCapacity the initial capacity
     * @param  loadFactor      the load factor
     * @throws IllegalArgumentException if the initial capacity is negative
     *         or the load factor is nonpositive
     *
     * 允许设置初始化大小和负载因子的构造函数
     */
    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        // map 最大容量 1 << 30
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;

        // 什么情况下可能是 NaN 呢？
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        // 根据传入的初始化大小值确定
        // 这里先用 threshold 保存 map 容量大小
        this.threshold = tableSizeFor(initialCapacity);
    }

/**
     * Returns a power of two size for the given target capacity.
     * TODO 为什么要是2的 n 次幂？
     * 2的 N次幂 -1 所得的二进制值都为1，相当于掩码，计算 key 值所在桶位
     */
    static final int tableSizeFor(int cap) {
        // 获取 cap - 1 在补码中高位的0位个数
        // -1 原码 10000000000000000000000000000001
        // -1 补码 11111111111111111111111111111111
        // 假设 cap == 16 00000000000000000000000000010000  27 个 0
        // cap - 1 == 00000000000000000000000000001111  28 个 0
        // n == 00000000000000000000000000001111 == 15
        // 这里 n 计算结果会是 2^x - 1
        int n = -1 >>> Integer.numberOfLeadingZeros(cap - 1);
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }

hash()

	/**
	* 使用hashcode 高位和低位异或，使得在计算 key 所在的桶位的时候有高位值的特征
	*/
	static final int hash(Object key) {
        int h;
        // key 本身的 hashcode 和 hashcode 高16位做异或
        // 尽可能的散列，防止冲突
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

put()

/**
     * Associates the specified value with the specified key in this map.
     * If the map previously contained a mapping for the key, the old
     * value is replaced.
     *
     * @param key key with which the specified value is to be associated
     * @param value value to be associated with the specified key
     * @return the previous value associated with {@code key}, or
     *         {@code null} if there was no mapping for {@code key}.
     *         (A {@code null} return can also indicate that the map
     *         previously associated {@code null} with {@code key}.)
     */
    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

    /**
     * Implements Map.put and related methods.
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value
     * @param evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            // 初始化 table
            n = (tab = resize()).length;
        // table 长度 - 1 & hash
        // 当 key == null 时，hash 为0
        if ((p = tab[i = (n - 1) & hash]) == null)
            // 如果当前下标下没有数据，直接创建保存到当前下标
            tab[i] = newNode(hash, key, value, null);
        else {
            // 存在 hash 冲突
            Node<K,V> e; K k;
            // 如果 hash 值相同，并且 key 相等。
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                // p,e 指向了同一个对象
                e = p;
            else if (p instanceof TreeNode)
                // 如果 p 是一个树节点，进行树节点保存
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                // 即不等于头节点，又不是树，则进行列表循环
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        // 遍历到了列表最后一个节点，直接插入,
                        // 尾插
                        p.next = newNode(hash, key, value, null);
                        // 如果当前 count 为7，再加一个新的 node，count 为8，即当链表存在9个元素时，转换为树
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    // 判断新增节点是否和当前节点相同
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                // 获取旧值
                V oldValue = e.value;
                // 如果允许替换或者旧值为 null，则替换新值
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        // 修改标记
        ++modCount;
        // 如果当前的 size 比阈值大，进行扩容操作
        // 当前容量等于阈值时不会扩容
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

get() & remove()

public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }

    /**
     * Implements Map.get and related methods.
     *
     * @param hash hash for key
     * @param key the key
     * @return the node, or null if none
     */
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        // 先判断是否初始化了 table， table 中是否有数据
        // 当前查询的 key 所在的下标下是否有数据
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            // 如果 hash 相同，key 相同或者相等
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            // 如果 next 不为空
            if ((e = first.next) != null) {
                // 如果是红黑树，执行红黑树查询
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    // e = first.next, 判断是否相同
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

final Node<K,V> removeNode(int hash, Object key, Object value,
                               boolean matchValue, boolean movable) {
        Node<K,V>[] tab; Node<K,V> p; int n, index;

        if ((tab = table) != null && (n = tab.length) > 0 &&
            (p = tab[index = (n - 1) & hash]) != null) {
            // 找要remove 的元素，与 get 相同
            Node<K,V> node = null, e; K k; V v;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                node = p;
            else if ((e = p.next) != null) {
                if (p instanceof TreeNode)
                    node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
                else {
                    do {
                        if (e.hash == hash &&
                            ((k = e.key) == key ||
                             (key != null && key.equals(k)))) {
                            node = e;
                            break;
                        }
                        p = e;
                    } while ((e = e.next) != null);
                }
            }


            if (node != null && (!matchValue || (v = node.value) == value ||
                                 (value != null && value.equals(v)))) {
                if (node instanceof TreeNode)
                    ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
                else if (node == p)
                    // 此时是移除 first 节点
                    tab[index] = node.next;
                else
                    // node是当前要删除的节点
                    // p 是 node 的上一个节点
                    p.next = node.next;
                ++modCount;
                --size;
                afterNodeRemoval(node);
                return node;
            }
        }
        return null;
    }

resize()

final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        // 获取旧的数组长度
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;
        if (oldCap > 0) {
            if (oldCap >= MAXIMUM_CAPACITY) {
                // 如果 capacity 已经是允许的最大值，将 threshold 设置为 Integer 的最大值
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                // TODO 为什么 oldCap < 16时，threshold 不直接扩大一倍？
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshold
            // 如果创建指定了容器大小，则将容器大小赋值 newCap
            newCap = oldThr;
        else {               // zero initial threshold signifies using defaults
            // new 未指定容器大小，赋默认值
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
            // new 指定了容器大小，首次创建 table，重新计算 threshold
            // oldCap 小于 16
            float ft = (float)newCap * loadFactor;
            // 因为 loadFactor 可以指定，所以此处需要判断 ft < (float)MAXIMUM_CAPACITY
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        if (oldTab != null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if ((e = oldTab[j]) != null) {
                    // TODO ???
                    oldTab[j] = null;
                    if (e.next == null)
                        // 当前下标只有一个节点，直接分配
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode)
                        // 树
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            // hash 和原数组的大小相与
                            /**
                             * 假设原来长度是16 10000
                             * 原来第0位放的元素 hash & 1111 == 0
                             * hash 可能的情况 10000 100000 110000 也就是说数组小标位置已经能确定低4位的情况
                             *
                             * 现在想要确认该元素是在当前位置，还是在第16位
                             * 只需要确认第5位是不是1即可
                             */
                            // 这里使用的是尾插
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                /**
                                 * 第一次到这里 loHead 和 loTail 指向同一个对象
                                 * 所以，loTail.next 赋值，相当于 loHead.next 赋值
                                 * 第二次到这里 loTail 实际指向 loHead.next
                                 * 此时 loTail.next 赋值，相当于 loHead.next.next = e
                                 */
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            // 将原有的链表断开
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

树化

基本数据结构

/**
* TreeNode 继承 Node
*/
static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
        // 父节点
        TreeNode<K,V> parent;  // red-black tree links
        // 左子树
        TreeNode<K,V> left;
        // 右子树
        TreeNode<K,V> right;
        // 前一个节点
        TreeNode<K,V> prev;    // needed to unlink next upon deletion
        boolean red;
        TreeNode(int hash, K key, V val, Node<K,V> next) {
            super(hash, key, val, next);
        }
}

treeifyBin()

/**
     * Replaces all linked nodes in bin at index for given hash unless
     * table is too small, in which case resizes instead.
     */
    final void treeifyBin(Node<K,V>[] tab, int hash) {
        int n, index; Node<K,V> e;
        // 如果 table 长度未达到64，执行扩容操作
        if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
            resize();
        else if ((e = tab[index = (n - 1) & hash]) != null) {
            // hd 用来标记链表头
            // tl 用来连接链表
            TreeNode<K,V> hd = null, tl = null;
            do {
                // 修改为 treenode，next指针置为 null
                TreeNode<K,V> p = replacementTreeNode(e, null);
                if (tl == null)
                    hd = p;
                else {
                    p.prev = tl;
                    tl.next = p;
                }
                tl = p;
            } while ((e = e.next) != null);
            // index = (n - 1 & hash)
            if ((tab[index] = hd) != null)
            	// 构造树
                hd.treeify(tab);
        }
    }

treeify()

/**
         * Forms tree of the nodes linked from this node.
         */
        final void treeify(Node<K,V>[] tab) {
            TreeNode<K,V> root = null;
            for (TreeNode<K,V> x = this, next; x != null; x = next) {
                next = (TreeNode<K,V>)x.next;
                x.left = x.right = null;
                // 根节点为 null
                if (root == null) {
                    x.parent = null;
                    // 红黑树性质1 根节点一定是黑色的
                    x.red = false;
                    root = x;
                }
                else {
                    K k = x.key;
                    int h = x.hash;
                    Class<?> kc = null;
                    for (TreeNode<K,V> p = root;;) {
                        int dir, ph;
                        K pk = p.key;
                        if ((ph = p.hash) > h)
                            dir = -1;
                        else if (ph < h)
                            dir = 1;
                        // 如果 hash 值相同，如果 key 对应的类实现 Comparable接口，则通过compareTo方法比较
                        // 如果 compareTo 计算等于0 即两值相等 则使用原生的 hashcode，即对象地址值
                        else if ((kc == null &&
                                  (kc = comparableClassFor(k)) == null) ||
                                 (dir = compareComparables(kc, k, pk)) == 0)
                            dir = tieBreakOrder(k, pk);

                        TreeNode<K,V> xp = p;
                        // 如果 dir <= 0, 则放到当前节点左边
                        if ((p = (dir <= 0) ? p.left : p.right) == null) {
                            // 标记 parent 节点
                            x.parent = xp;
                            if (dir <= 0)
                                xp.left = x;
                            else
                                xp.right = x;
                            // 平衡树操作
                            root = balanceInsertion(root, x);
                            break;
                        }
                    }
                }
            }
            // 修改链表指针，将 root 节点放为首节点
            moveRootToFront(tab, root);
        }

平衡树

static <K,V> TreeNode<K,V> balanceInsertion(TreeNode<K,V> root,
                                                    TreeNode<K,V> x) {
            // 新插入的节点是红色
            x.red = true;
            for (TreeNode<K,V> xp, xpp, xppl, xppr;;) {
                // root
                if ((xp = x.parent) == null) {
                    x.red = false;
                    return x;
                }
                // 父节点不为红 ，爷爷节点为 null
                // 父节点是 root
                else if (!xp.red || (xpp = xp.parent) == null)
                    return root;
                // parent 是 爷爷的左子节点
                if (xp == (xppl = xpp.left)) {
                    // 爷爷的右子节点不为 null 且为红色
                    if ((xppr = xpp.right) != null && xppr.red) {
                        // 变色
                        // 爷爷右子节点变为黑色
                        xppr.red = false;
                        // 父节点（爷爷左子节点）变为黑色
                        xp.red = false;
                        // 爷爷变为红
                        xpp.red = true;
                        x = xpp;
                    }
                    // !爷爷的右子节点不为 null 且为红色
                    else {
                        // 如果当前节点是 parent 的右子节点
                        if (x == xp.right) {
                            // xp左旋
                            root = rotateLeft(root, x = xp);
                            // 旋转后重新定位 xp, xpp
                            xpp = (xp = x.parent) == null ? null : xp.parent;
                        }
                        if (xp != null) {
                        	// 红黑树性质 红色节点不能相连
                            // xp 修改为黑色 （x 是红色）
                            xp.red = false;
                            if (xpp != null) {
                                xpp.red = true;
                                // xpp右旋
                                root = rotateRight(root, xpp);
                            }
                        }
                    }
                }
                // parent 是 爷爷的右子节点
                else {
                    if (xppl != null && xppl.red) {
                        xppl.red = false;
                        xp.red = false;
                        xpp.red = true;
                        x = xpp;
                    }
                    else {
                        if (x == xp.left) {
                            root = rotateRight(root, x = xp);
                            xpp = (xp = x.parent) == null ? null : xp.parent;
                        }
                        if (xp != null) {
                            xp.red = false;
                            if (xpp != null) {
                                xpp.red = true;
                                root = rotateLeft(root, xpp);
                            }
                        }
                    }
                }
            }
        }

/* ------------------------------------------------------------ */
        // Red-black tree methods, all adapted from CLR
        /**
         * 右旋同理
         * 左旋 
         * 1.p 右子节点挂 p.r 的左子树
         * 2.修改 r.p = pp
         * 3.修改 pp.child
         * 4.修改 r.left, p.parent
         */
        static <K,V> TreeNode<K,V> rotateLeft(TreeNode<K,V> root,
                                              TreeNode<K,V> p) {
            TreeNode<K,V> r, pp, rl;
            // p 不为 null， p 的右子节点不为 null
            if (p != null && (r = p.right) != null) {
                // p.right = r.left
                // parent 右子节点挂 p.r 的左子树
                // r的左子节点不为 null
                if ((rl = p.right = r.left) != null)
                    // 修改父节点
                    rl.parent = p;
                // r.p 修改为 pp
                // 如果 pp 为 null，说明 p 为 root
                if ((pp = r.parent = p.parent) == null)
                    // 此时 r 为 root，置为黑色
                    (root = r).red = false;
                // 修改 pp 的子树指向 p.r
                else if (pp.left == p)
                    pp.left = r;
                else
                    pp.right = r;
                // p 为 r 的左子树
                r.left = p;
                p.parent = r;
            }
            return root;
        }

树扩容

 /**
         * Splits nodes in a tree bin into lower and upper tree bins,
         * or untreeifies if now too small. Called only from resize;
         * see above discussion about split bits and indices.
         * 原有的树节点右前后节点的引用，所以直接按照链表拆分就可以
         * 然后分别对高、低位链进行是否树化判断
         *
         * @param map the map
         * @param tab the table for recording bin heads
         * @param index the index of the table being split 当前下标
         * @param bit the bit of hash to split on 旧的容量
         */
        final void split(HashMap<K,V> map, Node<K,V>[] tab, int index, int bit) {
            TreeNode<K,V> b = this;
            // Relink into lo and hi lists, preserving order
            TreeNode<K,V> loHead = null, loTail = null;
            TreeNode<K,V> hiHead = null, hiTail = null;
            int lc = 0, hc = 0;
            for (TreeNode<K,V> e = b, next; e != null; e = next) {
                next = (TreeNode<K,V>)e.next;
                e.next = null;
                // 低位
                if ((e.hash & bit) == 0) {
                    if ((e.prev = loTail) == null)
                        loHead = e;
                    else
                        loTail.next = e;
                    loTail = e;
                    // 链表计数
                    ++lc;
                }
                // 高位
                else {
                    if ((e.prev = hiTail) == null)
                        hiHead = e;
                    else
                        hiTail.next = e;
                    hiTail = e;
                    ++hc;
                }
            }

            if (loHead != null) {
                // 小于等于 6 转化为链表
                if (lc <= UNTREEIFY_THRESHOLD)
                    tab[index] = loHead.untreeify(map);
                else {
                    tab[index] = loHead;
                    // 等于 null 说明没有拆分到高位链上，就是原有的树
                    if (hiHead != null) // (else is already treeified)
                        // 树化
                        loHead.treeify(tab);
                }
            }
            if (hiHead != null) {
                if (hc <= UNTREEIFY_THRESHOLD)
                    tab[index + bit] = hiHead.untreeify(map);
                else {
                    tab[index + bit] = hiHead;
                    if (loHead != null)
                        hiHead.treeify(tab);
                }
            }
        }