HashMap源码分析

大音~希声

已于 2024-04-12 22:11:30 修改

阅读量360

点赞数 3

文章标签：哈希算法算法

于 2024-04-12 17:40:43 首次发布

本文链接：https://blog.csdn.net/qq_37752382/article/details/137683484

版权

一、背景
总有人爱问这个，这里自己看看1.8。1.7就不说了。概念也不说了，有基础的可以看看。
二、总结。
总结放前面
1）扩容的条件

初始化的时候
新增一个新的数组，当其中的key-value数量超过阈值的时候。（可以思考下为啥设置这么小，因为数组加链表的容量远大于数组长度，更别说还*扩容因子了）
链表转树的时候，如果元素数量没超过64，就不会转化树，而是先扩容。（很多人忘记这一点）

2）链表转树的条件
链表新增到第8个的时候，就转树。也就是说大于等于8的时候就转树。（网上为啥很多说大于8，我看源码是等于8的时候就转了）-- 后续出一个证明的地方
第九个元素的时候转红黑树。（事实证明我错了，验证代码放最后）另外转树之前会判断会比较元素数量是否大于64，小于的话直接扩容，而不是转树。

3）啥时候树转链表呢？能想到的地方就是删除元素或者扩容的时候，先看下代码，回头填这里。删除元素并不是少于6就转链表了是不是很意外
一般是不是就记住了一个6、8这两个数，8可以找到，6是不是找不到，慢慢看删除逻辑

if (root == null || root.right == null ||
                (rl = root.left) == null || rl.left == null)

从上述看出，删除元素与个数是没关系的，只与固定的这几个节点有关系。
扩容的时候，会对红黑树进行判断，少于6就会转化，明白了吧。

4）扩容
table容量变为2倍，但是不需要像之前一样计算下标，只需要将hash值和旧数组长度相与即可确定位置。

如果 Node 桶的数据结构是链表会生成 low 和 high 两条链表，是红黑树则生成 low 和 high 两颗红黑树
依靠 (hash & oldCap) == 0 判断 Node 中的每个结点归属于 low 还是 high。
把 low 插入到新数组中当前数组下标的位置，把 high 链表插入到新数组中 [当前数组下标 + 旧数组长度] 的位置
如果生成的 low，high 树中元素个数小于等于6退化成链表再插入到新数组的相应下标的位置

三、分析

HashMap<Integer,Integer> map = new HashMap();
map.put(1,1);

一个未写初始容量的源码，从底层看就写了一个扩容因子，那么容量是在啥时候初始化的呢？至少不是刚开始初始化的。

public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR; 
    }

从put方法开始看

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

table就是那个数组 transient Node<K,V>[] table，table开始肯定是空的。然后我们看n = (tab = resize()).length;这个方法都干了啥，resize()望文生义就是扩容的意思（此处先写初始化的逻辑，扩容逻辑后续还得分析）

// 默认值是16
newCap = DEFAULT_INITIAL_CAPACITY;
Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];

容量和数组初始化好了，那么接着往下看put

// 如果数组这里没有数据就直接给一个新的newNode
 if ((p = tab[i = (n - 1) & hash]) == null)
	tab[i] = newNode(hash, key, value, null);

着重看下链表+红黑树部分

//说明是同一个key,直接将这个节点的node p赋值给e
if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
 	e = p;

如果是个TreeNode类型的

else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);

进入到这个putTreeVal方法中看了下，就是简单的在树上加了一个node,由于是新加元素此处不涉及树和链表的转化，继续看下面的，再贴一下

for (int binCount = 0; ; ++binCount) {
      if ((e = p.next) == null) {
          p.next = newNode(hash, key, value, null);
          // 看这里树化
          if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
              treeifyBin(tab, hash);
          break;
      }
      if (e.hash == hash &&
          ((k = e.key) == key || (key != null && key.equals(k))))
          break;
      p = e;
  }

看这里，如果当前链表数据大于等于7的时候，树化。binCount 是从0开始的，等于7的时候触发，说明p.next了7下这就说明转化时机是当第八个的时候就要转成树了。
实际上当binCount 等于7的时候，p已经是第8个元素了，而加上next就是新元素插入的是第九个元素。

if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
          treeifyBin(tab, hash);

但是别急，里面的逻辑还得看

final void treeifyBin(Node<K,V>[] tab, int hash) {
        int n, index; Node<K,V> e;
        if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
            resize();
        else if ((e = tab[index = (n - 1) & hash]) != null) {
            TreeNode<K,V> hd = null, tl = null;
            do {
                TreeNode<K,V> p = replacementTreeNode(e, null);
                if (tl == null)
                    hd = p;
                else {
                    p.prev = tl;
                    tl.next = p;
                }
                tl = p;
            } while ((e = e.next) != null);
            if ((tab[index] = hd) != null)
                hd.treeify(tab);
        }
    }

if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
    resize();

看见了吧，大于8只是一个条件，还要看底层数组是不是小于64，如果小于它首选的转变，是扩容。
这里说明扩容的条件不仅仅是容量超过max*扩容因子，转化成树也会触发扩容
数组上原来有值的逻辑到这里就结束了，并且key相同默认的是值覆盖
如果是新值，还有点逻辑需要看下

// size key-val的数量  threshold阈值
if (++size > threshold)
            resize();

此处扩容，threshol的初始化在哪里呢？在第一次扩容中，也就是说数组长度*扩容因子

newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
threshold = newThr;

另一个知识点：

static final int tableSizeFor(int cap) {
        int n = cap - 1;
        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }

说明初始值是允许设置为很小的值，只是默认为16。

到次数涉及的扩容的就弄完了，可以对应到开头总结下扩容条件。我们先看下删除的逻辑，扩容的逻辑最后讲，比较经典。

删除操作没太多讲的，一个就是返回值是旧值，一个就是树少删除元素是否转链表的问题。
删除的关于树转链表的代码如下：

if (root == null || root.right == null ||
                (rl = root.left) == null || rl.left == null) {
                tab[index] = first.untreeify(map);  // too small
                return;
            }

看了半天没有关于6这个数字的描述啊，奇怪？
红黑树是一个接近平衡的二叉树，看这个代码需要大概知道红黑树长啥样。红黑树传送门
这一块关于红黑树的转变看起来确实比6小很多啊，通过对红黑树各种演示，通过这个判断并不能判断出个数，所以总结为与个数无关，只与他这几个节点相关。

扩容代码分析

final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;
        if (oldCap > 0) {
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
        else {               // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        if (oldTab != null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if ((e = oldTab[j]) != null) {
                    oldTab[j] = null;
                    if (e.next == null)
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode)
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

扩容代码其实主要还是下标的计算
如果当前节点（遍历到的数组元素）没有next,则不是链表

newTab[e.hash & (newCap - 1)] = e;

如果是树，split中还和6进行了比较，小于6就去树化
另一种是链表

四、验证

public class M2 {
    public static void main(String[] args) {
        Node head = new Node(1);
        head.next = new Node(2);
        head.next.next = new Node(3);
        head.next.next.next = new Node(4);
        head.next.next.next.next = new Node(5);
        head.next.next.next.next.next = new Node(6);
        head.next.next.next.next.next.next = new Node(7);
        //head.next.next.next.next.next.next.next = new Node(8);

        Node p = head;
        Node e ;
        for (int binCount = 0; ; ++binCount) {
            //当到这里的时候，p的下标对应的就是链表的下标
            System.out.println(binCount);
            if ((e = p.next) == null) {
                p.next = new Node(9999);
                //进if的时候，就是7，此时p已经是第8个元素了,而新元素是9
                if (binCount >= 8 - 1) // -1 for 1st
                    System.out.println("转树");
                break;
            }
            p = e;
        }
    }

    public static class Node{
        public Integer val;
        public Node next;

        public Node(Integer val) {
            this.val = val;
        }
    }
}

大音~希声

关注

3
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
HashMap源码分析

这个方法都干了啥，resize()望文生义就是扩容的意思（此处先写初始化的逻辑，扩容逻辑后续还得分析）binCount 是从0开始的，等于7的时候触发，说明p.next了7下这就说明转化时机是当第八个的时候就要转成树了。进入到这个putTreeVal方法中看了下，就是简单的在树上加了一个node,由于是新加元素此处不涉及树和链表的转化，继续看下面的，再贴一下。我们先看下删除的逻辑，扩容的逻辑最后讲，比较经典。一个未写初始容量的源码，从底层看就写了一个扩容因子，那么容量是在啥时候初始化的呢？
复制链接

扫一扫

HashMap源码分析

“相关推荐”对你有帮助么？