HashMap源码阅读笔记

showMeAllCode

已于 2022-10-28 10:22:21 修改

阅读量150

点赞数

分类专栏：学习笔记文章标签： java

于 2022-10-27 21:31:20 首次发布

本文链接：https://blog.csdn.net/qq_42825158/article/details/127509290

版权

学习笔记专栏收录该内容

1 篇文章 0 订阅

订阅专栏

HashMap源码阅读笔记

1.HashMap中的各项属性：

//默认的初始容量
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;
//默认加载因子
static final float DEFAULT_LOAD_FACTOR = 0.75f;
//链表长度的树化阈值
static final int TREEIFY_THRESHOLD = 8;
//当树中小于等于6个时，转化为链表
static final int UNTREEIFY_THRESHOLD = 6;
//HashMap的大小
transient int size;
//判断是否扩容的阈值
int threshold;

初始容量：如果我们不指定大小，那默认初始大小就是16

默认加载因子：0.75 与链表长度的树化阈值：8
这两个值是在泊松分布下最节省空间和时间的合理参数，详细见源码注释

    /*
     * Implementation notes.
     *
     * This map usually acts as a binned (bucketed) hash table, but
     * when bins get too large, they are transformed into bins of
     * TreeNodes, each structured similarly to those in
     * java.util.TreeMap. Most methods try to use normal bins, but
     * relay to TreeNode methods when applicable (simply by checking
     * instanceof a node).  Bins of TreeNodes may be traversed and
     * used like any others, but additionally support faster lookup
     * when overpopulated. However, since the vast majority of bins in
     * normal use are not overpopulated, checking for existence of
     * tree bins may be delayed in the course of table methods.
     *
     * Tree bins (i.e., bins whose elements are all TreeNodes) are
     * ordered primarily by hashCode, but in the case of ties, if two
     * elements are of the same "class C implements Comparable<C>",
     * type then their compareTo method is used for ordering. (We
     * conservatively check generic types via reflection to validate
     * this -- see method comparableClassFor).  The added complexity
     * of tree bins is worthwhile in providing worst-case O(log n)
     * operations when keys either have distinct hashes or are
     * orderable, Thus, performance degrades gracefully under
     * accidental or malicious usages in which hashCode() methods
     * return values that are poorly distributed, as well as those in
     * which many keys share a hashCode, so long as they are also
     * Comparable. (If neither of these apply, we may waste about a
     * factor of two in time and space compared to taking no
     * precautions. But the only known cases stem from poor user
     * programming practices that are already so slow that this makes
     * little difference.)
     *
     * Because TreeNodes are about twice the size of regular nodes, we
     * use them only when bins contain enough nodes to warrant use
     * (see TREEIFY_THRESHOLD). And when they become too small (due to
     * removal or resizing) they are converted back to plain bins.  In
     * usages with well-distributed user hashCodes, tree bins are
     * rarely used.  Ideally, under random hashCodes, the frequency of
     * nodes in bins follows a Poisson distribution
     * (http://en.wikipedia.org/wiki/Poisson_distribution) with a
     * parameter of about 0.5 on average for the default resizing
     * threshold of 0.75, although with a large variance because of
     * resizing granularity. Ignoring variance, the expected
     * occurrences of list size k are (exp(-0.5) * pow(0.5, k) /
     * factorial(k)). The first values are:
     *
     * 0:    0.60653066
     * 1:    0.30326533
     * 2:    0.07581633
     * 3:    0.01263606
     * 4:    0.00157952
     * 5:    0.00015795
     * 6:    0.00001316
     * 7:    0.00000094
     * 8:    0.00000006
     * more: less than 1 in ten million
     *
     * The root of a tree bin is normally its first node.  However,
     * sometimes (currently only upon Iterator.remove), the root might
     * be elsewhere, but can be recovered following parent links
     * (method TreeNode.root()).
     *
     * All applicable internal methods accept a hash code as an
     * argument (as normally supplied from a public method), allowing
     * them to call each other without recomputing user hashCodes.
     * Most internal methods also accept a "tab" argument, that is
     * normally the current table, but may be a new or old one when
     * resizing or converting.
     *
     * When bin lists are treeified, split, or untreeified, we keep
     * them in the same relative access/traversal order (i.e., field
     * Node.next) to better preserve locality, and to slightly
     * simplify handling of splits and traversals that invoke
     * iterator.remove. When using comparators on insertion, to keep a
     * total ordering (or as close as is required here) across
     * rebalancings, we compare classes and identityHashCodes as
     * tie-breakers.
     *
     * The use and transitions among plain vs tree modes is
     * complicated by the existence of subclass LinkedHashMap. See
     * below for hook methods defined to be invoked upon insertion,
     * removal and access that allow LinkedHashMap internals to
     * otherwise remain independent of these mechanics. (This also
     * requires that a map instance be passed to some utility methods
     * that may create new nodes.)
     *
     * The concurrent-programming-like SSA-based coding style helps
     * avoid aliasing errors amid all of the twisty pointer operations.
     */

2.HashMap的put

    public V put(K key, V value) {
       return putVal(hash(key), key, value, false, true);
    }
    
    static final int hash(Object key) {
       int h;
       return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }
   
	final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                  boolean evict) {
       Node<K,V>[] tab; Node<K,V> p; int n, i;
       if ((tab = table) == null || (n = tab.length) == 0)
       	//如果当前的table（数组）为空或者长度为0，那么进行扩容
           n = (tab = resize()).length;
       if ((p = tab[i = (n - 1) & hash]) == null)
       	//通过(n - 1) & hash计算下标获取下标的值，如果为空，新建一个Node并赋值，值插入成功
           tab[i] = newNode(hash, key, value, null);
       else {
       	//如果不为空
           Node<K,V> e; K k;
           if (p.hash == hash &&
               ((k = p.key) == key || (key != null && key.equals(k))))
               //判断第一个Node中的hash是否与当前hash一致
               //判断第一个Node中的key的地址是否与当前key一致或当前key不为空并与Node中的key值相同
               //如果满足，则将第一个Node赋值给临时变量e
               e = p;
           else if (p instanceof TreeNode)
           	//如不满足，则判断当前是否是红黑树，如果是，则赋值给e当前待插入位置（可能为null）
               e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
           else {
           	//如果不是红黑树，遍历链表
               for (int binCount = 0; ; ++binCount) {
               	//将待处理Node赋值给临时变量e
                   if ((e = p.next) == null) {
                   	//如果最终没有重复key，则新建一个Node赋值给最后一个Node.next和e
                       p.next = newNode(hash, key, value, null);
                       if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                       	//判断当前链表长度是否达到8，如果是则转为红黑树
                           treeifyBin(tab, hash);
                       break;
                   }
                   if (e.hash == hash &&
                       ((k = e.key) == key || (key != null && key.equals(k))))
                       //如果找到了相同key跳出循环
                       break;
                    //将下一个Node赋值给p
                   p = e;
               }
           }
           if (e != null) { // existing mapping for key
           	//如果e不为空
           	//获取value旧值
               V oldValue = e.value;
               //判断是否为仅当不存在时才写入或者value旧值为null
               if (!onlyIfAbsent || oldValue == null)
               	//将新值赋值给e的value	
                   e.value = value;
               afterNodeAccess(e);
               //返回旧值
               return oldValue;
           }
       }
       //计数加一
       ++modCount;
       if (++size > threshold)
       	//判断当前size是否达到扩容阈值，如果是，则扩容
           resize();
       afterNodeInsertion(evict);
       return null;
   }

1.我们调用put时，首先回调用hash值得获取方法；
2.hash方法：首先判断我们的key是否为空，如果为空，则返回0；否则返回key的hashcode与其位移16位之后的异或结果。
3.获取hash值后调用putVal，调用时传入了两个默认值：onlyIfAbsent–false，evict–true
与其对应的

    public V putIfAbsent(K key, V value) {
        return putVal(hash(key), key, value, true, true);
    }

onlyIfAbsent–true
这个值得作用就是当其为true时，不会覆盖其原有值
4.resize：

final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;
        if (oldCap > 0) {
        	//如果当前table的长度大于0：就对其进行判断，看长度是否达到最大，如果达到最大，则
			//将当前的扩容阈值设置为Integer最大值，并返回当前table；
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            //否则将新table的长度扩大一倍，
			//且如果当前table的长度大于默认长度，则将新table的扩容阈值增大一倍。
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshold
       		 //如果table的长度不大于0，则判断当前table的扩容阈值是否大于0，如果大于0，
       		//则将新table的长度设置为当前table的扩容阈值。
            newCap = oldThr;
        else {               // zero initial threshold signifies using defaults
        	//否则将新table的长度置为默认长度（16），
        	//新table的扩容阈值设置为：默认加载因子 （0.75）* 默认长度（16）。
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
		     // 然后判断新table的扩容阈值是否等于0（只有第2种情况下才为0），如果等于0，
		     // 计算出新table的扩容阈值"ft"：新table的长度 （newCap）* 加载因子 （默认0.75），
		     // 如果新table的长度不超过最大长度，且新table的扩容阈值也不超过最大长度，
		     // 就将新table的扩容阈值设置为"ft",否则设置为Integer最大值。
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        //将当前的扩容阈值设置为新的扩容阈值。
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
        //new出一个新的长度为新table长度的Node<K,V>[]。
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        //将属性的table设置为新table（oldTab未改变）。
        table = newTab;
        if (oldTab != null) {
        	//如果oldTab不为空，遍历oldTab中的所有元素（Node）
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                //如果元素值不为空，旧数组的桶下标赋给临时变量e
                if ((e = oldTab[j]) != null) {
                	//解除旧数组中的引用，否则就数组无法被GC回收
                    oldTab[j] = null;
                    if (e.next == null)
                    	//如果当前Node的下一个Node(next)不为空，那么将其赋值给新的table
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode)
                    	//如果当前Node的下一个Node(next)为空，且当前是红黑树结构，那么处理树中元素的重排
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                   		//如果e是链表的头并且存在下一个Node，那么处理链表中元素重排
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

	初始化首先拿到当前的table（Node<K,V>[]）（oldTab）及其长度（oldCap），与当前的扩容
阈值（oldThr），然后初始化出新的table的长度（newCap）和扩容阈值（newThr）。
	（1）如果当前table的长度大于0：就对其进行判断，看长度是否达到最大，如果达到最大，则
将当前的扩容阈值设置为Integer最大值，并返回当前table；否则将新table的长度扩大一倍，且
如果当前table的长度大于默认长度，则将新table的扩容阈值增大一倍。
	（2）如果table的长度不大于0，则判断当前table的扩容阈值是否大于0，如果大于0，则将
新table的长度设置为当前table的扩容阈值。
	（3）否则将新table的长度置为默认长度（16），新table的扩容阈值设置为：
						默认加载因子 （0.75）* 默认长度（16）。
	然后判断新table的扩容阈值是否等于0（只有第（2）种情况下才为0），如果等于0，计算出
新table的扩容阈值"ft"：新table的长度 （newCap）* 加载因子 （默认0.75），如果新table的
长度不超过最大长度，且新table的扩容阈值也不超过最大长度，就将新table的扩容阈值设置
为"ft",否则设置为Integer最大值。
	将当前的扩容阈值设置为新的扩容阈值。
	new出一个新的长度为新table长度的Node<K,V>[]。
	将属性的table设置为新table（oldTab未改变）。
	如果oldTab不为空，遍历oldTab中的所有元素（Node），如果元素值不为空，旧数组的桶下标赋给临时变量e，
然后解除旧数组中的引用，否则就数组无法被GC回收。如果当前Node的下一个Node(next)不为空，那么将其赋值
给新的table。如果当前Node的下一个Node(next)为空，且当前是红黑树结构，那么处理树中元素的重排。如果e是
链表的头并且存在下一个Node，那么处理链表中元素重排