HashMap源码分析(三)
JDK1.8
从上一篇HashMap源码分析(一),HashMap源码分析(二)我们了解了HashMap基本的数据结构,也了解了怎么生成链表的,同时我们也知道HashMap内部是通过固定长度数组进行存储的,接下来我们再看一下怎么扩容的。
先看一下测试代码:
HashMap hashMap = new HashMap();
hashMap.put("1","1");
hashMap.put("2","1");
hashMap.put("3","1");
hashMap.put("4","1");
hashMap.put("5","1");
hashMap.put("6","1");
hashMap.put("7","1");
hashMap.put("8","1");
hashMap.put("9","1");
hashMap.put("10","1");
hashMap.put("11","1");
hashMap.put("12","1");
hashMap.put("13","1");
然后看一下putVal的函数:
if (++size > threshold)
resize();
size是key-value的数量,threshold是第一次初始化的时候赋值的12。所以当hashMap.put(“13”,“1”);添加的时候,调用putVal的最后就会调用resize()。从这里我们也可以看到,HashMap是先添加,再扩容。
/**
* Initializes or doubles table size. If null, allocates in
* accord with initial capacity target held in field threshold.
* Otherwise, because we are using power-of-two expansion, the
* elements from each bin must either stay at same index, or move
* with a power of two offset in the new table.
*
* @return the table
*/
final Node<K,V>[] resize() {
Node<K,V>[] oldTab = table;
int oldCap = (oldTab == null) ? 0 : oldTab.length;//oldCap=16
int oldThr = threshold;//oldThr = 12
int newCap, newThr = 0;
if (oldCap > 0) {
if (oldCap >= MAXIMUM_CAPACITY) { //MAXIMUM_CAPACITY = 2^30
threshold = Integer.MAX_VALUE;
return oldTab;
}
else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && //newCap=16*2=32
oldCap >= DEFAULT_INITIAL_CAPACITY) //DEFAULT_INITIAL_CAPACITY = 16
newThr = oldThr << 1; // double threshold //newThr = 12*2 = 32;
}
else if (oldThr > 0) // initial capacity was placed in threshold
newCap = oldThr;
else { // zero initial threshold signifies using defaults
newCap = DEFAULT_INITIAL_CAPACITY;
newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
}
if (newThr == 0) {
float ft = (float)newCap * loadFactor;
newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
(int)ft : Integer.MAX_VALUE);
}
threshold = newThr;
@SuppressWarnings({"rawtypes","unchecked"})
Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
table = newTab;
......
return newTab;
}
当第一次扩容的时候,走的下面的代码块:
if (oldCap > 0) {
...
else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && //newCap=16*2=32
oldCap >= DEFAULT_INITIAL_CAPACITY) //DEFAULT_INITIAL_CAPACITY = 16
newThr = oldThr << 1; // double threshold //newThr = 12*2 = 32;
}
然后确定了新的数组长度:newCap,和新的扩容边界newThr,并且初始化了一个长度为newCap的数组
threshold = newThr;
@SuppressWarnings({"rawtypes","unchecked"})
Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
table = newTab;
接下来,我们不想也知道,需要把旧的数组复制到新的数组:
for (int j = 0; j < oldCap; ++j) {
Node<K,V> e;
if ((e = oldTab[j]) != null) {
oldTab[j] = null;
if (e.next == null)
newTab[e.hash & (newCap - 1)] = e;
else if (e instanceof TreeNode)
((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
else { // preserve order
Node<K,V> loHead = null, loTail = null;
Node<K,V> hiHead = null, hiTail = null;
Node<K,V> next;
do {
next = e.next;
if ((e.hash & oldCap) == 0) {
if (loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
}
else {
if (hiTail == null)
hiHead = e;
else
hiTail.next = e;
hiTail = e;
}
} while ((e = next) != null);
if (loTail != null) {
loTail.next = null;
newTab[j] = loHead;
}
if (hiTail != null) {
hiTail.next = null;
newTab[j + oldCap] = hiHead;
}
}
}
}
把老的数据复制到新的数据有3种情况:
-
单独的一个Node对象,没有形成链表的:
newTab[e.hash & (newCap - 1)] = e;
-
形成链表的,并且是树形结构的(后面会讲到,先跳过)
((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
-
不是树形结构
Node<K,V> loHead = null, loTail = null; Node<K,V> hiHead = null, hiTail = null; Node<K,V> next; do { next = e.next; if ((e.hash & oldCap) == 0) { if (loTail == null) loHead = e; else loTail.next = e; loTail = e; } else { if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) != null); if (loTail != null) { loTail.next = null; newTab[j] = loHead; } if (hiTail != null) { hiTail.next = null; newTab[j + oldCap] = hiHead; }
乍一看这代码有点懵,不知道是干啥的,特别是e.hash & oldCap这个。按照常理,应该是循环查这个链表里的每一个元素,然后再以此计算其的新的下标(e.hash & (newCap - 1)),可是看不出来这个代码中有哪一块是计算下标。
我们先不管这个代码的实现,我们先自己按照这个逻辑来自己实现。
假设我们hashMap要存入4个key-value**,key1,key2,key3,key4**(value不用考虑),然后这4个key的hashcode值分别为5,23,53,55
对于这个链表我们在扩容的时候,怎么把这个数据放到新的数组中呢。
首先我们现在算出这4个key值的下标:
从这个图可以查出,新的下标要么是旧下标不变,要么就是加上16,而16恰恰是老数组的长度。实际上,我们从二进制的运算就可以看出这个不是偶然,因为新的数组(n-1)的第5位(oldCap)是1导致的,导致所有hash()第五位是1的key的下标都加上了oldCap。所以以此类推,当第三次扩容是,即n=64时,第6位(oldCap)是1的时候,新得下标就是旧的下标加上oldCap(32)。
回过头来,再看源代码:
if ((e.hash & oldCap) == 0) {
if (loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
}
else {
if (hiTail == null)
hiHead = e;
else
hiTail.next = e;
hiTail = e;
}
if (loTail != null) {
loTail.next = null;
newTab[j] = loHead;
}
if (hiTail != null) {
hiTail.next = null;
newTab[j + oldCap] = hiHead;
}
你就知道这个if判断是干什么的了,就是判断新增的bit位是否时0。如果是的话,把链表复制到loTail,然后新旧下标不见。是1的时候,复制到hiTail,然后新的下标是旧的下标加oldCap。
到此,扩容就结束了。
最后感谢该博客的讲解,我也是看了这篇博客,才彻底明白扩容时链表下标变化。