1、数据结构
数组+链表/红黑树:数组存储元素,当有冲突时,就在该数组位置形成一个链表,存储冲突的元素,当链表元素个数大于阈值时,链表就转换为红黑树。
transient Node<K,V>[] table;
transient int size;
int threshold;
final float loadFactor;
// 默认初始容量16
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;
// 最大容量2^30
static final int MAXIMUM_CAPACITY = 1 << 30;
// 默认负载因子
static final float DEFAULT_LOAD_FACTOR = 0.75f;
// 链表转红黑树的阈值
static final int TREEIFY_THRESHOLD = 8;
Node[] table的初始化长度length(默认值是16),loadFactor为负载因子(默认值是0.75),threshold是HashMap所能容纳的最大数据量的Node(键值对)个数。threshold = length * loadFactor。也就是说,在数组定义好长度之后,负载因子越大,所能容纳的键值对个数越多。
2、哈希算法
在JDK1.8的实现中,优化了高位运算的算法,通过hashCode()的高16位异或低16位:
(h = k.hashCode()) ^ (h >>> 16)
从速度、功效、质量来考虑,这么做可以在数组table的length比较小的时候,也能保证考虑到高低Bit都参与到Hash的计算中,同时不会有太大的开销。
/**
* 计算哈希值
*/
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
/**
* 计算索引
*/
static int indexFor(int h, int length) { //jdk1.7的源码,jdk1.8原理一样
return h & (length-1);
它通过h & (table.length -1)来得到该对象的索引位,而HashMap底层数组的长度总是2^n,这是HashMap在速度上的优化。
当length总是2^n时,h& (length-1)运算等价于对length取模,也就是h%length,但是&比%具有更高的效率。
3、扩容机制
“翻倍扩容”:
newCap = oldCap << 1
扩展之后需要进行rehash,哈希值与oldCap相与,判断对应位是否为1:
if(hash & oldCap == 1){
newIndex=oldIndex+oldCap;
}else{
newIndex=oldIndex;
}
具体代码如下:
do {
next = e.next;
if ((e.hash & oldCap) == 0) {
if (loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;// loTail标志 e.hash & oldCap == 0
}
else {
if (hiTail == null)
hiHead = e;
else
hiTail.next = e;
hiTail = e;// hiTail标志 e.hash & oldCap == 1
}
} while ((e = next) != null);
if (loTail != null) {
loTail.next = null;
newTab[j] = loHead;// newIndex=oldIndex
}
if (hiTail != null) {
hiTail.next = null;
newTab[j + oldCap] = hiHead;// newIndex=oldIndex+oldCap
}
4、为什么“翻倍扩容”?
例:key1=5,key2=21
扩容前:
hash(key1) = 5 mod 16 = 5
hash(key2) = 21 mod 16 = 5
扩容后:
hash(key1) = 5 mod 32 = 5
hash(key2) = 21 mod 32 = 21 = 5 + 16
利用位运算,提高效率:
原容量oldCap= 16,即0001 0000,oldCap-1 = 0000 1111
新容量newCap = 32,即0010 0000,newCap-1 = 0001 1111
判断对应位是否为1:key & cap
获取索引:key & (cap - 1)
rehash之后,n为原来的2倍,所以n-1的mask范围在高位多1bit(红色),因此新的index就会发生这样的变化:
5、源代码
(1)put
public V put(K key, V value) {
return putVal(hash(key), key, value, false, true);
}
final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
Node<K,V>[] tab;
Node<K,V> p;
int n, i;
// 如果当前map中无数据,执行resize方法。并且返回n
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
// 如果要插入的键值对要存放的这个位置刚好没有元素,那么把他封装成Node对象,放在这个位置上,完事。
// (n - 1) & hash:计算下标,即JDK7的indexFor()
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
// 否则的话,说明这上面有元素
else {
Node<K,V> e; K k;
// 如果这个元素的key与要插入的一样,那么就替换一下,也完事。
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
e = p;
// 如果当前节点是TreeNode类型的数据,执行putTreeVal方法
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
// 否则遍历链表,增加节点
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null);
// 判断是否转换为红黑树
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);
break;
}
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
if (e != null) { // existing mapping for key
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
// 判断是否扩容
if (++size > threshold)
resize();
afterNodeInsertion(evict);
return null;
}
(2)resize
final Node<K,V>[] resize() {
2 Node<K,V>[] oldTab = table;
3 int oldCap = (oldTab == null) ? 0 : oldTab.length;
4 int oldThr = threshold;
5 int newCap, newThr = 0;
6 if (oldCap > 0) {
7 // 超过最大值就不再扩充了,就只好随你碰撞去吧
8 if (oldCap >= MAXIMUM_CAPACITY) {
9 threshold = Integer.MAX_VALUE;
10 return oldTab;
11 }
12 // 没超过最大值,就扩充为原来的2倍
13 else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
14 oldCap >= DEFAULT_INITIAL_CAPACITY)
15 newThr = oldThr << 1; // double threshold
16 }
17 else if (oldThr > 0) // initial capacity was placed in threshold
18 newCap = oldThr;
19 else { // zero initial threshold signifies using defaults
20 newCap = DEFAULT_INITIAL_CAPACITY;
21 newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
22 }
23 // 计算新的resize上限
24 if (newThr == 0) {
25
26 float ft = (float)newCap * loadFactor;
27 newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
28 (int)ft : Integer.MAX_VALUE);
29 }
30 threshold = newThr;
31 @SuppressWarnings({"rawtypes","unchecked"})
32 Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
33 table = newTab;
34 if (oldTab != null) {
35 // 把每个bucket都移动到新的buckets中
36 for (int j = 0; j < oldCap; ++j) {
37 Node<K,V> e;
38 if ((e = oldTab[j]) != null) {
39 oldTab[j] = null;
40 if (e.next == null)
41 newTab[e.hash & (newCap - 1)] = e;
42 else if (e instanceof TreeNode)
43 ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
44 else { // 链表优化重hash的代码块
45 Node<K,V> loHead = null, loTail = null;
46 Node<K,V> hiHead = null, hiTail = null;
47 Node<K,V> next;
48 do {
49 next = e.next;
50 //
51 if ((e.hash & oldCap) == 0) {
52 if (loTail == null)
53 loHead = e;
54 else
55 loTail.next = e;
56 loTail = e;
57 }
58
59 else {
60 if (hiTail == null)
61 hiHead = e;
62 else
63 hiTail.next = e;
64 hiTail = e;
65 }
66 } while ((e = next) != null);
67 // 原索引放到bucket里
68 if (loTail != null) {
69 loTail.next = null;
70 newTab[j] = loHead;
71 }
72 // 原索引+oldCap放到bucket里
73 if (hiTail != null) {
74 hiTail.next = null;
75 newTab[j + oldCap] = hiHead;
76 }
77 }
78 }
79 }
80 }
81 return newTab;
82 }