相同系列
一、HashMap源码分析——默认参数问题
三、HashMap源码分析——扩容问题
相关字段
/* ---------------- Fields -------------- */
transient Node<K,V>[] table;
transient int size;
int threshold;
final float loadFactor;
/*
1、Node<K,V>[] table就是哈希表,这个Node就是上篇的普通节点,它记录了哈希值、K、V、还有下一个节点的信息
2、size就是当前K-V键值对的数量
3、阈值threshold,当size>threshold时,就会发生扩容,重新散列
4、loadFactor就是当前的负载因子值
*/
hash方法
/**
* Computes key.hashCode() and spreads (XORs) higher bits of hash
* to lower. Because the table uses power-of-two masking, sets of
* hashes that vary only in bits above the current mask will
* always collide. (Among known examples are sets of Float keys
* holding consecutive whole numbers in small tables.) So we
* apply a transform that spreads the impact of higher bits
* downward. There is a tradeoff between speed, utility, and
* quality of bit-spreading. Because many common sets of hashes
* are already reasonably distributed (so don't benefit from
* spreading), and because we use trees to handle large sets of
* collisions in bins, we just XOR some shifted bits in the
* cheapest possible way to reduce systematic lossage, as well as
* to incorporate impact of the highest bits that would otherwise
* never be used in index calculations because of table bounds.
*/
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
为什么要key的哈希值高16位参与运算?
注释大意+个人理解:
1、避免出现高位频繁变动而低位很少变动的情况,让hash值更随机,有利于减少哈希冲突。
2、采用与高16位异或运算,是利用最少的代价来减少系统性能的损失。
tableSizeFor方法
一个保证哈希表容量是2次幂的方法。看下方代码,可以看到这个位运算操作像是在给最高位的1做复制翻倍的效果,最后得到一个2次幂-1的结果,然后+1就得到一个2次幂的结果。
int tableSizeFor(int cap) {
int n = cap - 1; //1xxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx
n |= n >>> 1; //11xx xxxx xxxx xxxx xxxx xxxx xxxx xxxx
n |= n >>> 2; //1111 xxxx xxxx xxxx xxxx xxxx xxxx xxxx
n |= n >>> 4; //1111 1111 xxxx xxxx xxxx xxxx xxxx xxxx
n |= n >>> 8; //1111 1111 1111 1111 xxxx xxxx xxxx xxxx
n |= n >>> 16; //1111 1111 1111 1111 1111 1111 1111 1111 (2^k)-1
return (n<0)?1:(n>=MAX_CAPACITY)?MAX_CAPACITY:n+1; //+1
}
另外,将元素定位到哈希桶的运算是hash&(n-1)
,其中n是哈希表的容量,因为保证了n是二次幂,所以决定了扩容时,必须将容量扩达到2倍(2次幂倍数)。
hash&(n-1)的结果从二进制上看,设n=2^k,其实就是hash值对应二进制的低k位结果。
put方法
public V put(K key, V value) {
return putVal(hash(key), key, value, false, true);
}
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length; //初始化哈希表
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null); //桶为空就放
else {
Node<K,V> e; K k;
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
e = p; //桶第一个元素就是key
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value); //树操作
else {//遍历桶内元素
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null); //加到桶内
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash); //转成红黑树
break;
}
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
if (e != null) { // 存在相同的key
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value; //覆盖
afterNodeAccess(e);
return oldValue; //返回旧值
}
}
++modCount;
if (++size > threshold)
resize(); //扩容
afterNodeInsertion(evict);
return null;
}
get方法
public V get(Object key) {
Node<K,V> e;
return (e = getNode(hash(key), key)) == null ? null : e.value;
}
final Node<K,V> getNode(int hash, Object key) {
Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
if ((tab = table) != null && (n = tab.length) > 0 &&
(first = tab[(n - 1) & hash]) != null) {
if (first.hash == hash && // 总是先检查桶内第一个元素
((k = first.key) == key || (key != null && key.equals(k))))
return first;
if ((e = first.next) != null) {
if (first instanceof TreeNode)
return ((TreeNode<K,V>)first).getTreeNode(hash, key); //树操作
do {
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
return e;
} while ((e = e.next) != null); //遍历桶内所有元素
}
}
return null;
}