HashMap的实现原理面试简单解答
补充面试题:为什么hashcode和equals一般同时重写。
hashmap如何判断出现了碰撞,然后存储在链表中。
hash算法。
这里不剖析源码只简单讲解:
1原理简单解析
数组结构
HashMap内部结构为数组加链表方式,这里可以知道HashMap解决冲突的方法是链地址法,
Node<K,V>[] table,Node节点里包括hash值,key,value,nextNode,HashMap中key值得Hash值得计算
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16)(1.8jdk);
上面公式用到了key的hashCode值,每种类型的key都会重写
hashCode的计算方法,例如String:(重写了hashCode一般需要重写equals)
public int hashCode() {
int h = hash;
final int len = length();
if (h == 0 && len > 0) {
for (int i = 0; i < len; i++) {
h = 31 * h + charAt(i);
}
hash = h;
}
return h;
}
(key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16)计算key的hash值最终会比较小,右移16位之后亦或使最终结果包含了高位和低位特性,碰撞的几率会更小。
最终会利用hash值和长度进行求余,确定key的位置。
V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
//n 是长度,然后& hash值得到key在数组中的位置
Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
}
链表结构
在putVal函数中如果判断数组中已经存在node,此时调用putTreeVal函数把node放在node的链表中,如果链表过长会转换成红黑树存储
2 HashMap扩容
综合load factor和长度,如果需要扩容会调用resize函数,将原来的长度扩大一倍,根据扩容后的长度建立新的数组,然后把旧的数据放入到新数组中。默认大小为16,负载因子和默认大小都可以设置。
final Node<K,V>[] resize() {
//旧数组
Node<K,V>[] oldTab = table;
int oldCap = (oldTab == null) ? 0 : oldTab.length;
//旧长度
int oldThr = threshold;
int newCap, newThr = 0;
if (oldCap > 0) {
//一般情况扩容走这里
if (oldCap >= MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return oldTab;
}
else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
oldCap >= DEFAULT_INITIAL_CAPACITY)
//得到新的长度
newThr = oldThr << 1; // double threshold
}
else if (oldThr > 0) // initial capacity was placed in threshold
newCap = oldThr;
else { // zero initial threshold signifies using defaults
newCap = DEFAULT_INITIAL_CAPACITY;
newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
}
if (newThr == 0) {
float ft = (float)newCap * loadFactor;
newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
(int)ft : Integer.MAX_VALUE);
}
//赋值
threshold = newThr;
@SuppressWarnings({"rawtypes","unchecked"})
//建立新数组
Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
table = newTab;
//拷贝旧数组到新数组
if (oldTab != null) {
for (int j = 0; j < oldCap; ++j) {
Node<K,V> e;
if ((e = oldTab[j]) != null) {
oldTab[j] = null;
if (e.next == null)
newTab[e.hash & (newCap - 1)] = e;
else if (e instanceof TreeNode)
((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
else { // preserve order
Node<K,V> loHead = null, loTail = null;
Node<K,V> hiHead = null, hiTail = null;
Node<K,V> next;
do {
next = e.next;
if ((e.hash & oldCap) == 0) {
if (loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
}
else {
if (hiTail == null)
hiHead = e;
else
hiTail.next = e;
hiTail = e;
}
} while ((e = next) != null);
if (loTail != null) {
loTail.next = null;
newTab[j] = loHead;
}
if (hiTail != null) {
hiTail.next = null;
newTab[j + oldCap] = hiHead;
}
}
}
}
}
return newTab;
}
hash算法中解决冲突的办法有很多,开放地址法,链地址发,再hash法等。
3 优化
/** * The bin count threshold for using a tree rather than list for a * bin. Bins are converted to trees when adding an element to a * bin with at least this many nodes. The value must be greater * than 2 and should be at least 8 to mesh with assumptions in * tree removal about conversion back to plain bins upon * shrinkage. */ static final int TREEIFY_THRESHOLD = 8;
大致意思就是当list链表长度超过8时就用tree树进行存储。
/**
* Replaces all linked nodes in bin at index for given hash unless
* table is too small, in which case resizes instead.
*/
final void treeifyBin(Node<K,V>[] tab, int hash) {
int n, index; Node<K,V> e;
if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
resize();
else if ((e = tab[index = (n - 1) & hash]) != null) {
TreeNode<K,V> hd = null, tl = null;
do {
TreeNode<K,V> p = replacementTreeNode(e, null);
if (tl == null)
hd = p;
else {
p.prev = tl;
tl.next = p;
}
tl = p;
} while ((e = e.next) != null);
if ((tab[index] = hd) != null)
hd.treeify(tab);
}
}
/**
* Forms tree of the nodes linked from this node.
* @return root of tree
*/
final void treeify(Node<K,V>[] tab) {
TreeNode<K,V> root = null;
for (TreeNode<K,V> x = this, next; x != null; x = next) {
next = (TreeNode<K,V>)x.next;
x.left = x.right = null;
if (root == null) {
x.parent = null;
x.red = false;
root = x;
}
else {
K k = x.key;
int h = x.hash;
Class<?> kc = null;
for (TreeNode<K,V> p = root;;) {
int dir, ph;
K pk = p.key;
if ((ph = p.hash) > h)
dir = -1;
else if (ph < h)
dir = 1;
else if ((kc == null &&
(kc = comparableClassFor(k)) == null) ||
(dir = compareComparables(kc, k, pk)) == 0)
dir = tieBreakOrder(k, pk);
TreeNode<K,V> xp = p;
if ((p = (dir <= 0) ? p.left : p.right) == null) {
x.parent = xp;
if (dir <= 0)
xp.left = x;
else
xp.right = x;
root = balanceInsertion(root, x);
break;
}
}
}
}
moveRootToFront(tab, root);
}
上面是优化的具体代码,简单说就是把list转成了红黑树,查找效率更高。