HashMap哈希散列与扩容
相比List<> 集合类 , HashMap 是一组键值对集合.
假设要自己写个HashMap 类 , 在座的同志们会怎么写呢?
这里我以一个客观的且'新人'的方式写一个HashMap供大家观赏.
package 键值对集合;
public class HashMap<k , v> {
private Node<k , v >[] nodes;
private int capacity = 10;
private float factory = 0.8f;
private int threshold = (int) (capacity * factory);
private int size;
public HashMap(){
nodes = new Node[capacity];
}
class Node<k , v>{
k key ;
v val ;
public Node(k k , v v){
key = k;
val = v;
}
}
public void updateNode(Node oldNode , NodenewNode){
oldNode . val = newNode.val;
}
public void put(k key , v val){
Node<k, v> node = new Node<k , v>(key , val);
Node table = nodes[0];
if(size > threshold){
resizeCapacity();
}
for(int i = 1 ; i < threshold ; i ++){
table = nodes[i];
if(table == null){
putEntity(i , node);
break;
}
if(table.key.equals(key)){
updateNode(table , node);
}
}
}
private void resizeCapacity() {
// TODOAuto-generated method stub
Node<k , v> ns[] = new Node[capacity = capacity * 2];
System.arraycopy(nodes, 0, ns, 0 , threshold);
nodes = ns;
threshold = (int) (capacity * factory);
}
private void putEntity(int i, Node<k, v> node) {
// TODOAuto-generated method stub
nodes[i] = node;
size ++;
}
}
这段自定义的HashMap主要涉及到扩容方面的知识,当capacity 不够时, 重新申请内存,而这里进行put的时候,从数组下表0开始一直put到数组满threshold时,重新进行扩容.
当出现重复的key的时候 , 也是从头开始for 进行判断 , 是否更新key对应的值,这是一个很消耗时间的地方,虽然扩容是很浪费时间的 , 但是这也是一个必经的过程.
jdk 自定的HashMap肯定写法没有这么单纯 , 这里我们欣赏下系统中的HashMap:
@Override public V put(K key, V value) {
if (key == null) {
return putValueForNullKey(value);
}
int hash = Collections.secondaryHash(key);
HashMapEntry<K, V>[] tab = table;
int index = hash & (tab.length - 1);
for (HashMapEntry<K, V> e = tab[index]; e!= null; e = e.next) {
if (e.hash == hash && key.equals(e.key)) {
preModify(e);
V oldValue = e.value;
e.value = value;
return oldValue;
}
}
// No entry for (non-null) key is present;create one
modCount++;
if (size++ > threshold) {
tab = doubleCapacity();
index = hash & (tab.length - 1);
}
addNewEntry(key, value, hash, index);
return null;
}
其中这三行代码是比较关键的:
int hash =Collections.secondaryHash(key);
HashMapEntry<K,V>[] tab = table;
int index = hash& (tab.length - 1);
首先用key计算出了所对应的hashcode , 这里运用到了散列算法:
Collections.java类:
public static int secondaryHash(Object key) {
return secondaryHash(key.hashCode());
}
private static int secondaryHash(int h) {
// Spread bits to regularize both segment and indexlocations,
// using variant of single-word Wang/Jenkins hash.
h += (h << 15) ^ 0xffffcd7d;
h ^= (h >>> 10);
h += (h << 3);
h ^= (h >>> 6);
h += (h << 2) + (h << 14);
return h ^ (h >>> 16);
}
这里将返回一个hash值,接着 hash & (tab.length - 1); 进行散列
确保下标索引不会超出数组,这是跟我们想象中的HashMap不同的地方,其他方面皆与我们想象中的没多少差别,具体请看java.util.HashMap的源码~