简要说明
HashMap内部存储采用的是一个一维数组,数组长度为2的倍数。数组默认大小为16,如果指定初始化大小为N,则数组长度为小于等于N的最大偶数(2的倍数)。通过key将value映射到数据中的某个位置,发生位置冲突时,使用数组元素的next指针(引用)将发生冲突的元素连接起来,构成一个链表结构,最新存入的元素放在链表最前面。
HashMap存储结构
HashMap存储元素的数组定义如下:
/**
* The table, resized as necessary. Length MUST Always be a power of two.
*/
transient Entry[] table;
Entry包含四个字段:key,value,hash,next。
- key:存入value时使用的关键字对象;
- value:要存储的对象;
- hash:HashMap内存计算出来的一个hash值;
- next:指向下一个元素的指针(引用),发生位置冲突时候把映射位置相同的元素链接起来。
Entry类定义如下(省略掉一些普通方法):
static class Entry<K,V> implements Map.Entry<K,V> {
final K key;
V value;
Entry<K,V> next;
final int hash;
public final V setValue(V newValue) {
V oldValue = value;
value = newValue;
return oldValue;
}
public final boolean equals(Object o) {
if (!(o instanceof Map.Entry))
return false;
Map.Entry e = (Map.Entry)o;
Object k1 = getKey();
Object k2 = e.getKey();
if (k1 == k2 || (k1 != null && k1.equals(k2))) {
Object v1 = getValue();
Object v2 = e.getValue();
if (v1 == v2 || (v1 != null && v1.equals(v2)))
return true;
}
return false;
}
public final int hashCode() {
return (key==null ? 0 : key.hashCode()) ^
(value==null ? 0 : value.hashCode());
}
}
HashMap初始化
数组默认大小为16,如果指定初始化大小为N,则数组长度为小于等于N的最大偶数(2的倍数)。数组大小会根据元素数量进行扩容,当元素个数大于某个阈值(threshold)时,就会触发扩容。当容量达到最大(MAXIMUM_CAPACITY)时,HashMap就不会再进行扩容。threshold的值与加载因子(loadFactor)有关:threshold = (int)(capacity * loadFactor),loadFactor默认值为0.75,capacity为数组长度。
默认值代码:
/**
* The default initial capacity - MUST be a power of two.
*/
static final int DEFAULT_INITIAL_CAPACITY = 16;
/**
* The maximum capacity, used if a higher value is implicitly specified
* by either of the constructors with arguments.
* MUST be a power of two <= 1<<30.
*/
static final int MAXIMUM_CAPACITY = 1 << 30;
/**
* The load factor used when none specified in constructor.
*/
static final float DEFAULT_LOAD_FACTOR = 0.75f;
初始化代码:
/**
* Constructs an empty <tt>HashMap</tt> with the specified initial
* capacity and load factor.
*
* @param initialCapacity the initial capacity
* @param loadFactor the load factor
* @throws IllegalArgumentException if the initial capacity is negative
* or the load factor is nonpositive
*/
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " +
initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " +
loadFactor);
// Find a power of 2 >= initialCapacity
int capacity = 1;
while (capacity < initialCapacity)
capacity <<= 1;
this.loadFactor = loadFactor;
threshold = (int)(capacity * loadFactor);
table = new Entry[capacity];
init();
}
HashMap存储位置映射算法
HashMap关键的算法,以get方法为例:
1. 调用key.hashCode()方法获取key的hashCode,然后对hashCode做一些计算,最后的到一个hash值。
2. 使用hash值,计算出元素应该储存的位置index。
3. 取出index位置的元素链表,查找目标元素e。查找过程中比较方法:hash == e.hash && (e.key == key || key.equals(e.k)),找到后,返回e.value。
get方法代码如下:
/**
* Returns the value to which the specified key is mapped,
* or {@code null} if this map contains no mapping for the key.
*
* <p>More formally, if this map contains a mapping from a key
* {@code k} to a value {@code v} such that {@code (key==null ? k==null :
* key.equals(k))}, then this method returns {@code v}; otherwise
* it returns {@code null}. (There can be at most one such mapping.)
*
* <p>A return value of {@code null} does not <i>necessarily</i>
* indicate that the map contains no mapping for the key; it's also
* possible that the map explicitly maps the key to {@code null}.
* The {@link #containsKey containsKey} operation may be used to
* distinguish these two cases.
*
* @see #put(Object, Object)
*/
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry<K,V> e = table[indexFor(hash, table.length)];
e != null;
e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
计算key的hash值的hash方法如下:
/**
* Applies a supplemental hash function to a given hashCode, which
* defends against poor quality hash functions. This is critical
* because HashMap uses power-of-two length hash tables, that
* otherwise encounter collisions for hashCodes that do not differ
* in lower bits. Note: Null keys always map to hash 0, thus index 0.
*/
static int hash(int h) {
// This function ensures that hashCodes that differ only by
// constant multiples at each bit position have a bounded
// number of collisions (approximately 8 at default load factor).
h ^= (h >>> 20) ^ (h >>> 12);
return h ^ (h >>> 7) ^ (h >>> 4);
}
根据hash值计算存储位置的indexFor方法如下:
/**
* Returns index for hash code h.
*/
static int indexFor(int h, int length) {
return h & (length-1);
}
HashMap扩容处理
添加元素(键值对)时,会检测是否需要扩容:当HashMap中元素个数大于等于threshold时,就会触发扩容。
threshold = (int)(capacity * loadFactor)
扩容原理:重新分配一个数组table,然后把原数组中的元素逐个重新映射到新数组中,而非直接拷贝进去,因为数组长度发生了变化,映射位置也会发生变化。从上面的indexFor方法可以看出这一点:映射位置 = h & (length-1),h是根据key的hashCode计算出来的一个哈希值,length是数组长度。
addEntry方法代码如下:
/**
* Adds a new entry with the specified key, value and hash code to
* the specified bucket. It is the responsibility of this
* method to resize the table if appropriate.
*
* Subclass overrides this to alter the behavior of put method.
*/
void addEntry(int hash, K key, V value, int bucketIndex) {
Entry<K,V> e = table[bucketIndex];
table[bucketIndex] = new Entry<K,V>(hash, key, value, e);
if (size++ >= threshold)
resize(2 * table.length);
}
负责扩容的resize方法代码如下:
/**
* Rehashes the contents of this map into a new array with a
* larger capacity. This method is called automatically when the
* number of keys in this map reaches its threshold.
*
* If current capacity is MAXIMUM_CAPACITY, this method does not
* resize the map, but sets threshold to Integer.MAX_VALUE.
* This has the effect of preventing future calls.
*
* @param newCapacity the new capacity, MUST be a power of two;
* must be greater than current capacity unless current
* capacity is MAXIMUM_CAPACITY (in which case value
* is irrelevant).
*/
void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}
Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}
/**
* Transfers all entries from current table to newTable.
*/
void transfer(Entry[] newTable) {
Entry[] src = table;
int newCapacity = newTable.length;
for (int j = 0; j < src.length; j++) {
Entry<K,V> e = src[j];
if (e != null) {
src[j] = null;
do {
Entry<K,V> next = e.next;
int i = indexFor(e.hash, newCapacity);
e.next = newTable[i];
newTable[i] = e;
e = next;
} while (e != null);
}
}
}