资料参考:http://zhangshixi.iteye.com/blog/672697
HashMap是Map的非同步实现,这是HashMap和Hashtable的区别之一,其次HashMap能支持null键值的插入,而Hashtable则不允许。
直观体现如下:
一、数据结构
HashMap使用数组+链表的形式实现,其中主要结构是其内部类Entry:
static class Entry<K,V> implements Map.Entry<K,V> {
final K key;//保存键值
V value;//元素的value
Entry<K,V> next;//指向链表的下一个节点
final int hash;//元素在哈希表table中的hash值
Entry的重要方法之一equals,用于比较链表节点是否相等:
public final boolean equals(Object o) {
if (!(o instanceof Map.Entry))
return false;
Map.Entry e = (Map.Entry)o;
Object k1 = getKey();
Object k2 = e.getKey();
if (k1 == k2 || (k1 != null && k1.equals(k2))) {//先比较键值
Object v1 = getValue();
Object v2 = e.getValue();
if (v1 == v2 || (v1 != null && v1.equals(v2)))//若键值相等,则再比较value
return true;
}
return false;
}
Entry的重要方法之一hashCode:
public final int hashCode() {//entry的hashCode就是key和value的异或
return (key==null ? 0 : key.hashCode()) ^
(value==null ? 0 : value.hashCode());
}
1、哈希表的初始化大小为16
/**
* The default initial capacity - MUST be a power of two.
*/
static final int DEFAULT_INITIAL_CAPACITY = 16;
2、哈希表的最大能扩容大小2^30
/**
* The maximum capacity, used if a higher value is implicitly specified
* by either of the constructors with arguments.
* MUST be a power of two <= 1<<30.
*/
static final int MAXIMUM_CAPACITY = 1 << 30;
3、哈希表的默认装载因子大小0.75
/**
* The load factor used when none specified in constructor.
*/
static final float DEFAULT_LOAD_FACTOR = 0.75f;
4、哈希表Entry[] table
/**
* The table, resized as necessary. Length MUST Always be a power of two.
*/
transient Entry[] table;
5、当前HashMap中存储的元素个数
/**
* The number of key-value mappings contained in this map.
*/
transient int size;
6、门槛值,即哈希表容量*装载因子
/**
* The next size value at which to resize (capacity * load factor).
* @serial
*/
int threshold;
7、当哈希表的元素超过容量的loadFactor则需要进行扩容,默认为0.75
/**当哈希表的元素超过容量的0.75则需要进行扩容
* The load factor for the hash table.
*
* @serial
*/
final float loadFactor;
8、HashMap对象被修改次数,put和remove的时候要使用
/**
* The number of times this HashMap has been structurally modified
* Structural modifications are those that change the number of mappings in
* the HashMap or otherwise modify its internal structure (e.g.,
* rehash). This field is used to make iterators on Collection-views of
* the HashMap fail-fast. (See ConcurrentModificationException).
*/
transient volatile int modCount;
三、重要方法
HashMap之所以称为线程不安全,因为它的方法没有加锁,所以当多线程执行增删的时候就可能会同时对同一个元素进行不同操作造成错误
1、添加元素put()
public V put(K key, V value) {
if (key == null)
return putForNullKey(value);
int hash = hash(key.hashCode());
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {//查找是否已经存在key,若存在则修改value
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;//若不存在key,则新添加节点
addEntry(hash, key, value, i);
return null;
}
1.1——》若key值是null,则采用putForNullKey方法插入null键,从源码可以得知,null的hash码为0:
key为null的元素存放在table[0]上的entry链表中,因为其它元素也有可能映射到table[0]上,所以null只是其中一员
private V putForNullKey(V value) {
for (Entry<K,V> e = table[0]; e != null; e = e.next) {
if (e.key == null) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;//若新添加元素则modCount修改值递增,table[0]上增加元素
addEntry(0, null, value, 0);
return null;
}
1.2——》在定位到table[]前,需要对键值key进行hash,方法hash(key.hashCode());
此算法加入了高位计算,防止低位不变,高位变化时,造成的hash冲tu。
/**
* Applies a supplemental hash function to a given hashCode, which
* defends against poor quality hash functions. This is critical
* because HashMap uses power-of-two length hash tables, that
* otherwise encounter collisions for hashCodes that do not differ
* in lower bits. Note: Null keys always map to hash 0, thus index 0.
*/
static int hash(int h) {
// This function ensures that hashCodes that differ only by
// constant multiples at each bit position have a bounded
// number of collisions (approximately 8 at default load factor).
h ^= (h >>> 20) ^ (h >>> 12);
return h ^ (h >>> 7) ^ (h >>> 4);
}
1.3——》table[]中定位,indexFor(hash),使用逻辑与&将key的再哈希码映射定位到table中,相比取模mod,位运算&的消耗小
当length总是 2 的n次方时,h& (length-1)运算等价于对length取模,也就是h%length,但是&比%具有更高的效率。
因为2^n-1总是n个1与h相与,所以与mod的效果是一样的
/**
* Returns index for hash code h.
*/
static int indexFor(int h, int length) {
return h & (length-1);
}
1.4——》增加新节点addEntry(),每次增加新节点都是在数组table[index]上进行头插入
void addEntry(int hash, K key, V value, int bucketIndex) {
Entry<K,V> e = table[bucketIndex];
table[bucketIndex] = new Entry<K,V>(hash, key, value, e);
if (size++ >= threshold)
resize(2 * table.length);
}
/**
* Creates new entry.
*/
Entry(int h, K k, V v, Entry<K,V> n) {
value = v;
next = n;
key = k;
hash = h;
}
1.5——》扩容,resize(),如果增加新节点使元素个数size大于门槛值threshold的时候则需要扩容成2*table.length
因为table[]是一个数组大小固定,所以扩容实际上需要重新建立一个两倍大小的数组
void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}
Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}
1.6——》将原table中的元素转移到新table中,这是性能消耗很大的操作
转移的过程,需要遍历原table读取每个entry根据元素的hash值(即key的hashCode的再hash码)定位到新table中
这里有一个技巧就是,indexFor(h, newCapacity),数组长度为2的n次幂的时候,不同的key算得得index相同的几率较小,那么数据在数组上分布就比较均匀,也就是说碰撞的几率小,查询的次数减少
/**
* Transfers all entries from current table to newTable.
*/
void transfer(Entry[] newTable) {
Entry[] src = table;
int newCapacity = newTable.length;
for (int j = 0; j < src.length; j++) {
Entry<K,V> e = src[j];
if (e != null) {
src[j] = null;
do {
Entry<K,V> next = e.next;
int i = indexFor(e.hash, newCapacity);//重新定位到新table中
e.next = newTable[i];
newTable[i] = e;
e = next;
} while (e != null);
}
}
}
2、获取元素get()
2.1——》若是key为null,getForNullKey(),到指定的table[0]中取
2.2——》若不为null,则会根据key.hashCode进行再hash码再定位到数组中,之后遍历链表比较hash值、key
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry<K,V> e = table[indexFor(hash, table.length)];
e != null;
e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
3、删除元素remove()
如前所述,HashMap中进行put和remove操作都要进行修改值modCount的增加
final Entry<K,V> removeEntryForKey(Object key) {
int hash = (key == null) ? 0 : hash(key.hashCode());
int i = indexFor(hash, table.length);
Entry<K,V> prev = table[i];
Entry<K,V> e = prev;
while (e != null) {
Entry<K,V> next = e.next;
Object k;
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k)))) {
modCount++;//找到之后要删除,将修改次数增加
size--;
if (prev == e)//如果是头结点则用下一个节点替换
table[i] = next;
else
prev.next = next;
e.recordRemoval(this);
return e;
}
prev = e;
e = next;
}
return e;
}
4、Fail-Fast机制
我们知道java.util.HashMap不是线程安全的,因此如果在使用迭代器的过程中有其他线程修改了map,那么将抛出ConcurrentModificationException,这就是所谓fail-fast策略。
这是由HashMap中的一个内部类起作用的,源码如下:
为了是各个线程都能及时获取到modCount的最新值,modCount设成volatile
其原理就是,当使用HashMap的迭代器的时候,就将当前修改次数覆盖迭代器的expectedModCount,如果中途校验HashMap的modCount和expectedModCount不相同则说明在迭代的过程中修改了HashMap,则报ConcurrentModificationException
private abstract class HashIterator<E> implements Iterator<E> {
Entry<K,V> next; // next entry to return
int expectedModCount; // For fast-fail,初始的修改值
int index; // current slot
Entry<K,V> current; // current entry
HashIterator() {
expectedModCount = modCount;//设置当前的修改值
if (size > 0) { // advance to first entry
Entry[] t = table;//通过遍历的方式到达哈希表中的第一个节点
while (index < t.length && (next = t[index++]) == null)
;
}
}
public final boolean hasNext() {
return next != null;
}
final Entry<K,V> nextEntry() {
if (modCount != expectedModCount)//如果期望的修改值发生了变化则报异常
throw new ConcurrentModificationException();
Entry<K,V> e = next;
if (e == null)
throw new NoSuchElementException();
if ((next = e.next) == null) {
Entry[] t = table;
while (index < t.length && (next = t[index++]) == null)
;
}
current = e;
return e;
}
public void remove() {
if (current == null)
throw new IllegalStateException();
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
Object k = current.key;
current = null;
HashMap.this.removeEntryForKey(k);
expectedModCount = modCount;
}
}