我尽量不打错别字,用词准确,不造成阅读障碍。
注:本文基于Android API 24 Platform 中 android.jar下的HashMap,跟Oracle的JDK 1.8 还是很多不一样的,Oracle使用的是红黑树,差别挺大,似乎和Oracle的JDK 1.6 很像。
看源码是很枯燥的,请静下心来。
一. 基本知识
1.HashMap是一个散列表,存储是基于键值对(key-value)的映射。
2.HashMap继承AbstractMap,实现了Map、Cloneable、Serializable接口。
3.HashMap是线程不安全的,不同步,且key、value可以为null,与之相反的是HashTable——安全、同步、不可为null;最后,它不是有序的,有序的是TreeMap。
HashMap是数组与链表的结合,也就是说数组中的元素是链表的结点。
HashMap有几个重要的参数:负载因子和初始容量
初始容量:表示HashMap的初始大小,也就是默认大小。
负载因子:当HashMap容量不够的时候需要扩容,负载因子就是限定扩容的条件,默认是0.75,即达到默认/初次指定容量的75%就扩容,每次扩容2倍,即<< 1。
HashCode:请百度,用来确定数组下标。
HashMap是通过“拉链法”实现的哈希表。
HashMap有几个重要的成员变量:
table:数组名;
size:HashMap的大小,保存的键值对数量;
threshold:阈值,用于判断是否扩容,threshold=容量*负载因子;
loadFactor:负载因子;
modCount:实现fail-fast机制,当表中数据发生变化时会加1,如put()和clear()时。
主要API
返回值 | 方法名 |
---|---|
void | clear() |
Object | clone() |
boolean | containsKey(Object key) |
boolean | containsValue(Object value) |
Set<Entry<K,V>> | entrySet() |
V | get(Object key) |
boolean | isEmpty() |
Set | keySet() |
V | put(K key,V value) |
void | putAll(Map< ? extends K,?extends V> map) |
V | remove(Object key) |
int | size() |
Collection | values() |
二.源码解释
数组
/**
* An empty table instance to share when the table is not inflated.
* Orcle的JDK中名字叫Node<K,V>
*/
static final HashMapEntry<?,?>[] EMPTY_TABLE = {};
/**
* The table, resized as necessary. Length MUST Always be a power of two.
* Orcle的JDK中名字叫Node<K,V>
*/
transient HashMapEntry<K,V>[] table = (HashMapEntry<K,V>[]) EMPTY_TABLE;
数组名叫table,初始化时为空。
HashMapEntry是HashMap的静态内部类,数据节点都保存在这里面:
static class HashMapEntry<K,V> implements Map.Entry<K,V> {
final K key;
V value;
HashMapEntry<K,V> next; //下一个结点
int hash;
HashMapEntry(int h, K k, V v, HashMapEntry<K,V> n) {
value = v;
next = n;
key = k;
hash = h;
}
public final K getKey() {
return key;
}
public final V getValue() {
return value;
}
public final V setValue(V newValue) {
V oldValue = value;
value = newValue;
return oldValue;
}
//判断是不是相等,key和value都要相等才会返回true
//Orcle的JDK写法这里不一样,但是都是同时比较key和value才可以
public final boolean equals(Object o) {
if (!(o instanceof Map.Entry))
return false;
Map.Entry e = (Map.Entry)o;
Object k1 = getKey();
Object k2 = e.getKey();
if (k1 == k2 || (k1 != null && k1.equals(k2))) {
Object v1 = getValue();
Object v2 = e.getValue();
if (v1 == v2 || (v1 != null && v1.equals(v2)))
return true;
}
return false;
}
//获取hashCode
public final int hashCode() {
return Objects.hashCode(getKey()) ^ Objects.hashCode(getValue());
}
public final String toString() {
return getKey() + "=" + getValue();
}
/**
* This method is invoked whenever the value in an entry is
* overwritten by an invocation of put(k,v) for a key k that's already
* in the HashMap.
* 往hashMap里面添加元素时调用这个方法
*/
void recordAccess(HashMap<K,V> m) {
}
/**
* This method is invoked whenever the entry is
* removed from the table.
* 往hashMap里面删除元素时调用这个方法
*/
void recordRemoval(HashMap<K,V> m) {
}
}
构造方法
构造方法只有四个,官方注释我就删掉了,太长。
//2参构造方法,一般都会走到这,接受初始大小和负载因子
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity);
//不能超过最大容量,最大容量是 1<<30; 即左移30位
if (initialCapacity > MAXIMUM_CAPACITY) {
initialCapacity = MAXIMUM_CAPACITY;
} else if (initialCapacity < DEFAULT_INITIAL_CAPACITY) {
//不能小于默认大小,否则就改为默认大小
initialCapacity = DEFAULT_INITIAL_CAPACITY;
}
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " + loadFactor);
// Android-Note: We always use the default load factor of 0.75f.
// This might appear wrong but it's just awkward design. We always call
// inflateTable() when table == EMPTY_TABLE. That method will take "threshold"
// to mean "capacity" and then replace it with the real threshold (i.e, multiplied with
// the load factor).
threshold = initialCapacity; //将指定大小赋值给扩容标识,用于后来扩容的判断
init(); //这是个空方法,子类可重写
}
//1参构造方法,指定初始大小
public HashMap(int initialCapacity) {
this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
//无参构造方法,调用2参构造方法
public HashMap() {
//参数分别为;默认大小,负载因子,Orcale的JDK为16,Android Open JDK为4(没错),必须是2的次方
this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);
}
//这个方法先调用2参构造方法,然后扩容,直接扩容;创建一个包含“子map”的HashMap
public HashMap(Map<? extends K, ? extends V> m) {
this(Math.max((int) (m.size() / DEFAULT_LOAD_FACTOR) + 1, DEFAULT_INITIAL_CAPACITY), DEFAULT_LOAD_FACTOR);
inflateTable(threshold);
putAllForCreate(m); //将map中元素逐个添加到HashMap中。
}
put方法
最常用的方法之一,所以先介绍一下;
public V put(K key, V value) {
if (table == EMPTY_TABLE) {
//如果数组是空的,但是有数据要put,则直接扩容
inflateTable(threshold);
}
if (key == null)
//如果key是null的,则将数据放入下标为0的数组第一个元素中,即table[0]中
return putForNullKey(value);
//获取hash值
int hash = sun.misc.Hashing.singleWordWangJenkinsHash(key);
//确定数组下标
int i = indexFor(hash, table.length);
//从数组下标处开始遍历链表,
//如果key相等,就将新的value替换旧的value,将旧的Value返回
//注意这里是新值替换旧值
for (HashMapEntry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
//这是个空方法
e.recordAccess(this);
return oldValue;
}
}
modCount++; //记录数据改动了一次,只有新的元素添加会+1,重复元素直接在for循环里就返回了
addEntry(hash, key, value, i);//新的元素添加至Map中
return null;
}
总的来说就是,先判断是否为null,是null就放到table[0]中,table[0]也是一个链表,也是旧的替换新的,将旧的返回,新元素则将modCount++,并调用addEntry(0,null,value,0);注意我这里的参数是确定值。不是null就获取hash值,确定数组下标(return hash & table.length - 1;),从确定的数组下标位置开始遍历链表,key相等就新的value替换旧的value,返回旧的value,是新元素就modCount + 1,元素添加至Map中。 这就是解决hash冲突常用的方法之一——拉链法。Oracle的put方法与Android Open JDK有较大出入,Oracle用的是红黑树。
addEntry方法(注意扩容是2倍)
void addEntry(int hash, K key, V value, int bucketIndex) {
//若大小超过了阈值并且确认所设置的数组下标为null,则扩容2倍
if ((size >= threshold) && (null != table[bucketIndex])) {
resize(2 * table.length);
hash = (null != key) ? sun.misc.Hashing.singleWordWangJenkinsHash(key) : 0;
bucketIndex = indexFor(hash, table.length);
}
//元素添加至数组中
createEntry(hash, key, value, bucketIndex);
}
createEntry方法很简单:
void createEntry(int hash, K key, V value, int bucketIndex) {
HashMapEntry<K,V> e = table[bucketIndex];
table[bucketIndex] = new HashMapEntry<>(hash, key, value, e);
size++;
}
将元素放入数组下标处,并size++;
这里inflateTable和resize都是扩容,但是我感觉resize才是真正的扩容,inflateTable是创建了一个新的HashMapEntry对象,长度为新指定的长度,并赋值给table,感觉比较适合没有数据时的初始时扩容,事实上,大多数情况下,该方法都是在table为空时需要扩容时调用。resize则是会new一个新的数组,将旧数组的内容加入新数组中,并重新调整阈值。
resize方法:
void resize(int newCapacity) {
HashMapEntry[] oldTable = table;
int oldCapacity = oldTable.length;
//如果旧的数组已经是最大了,就没有扩容的必要了,扩不了
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}
HashMapEntry[] newTable = new HashMapEntry[newCapacity];
transfer(newTable);
table = newTable;
//调整阈值
threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
}
感觉有必要展示一下transfer方法:
void transfer(HashMapEntry[] newTable) {
int newCapacity = newTable.length;
//循环table,将旧数据放入新table中
for (HashMapEntry<K,V> e : table) {
while(null != e) {
HashMapEntry<K,V> next = e.next;
int i = indexFor(e.hash, newCapacity);
e.next = newTable[i];
newTable[i] = e;
e = next;
}
}
}
扩容原理还是很简单的,自己画两个图就好理解了,注意的是扩容后hash值可能就变了,在旧数组的位置到新数组上不一定还是原来的位置。画图很麻烦,建议画一下,面试官可能会问你的。
get()方法
public V get(Object key) {
if (key == null)
//如果table[0]的key为null,就返回table[0]的value,否则返回null
return getForNullKey();
//HashMapEntry继承Entry
Entry<K,V> entry = getEntry(key);
return null == entry ? null : entry.getValue();
}
getEntry()
final Entry<K,V> getEntry(Object key) {
if (size == 0) {
return null;
}
//获取hash值
int hash = (key == null) ? 0 : sun.misc.Hashing.singleWordWangJenkinsHash(key);
//循环table中的链表,若key相同就返回整个e
for (HashMapEntry<K,V> e = table[indexFor(hash, table.length)];
e != null;
e = e.next) {
Object k;
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
return e;
}
return null;
}
最长用的方法介绍完毕,接下来是几个次要一些的方法:
putAll()方法:将一整个map添加入HashMap中
public void putAll(Map<? extends K, ? extends V> m) {
int numKeysToBeAdded = m.size();
if (numKeysToBeAdded == 0)
return;
if (table == EMPTY_TABLE) {
inflateTable((int) Math.max(numKeysToBeAdded * loadFactor, threshold));
}
/*
* Expand the map if the map if the number of mappings to be added
* is greater than or equal to threshold. This is conservative; the
* obvious condition is (m.size() + size) >= threshold, but this
* condition could result in a map with twice the appropriate capacity,
* if the keys to be added overlap with the keys already in this map.
* By using the conservative calculation, we subject ourself
* to at most one extra resize.
*/
if (numKeysToBeAdded > threshold) {
int targetCapacity = (int)(numKeysToBeAdded / loadFactor + 1);
if (targetCapacity > MAXIMUM_CAPACITY)
targetCapacity = MAXIMUM_CAPACITY;
int newCapacity = table.length;
//如果我的map比要添加的map小,我就扩容2倍
while (newCapacity < targetCapacity)
newCapacity <<= 1;
if (newCapacity > table.length)
//新的map大小比原map大,则置换
resize(newCapacity);
}
//循环put要添加的map到新扩容的map中
for (Map.Entry<? extends K, ? extends V> e : m.entrySet())
put(e.getKey(), e.getValue());
}
remove()方法
public V remove(Object key) {
Entry<K,V> e = removeEntryForKey(key);
//把要remove的value返还给你,疑惑为什么不写boolean
return (e == null ? null : e.getValue());
}
removeEntryForKey()方法:
final Entry<K,V> removeEntryForKey(Object key) {
if (size == 0) {
return null;
}
int hash = (key == null) ? 0 : sun.misc.Hashing.singleWordWangJenkinsHash(key);
int i = indexFor(hash, table.length);
HashMapEntry<K,V> prev = table[i];
HashMapEntry<K,V> e = prev;
while (e != null) {
HashMapEntry<K,V> next = e.next;
Object k;
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k)))) {
modCount++;
size--;
if (prev == e) {
//说明是数组中第一个节点,直接将下一个节点放入数组中
table[i] = next;
}else {
//指向下一个节点
prev.next = next;
}
//空方法
e.recordRemoval(this);
return e;
}
prev = e;
e = next;
}
return e;
}
自己画个图就很快理解了。其实是我不想画图。。。。。。
clear()方法
public void clear() {
modCount++;
Arrays.fill(table, null); //里面是一个for循环,将table中的元素置为null
size = 0; //大小记为0
}
containsKey()方法
public boolean containsKey(Object key) {
return getEntry(key) != null; //就是getEntry方法,前面有
}
containsValue()方法
public boolean containsValue(Object value) {
if (value == null)
return containsNullValue(); //两个for循环嵌套遍历table判断:e.value == null
HashMapEntry[] tab = table;
for (int i = 0; i < tab.length ; i++)
for (HashMapEntry e = tab[i] ; e != null ; e = e.next)
if (value.equals(e.value))
return true;
return false;
}
EntrySet()方法、valuse()方法、keySet()方法
这些方法一般是用来遍历数据的。此处以EntrySet()为例
public Set<Map.Entry<K,V>> entrySet() {
return entrySet0();
}
private Set<Map.Entry<K,V>> entrySet0() {
Set<Map.Entry<K,V>> es = entrySet;
return es != null ? es : (entrySet = new EntrySet());
}
private final class EntrySet extends AbstractSet<Map.Entry<K,V>> {
//使用迭代器
public Iterator<Map.Entry<K,V>> iterator() {
return newEntryIterator();
}
public boolean contains(Object o) {
if (!(o instanceof Map.Entry))
return false;
Map.Entry<K,V> e = (Map.Entry<K,V>) o;
Entry<K,V> candidate = getEntry(e.getKey());
return candidate != null && candidate.equals(e);
}
public boolean remove(Object o) {
return removeMapping(o) != null;
}
public int size() {
return size;
}
public void clear() {
HashMap.this.clear();
}
public final Spliterator<Map.Entry<K,V>> spliterator() {
return new EntrySpliterator<>(HashMap.this, 0, -1, 0, 0);
}
public final void forEach(Consumer<? super Map.Entry<K,V>> action) {
HashMapEntry<K,V>[] tab;
if (action == null)
throw new NullPointerException();
if (size > 0 && (tab = table) != null) {
int mc = modCount;
for (int i = 0; i < tab.length; ++i) {
for (HashMapEntry<K,V> e = tab[i]; e != null; e = e.next) {
action.accept(e);
// Android-modified - this was outside of the loop, inconsistent with other
// collections
if (modCount != mc) {
throw new ConcurrentModificationException();
}
}
}
}
}
}
看一下HashMap的迭代器:
Iterator<Map.Entry<K,V>> newEntryIterator() {
return new EntryIterator();
}
private final class EntryIterator extends HashIterator<Map.Entry<K,V>> {
public Map.Entry<K,V> next() {
return nextEntry();
}
}
private abstract class HashIterator<E> implements Iterator<E> {
HashMapEntry<K,V> next; //下一个元素
int expectedModCount; // For fast-fail
int index; // 当前索引
HashMapEntry<K,V> current; // 当前元素
HashIterator() {
expectedModCount = modCount;
if (size > 0) { // advance to first entry
HashMapEntry[] t = table;
//将next指向第一个不为null的元素,index初始值为0,所以一直向后循环
while (index < t.length && (next = t[index++]) == null)
;
}
}
public final boolean hasNext() {
return next != null;
}
//获取下一个元素
final Entry<K,V> nextEntry() {
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
HashMapEntry<K,V> e = next;
if (e == null)
throw new NoSuchElementException();
//如果下一个节点不为空,就将next指向下一个节点,负责指向下一个entry
if ((next = e.next) == null) {
HashMapEntry[] t = table;
while (index < t.length && (next = t[index++]) == null)
;
}
current = e;
return e;
}
//删除当前元素
public void remove() {
if (current == null)
throw new IllegalStateException();
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
Object k = current.key;
current = null;
HashMap.this.removeEntryForKey(k);
expectedModCount = modCount;
}
}
实现Serializable接口:
HashMap实现了Serializable接口,所以是有序列化和反序列化操作的,主要是2个方法:
private void writeObject(java.io.ObjectOutputStream s)throws IOException {
// Write out the threshold, loadfactor, and any hidden stuff
s.defaultWriteObject();
// Write out number of buckets
if (table==EMPTY_TABLE) {
s.writeInt(roundUpToPowerOf2(threshold));
} else {
s.writeInt(table.length);
}
// Write out size (number of Mappings)
s.writeInt(size);
// Write out keys and values (alternating)
if (size > 0) {
for(Map.Entry<K,V> e : entrySet0()) {
s.writeObject(e.getKey());
s.writeObject(e.getValue());
}
}
}
private void readObject(java.io.ObjectInputStream s) throws IOException, ClassNotFoundException{
// Read in the threshold (ignored), loadfactor, and any hidden stuff
s.defaultReadObject();
if (loadFactor <= 0 || Float.isNaN(loadFactor)) {
throw new InvalidObjectException("Illegal load factor: " + loadFactor);
}
// set other fields that need values
table = (HashMapEntry<K,V>[]) EMPTY_TABLE;
// Read in number of buckets
s.readInt(); // ignored.
// Read number of mappings
int mappings = s.readInt();
if (mappings < 0)
throw new InvalidObjectException("Illegal mappings count: " + mappings);
// capacity chosen by number of mappings and desired load (if >= 0.25)
int capacity = (int) Math.min(mappings * Math.min(1 / loadFactor, 4.0f),
// we have limits...
HashMap.MAXIMUM_CAPACITY);
// allocate the bucket array;
if (mappings > 0) {
inflateTable(capacity);
} else {
threshold = capacity;
}
init(); // Give subclass a chance to do its thing.
// Read the keys and values, and put the mappings in the HashMap
for (int i = 0; i < mappings; i++) {
K key = (K) s.readObject();
V value = (V) s.readObject();
putForCreate(key, value);
}
}
三. HashMap的常用使用方法:
遍历键值对:
Iterator iter = map.entrySet().iterator();
while(iter.hasNext()){
Map.Entry entry = (Map.Entry)iter.next();
key = (String)entry.getKey();
value = (Object)entry.getValue();
}
遍历键:
Iterator iter = map.keySet().iterator();
while(iter.hasNext()){
key = (String)iter.next();
value = (Object)map.get(key);
}
遍历值:
Collection c = map.values();
Iterator iter = c.iterator();
while(iter.hashNext()){
key = (String)iter.next();
}
JDK 1.8引入了红黑树,比较难理解,我研究一下,后续更新。