首先贴一段hashMap存储格式, 算是为后面阐述做铺垫
public class HashMap<K,V>
extends AbstractMap<K,V>
implements Map<K,V>, Cloneable, Serializable
{
/**
* The default initial capacity - MUST be a power of two.
*/
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
/**
* The maximum capacity, used if a higher value is implicitly specified
* by either of the constructors with arguments.
* MUST be a power of two <= 1<<30.
*/
static final int MAXIMUM_CAPACITY = 1 << 30;
/**
* The load factor used when none specified in constructor.
*/
static final float DEFAULT_LOAD_FACTOR = 0.75f;
/**
* An empty table instance to share when the table is not inflated.
*/
static final Entry<?,?>[] EMPTY_TABLE = {};
/**
* The table, resized as necessary. Length MUST Always be a power of two.
*/
transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;
/**
* The number of key-value mappings contained in this map.
*/
transient int size;
/**
* The next size value at which to resize (capacity * load factor).
* @serial
*/
// If table == EMPTY_TABLE then this is the initial capacity at which the
// table will be created when inflated.
int threshold;
/**
* The load factor for the hash table.
*
* @serial
*/
final float loadFactor;
/**
* The number of times this HashMap has been structurally modified
* Structural modifications are those that change the number of mappings in
* the HashMap or otherwise modify its internal structure (e.g.,
* rehash). This field is used to make iterators on Collection-views of
* the HashMap fail-fast. (See ConcurrentModificationException).
*/
transient int modCount;
}
1,将对象作为hashMap KEY,必须注意其hashcode与equals方法, 并且equals相等的两个对象其hashcode一定要相等。这是为什么呢?
要弄明白这个,其实只需看明白hashMap中的put方法即可, hashMap的插入是首先根据hash与indexForde得到散列表的位置,再才会去遍历table[i]是否有
相等元素。{object 对象 的 == 操作符默认为地址比较 }。所以要想作为hashMap的 KEY,必须注意其hashcode与equals方法, 并且equals相等的两个对象其hashcode一定要相等
/**
* Associates the specified value with the specified key in this map.
* If the map previously contained a mapping for the key, the old
* value is replaced.
*
* @param key key with which the specified value is to be associated
* @param value value to be associated with the specified key
* @return the previous value associated with <tt>key</tt>, or
* <tt>null</tt> if there was no mapping for <tt>key</tt>.
* (A <tt>null</tt> return can also indicate that the map
* previously associated <tt>null</tt> with <tt>key</tt>.)
*/
public V put(K key, V value) {
if (table == EMPTY_TABLE) {
inflateTable(threshold);
}
if (key == null)
return putForNullKey(value);
int hash = hash(key);
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
2。hashmap什么时候扩容,是扩充table, 还是扩充链表entry?
当table的使用率超过一定阈值时就扩充table, 处理hash冲突时扩充entry. 结合上段代码
/**
* Adds a new entry with the specified key, value and hash code to
* the specified bucket. It is the responsibility of this
* method to resize the table if appropriate.
*
* Subclass overrides this to alter the behavior of put method.
*/
void addEntry(int hash, K key, V value, int bucketIndex) {
if ((size >= threshold) && (null != table[bucketIndex])) {
resize(2 * table.length);
hash = (null != key) ? hash(key) : 0;
bucketIndex = indexFor(hash, table.length);
}
createEntry(hash, key, value, bucketIndex);
}
/**
* Like addEntry except that this version is used when creating entries
* as part of Map construction or "pseudo-construction" (cloning,
* deserialization). This version needn't worry about resizing the table.
*
* Subclass overrides this to alter the behavior of HashMap(Map),
* clone, and readObject.
*/
void createEntry(int hash, K key, V value, int bucketIndex) {
Entry<K,V> e = table[bucketIndex];
table[bucketIndex] = new Entry<>(hash, key, value, e);
size++;
}
3,何处对于threshold的做初始化操作?
很让人意外,对于HashMap<XX, XX> x = new HashMap<XX, XX>这种构造情况下是在第一次put元素时做的初始化操作时调用inflateTable方法初始化threshold。
/**
* Constructs an empty <tt>HashMap</tt> with the specified initial
* capacity and load factor.
*
* @param initialCapacity the initial capacity
* @param loadFactor the load factor
* @throws IllegalArgumentException if the initial capacity is negative
* or the load factor is nonpositive
*/
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " +
initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " +
loadFactor);
this.loadFactor = loadFactor;
threshold = initialCapacity;
init();
}
public HashMap() {
this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);
}
如果构造map时指定了threshold ,其threshOld 就是指定的本身, 默认大小为16.
对于自动扩容后的
threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
/*
* Inflates the table.
*/
private void inflateTable(int toSize) {
// Find a power of 2 >= toSize
int capacity = roundUpToPowerOf2(toSize);
threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
table = new Entry[capacity];
initHashSeedAsNeeded(capacity);
}
4.最后我们说说,map put的时候什么情况下会出现性能的瓶颈:
再贴一段对put方法调用的性能测试代码,put次数与时间并不一定成正比
/**
* Created with IntelliJ IDEA. User: jianjun.yu Date: 14-8-11 Time: 下午11:27 To change this template use File | Settings
* | File Templates.
*/
public class hashMapTest {
public static void put(Integer count) {
long begin = System.currentTimeMillis();
HashMap<Integer, Integer> hashMap = new HashMap<Integer, Integer>();
for (Integer i = 0; i < count; i++) {
hashMap.put(i, i);
}
long end = System.currentTimeMillis();
System.out.println("count:" + count + "耗时:" + String.valueOf(end - begin));
}
public static void main(String[] args) {
put(1000);
put(10000);
put(100000);
put(1000000);
put(10000000);
}
}
运行结果:
count:1000耗时:2
count:10000耗时:19
count:100000耗时:19
count:1000000耗时:138
count:10000000耗时:5994
map容器扩容的时候,会开辟2倍原有空间,顺序将原有值拷贝过去, 当map本以很大的时候再扩容时就会出现性能问题。
解决之道:预先估计map大小,及threashOld,能省一点点时间。。。
public static void noInitVolumeTest(Integer count) {
long begin = System.currentTimeMillis();
HashMap<Integer, Integer> hashMap = new HashMap<Integer, Integer>();
for (Integer i = 0; i < count; i++) {
hashMap.put(i, i);
}
long end = System.currentTimeMillis();
System.out.println("noInitVolumeTest count:" + count + "耗时:" + String.valueOf(end - begin));
}
public static Integer getNearTwoPower(Integer count) {
Integer result = 1;
while (result < count) {
result *= 2;
}
return result;
}
public static void initVolumeTest3(Integer count) {
long begin = System.currentTimeMillis();
HashMap<Integer, Integer> hashMap = new HashMap<Integer, Integer>(getNearTwoPower(count), (float) 1.0);
for (Integer i = 0; i < count; i++) {
hashMap.put(i, i);
}
long end = System.currentTimeMillis();
System.out.println("initVolumeTest3 count:" + count + "耗时:" + String.valueOf(end - begin));
}
public static void main(String[] args) {
Integer count = 10;
for (Integer i = 0; i < 8; i++) {
noInitVolumeTest(count);
initVolumeTest3(count);
count *= 10;
}
}
运行结果:
noInitVolumeTest count:10耗时:0
initVolumeTest3 count:10耗时:0
noInitVolumeTest count:100耗时:0
initVolumeTest3 count:100耗时:0
noInitVolumeTest count:1000耗时:1
initVolumeTest3 count:1000耗时:2
noInitVolumeTest count:10000耗时:15
initVolumeTest3 count:10000耗时:6
noInitVolumeTest count:100000耗时:23
initVolumeTest3 count:100000耗时:18
noInitVolumeTest count:1000000耗时:95
initVolumeTest3 count:1000000耗时:20
noInitVolumeTest count:10000000耗时:5360
initVolumeTest3 count:10000000耗时:2928
节省的这么一点点时间其实并不重要, 重要的是能节省内存的峰值。我们看看容量重分配的时候到底做了什么,会初始化一块原来2倍空间,
然后顺序将原有元素拷贝到新的内存区,这才是关键。
void addEntry(int hash, K key, V value, int bucketIndex) {
if ((size >= threshold) && (null != table[bucketIndex])) {
resize(2 * table.length);
hash = (null != key) ? hash(key) : 0;
bucketIndex = indexFor(hash, table.length);
}
createEntry(hash, key, value, bucketIndex);
}
void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}
Entry[] newTable = new Entry[newCapacity];
transfer(newTable, initHashSeedAsNeeded(newCapacity));
table = newTable;
threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
}
/**
* Transfers all entries from current table to newTable.
*/
void transfer(Entry[] newTable, boolean rehash) {
int newCapacity = newTable.length;
for (Entry<K,V> e : table) {
while(null != e) {
Entry<K,V> next = e.next;
if (rehash) {
e.hash = null == e.key ? 0 : hash(e.key);
}
int i = indexFor(e.hash, newCapacity);
e.next = newTable[i];
newTable[i] = e;
e = next;
}
}
}