基于JDK1.8的HashMap源码分析
What is HashMap?
The ability of appling one object to map another object is a good way to solve some problems. Map, consisting of key and value, has the ability . Map is an interface. HashMap is one of implentation classes.
Data Structer of HashMap
In the JDK 1.7, data structer of Hashap consists of arrays and linkedlist. However, data structer of HashMap is composed of arrays,linkedList and red-black tree.
Diagram of HashMap
How to use HashMap?
HashMap has four constructors.
First:
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " +
initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " +
loadFactor);
this.loadFactor = loadFactor;
this.threshold = tableSizeFor(initialCapacity);
}
Second:
public HashMap(int initialCapacity) {
this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
Third:
public HashMap() {
this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}
Fourth:
public HashMap(Map<? extends K, ? extends V> m) {
this.loadFactor = DEFAULT_LOAD_FACTOR;
putMapEntries(m, false);
}
For example:
package com.smart.java.foundation.map;
import java.util.HashMap;
public class HashMapTest {
public static void main(String[] args) {
HashMap<String,String> firstMap = new HashMap<String, String>();
firstMap.put("1","2");
System.out.println(firstMap.put("1","3"));
HashMap<String,String> secondMap = new HashMap<String, String>(11);
secondMap.put("1","2");
System.out.println(secondMap.put("1","3"));
HashMap<String,String> thirdMap = new HashMap<String, String>(11,0.8f);
thirdMap.put("1","2");
System.out.println(thirdMap.put("1","3"));
HashMap<String,String> fourthMap = new HashMap<String, String>(firstMap);
System.out.println(fourthMap.get("1"));
}
}
As we all know, four constructors has two keywords:loadFactor and initialCapacity. HashMap capacity resize is up to the loadFactor and intialCapacity. The initialCapacity must be the power of two. The reason will be discussed as followed.
HashMap Source Code
As followed, the hashmap source code will be discussed, which may be boring but userful for interview.
According to the discussion before, the data structer in JDK 1.8 consists of arrays, linkedList and red-black tree.
The following are the properties.
/**
* The default initial capacity - MUST be a power of two.
*/
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
/**
* The maximum capacity, used if a higher value is implicitly specified
* by either of the constructors with arguments.
* MUST be a power of two <= 1<<30.
*/
static final int MAXIMUM_CAPACITY = 1 << 30;
/**
* The load factor used when none specified in constructor.
*/
static final float DEFAULT_LOAD_FACTOR = 0.75f;
/**
* The bin count threshold for using a tree rather than list for a
* bin. Bins are converted to trees when adding an element to a
* bin with at least this many nodes. The value must be greater
* than 2 and should be at least 8 to mesh with assumptions in
* tree removal about conversion back to plain bins upon
* shrinkage.
*/
static final int TREEIFY_THRESHOLD = 8;
/**
* The bin count threshold for untreeifying a (split) bin during a
* resize operation. Should be less than TREEIFY_THRESHOLD, and at
* most 6 to mesh with shrinkage detection under removal.
*/
static final int UNTREEIFY_THRESHOLD = 6;
/**
* The smallest table capacity for which bins may be treeified.
* (Otherwise the table is resized if too many nodes in a bin.)
* Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
* between resizing and treeification thresholds.
*/
static final int MIN_TREEIFY_CAPACITY = 64;
/* ---------------- Fields -------------- */
/**
* The table, initialized on first use, and resized as
* necessary. When allocated, length is always a power of two.
* (We also tolerate length zero in some operations to allow
* bootstrapping mechanics that are currently not needed.)
*/
transient Node<K,V>[] table;
/**
* Holds cached entrySet(). Note that AbstractMap fields are used
* for keySet() and values().
*/
transient Set<Map.Entry<K,V>> entrySet;
/**
* The number of key-value mappings contained in this map.
*/
transient int size;
/**
* The number of times this HashMap has been structurally modified
* Structural modifications are those that change the number of mappings in
* the HashMap or otherwise modify its internal structure (e.g.,
* rehash). This field is used to make iterators on Collection-views of
* the HashMap fail-fast. (See ConcurrentModificationException).
*/
transient int modCount;
/**
* The next size value at which to resize (capacity * load factor).
*
* @serial
*/
// (The javadoc description is true upon serialization.
// Additionally, if the table array has not been allocated, this
// field holds the initial array capacity, or zero signifying
// DEFAULT_INITIAL_CAPACITY.)
int threshold;
/**
* The load factor for the hash table.
*
* @serial
*/
final float loadFactor;
Characteristics of red-black tree
- Consist of two kinds of nodes: black node and red node;
- Root node must be black and leaf node must be red;
- If some node is red, its children nodes must be black;
- The number of black node from some node to any leaf node must be same;
- If there are n nodes in the red-black tree, the height is not greater than 2log(n+1) .
The red-black tree speeds up in the search,add and insert node operations.
When to change the linkedList to the red-black tree?
When some bucket has 8 elements , the linkedList will be treeified.
When to resize?
When the number of bucket is greater than capacity multiply the factor, the resize operation will be taken.
Bucket Position
Position depends on the array length and the key hashCode. The way to generate the key hashCode is as follows:
(key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
The calculate formula of position is as follows:
(capactiy -1) & hash
Reason for capacity is the power of two
To reduce the hash conflicts times, for example, When the capacity is sixteen and meets requirements of resize, the capacity will be thirty-two. The binary codes of the table length minus one in two cases are respectively 0000 1111 and 0001 1111. The differences between two binary codes is very little. When calculating the position, the hash conflicts will be little. I hope the reader can try it .
How to insert the object into the LinkedList?
When the object hash the same hash code, it will be assigned the same bucket. Use tail insert way to insert the object into the LinkdedList. Generally speaking, there are two ways to put the object into the LinkedList: tail insert way and head insert way. If you are inserted, you can refer to the implementation of linkedlist .
End
Good Luck to every body.