基于JDK 8的HashMap源码解析

感谢Neal Gafter、Arthur van Hoff、Josh Bloch、Doug Lea为我们提供了HashMap这一工具类,方便了诸多Java开发者的开发工作。




Hash table based implementation of the Map interface.  This implementation provides all of the optional map
operations, and permits null values and the null key.  (The HashMap class is roughly equivalent to Hashtable, 
except that it is unsynchronized and permits nulls.)  This class makes no guarantees as to the order of the 
map; in particular, it does not guarantee that the order will remain constant over time.


This implementation provides constant-time performance for the basic operations (get and put), assuming the 
hash function disperses the elements properly among the buckets.  Iteration over collection views requires 
time proportional to the "capacity" of the HashMap instance (the number of buckets) plus its size (the number
of key-value mappings).  Thus, it's very important not to set the initial capacity too high (or the load factor too 
low) if iteration performance is important.


An instance of HashMap has two parameters that affect its performance: initial capacity and load factor.  The 
capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time 
the hash table is created.  The load factor is a measure of how full the hash table is allowed to get before its 
capacity is automatically increased.  When the number of entries in the hash table exceeds the product of the 
load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so 
that the hash table has approximately twice the number of buckets.


As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs.  Higher 
values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the 
HashMap class, including get and put).  The expected number of entries in the map and its load factor should 
be taken into account when setting its initial capacity, so as to minimize the number of rehash operations.  If 
the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash 
operations will ever occur.


If many mappings are to be stored in a HashMap instance, creating it with a sufficiently large capacity will 
allow the mappings to be stored more efficiently than letting it perform automatic rehashing as needed to 
grow the table.  Note that using many keys with the same hashCode() is a sure way to slow down 
performance of any hash table. To ameliorate impact, when keys are Comparable, this class may use 
comparison order among keys to help break ties.


Note that this implementation is not synchronized. If multiple threads access a hash map concurrently, and at 
least one of the threads modifies the map structurally, it must be synchronized externally.  (A structural 
modification is any operation that adds or deletes one or more mappings; merely changing the value 
associated with a key that an instance already contains is not a structural modification.)  This is typically 
accomplished by synchronizing on some object that naturally encapsulates the map.


If no such object exists, the map should be "wrapped" using the Collections.synchronizedMap method.  This 
is best done at creation time, to prevent accidental unsynchronized access to the map:
Map m = Collections.synchronizedMap(new HashMap(...));


The iterators returned by all of this class's "collection view methods" are fail-fast: if the map is structurally 
modified at any time after the iterator is created, in any way except through the iterator's own remove method,  
the iterator will throw a ConcurrentModificationException.  Thus, in the face of concurrent modification, the 
iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined 
time in the future.


Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to 
make any hard guarantees in the presence of unsynchronized concurrent modification.  Fail-fast iterators 
throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a 
program that depended on this exception for its correctness: the fail-fast behavior of iterators should be 
used only to detect bugs.


This map usually acts as a binned (bucketed) hash table, but when bins get too large, they are transformed 
into bins of TreeNodes, each structured similarly to those in java.util.TreeMap. Most methods try to use 
normal bins, but relay to TreeNode methods when applicable (simply by checking instanceof a node).  Bins of 
TreeNodes may be traversed and used like any others, but additionally support faster lookup when 
overpopulated. However, since the vast majority of bins in normal use are not overpopulated, checking for 
existence of tree bins may be delayed in the course of table methods.


Tree bins (i.e., bins whose elements are all TreeNodes) are ordered primarily by hashCode, but in the case of 
ties, if two elements are of the same "class C implements Comparable<C>", type then their compareTo 
method is used for ordering. (We conservatively check generic types via reflection to validate this -- see 
method comparableClassFor).  The added complexity of tree bins is worthwhile in providing worst-case O(log 
n) operations when keys either have distinct hashes or are orderable, Thus, performance degrades gracefully 
under accidental or malicious usages in which hashCode() methods return values that are poorly distributed, 
as well as those in which many keys share a hashCode, so long as they are also Comparable. (If neither of 
these apply, we may waste about a factor of two in time and space compared to taking no precautions. But 
the only known cases stem from poor user programming practices that are already so slow that this makes
 little difference.)


Because TreeNodes are about twice the size of regular nodes, we use them only when bins contain enough 
nodes to warrant use (see TREEIFY_THRESHOLD). And when they become too small (due to removal or 
resizing) they are converted back to plain bins.  In usages with well-distributed user hashCodes, tree bins are 
rarely used.  Ideally, under random hashCodes, the frequency of nodes in bins follows a Poisson distribution 
(http://en.wikipedia.org/wiki/Poisson_distribution) with a parameter of about 0.5 on average for the default 
resizing threshold of 0.75, although with a large variance because of resizing granularity. Ignoring variance, 
the expected occurrences of list size k are (exp(-0.5) * pow(0.5, k) / factorial(k)). The first values are:
0:    0.60653066
1:    0.30326533
2:    0.07581633
3:    0.01263606
4:    0.00157952
5:    0.00015795
6:    0.00001316
7:    0.00000094
8:    0.00000006
more: less than 1 in ten million


The root of a tree bin is normally its first node.  However, sometimes (currently only upon Iterator.remove), 
the root might be elsewhere, but can be recovered following parent links (method TreeNode.root()).


All applicable internal methods accept a hash code as an argument (as normally supplied from a public 
method), allowing them to call each other without recomputing user hashCodes. Most internal methods also 
accept a "tab" argument, that is normally the current table, but may be a new or old one when resizing or 
When bin lists are treeified, split, or untreeified, we keep them in the same relative access/traversal order (i.e., field Node.next) to better preserve locality, and to slightly simplify handling of splits and traversals that invoke iterator.remove. When using comparators on insertion, to keep a total ordering (or as close as is required here) across rebalancings, we compare classes and identityHashCodes as tie-breakers.


The use and transitions among plain vs tree modes is complicated by the existence of subclass 
LinkedHashMap. See below for hook methods defined to be invoked upon insertion, removal and access that 
allow LinkedHashMap internals to otherwise remain independent of these mechanics. (This also requires that 
a map instance be passed to some utility methods that may create new nodes.)
The concurrent-programming-like SSA-based coding style helps avoid aliasing errors amid all of the twisty pointer operations.

1 线程安全方面
2 键值特性方面
3 初始容量和负载率方面
4 Fail-Fast机制方面
5 节点转化方面



private static final long serialVersionUID = 362498820763181265L;
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;
static final int MAXIMUM_CAPACITY = 1 << 30;
static final float DEFAULT_LOAD_FACTOR = 0.75f;
static final int TREEIFY_THRESHOLD = 8;
static final int UNTREEIFY_THRESHOLD = 6;
static final int MIN_TREEIFY_CAPACITY = 64;


static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;

        public final K getKey()        {
    return key; }
        public final V getValue()      {
    return value; }
        public final String toString() {
    return key + "=" + value; }

        public final int hashCode() {
            return Objects.hashCode(key) ^ Objects.hashCode(value);

        public final V setValue(V newValue) {
            V oldValue = value;
            value = newValue;
            return oldValue;

        public final boolean equals(Object o) {
            if (o == this)
                return true;
            if (o instanceof Map.Entry) {
                Map.Entry<?,?> e = (Map.Entry<?,?>)o;
                if (Objects.equals(key, e.getKey()) &&
                    Objects.equals(value, e.getValue()))
                    return true;
            return false;


    static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
        TreeNode<K,V> parent;  // red-black tree links
        TreeNode<K,V> left;
        TreeNode<K,V> right;
        TreeNode<K,V> prev;    // needed to unlink next upon deletion
        boolean red;	//说明这是一颗红黑树
        TreeNode(int hash, K key, V val, Node<K,V> next) {
            super(hash, key, val, next);

        final TreeNode<K,V> root() {
            for (TreeNode<K,V> r = this, p;;) {
                if ((p = r.parent) == null)
                    return r;
                r = p;

        static <K,V> void moveRootToFront(Node<K,V>[] tab, TreeNode<K,V> root) {
            int n;
            if (root != null && tab != null && (n = tab.length) > 0) {
                int index = (n - 1) & root.hash;	//找到桶位
                TreeNode<K,V> first = (TreeNode<K,V>)tab[index];
                if (root != first) {
                    Node<K,V> rn;
                    tab[index] = root;	//根节点放在桶的第一位
                    TreeNode<K,V> rp = root.prev;	//根的前一个节点
                    if ((rn = root.next) != null)	
                        ((TreeNode<K,V>)rn).prev = rp;
                    if (rp != null)
                        rp.next = rn;
                    if (first != null)
                        first.prev = root;
                    root.next = first;
                    root.prev = null;
                assert checkInvariants(root);
    	final TreeNode<K,V> find(int h, Object k, Class<?> kc) {
        	TreeNode<K,V> p = this;
        	do {
            	int ph, dir; K pk;
            	TreeNode<K,V> pl = p.left, pr = p.right, q; // 当前节点的左右孩子
            	if ((ph = p.hash) > h) // hash值小的从左子树迭代查找
                	p = pl;
            	else if (ph < h) // hash值大的从右子树迭代查找
                	p = pr;
            	// hash值相等,且键地址相同或都为空时,查找成功
            	else if ((pk = p.key) == k || (k != null && k.equals(pk)))
                	return p;
            	// hash值相等,但键不相同,且节点没有左子树,就从右子树查找
            	else if (pl == null)
                	p = pr;
            	// hash值相等,但键不相同,且节点没有右子树,就从左子树查找
            	else if (pr == null)
                	p = pl;
            	// 比较两个Key
            	else if ((kc != null ||
                      	(kc = comparableClassFor(k)) != null) &&
                     	(dir = compareComparables(kc, k, pk)) != 0)
                	p = (dir < 0) ? pl : pr;
            	// Key不可比较或比较结果为0时,先在右子树中查找
            	else if ((q = pr.find(h, k, kc)) != null)
                	return q;
            	// 右子树查找不到时
                	p = pl;
        	} while (p != null);
        	return null;

    	// 从根节点查找hash值为h,Key为k的节点
    	final TreeNode<K,V> getTreeNode(int h, Object k) {
        	return ((parent != null) ? root() : this).find(h, k, null);

    	// 强行比较两个对象,结果为-1或1
    	static int tieBreakOrder(Object a, Object b) {
        	int d;
        	// a和b都不为空时比较它们的类名
        	if (a == null || b == null ||
            	(d = a.getClass().getName().
             	compareTo(b.getClass().getName())) == 0)
            	// a为null,或b为null,或类名也相等时,比较它们的内存地址
            	d = (System.identityHashCode(a) <= System.identityHashCode(b) ?
                 	-1 : 1);
        	return d
