HashMap源码分析(1)
jdk1.8
先看hashMap的用法:
HashMap hashMap = new HashMap();
hashMap.put("1","1");
-
初始化
第一步new一个HashMap,看这个HashMap的无参构造方法:
public HashMap() { this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted }
这个构造方法就是初始化一个loadFactor成员变量,然后看一下这个成员变量的注释:
/** * The load factor for the hash table. * (哈希表的负载因子) * @serial */ final float loadFactor;
那这个loadFactor的作用是做什么的:
加载因子是表示Hsah表中元素的填满的程度.若:加载因子越大,填满的元素越多,好处是,空间利用率高了,但:冲突的机会加大了.反之,加载因子越小,填满的元素越少,好处是:冲突的机会减小了,但:空间浪费多了.
通俗的说就是:loadFactor决定了hashmap什么时候扩容:loadFactor*hashmap最大的长度的时候就开始扩容。
再看初始化的时候,loadFactor被赋予的值:DEFAULT_LOAD_FACTOR
/** * The load factor used when none specified in constructor. * 当在构造器里没有标明的时候,初始化factor。 */ static final float DEFAULT_LOAD_FACTOR = 0.75f;
那为什么是0.75呢?我也说不清楚。。
想进一步了解的可以查看博客:https://www.jianshu.com/p/64f6de3ffcc1
-
PUT操作
public V put(K key, V value) { return putVal(hash(key), key, value, false, true); }
先看hash(),
/** * Computes key.hashCode() and spreads (XORs) higher bits of hash * to lower. Because the table uses power-of-two masking, sets of * hashes that vary only in bits above the current mask will * always collide. (Among known examples are sets of Float keys * holding consecutive whole numbers in small tables.) So we * apply a transform that spreads the impact of higher bits * downward. There is a tradeoff between speed, utility, and * quality of bit-spreading. Because many common sets of hashes * are already reasonably distributed (so don't benefit from * spreading), and because we use trees to handle large sets of * collisions in bins, we just XOR some shifted bits in the * cheapest possible way to reduce systematic lossage, as well as * to incorporate impact of the highest bits that would otherwise * never be used in index calculations because of table bounds. */ static final int hash(Object key) { int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); }
先不管,就当时计算出一个int值,继续看下面代码:
/** * Implements Map.put and related methods * * @param hash hash for key * @param key the key * @param value the value to put * @param onlyIfAbsent if true, don't change existing value * @param evict if false, the table is in creation mode. * @return previous value, or null if none */ final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { Node<K,V>[] tab; Node<K,V> p; int n, i; if ((tab = table) == null || (n = tab.length) == 0) n = (tab = resize()).length; if ((p = tab[i = (n - 1) & hash]) == null) tab[i] = newNode(hash, key, value, null); else { Node<K,V> e; K k; if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))) e = p; else if (p instanceof TreeNode) e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value); else { for (int binCount = 0; ; ++binCount) { if ((e = p.next) == null) { p.next = newNode(hash, key, value, null); if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st treeifyBin(tab, hash); break; } if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) break; p = e; } } if (e != null) { // existing mapping for key V oldValue = e.value; if (!onlyIfAbsent || oldValue == null) e.value = value; afterNodeAccess(e); return oldValue; } } ++modCount; if (++size > threshold) resize(); afterNodeInsertion(evict); return null; }
先看方法参数:
- hash :hash()方法计算出的值
- key:hashMap put方法中的key
- valuw:hashMap put方法中的key
- onlyIfAbsent : 如果是true,就不改变现有的值,也就是,当key对于的value不存在的时候,才添加
- evict:先不管
然后第一行:
Node<K,V>[] tab; Node<K,V> p; int n, i;
定义一个Node<K,V>的数组,和Node<K,V>对象,以及2个int 变量,暂时不知道这2个的意思。
if ((tab = table) == null || (n = tab.length) == 0) n = (tab = resize()).length;
先看table变量
/** * The table, initialized on first use, and resized as * necessary. When allocated, length is always a power of two. * (We also tolerate length zero in some operations to allow * bootstrapping mechanics that are currently not needed.) 第一次使用的时候初始化,然后在必要的时候继续调整,当调整分配的时候,这个数组的长度始终是2的幂。 (也允许在某些操作中长度是0) */ transient Node<K,V>[] table;
这个变量默认是null,所以需要调用**resize()**方法。
/** * Initializes or doubles table size. If null, allocates in * accord with initial capacity target held in field threshold. * Otherwise, because we are using power-of-two expansion, the * elements from each bin must either stay at same index, or move * with a power of two offset in the new table. * * @return the table */ final Node<K,V>[] resize() { Node<K,V>[] oldTab = table; int oldCap = (oldTab == null) ? 0 : oldTab.length; int oldThr = threshold; int newCap, newThr = 0; if (oldCap > 0) { if (oldCap >= MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return oldTab; } else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) newThr = oldThr << 1; // double threshold } else if (oldThr > 0) // initial capacity was placed in threshold newCap = oldThr; else { // zero initial threshold signifies using defaults newCap = DEFAULT_INITIAL_CAPACITY; newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); } if (newThr == 0) { float ft = (float)newCap * loadFactor; newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE); } threshold = newThr; @SuppressWarnings({"rawtypes","unchecked"}) Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; table = newTab; if (oldTab != null) { for (int j = 0; j < oldCap; ++j) { Node<K,V> e; if ((e = oldTab[j]) != null) { oldTab[j] = null; if (e.next == null) newTab[e.hash & (newCap - 1)] = e; else if (e instanceof TreeNode) ((TreeNode<K,V>)e).split(this, newTab, j, oldCap); else { // preserve order Node<K,V> loHead = null, loTail = null; Node<K,V> hiHead = null, hiTail = null; Node<K,V> next; do { next = e.next; if ((e.hash & oldCap) == 0) { if (loTail == null) loHead = e; else loTail.next = e; loTail = e; } else { if (hiTail == null) hiHead = e; else hiTail.next = e; hiTail = e; } } while ((e = next) != null); if (loTail != null) { loTail.next = null; newTab[j] = loHead; } if (hiTail != null) { hiTail.next = null; newTab[j + oldCap] = hiHead; } } } } } return newTab; }
看注释就知道是初始化一个数组或者把一个数组扩容2倍。看一下具体实现:
第一次调用的时候,oldCap和oldThr都为0,然后调用:
newCap = DEFAULT_INITIAL_CAPACITY; //1<<4=2^4=16 newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY); (int)(0.75f*16) = 12
然后把newThr赋值threshold
/** * The next size value at which to resize (capacity * load factor). * 下一次调用resize() * @serial */ // (The javadoc description is true upon serialization. // Additionally, if the table array has not been allocated, this // field holds the initial array capacity, or zero signifying // DEFAULT_INITIAL_CAPACITY.) int threshold;
这个时候看注释就知道loadFactor是什么作用了,就是数组的长度*loadFactor的时候,调整大小。
Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap]; table = newTab;
这个就是初始化变量table,默认是16,如上面注释所说长度是2的幂。
到这,第一次调用resize()结束了。
if ((tab = table) == null || (n = tab.length) == 0) n = (tab = resize()).length;
同时也知道定义变量n就是数组的长度。
再看下面代码:
if ((p = tab[i = (n - 1) & hash]) == null) tab[i] = newNode(hash, key, value, null);
这段代码我们就可以知道hashmap中数组下标是怎么确定的:(n - 1) & hash,因为第一次put,所以tab[i = (n - 1) & hash])肯定是null,然后数组插入newNode(hash, key, value, null);
// Create a regular (non-tree) node //创建一个规则的(非树)节点 Node<K,V> newNode(int hash, K key, V value, Node<K,V> next) { return new Node<>(hash, key, value, next); }
所以就是HashMap内部是用Node类型的数组存储的。
看一下Node 类的组成:
/** * Basic hash bin node, used for most entries. (See below for * TreeNode subclass, and in LinkedHashMap for its Entry subclass.) */ static class Node<K,V> implements Map.Entry<K,V> { final int hash; final K key; V value; Node<K,V> next; Node(int hash, K key, V value, Node<K,V> next) { this.hash = hash; this.key = key; this.value = value; this.next = next; } public final K getKey() { return key; } public final V getValue() { return value; } public final String toString() { return key + "=" + value; } public final int hashCode() { return Objects.hashCode(key) ^ Objects.hashCode(value); } public final V setValue(V newValue) { V oldValue = value; value = newValue; return oldValue; } public final boolean equals(Object o) { if (o == this) return true; if (o instanceof Map.Entry) { Map.Entry<?,?> e = (Map.Entry<?,?>)o; if (Objects.equals(key, e.getKey()) && Objects.equals(value, e.getValue())) return true; } return false; } }
从上面的类组成来看,Node是实现了链表的功能。所以我们可以初步的知道HashMap的组成形式:
上图就是从我们在第一次put之后,查看源码之后可以知道的HashMap的内部数据结构。