HashMap

看源码doc注释比所有的博客、视频讲解都要权威!

翻译靠谷歌

DOC注释:

Hash table based implementation of the Map interface. This implementation provides all of the optional map operations, and permits null values and the null key. (The HashMap class is roughly equivalent to Hashtable, except that it is unsynchronized and permits nulls.) This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.

基于Map接口的hash表格。提供所有可选择map的操作,且允许将null作为key和value。(The HashMap class is roughly equivalent to Hashtable, except that it is unsynchronized and permits nulls)(HashMap和HashTable非常相似,除了HashMap是unsynchronized非线程安全,而且允许key、value为空,HashTable不可以为null)。HashMap不会保证map的内容顺序,而且map内容的顺序会随着时间而改变。

This implementation provides constant-time performance for the basic operations (get and put), assuming the hash function disperses the elements properly among the buckets. Iteration over collection views requires time proportional to the “capacity” of the HashMap instance (the number of buckets) plus its size (the number of key-value mappings). Thus, it’s very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.

假如hash()方法可以将元素很好的分散在散列表中,那么HashMap的get和put方法都会有一个非常好的性能。迭代Collection视图(HashMap)所需要的时间是和HashMap的容量"capacity"(buckets的容量)以及size(key-value键值mapping的数量)成正比的,因此,如果对HashMap的迭代性能要求非常高的话,就不能吧"capacity"设置的太高(或者把load factor覆盖因子设置的太低)。

An instance of HashMap has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.

HashMap的性能取决于两个参数:initial capacity(初始容量)、load factor(负载因子)。capacity表示的是hash表里的buckets数量,initial capacity就表示hash表被创建时的容量。load factor表示的就是hash表在扩容前允许最大的满度。当hash表里的实体数量超过了负载因子以及当前容量时,就会进行reshaed(即,内部数据结构重构),使得hash表的buckets数量变成原来的约两倍。

As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.

通常情况下,默认的load factor(0.75)在时间和空间成本上提供了一个很好的权衡。load factor较高的值会减少空间开销,但会增加查找时间的成本(在大多数的HashMap的操作中,包括get、put等,都得到了体现)。在设置initial capacity初始容量的时候,应该把map里预期元素的数量以及load factor负载因子考虑进来,以最大的降低rehash操作的次数。如果initial capacity>最大元素元素/load factor,那就永远不会rehash。

If many mappings are to be stored in a HashMap instance, creating it with a sufficiently large capacity will allow the mappings to be stored more efficiently than letting it perform automatic rehashing as needed to grow the table. Note that using many keys with the same hashCode() is a sure way to slow down performance of any hash table. To ameliorate impact, when keys are Comparable, this class may use comparison order among keys to help break ties.

如果一个HashMap有很多mappings映射需要被存储,那么创建一个足够大的capacity的HashMap,将会比让HashMap等到扩容时进行rehash操作要效率更高。注意:产生hash冲突将会降低hash表的性能。为了改善影响,当key是Comparable时,HashMap会用key之间的比较顺序来帮助打破关系。

Note that this implementation is not synchronized. If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.) This is typically accomplished by synchronizing on some object that naturally encapsulates the map. If no such object exists, the map should be “wrapped” using the Collections.synchronizedMap method. This is best done at creation time, to prevent accidental unsynchronized access to the map:

Map m = Collections.synchronizedMap(new HashMap(…));

The iterators returned by all of this class’s “collection view methods” are fail-fast: if the map is structurally modified at any time after the iterator is created, in any way except through the iterator’s own remove method, the iterator will throw a ConcurrentModificationException. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.

注意:HashMap不是线程安全的。如果很多线程同时对一个HashMap进行访问,并且其中至少一个线程在更新修改map的结构,就必须要保证线程安全。(修改map的结构包括添加或删除一个或多个映射的任何操作,而修改已存在映射实例的key对应的value值,是算不上修改map的结构。)一般都是通过同步(synchronized)封装了map的对象来实现线程安全,如果没有这个对象(HashMap对象还未创建),就可以通过使用Collections.synchronizedMap的方法来"包装"一个HashMap来达到线程安全,最好是在HashMap创建的时候进行,来防止非线程安全的事件访问map:

Map m = Collections.synchronizedMap(new HashMap(…));

这个类的"集合视图方法"返回的iterators会快速失败:如果map在iterator被创建后的任何时候被更新,除非是iterators自己的remove()方法,否则iterator就会抛出ConcurrentModificationException的异常。因此,面对并发的修改,iterator会快速地失败,而不是在未来不确定的时间随意冒险以及做一些不确定的事。

Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast iterators throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: the fail-fast behavior of iterators should be used only to detect bugs.
This class is a member of the Java Collections Framework.

注意:在非synchronized(同步)的并发修改中,iterator的快速失败无法得到保证,而且一般都不可能保证。快速失败的迭代器会尽力去抛出ConcurrentModificationException 异常。因此,为了严谨却去写一个依赖于这个异常的程序将会出错:iterator的快速失败应该只被用在检测bug上。
HashMap是Java集合框架的一员。


This map usually acts as a binned (bucketed) hash table, but when bins get too large, they are transformed into bins of TreeNodes, each structured similarly to those in java.util.TreeMap. Most methods try to use normal bins, but relay to TreeNode methods when applicable (simply by checking instanceof a node). Bins of TreeNodes may be traversed and used like any others, but additionally support faster lookup when overpopulated. However, since the vast majority of bins in normal use are not overpopulated, checking for existence of tree bins may be delayed in the course of table methods.

HashMap通常用作一个装箱的hash表,但箱子(bins)太大的话,他们就会被转成树节点的箱子,每个结构都和java.util.TreeMap相似。大多方法只是去使用正常的bins,但适用时会中继到TreeNode的方法(通过检查节点实例)。
TreeNodes的bins可以像其他的bins一样查找和使用,但是bins过多时,TreeNode的bins查找起来还更快。但是正常情况下,大多的bins并不会过多,所以在hash table的方法中会延迟的检查tree 里的bins数量。

Tree bins (i.e., bins whose elements are all TreeNodes) are ordered primarily by hashCode, but in the case of ties, if two elements are of the same “class C implements Comparable”, type then their compareTo method is used for ordering. (We conservatively check generic types via reflection to validate this – see method comparableClassFor). The added complexity of tree bins is worthwhile in providing worst-case O(log n) operations when keys either have distinct hashes or are orderable, Thus, performance degrades gracefully under accidental or malicious usages in which hashCode() methods return values that are poorly distributed, as well as those in which many keys share a hashCode, so long as they are also Comparable. (If neither of these apply, we may waste about a factor of two in time and space compared to taking no precautions. But the only known cases stem from poor user programming practices that are already so slow that this makes little difference.)

Tree bins(即bins的元素都是TreeNodes树节点)主要是通过hashCode来排序,但在有联系的情况下,如果两个元素的类都实现了Comparable接口,那么他们的compareTo方法会用来作为排序的方法。(通过反射来检查泛型的类型–查考comparableClassFor方法)。当keys有不同的hashCode或者有不同的顺序,增加复杂度的tree bins是值得去提供最差的O(log n)操作的,因此在意外或恶意使用(其中hashCode()方法返回的值很差)的情况下,性能会正常降低。以及其中许多键共享一个hashCode的键,只要它们也是可比较的。(如果这两种方法都不适用,那么与不采取预防措施相比,我们可能在时间和空间上浪费大约两倍。但是,唯一已知的情况是由于不良的用户编程实践已经如此之慢,以至于几乎没有什么区别。)

Because TreeNodes are about twice the size of regular nodes, we use them only when bins contain enough nodes to warrant use (see TREEIFY_THRESHOLD). And when they become too small (due to removal or resizing) they are converted back to plain bins. In usages with well-distributed user hashCodes, tree bins are rarely used. Ideally, under random hashCodes, the frequency of nodes in bins follows a Poisson distribution (http://en.wikipedia.org/wiki/Poisson_distribution) with a parameter of about 0.5 on average for the default resizing threshold of 0.75, although with a large variance because of resizing granularity. Ignoring variance, the expected occurrences of list size k are (exp(-0.5) * pow(0.5, k) / factorial(k)). The first values are:
0: 0.60653066
1: 0.30326533
2: 0.07581633
3: 0.01263606
4: 0.00157952
5: 0.00015795
6: 0.00001316
7: 0.00000094
8: 0.00000006
more: less than 1 in ten million
The root of a tree bin is normally its first node. However,
sometimes (currently only upon Iterator.remove), the root might
be elsewhere, but can be recovered following parent links
(method TreeNode.root()).

因为树节点大小大约是常规节点(链表节点)的两倍,所以在节点足够多时才使用树节点(请参阅TREEIFY_THRESHOLD)。当它们变得太小(由于移除或调整大小)时,它们会转换回常规bins(链表)。在很好的hashCode使用中,很少会将节点转为树节点。理想情况下,在随机hashCodes下,bin中节点的频率遵循Poisson分布(http://en.wikipedia.org/wiki/Poisson_distribution),默认调整大小阈值为0.75,平均参数约为0.5,即使调整粒度的差异很大。
忽略方差,列表大小k的预期出现次数是(exp(-0.5)* pow(0.5,k)/ factorial(k))。第一个值是:
0: 0.60653066
1: 0.30326533
2: 0.07581633
3: 0.01263606
4: 0.00157952
5: 0.00015795
6: 0.00001316
7: 0.00000094
8: 0.00000006
更多:少于一千万分之一 。
树枝的根通常是其第一个节点。然而, 有时(当前仅在Iterator.remove上),根目录可能 在其他地方,但可以通过父链接进行恢复 (TreeNode.root()方法)。

All applicable internal methods accept a hash code as an argument (as normally supplied from a public method), allowing them to call each other without recomputing user hashCodes. Most internal methods also accept a “tab” argument, that is normally the current table, but may be a new or old one when resizing or converting.

所有适用于内部接收一个hash code作为参数的方法(一般是public方法),能够相互调用而无需重新计算用户的hashCodes。大都内部方法也会接受一个"tab"参数,一般表示当前table,但也可能在resize或convert的一个新的或旧的table

When bin lists are treeified, split, or untreeified, we keep them in the same relative access/traversal order (i.e., field Node.next) to better preserve locality, and to slightly simplify handling of splits and traversals that invoke iterator.remove. When using comparators on insertion, to keep a total ordering (or as close as is required here) across rebalancings, we compare classes and identityHashCodes as tie-breakers.

当bin列表被树化,拆分或未树化时,我们将它们保持在相同的相对访问/遍历顺序(即字段Node.next)中,以更好地保留局部性,并略微简化调用iterator.remove的拆分和遍历的处理。当使用compatators比较器来插入元素时,为了在重新平衡之间保持总体的顺序,我们将类和identityHashCodes作为决胜局进行比较。

The use and transitions among plain vs tree modes is complicated by the existence of subclass LinkedHashMap. See below for hook methods defined to be invoked upon insertion, removal and access that allow LinkedHashMap internals to otherwise remain independent of these mechanics. (This also requires that a map instance be passed to some utility methods that may create new nodes.)

子类LinkedHashMap的存在使链表模式与树模式之间的使用和转换变得复杂。


常量部分:
  1. DEFAULT_INITIAL_CAPACITY
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

The default initial capacity - MUST be a power of two.

默认初始capacity - 必须是2的幂次方

  1. MAXIMUM_CAPACITY
static final int MAXIMUM_CAPACITY = 1 << 30;

The maximum capacity, used if a higher value is implicitly specified by either of the constructors with arguments. MUST be a power of two <= 1<<30.

如果任意一个构造函数使用参数隐式指定了更高的值,则使用maximum容量。必须<= 1<<30.

  1. DEFAULT_LOAD_FACTOR
static final float DEFAULT_LOAD_FACTOR = 0.75f;

The load factor used when none specified in constructor.

当在构造器里未被声明时,使用这个值

  1. TREEIFY_THRESHOLD
static final int TREEIFY_THRESHOLD = 8;

The bin count threshold for using a tree rather than list for a bin. Bins are converted to trees when adding an element to a bin with at least this many nodes. The value must be greater than 2 and should be at least 8 to mesh with assumptions in tree removal about conversion back to plain bins upon shrinkage.

bin数量阈值决定使用树而不是链表。当bin(桶)里至少8个节点,往bin增加一个元素,bins就会被转换成tree。这个值必须大于2且应该至少等于8,才能与树收缩转回链表的删除有关。

  1. UNTREEIFY_THRESHOLD
static final int UNTREEIFY_THRESHOLD = 6;

The bin count threshold for untreeifying a (split) bin during a resize operation. Should be less than TREEIFY_THRESHOLD, and at most 6 to mesh with shrinkage detection under removal.

在resize操作期间,非树化的bin数量阈值。应该比TREEIFY_THRESHOLD数值小,且最大为6。

  1. MIN_TREEIFY_CAPACITY
static final int MIN_TREEIFY_CAPACITY = 64;

The smallest table capacity for which bins may be treeified. (Otherwise the table is resized if too many nodes in a bin.) Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts between resizing and treeification thresholds.

HashMap容量最小为64时,bins才应该被树化。(否则bins(桶)里太多节点的话HashMap就会resize)。应该至少= 4 * TREEIFY_THRESHOLD(默认为8),以避免resize和树化阈值之间的冲突。


变量、常量部分:
  1. table
transient Node<K,V>[] table;	

The table, initialized on first use, and resized as necessary. When allocated, length is always a power of two. (We also tolerate length zero in some operations to allow bootstrapping mechanics that are currently not needed.)

table在首次使用时进行初始化,且根据需要进行调整size。分配后,table的长度总是为2的幂次方。(在某些操作中也会允许长度为0,来使用当前不需要的引导机制。)

注意:这里的意思,table是在put()方法操作后,才会进行初始化,可以在HashMap(int initialCapacity, float loadFactor)这个构造函数找到对应点:

public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        // tableSizeFor()方法返回的是关于capacity的值,
        // 而这样子才符合threshold的定义:threshold = capacity * load factor  。
        // 但是,请注意,在构造方法中,并没有对table这个成员变量进行初始化,table的初始化被推迟到了put方法中,
        // 在put方法中会对threshold重新计算 。
        this.threshold = tableSizeFor(initialCapacity); 
    }

transient:Java语言的关键字,变量修饰符,如果用transient声明一个实例变量,当对象存储时,它的值不需要维持。换句话来说就是,用transient关键字标记的成员变量不参与序列化过程。
可以参考下:https://blog.csdn.net/u013207877/article/details/52572975

  1. entrySet
transient Set<Map.Entry<K,V>> entrySet;

Holds cached entrySet(). Note that AbstractMap fields are used for keySet() and values().

保存entrySet()方法的缓存。注意,AbstractMap字段用于keySet()和values()。

  1. size
transient int size;

The number of key-value mappings contained in this map.

map中key-value键值对映射的数量。

  1. modCount
transient int modCount;

The number of times this HashMap has been structurally modified Structural modifications are those that change the number of mappings in the HashMap or otherwise modify its internal structure (e.g., rehash). This field is used to make iterators on Collection-views of the HashMap fail-fast. (See ConcurrentModificationException).

HashMap结构修改的次数,比如键值对映射的数量,或者修改内部结构(如rehash())。这个字段用于使HashMap的Collection-view上的迭代器快速失败。(参考ConcurrentModificationException)。

  1. threshold
int threshold;

The next size value at which to resize (capacity * load factor).

下一次要resize的size值(容量乘以负载因子)

  1. loadFactor
final float loadFactor;

The load factor for the hash table.

hash table 的负载因子


静态内部类
  • Node<K,V>
static class Node<K,V> implements Map.Entry<K,V> {
....
}

Basic hash bin node, used for most entries. (See below for TreeNode subclass, and in LinkedHashMap for its Entry subclass.)

基础hash节点,用于大多数实体。(参考下面的TreeNode子类,以及LinkedHashMap中的Entry子类。)

public final boolean equals(Object o) {
            if (o == this)
                return true;
            if (o instanceof Map.Entry) {
                Map.Entry<?,?> e = (Map.Entry<?,?>)o;
                if (Objects.equals(key, e.getKey()) &&
                    Objects.equals(value, e.getValue()))
                    return true;
            }
            return false;
        }

有equals()方法就必须有hashCoed()方法:

 public final int hashCode() {
            return Objects.hashCode(key) ^ Objects.hashCode(value);
        }

这里equals(),传进来的参数必须是Map.Entry类型的,比较的逻辑是两个对象之间的key、value需要"=="或者key1.equals(key2)、value1.equals(value2),如果key、value是String类型,String重写了equals()方法,只需内容相同即可,如果key、value是其他的未重写equals()方法的类,则需要两个对象之间的地址相同。


方法
  1. hash()
static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

Computes key.hashCode() and spreads (XORs) higher bits of hash to lower. Because the table uses power-of-two masking, sets of hashes that vary only in bits above the current mask will always collide. (Among known examples are sets of Float keys holding consecutive whole numbers in small tables.) So we apply a transform that spreads the impact of higher bits downward. There is a tradeoff between speed, utility, and quality of bit-spreading. Because many common sets of hashes are already reasonably distributed (so don’t benefit from spreading), and because we use trees to handle large sets of collisions in bins, we just XOR some shifted bits in the cheapest possible way to reduce systematic lossage, as well as to incorporate impact of the highest bits that would otherwise never be used in index calculations because of table bounds.

计算key.hashCode()并将哈希的较高位扩展(XOR)到较低位。因为该表使用2的幂次掩码,所以仅在当前掩码上方的位中变化的哈希集将始终发生冲突。(已知示例中有一组Float键,它们在小表中保存连续的整数。)所以我们做了一个将高位转到低位的转换。在速度,实用性和位扩展质量之间需要权衡。由于许多常见的哈希集已经合理分布(因此无法从扩展中受益),并且由于我们使用树来处理容器中的大量冲突集,因此我们仅以最便宜的方式对一些移位后的位进行XOR,以减少系统损失,以及合并最高位的影响,否则由于表范围的限制,这些位将永远不会在索引计算中使用。

  1. tableSizeFor()
static final int tableSizeFor(int cap) {
        int n = cap - 1;
        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}

Returns a power of two size for the given target capacity.

根据给定的目标capacity,来返回2次幂的值。

tableSizeFor()方法保证HashMap初始化时,传进来的capacity参数,在进行tableSizeFor(cap)后,返回的值一定是2的幂次方。

假设,传进来的cap = 65:

n = cap-1 = 64; // n=0000 0000 0000 0000 0000 0000 0100 0000(二进制)
n |= n >>> 1;
// 0000 0000 0000 0000 0000 0000 0100 0000
// | 0000 0000 0000 0000 0000 0000 0010 0000
// = 0000 0000 0000 0000 0000 0000 0110 0000
n |= n >>> 2;
// 0000 0000 0000 0000 0000 0000 0110 0000
// | 0000 0000 0000 0000 0000 0000 0001 1000
// = 0000 0000 0000 0000 0000 0000 0111 1000
n |= n >>> 4;
// 0000 0000 0000 0000 0000 0000 0111 1000
// | 0000 0000 0000 0000 0000 0000 0000 0111
// = 0000 0000 0000 0000 0000 0000 0111 1111
…//16、8无变化
n |= n >>> 8;
// 0000 0000 0000 0000 0000 0000 0111 1111
// | 0000 0000 0000 0000 0000 0000 0000 0000
// = 0000 0000 0000 0000 0000 0000 0111 1111
n |= n >>> 16;
// 0000 0000 0000 0000 0000 0000 0111 1111
// | 0000 0000 0000 0000 0000 0000 0000 0000
// = 0000 0000 0000 0000 0000 0000 0111 1111
此时,n = 127,return的值为n+1 = 128

多使用几个例子就可以发现,传进来的cap(65)的二进制值最高位的位置(例如上面cap=65,对应的二进制值最高位1为第7个位置),计算后的结果就是其最高位的位置后面填充上1,最后再将值+1,得到的数一定是2的幂次方。
也可看图详解:
在这里插入图片描述

  1. put()
public V put(K key, V value) {
    return putVal(hash(key), key, value, false, true);
}

Associates the specified value with the specified key in this map. If the map previously contained a mapping for the key, the old value is replaced.

将指定的value关联上map中的特定key,如果map包含了key,就将旧的value进行替换.

put()方法会调用putVal()方法,传递了5个参数,第一个参数为hash(key),先看看这个方法的操作:

static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

hash()的注释上面已经写了,具体操作就是将key的hashCode()值进行高低位异或。
例如:
原 来 的 hashCode : 1111 1111 1111 1111 0100 1100 0000 1010
移位后的hashCode : 0000 0000 0000 0000 1111 1111 1111 1111
进行异或运算 结果:1111 1111 1111 1111 1011 0011 1111 0101
这样做的好处是,可以将hashcode高位和低位的值进行混合做异或运算,而且混合后,低位的信息中加入了高位的信息,这样高位的信息被变相的保留了下来。掺杂的元素多了,那么生成的hash值的随机性会增大。

接下来可以看putVal()方法做的操作了:

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        // 上面提到过,table在put的时候,才被初始化。具体操作在resize()里
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
        	// 第一次put操作时进来,
        	// resize():table初始化,并将n赋值为table的length
            n = (tab = resize()).length;
            
        // i = (n -1) & hash得到的值一定小于table的length(后面详解)
        // 判断tab[i]的值是否已经有值了
        if ((p = tab[i = (n - 1) & hash]) == null)
        	// tab[i]为null就对tab[i]进行设值。
            tab[i] = newNode(hash, key, value, null);
        else {
        	// table已经初始化了,且tab[i]已经有值了,即产生哈希冲突了!
        	// 先不看
            ...
        }
        // modCount表示修改HashMap的次数,进行+1
        ++modCount;
        // 增加元素后将size+1,此时threshold=16(capacity)*0.75(load factor)=12
        if (++size > threshold)
        	// 如果size>12了,即size超过了阈值,就会进行resize(),这里的resize()指的是扩容了。
            resize();
        afterNodeInsertion(evict);
        return null;
    }

根据上面代码的注释内容,我们先看resize()方法,内容有点长,因为resize包括第一次put时,对table的初始化,以及元素过多进行扩容的内容,这里暂时研究第一次put时,对table的初始化:

/**
     * Initializes or doubles table size.  If null, allocates in
     * accord with initial capacity target held in field threshold.
     * Otherwise, because we are using power-of-two expansion, the
     * elements from each bin must either stay at same index, or move
     * with a power of two offset in the new table.
     *
     * @return the table
     */
    final Node<K,V>[] resize() {
    	// 此时table还是为null,所以oldTab为null
        Node<K,V>[] oldTab = table;
        // oldCap为0
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        // 此时threshold还是=capacity的状态=16
        int oldThr = threshold;
        int newCap, newThr = 0;
        // oldCap = 0直接跳到后面
        if (oldCap > 0) {
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        // 此时oldThr = 16 
        else if (oldThr > 0) // initial capacity was placed in threshold
        	// newCap = 16
            newCap = oldThr;
        else {               // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
        	// 此时newThr为0,下面就对thrshold重新计算了
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        // 到这之前threshold = tableSize(initialCapacity)
        // 所以在对table进行初始化时,重新计算了threshold = capacity*loadFacotor
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        // 到这table就算是初始化了
        table = newTab;
        if (oldTab != null) {
        	// 此时oldTab还是=null,下面的内容就是关于扩容的内容,很复杂!
        	...
        }
        return newTab;
    }

下面我们考虑putVal()产生哈希冲突时的操作:

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
      Node<K,V>[] tab; Node<K,V> p; int n, i;
      ...
      // 上面是未产生哈希冲突的正常插入元素的情况
      // tab = table,i 为key对应的table下标,p为此时table[i]对应的Node值,n为table的length
      else {
            Node<K,V> e; K k;
            // 如果p.hash(table[i]下的Node的key计算出的hash()值) == hash(传进来
            // 的key计算出的hash()值),就存在table[i]下的Node的key == 传进来的key
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
            	// 类似于map.put("a",1);
            	// 然后又map.put("a",2); 就进到这里来了
            	// 这种情况下e 指向 p的对象(table[i]下的Node)
            	// 后续的操作就是新value替代就value
                e = p;
            else if (p instanceof TreeNode)
            	// 如果这里返回true,就表示bins已经形成了树的结构
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
            	// 这里就表示产生哈希冲突但key不相同,且bins还未形成
            	// 树形状的情况
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                    	// 这里p.next==null表示已经遍历到了链表bins最后一个元素了
                        p.next = newNode(hash, key, value, null);
                        // TREEIFY_THRESHOLD 默认为8,binCount>=7就表示加上这个新元素,
                        // 链表bins的长度>=8了,就需要树化了。
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        	// 将table进行树化
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        // 在bins链表里找到了相同的key,后续的操作就是
                        // 新value替代旧value
                        break;
                    // 将p指向自己的下一个Node(上面已经将e指向了p的下一个Node)
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
            	// 下面的操作就是新value替代旧value
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        // ...
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
}

至此,putVal()我们还有两个最复杂的将bins树化以及将元素插入到树结构bins的内容

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值