文章目录
源码
Constants & Fields
public class ConcurrentHashMap<K,V> extends AbstractMap<K,V>
implements ConcurrentMap<K,V>, Serializable {
/* ---------------- Constants -------------- */
/**
* The largest possible table capacity. This value must be
* exactly 1<<30 to stay within Java array allocation and indexing
* bounds for power of two table sizes, and is further required
* because the top two bits of 32bit hash fields are used for
* control purposes.
*/
private static final int MAXIMUM_CAPACITY = 1 << 30; // 散列表数组最大容量值
/**
* The default initial table capacity. Must be a power of 2
* (i.e., at least 1) and at most MAXIMUM_CAPACITY.
*/
private static final int DEFAULT_CAPACITY = 16; // 散列表默认容量值16
/**
* The largest possible (non-power of two) array size.
* Needed by toArray and related methods.
*/
static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8; // 最大的数组大小(非2的幂)
/**
* The default concurrency level for this table. Unused but
* defined for compatibility with previous versions of this class.
*/
private static final int DEFAULT_CONCURRENCY_LEVEL = 16; // 默认并发级别,仅用于兼容JDK1.8之前版本
/**
* The load factor for this table. Overrides of this value in
* constructors affect only the initial table capacity. The
* actual floating point value isn't normally used -- it is
* simpler to use expressions such as {@code n - (n >>> 2)} for
* the associated resizing threshold.
*/
private static final float LOAD_FACTOR = 0.75f; // 负载(扩容)因子,构造函数覆盖仅仅影响表容量,n - (n >>> 2)
/**
* The bin (槽位(bucket)、哈希分桶) count threshold for using a tree rather than list for a
* bin. Bins are converted to trees when adding an element to a
* bin with at least this many nodes. The value must be greater
* than 2, and should be at least 8 to mesh with assumptions in
* tree removal about conversion back to plain bins upon
* shrinkage.
*/
static final int TREEIFY_THRESHOLD = 8; // for using a tree rather than list,树化阈值 >=
/**
* The bin count threshold for untreeifying a (split) bin during a
* resize operation. Should be less than TREEIFY_THRESHOLD, and at
* most 6 to mesh with shrinkage detection under removal.
*/
static final int UNTREEIFY_THRESHOLD = 6;// 树退化阈值 <=
/**
* The smallest table capacity for which bins may be treeified.
* (Otherwise the table is resized if too many nodes in a bin.)
* The value should be at least 4 * TREEIFY_THRESHOLD to avoid
* conflicts between resizing and treeification thresholds.
*/
static final int MIN_TREEIFY_CAPACITY = 64; // 树化的最小总量
/**
* Minimum number of rebinnings per transfer step. Ranges are
* subdivided to allow multiple resizer threads. This value
* serves as a lower bound to avoid resizers encountering
* excessive memory contention. The value should be at least
* DEFAULT_CAPACITY.
*/
private static final int MIN_TRANSFER_STRIDE = 16; // 扩容搬运时批量搬运的最小槽位数
/**
* The number of bits used for generation stamp in sizeCtl.
* Must be at least 6 for 32bit arrays.
*/
private static int RESIZE_STAMP_BITS = 16; // 计算扩容时会根据该属性值生成一个扩容标识戳
/**
* The maximum number of threads that can help resize.
* Must fit in 32 - RESIZE_STAMP_BITS bits.
*/
// (1 << 16) - 1 = 65535 表示并发扩容最多容纳的线程数
private static final int MAX_RESIZERS = (1 << (32 - RESIZE_STAMP_BITS)) - 1;
/**
* The bit shift for recording size stamp in sizeCtl.
*/
// 搬运线程数的标识位,通常是低16位
private static final int RESIZE_STAMP_SHIFT = 32 - RESIZE_STAMP_BITS;
/*
* Encodings for Node hash fields. See above for explanation.
*/
// 表示当前节点是FWD(forwarding)节点(已经被迁移的节点)
static final int MOVED = -1; // hash for forwarding nodes
// 表示当前节点已经树化,且当前节点为TreeBin对象
static final int TREEBIN = -2; // hash for roots of trees
// 原子计算的占位Node
static final int RESERVED = -3; // hash for transient reservations
// 作用是将一个二进制负数与1111111111111111111111111111111 进行按位与(&)运算时,保证hashcode扰动计算结果为正数
static final int HASH_BITS = 0x7fffffff; // usable bits of normal node hash
/** Number of CPUS, to place bounds on some sizings */
static final int NCPU = Runtime.getRuntime().availableProcessors();
/** For serialization compatibility. */
// JDK1.8 序列化为了兼容 JDK1.7的ConcurrentHashMap用到的属性
private static final ObjectStreamField[] serialPersistentFields = {
new ObjectStreamField("segments", Segment[].class),
new ObjectStreamField("segmentMask", Integer.TYPE),
new ObjectStreamField("segmentShift", Integer.TYPE)
};
/* ---------------- Fields -------------- */
/**
* The array of bins. Lazily initialized upon first insertion.
* Size is always a power of two. Accessed directly by iterators.
*/
transient volatile Node<K,V>[] table;
/**
* The next table to use; non-null only while resizing.
*/
// 新表的引用:扩容过程中,会将扩容中的新table赋值给nextTable,(保持引用),扩容结束之后,这里就会被设置为NULL
private transient volatile Node<K,V>[] nextTable;
/**
* Base counter value, used mainly when there is no contention,
* but also as a fallback during table initialization
* races. Updated via CAS.
*/
private transient volatile long baseCount;
/**
* Table initialization and resizing control. When negative, the
* table is being initialized or resized: -1 for initialization,
* else -(1 + the number of active resizing threads). Otherwise,
* when table is null, holds the initial table size to use upon
* creation, or 0 for default. After initialization, holds the
* next element count value upon which to resize the table.
*/
// 表示散列表table的状态,下面具体讲解
private transient volatile int sizeCtl;
/**
* The next table index (plus one) to split while resizing.
*/
// // 扩容过程中,记录当前进度。所有的线程都需要从transferIndex中分配区间任务,并去执行自己的任务
private transient volatile int transferIndex;
/**
* Spinlock (locked via CAS) used when resizing and/or creating CounterCells.
*/
private transient volatile int cellsBusy;
/**
* Table of counter cells. When non-null, size is a power of 2.
*/
private transient volatile CounterCell[] counterCells;
// views
private transient KeySetView<K,V> keySet;
private transient ValuesView<K,V> values;
private transient EntrySetView<K,V> entrySet;
}
sizeCtl含义
负数
- -1,表示当前table正在进行初始化
- -N,表示当前table散列表正在进行扩容,高16位表示与当前的tableSize关联的扩容戳resizeStamp,低16位表示扩容线程数+1,重点解释
0
表示哈希表尚未初始化,使用默认初始容量DEFAULT_CAPACITY=16
正数
- 如果table未初始化,表示初始化大小
- 如果table已经初始化,表示下次扩容时,触发扩容条件(阈值(0.75*))
-N
例:tryPresize(int size)部分代码,会初始化赋值-N。
// 首先int rs = resizeStamp(n);
// (rs << RESIZE_STAMP_SHIFT) + 2。
else if (tab == table) {
int rs = resizeStamp(n);
if (sc < 0) {
Node<K,V>[] nt;
if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 ||
sc == rs + MAX_RESIZERS || (nt = nextTable) == null ||
transferIndex <= 0)
break;
if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1))
transfer(tab, nt);
}
else if (U.compareAndSwapInt(this, SIZECTL, sc,
(rs << RESIZE_STAMP_SHIFT) + 2))
transfer(tab, null);
}
/**
* eg 16
* 0000 0000 0000 0000 0000 0000 0001 0000, Integer.numberOfLeadingZeros(n),得到27->11011
* 再或 2^15,得到
* 0000 0000 0000 0000 1000 0000 0001 1011
*
* @param n n
* @return int
*/
static final int resizeStamp(int n) {
// 返回指定int值的二进制补码表示中最高位(“最左边”)一位之前的零位数 或 2 ^ 15
return Integer.numberOfLeadingZeros(n) | (1 << (RESIZE_STAMP_BITS - 1));
}
// 之后再左移16位 ,+2。因为2第15位为1,所以值为负数。得到1000 0000 0001 1011 0000 0000 0000 0010即-2145714174
So, 高位表示扩容戳,低位表示扩容的线程数 + 1。
首个线程设置初始值为2的原因是:线程退出时会通过CAS操作将参与搬运的总线程数1,如果初始值按照常规做法设置成1,那么减1后就会变为0。此时其它线程发现线程数为0时,无法区分是没有任何线程做过搬运,还是有线程做完搬运但都退出了,也就无法判断要不要加入搬运的行列。
Table element access
/* ---------------- Table element access -------------- */
@SuppressWarnings("unchecked")
static final <K,V> Node<K,V> tabAt(Node<K,V>[] tab, int i) {
return (Node<K,V>)U.getObjectVolatile(tab, ((long)i << ASHIFT) + ABASE);
}
static final <K,V> boolean casTabAt(Node<K,V>[] tab, int i,
Node<K,V> c, Node<K,V> v) {
return U.compareAndSwapObject(tab, ((long)i << ASHIFT) + ABASE, c, v);
}
// in principle require only release ordering, not full volatile semantics, but are currently coded as volatile writes to be conservative. 原则上释放有序即可,并未完全volatile语义,但为了保守起见,还是使用volatile写。
static final <K,V> void setTabAt(Node<K,V>[] tab, int i, Node<K,V> v) {
U.putObjectVolatile(tab, ((long)i << ASHIFT) + ABASE, v);
}
public V put(K key, V value)
/**
* Maps the specified key to the specified value in this table.
* Neither the key nor the value can be null. key和value都不能为null
*
* <p>The value can be retrieved by calling the {@code get} method
* with a key that is equal to the original key.
*
* @param key key with which the specified value is to be associated
* @param value value to be associated with the specified key
* @return the previous value associated with {@code key}, or
* {@code null} if there was no mapping for {@code key}
* @throws NullPointerException if the specified key or value is null
*/
public V put(K key, V value) {
return putVal(key, value, false);
}
/** Implementation for put and putIfAbsent */
final V putVal(K key, V value, boolean onlyIfAbsent) {
if (key == null || value == null) throw new NullPointerException();
int hash = spread(key.hashCode()); // (h ^ (h >>> 16)) & HASH_BITS. 扰动函数(低16hashcode位与原hashcode异或)&0x7fffffff
int binCount = 0;
for (Node<K,V>[] tab = table;;) {
Node<K,V> f; int n, i, fh;
if (tab == null || (n = tab.length) == 0)
tab = initTable(); // init
else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) { // 取(头)节点
if (casTabAt(tab, i, null,
new Node<K,V>(hash, key, value, null)))
break; // no lock when adding to empty bin
}
else if ((fh = f.hash) == MOVED)
tab = helpTransfer(tab, f); // helpTransfer,协助扩容
else {
V oldVal = null;
synchronized (f) { // 锁
if (tabAt(tab, i) == f) {
if (fh >= 0) { // hash>0,为链表;当是红黑树,hash值为-2,就是变量TREEBIN的值
binCount = 1; // 节点数量统计
for (Node<K,V> e = f;; ++binCount) {
K ek;
if (e.hash == hash &&
((ek = e.key) == key ||
(ek != null && key.equals(ek)))) {
oldVal = e.val;
if (!onlyIfAbsent)
e.val = value;
break;
}
Node<K,V> pred = e;
if ((e = e.next) == null) {
pred.next = new Node<K,V>(hash, key,
value, null);
break;
}
}
}
else if (f instanceof TreeBin) { // 再check类型
Node<K,V> p;
binCount = 2;
if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, // 调用putTreeVal方法,添加到树中去
value)) != null) {
oldVal = p.val;
if (!onlyIfAbsent)
p.val = value;
}
}
}
}
if (binCount != 0) {
if (binCount >= TREEIFY_THRESHOLD) // >=8,调用treeifyBin,是否扩容或树化
treeifyBin(tab, i);
if (oldVal != null)
return oldVal;
break;
}
}
}
addCount(1L, binCount);
return null;
}
涉及到重要方法initTable
、helpTransfer
、putTreeVal
、treeifyBin
、addCount
。
initTable()
/**
* Initializes table, using the size recorded in sizeCtl.
*/
private final Node<K,V>[] initTable() {
Node<K,V>[] tab; int sc;
while ((tab = table) == null || tab.length == 0) {
if ((sc = sizeCtl) < 0) // -1 初始化;-N 扩容(1 + nThread)
Thread.yield(); // lost initialization race; just spin
else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
try {
if ((tab = table) == null || tab.length == 0) {
int n = (sc > 0) ? sc : DEFAULT_CAPACITY;
@SuppressWarnings("unchecked")
Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n];
table = tab = nt;
sc = n - (n >>> 2); // 扩容阈值 0.75 * size
}
} finally {
sizeCtl = sc;
}
break;
}
}
return tab;
}
helpTransfer
如果sizeCtrl中高16位的邮戳与当前tableSize不匹配,或者搬运线程数达到了最大值,或者所有搬运的线程都已经退出(只有在遍历完所有槽位后才会退出,否则会一直循环),或者nextTable已经被清空,跳过搬运操作。
/**
* Helps transfer if a resize is in progress.
*/
final Node<K,V>[] helpTransfer(Node<K,V>[] tab, Node<K,V> f) {
Node<K,V>[] nextTab; int sc;
if (tab != null && (f instanceof ForwardingNode) &&
(nextTab = ((ForwardingNode<K,V>)f).nextTable) != null) {
int rs = resizeStamp(tab.length);
while (nextTab == nextTable && table == tab &&
(sc = sizeCtl) < 0) {
// 判断扩容是否结束,此处有BUG(JDK8)。(sc == rs + 1 || sc == rs + MAX_RESIZERS)少了右移16位操作。JDK12已修复。
// 无符号右移16位验证扩容戳
// sc = rs + 1,初始化为 +2,说明所有搬运的线程都已经退出。详情tryPresize
// 达到了最大resize线程个数 65535个
// transferIndex <= 0 说明nextTable已经被清空,跳过搬运操作
if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 ||
sc == rs + MAX_RESIZERS || transferIndex <= 0)
break;
// 扩容未结束,则帮助扩容,线程+1。
if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1)) {
transfer(tab, nextTab);
break;
}
}
return nextTab;
}
return table;
}
并发transfer
C13Map中总共有三处地方会触发transfer方法的调用,分别是addCount、tryPresize、helpTransfer三个函数。
- addCount用于写操作完成后检验元素数量,如果超过了(>=)扩容阈值(sizeCtl),则触发resize扩容和旧表向新表的transfer。
- tryPresize是putAll,数量过多或treeifyBin树化前检验(树化的最小容量64),则预先触发一次resize扩容和旧表向新表的transfer。
- helpTransfer是写操作过程中发现bin的头节点是ForwardingNode, 则调用helpTransfer加入协助搬运的行列。
/**
* Moves and/or copies the nodes in each bin to new table. See
* above for explanation.
*/
private final void transfer(Node<K,V>[] tab, Node<K,V>[] nextTab) {
int n = tab.length, stride; // 假设n = 32
// //一次搬运多少个槽位,最小16
if ((stride = (NCPU > 1) ? (n >>> 3) / NCPU : n) < MIN_TRANSFER_STRIDE)
stride = MIN_TRANSFER_STRIDE; // subdivide range
if (nextTab == null) { // initiating
try {
@SuppressWarnings("unchecked")
Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n << 1];
nextTab = nt;
} catch (Throwable ex) { // try to cope with OOME
sizeCtl = Integer.MAX_VALUE;
return;
}
nextTable = nextTab;
transferIndex = n; // 初始化当前搬运索引
}
int nextn = nextTab.length;
ForwardingNode<K,V> fwd = new ForwardingNode<K,V>(nextTab); // 标记节点,null或搬运完成
boolean advance = true;
boolean finishing = false; // to ensure sweep before committing nextTab,保证提交nextTable之前已遍历旧表的所有槽位
for (int i = 0, bound = 0;;) {
Node<K,V> f; int fh;
while (advance) {
int nextIndex, nextBound;
if (--i >= bound || finishing)
advance = false;
else if ((nextIndex = transferIndex) <= 0) {
i = -1;
advance = false;
}
else if (U.compareAndSwapInt
(this, TRANSFERINDEX, nextIndex,
nextBound = (nextIndex > stride ?
nextIndex - stride : 0))) { // 首次搬运,更新搬运索引, nextBound = 32 -16 = 16
bound = nextBound; // 16
i = nextIndex - 1; // 31
advance = false;
}
}
if (i < 0 || i >= n || i + n >= nextn) {
int sc;
// 最后一个线程,提交table = nextTab;
if (finishing) {
nextTable = null;
table = nextTab;
sizeCtl = (n << 1) - (n >>> 1);
return;
}
if (U.compareAndSwapInt(this, SIZECTL, sc = sizeCtl, sc - 1)) {
// 是不是最后一个线程,不是则return
if ((sc - 2) != resizeStamp(n) << RESIZE_STAMP_SHIFT)
return;
finishing = advance = true;
i = n; // recheck before commit,抄底检查
}
}
else if ((f = tabAt(tab, i)) == null)
advance = casTabAt(tab, i, null, fwd);
else if ((fh = f.hash) == MOVED)
advance = true; // already processed
else {
synchronized (f) {
if (tabAt(tab, i) == f) {
Node<K,V> ln, hn;
if (fh >= 0) {
// 只能为0或n,0原来的位置,n则是(n+原来的位置)
int runBit = fh & n;
Node<K,V> lastRun = f;
for (Node<K,V> p = f.next; p != null; p = p.next) {
int b = p.hash & n;
// 找到最后一个导致hash不同的节点,此时runBit的值。
if (b != runBit) {
runBit = b;
lastRun = p;
}
}
// p节点后面的节点的hash&n 值跟p节点是一样的,
// 构造高低位链表
if (runBit == 0) {
ln = lastRun;
hn = null;
}
else {
hn = lastRun;
ln = null;
}
// 重新遍历节点直到不等于lastRun
for (Node<K,V> p = f; p != lastRun; p = p.next) {
int ph = p.hash; K pk = p.key; V pv = p.val;
if ((ph & n) == 0)
// 假设runBit = 0,ln = lastRun,则p.next =ln, ln = p;
ln = new Node<K,V>(ph, pk, pv, ln);
else
// 假设runBit = 0,hn = lastRun,则p.next =hn, hn = p;
hn = new Node<K,V>(ph, pk, pv, hn);
}
setTabAt(nextTab, i, ln); // ln存放到新表的i位置(旧表的位置)
setTabAt(nextTab, i + n, hn); // ln存放到新表的i位置(旧表的位置 + n)
setTabAt(tab, i, fwd); // 标记
advance = true;
}
else if (f instanceof TreeBin) {
// 将红黑树拆成两份,搬运到nextTable的i和i+n槽位,如果满足红黑树的退化条件,顺便将其退化为链表
TreeBin<K,V> t = (TreeBin<K,V>)f;
TreeNode<K,V> lo = null, loTail = null;
TreeNode<K,V> hi = null, hiTail = null;
int lc = 0, hc = 0;
for (Node<K,V> e = t.first; e != null; e = e.next) {
int h = e.hash;
TreeNode<K,V> p = new TreeNode<K,V>
(h, e.key, e.val, null, null);
if ((h & n) == 0) {
if ((p.prev = loTail) == null)
lo = p;
else
loTail.next = p;
loTail = p;
++lc;
}
else {
if ((p.prev = hiTail) == null)
hi = p;
else
hiTail.next = p;
hiTail = p;
++hc;
}
}
// <=6 树退化链表
ln = (lc <= UNTREEIFY_THRESHOLD) ? untreeify(lo) :
(hc != 0) ? new TreeBin<K,V>(lo) : t;
hn = (hc <= UNTREEIFY_THRESHOLD) ? untreeify(hi) :
(lc != 0) ? new TreeBin<K,V>(hi) : t;
setTabAt(nextTab, i, ln);
setTabAt(nextTab, i + n, hn);
setTabAt(tab, i, fwd);
advance = true;
}
}
}
}
}
}
多个线程并发搬运时,如果是首个搬运线程,负责nextTable的初始化工作;然后借助于全局的transferIndex变量从当前table的n-1槽位开始依次向低位扫描搬运,通过对transferIndex的CAS操作一次获取一个区段(默认是16),当transferIndex达到最低位时,不再能够获取到新的区段,线程开始退出,退出时会在sizeCtl上将总的线程数减一,最后一个退出的线程将扫描坐标i回退到最高位,强迫做一次抄底的全局扫描。
总结
源码总共6000多行,想要完全阐述远远不止。
CHM是一个并发容器的典范实现,锁的粒度最小化,提高吞吐量。
使用到的思想以及技术多…CAS、Volatile、sychronized、Fork/join、红黑树…
并发安全,但高并发带来的数据性实时性,读和写因此局限在某个时间点,而迭代器等也只具有若一致性(并发更新同时再迭代的情况)。并发安全性(性能)和数据一致性往往存在取舍。而HashTable具体强一致性。
再来看一下源码的overview:
* The primary design goal of this hash table is to maintain
* concurrent readability (typically method get(), but also
* iterators and related methods) while minimizing update
* contention. Secondary goals are to keep space consumption about
* the same or better than java.util.HashMap, and to support high
* initial insertion rates on an empty table by many threads.
* 并发可读性、并发最小化更新、迭代器、空间消耗、并发空表插入。
未完待遇…
加油,未来未来!
参考:
https://blog.csdn.net/Unknownfuture/article/details/105350537