(segments, (j << SSHIFT) + SBASE)) == null) // in ensureSegment @3
s = ensureSegment(j); //@4
return s.put(key, hash, value, false); //@5
}
代码@1,表明 ConcurrentHashMap不支持value为空的键值对。
代码@2,计算该key对应的Segment的位置(数组下标),并发包中获取数组元素的方式,采用的是UNSAFE直接操作内存的方式,而不是典型的 Segment[] a = new Segment[16], 第j个元素的值为 a[j]。如果需要详细了解UNSAFE操作数组元素的原理,请查看 另一篇博客(AtomicIntegerArray 源码分析)
比如一个Integer[]中,每个int是32位,占4个字节,那数组中第3个位置的开始字节是多少呢?=(3-1) << 2,也就是说SHIFT的值为元素中长度的幂。怎么获取每个元素在数组中长度(字节为单位)= UNSAFE.arrayIndexScale,
而 UNSAFE.arrayBaseOffset,返回的是,第一个数据元素相对于对象起始地址的便宜量,该部分的详解,请参考我的技术博客【http://blog.csdn.net/prestigeding/article/details/52980801】
代码@3,就是获取j下标的segment对象。相当于 if( (s == segments[j])== null )
代码@4,我们将目光移到 ensureSegment方法中:
/**
-
Returns the segment for the given index, creating it and
-
recording in segment table (via CAS) if not already present.
-
@param k the index
-
@return the segment
*/
@SuppressWarnings(“unchecked”)
private Segment<K,V> ensureSegment(int k) {
final Segment<K,V>[] ss = this.segments;
long u = (k << SSHIFT) + SBASE; // raw offset
Segment<K,V> seg;
if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) {
Segment<K,V> proto = ss[0]; // use segment 0 as prototype
int cap = proto.table.length;
float lf = proto.loadFactor;
int threshold = (int)(cap * lf);
HashEntry<K,V>[] tab = (HashEntry<K,V>[])new HashEntry[cap];
if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u))
== null) { // recheck
Segment<K,V> s = new Segment<K,V>(lf, threshold, tab);
while ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u))
== null) {
if (UNSAFE.compareAndSwapObject(ss, u, null, seg = s))
break;
}
}
}
return seg;
}
该方法,主要是确保segment槽的k位置的Segment不为空,如果为空,初始化。
代码@5,代码@4初始化k位置的segment后,将键值对加入到segment,接下重点看一下Segment的put方法:
final V put(K key, int hash, V value, boolean onlyIfAbsent) {
HashEntry<K,V> node = tryLock() ? null :
scanAndLockForPut(key, hash, value); // @1
V oldValue;
try {
HashEntry<K,V>[] tab = table;
int index = (tab.length - 1) & hash;
HashEntry<K,V> first = entryAt(tab, index); // @2
for (HashEntry<K,V> e = first;😉 { // @3
if (e != null) { // @4
K k;
if ((k = e.key) == key ||
(e.hash == hash && key.equals(k))) { //@5
oldValue = e.value;
if (!onlyIfAbsent) {
e.value = value;
++modCount;
}
break;
}
e = e.next;
}
else { //@6
if (node != null)
node.setNext(first);
else
node = new HashEntry<K,V>(hash, key, value, first);
int c = count + 1;
if (c > threshold && tab.length < MAXIMUM_CAPACITY)
rehash(node);
else
setEntryAt(tab, index, node);
++modCount;
count = c;
oldValue = null;
break;
}
}
} finally {
unlock();
}
return oldValue;
}
该方法,实现思路其实和HashMap一样,就是要在Segment的HashEntity[] table的指定位置加入新的Node,如果在位置k的位置不为空,此时,说明该位置发生了hash冲突,这是需要先遍历整个链,看是否有相等的key,如果key相等,则替换该值,如果没有,则将新加入的节点的next指针指向 table[k],然后将node加入到k位置。但是,由于ConcurrentHashMap是支持多个线程同时访问的,对于单个Segment的操作,需要加锁。
代码@1,首先尝试获取锁,如果成功获取锁,则继续添加元素,如果获取锁失败,后面重点分析。
代码@2,获取该key所对应的table[]中的下标。根据该元素是否为空,有两种操作,如果为空,说明没有发生冲突,也就是走代码@6分支,就是将新创建的节点的节点放入table[k]处,当然,此时需要判断是否需要进行rehash操作(ConcurrentHashMap的是否需要rehash,就是判断阔值)。
代码@4,就是循环判断table[k]的链条中,是否有key与待操作key相等,如果相等,直接替换就好。由于@3开始,其实就是整个put方法,会在锁保护中。
上述过程,应该很好理解,所以,现在重点关注两个方法,一是scanAndLockForPut,二是rehash(比较好奇,是否与HashMap相同,应该是一样的吧,呵呵)。
/**
-
Scans for a node containing given key while trying to
-
acquire lock, creating and returning one if not found. Upon
-
return, guarantees that lock is held. UNlike in most
-
methods, calls to method equals are not screened: Since
-
traversal speed doesn’t matter, we might as well help warm
-
up the associated code and accesses as well.
-
@return a new node if key not found, else null
*/
private HashEntry<K,V> scanAndLockForPut(K key, int hash, V value) {
HashEntry<K,V> first = entryForHash(this, hash);
HashEntry<K,V> e = first;
HashEntry<K,V> node = null;
int retries = -1; // negative while locating node
while (!tryLock()) {
HashEntry<K,V> f; // to recheck first below
if (retries < 0) {
if (e == null) {
if (node == null) // speculatively create node
node = new HashEntry<K,V>(hash, key, value, null);
retries = 0;
}
else if (key.equals(e.key))
retries = 0;
else
e = e.next;
}
else if (++retries > MAX_SCAN_RETRIES) {
lock();
break;
}
else if ((retries & 1) == 0 &&
(f = entryForHash(this, hash)) != first) {
e = first = f; // re-traverse if entry changed
retries = -1;
}
}
return node;
}
在没有成功获取锁的情况下,先不急于阻塞,而是乐观的估计获取锁的线程操作的key与当前操作的key没关系,那我该干嘛就干嘛,自旋尝试获取锁(尝试MAX_SCAN_RETRIES,如果未成功获取锁)尝试超过最大尝试次数,为了性能考虑,该线程阻塞,参加代码@2。
@3,每隔一次,检查一下 Segment HashEntity[] table 处k的位置的元素是否发生变化,如果发生变化,则重试次数设置为-1,继续尝试获取锁。该方法如果在阻塞在lock()方法,时,一旦获取锁,则进入到final V put(K key, int hash, V value, boolean onlyIfAbsent) 方法中,进行常规的put方法(与HashMap操作类似。)
接下来重点看一下代码@7,如果当前segment中容量大于阔值,并小于允许的最大长度时,需要进行rehash,下面分析一下rehash源码:
/**
-
Doubles size of table and repacks entries, also adding the
-
given node to new table
*/
@SuppressWarnings(“unchecked”)
private void rehash(HashEntry<K,V> node) {
/*
-
Reclassify nodes in each list to new table. Because we
-
are using power-of-two expansion, the elements from
-
each bin must either stay at same index, or move with a
-
power of two offset. We eliminate unnecessary node
-
creation by catching cases where old nodes can be
-
reused because their next fields won’t change.
-
Statistically, at the default threshold, only about
-
one-sixth of them need cloning when a table
-
doubles. The nodes they replace will be garbage
-
collectable as soon as they are no longer referenced by
-
any reader thread that may be in the midst of
-
concurrently traversing table. Entry accesses use plain
-
array indexing because they are followed by volatile
-
table write.
*/
HashEntry<K,V>[] oldTable = table;
int oldCapacity = oldTable.length;
int newCapacity = oldCapacity << 1;
threshold = (int)(newCapacity * loadFactor);
HashEntry<K,V>[] newTable =
(HashEntry<K,V>[]) new HashEntry[newCapacity];
int sizeMask = newCapacity - 1;
for (int i = 0; i < oldCapacity ; i++) {
HashEntry<K,V> e = oldTable[i];
if (e != null) {
HashEntry<K,V> next = e.next;
int idx = e.hash & sizeMask;
if (next == null) // Single node on list
newTable[idx] = e;
else { // Reuse consecutive sequence at same slot
HashEntry<K,V> lastRun = e;
int lastIdx = idx;
for (HashEntry<K,V> last = next;
last != null;
last = last.next) {
int k = last.hash & sizeMask;
if (k != lastIdx) {
lastIdx = k;
lastRun = last;
}
}
newTable[lastIdx] = lastRun;
// Clone remaining nodes
for (HashEntry<K,V> p = e; p != lastRun; p = p.next) {
V v = p.value;
int h = p.hash;
int k = h & sizeMask;
HashEntry<K,V> n = newTable[k];
newTable[k] = new HashEntry<K,V>(h, p.key, v, n);
}
}
}
}
int nodeIndex = node.hash & sizeMask; // add the new node
node.setNext(newTable[nodeIndex]);
newTable[nodeIndex] = node;
table = newTable;
}
在理解了HashMap的rehash方法后,再来看该方法,应该能很好的理解,故不做重复讲解了。
2.2.2 public V putIfAbsent(K key, V value)
该方法的语义是,如果存在key,则直接返回key关联的value,如果key不存在,则加入该键值对,并返回null;该步骤是原子操作。
public V putIfAbsent(K key, V value) {
Segment<K,V> s;
if (value == null)
throw new NullPointerException();
int hash = hash(key);
int j = (hash >>> segmentShift) & segmentMask;
if ((s = (Segment<K,V>)UNSAFE.getObject
(segments, (j << SSHIFT) + SBASE)) == null)
s = ensureSegment(j);
return s.put(key, hash, value, true);
}
该方法与put方法的实现基本相同,唯一不同的是,对已经存在key时,put方法是直接覆盖旧值,而putIfAbsent是,返回旧值。
2.2.3 public void putAll(Map m)
public void putAll(Map<? extends K, ? extends V> m) {
for (Map.Entry<? extends K, ? extends V> e : m.entrySet())
put(e.getKey(), e.getValue());
}
直接将传人的Map类型的参数,遍历,调用put方法。
看过了put函数,我们将目标转向到get方法中,瞧一瞧get相关方法的实现:
2.2.4 public V get(Object key)源码分析
/**
-
Returns the value to which the specified key is mapped,
-
or {@code null} if this map contains no mapping for the key.
-
More formally, if this map contains a mapping from a key
-
{@code k} to a value {@code v} such that {@code key.equals(k)},
-
then this method returns {@code v}; otherwise it returns
-
{@code null}. (There can be at most one such mapping.)
-
@throws NullPointerException if the specified key is null
*/
public V get(Object key) {
Segment<K,V> s; // manually integrate access methods to reduce overhead
HashEntry<K,V>[] tab;
int h = hash(key);
long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE;
if ((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) != null &&
(tab = s.table) != null) {
for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile
(tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE);
e != null; e = e.next) {
K k;
if ((k = e.key) == key || (e.hash == h && key.equals(k)))
return e.value;
}
}
return null;
}
从上文中可以看到,get方法并没有加锁,只是根据key的hash,然后算出Segment槽的位置,不是直接根据下标去获取Segment,也不是直接根据下标去Segment 的 HashEntity[] tab中去获取元素,而是使用了 UNSAFE.getObjectVolatile方法,直接操作内存,并使用volatile方式获取,最大程度保证可见性。有人或许有疑问,为什么get方法不加读锁,阻止其他写入请求呢?其实这样做意义并不大,ConcurrentHashMap的是一个容器,数据存储,提供基本的 put,get操作,对单一一个get请求加锁,没什么意义,因为get方法并不会改变ConcurrentHashMap的内部结构,在当前线程获取到key中的值,然后其他线程删除了该key,这在业务场景上本身就是正常不过的操作。所以get方法并不需要加锁。
2.3 浏览源码,发现无论是replace方法,还是remove方法等操作内部等都和HashMap相似,因为Segment就是一个带锁的HashMap。所以,接下来,我们可以这样思考,put,replace,remove这些方法比HashMap效率高,因为提供了并发度,那这些获取全局的属性的方法呢,比如keys,size等这些方法,性能又是如何呢?我们将目光转向size,keys等遍历方法。
2.3.1 public int size方法
/**
-
Returns the number of key-value mappings in this map. If the
-
map contains more than Integer.MAX_VALUE elements, returns
-
Integer.MAX_VALUE.
-
@return the number of key-value mappings in this map
*/
public int size() {
// Try a few times to get accurate count. On failure due to
// continuous async changes in table, resort to locking.
final Segment<K,V>[] segments = this.segments;
int size;
boolean overflow; // true if size overflows 32 bits
long sum; // sum of modCounts
long last = 0L; // previous sum
int retries = -1; // first iteration isn’t retry
try {
for (;😉 {
if (retries++ == RETRIES_BEFORE_LOCK) {
for (int j = 0; j < segments.length; ++j)
ensureSegment(j).lock(); // force creation
}
sum = 0L;
size = 0;
overflow = false;
for (int j = 0; j < segments.length; ++j) {
Segment<K,V> seg = segmentAt(segments, j);
if (seg != null) {
sum += seg.modCount;
int c = seg.count;
if (c < 0 || (size += c) < 0)
overflow = true;
}
}
if (sum == last)
break;
last = sum;
}
} finally {
if (retries > RETRIES_BEFORE_LOCK) {
for (int j = 0; j < segments.length; ++j)
segmentAt(segments, j).unlock();
}
}
return overflow ? Integer.MAX_VALUE : size;
}
该方法的核心实现原理:从上文的解读,我想大家应该已经了解每一个Segment就是一个HashMap,HashMap中有两个变量,modCount,表示数据结构发生变化次数,比如put一个未在HashMap中包含的key,比如remove,比如clear方法,每调用一次上述方法,modCount就加1,也就是影响size属性的操作,都会将modeCount加一;另一个变量size,记录HashMap中键值对的个数。那ConcurrentHashMap的size方法,如果结构没有发生改变,只需将各个Segment的size相加,就可以得到ConcurrentHashMap的size,然并卯,在相加的过程其他线程如果有改变Segment内部的结构的话,导致size不准确,该方法的实现办法,是先乐观的尝试计算相加的过程最多三次,最少两次,如果前后两次的modCount一样,就说明在计算size的过程中,其他线程并没有改变ConcurrentHashMap的结构没有变化,则可以直接将size返回,结束该方法的调用,如果有变化,则需要依次对所有Segment申请加锁操作,只有获取全部锁后,然后对每个segment的size相加,然后是否锁,并返回size值。
代码@1,如果重试次数达到 (RETRIES_BEFORE_LOCK +1 ,默认为2)次数后,说明需要加锁才能计算。
代码@2,对Segment相加计算size
代码@3,就是实现,判断连续两次计算出的modCount相等,说明该size值正确,否则,继续尝试,获取去请求锁。
2.3.2 public boolean isEmpty() 方法源码解读
/**
-
Returns true if this map contains no key-value mappings.
-
@return true if this map contains no key-value mappings
*/
public boolean isEmpty() {
/*
-
Sum per-segment modCounts to avoid mis-reporting when
-
elements are concurrently added and removed in one segment
-
while checking another, in which case the table was never
-
actually empty at any point. (The sum ensures accuracy up
-
through at least 1<<31 per-segment modifications before
-
recheck.) Methods size() and containsValue() use similar
-
constructions for stability checks.
*/
long sum = 0L;
final Segment<K,V>[] segments = this.segments;
for (int j = 0; j < segments.length; ++j) {
Segment<K,V> seg = segmentAt(segments, j);
if (seg != null) {
if (seg.count != 0)
return false;
sum += seg.modCount;
}
}
if (sum != 0L) { // recheck unless no modifications
for (int j = 0; j < segments.length; ++j) {
Segment<K,V> seg = segmentAt(segments, j);
if (seg != null) {
if (seg.count != 0)
return false;
sum -= seg.modCount;
}
}
if (sum != 0L)
return false;
}
return true;
}
该方法的核心实现原理:就是遍历有所有的segment,一旦发现有存在size不等于0的segment,则返回false;如果发现所有的segment的size为0,则再次遍历,如果两次遍历时 modCount一样,则返回true,否则返回false。
大家再看看如下方法:
2.3.3 public boolean containsKey(Object key)
/**
-
Tests if the specified object is a key in this table.
-
@param key possible key
-
@return true if and only if the specified object
-
is a key in this table, as determined by the
-
<tt>equals</tt> method; <tt>false</tt> otherwise.
-
@throws NullPointerException if the specified key is null
*/
@SuppressWarnings(“unchecked”)
public boolean containsKey(Object key) {
Segment<K,V> s; // same as get() except no need for volatile value read
HashEntry<K,V>[] tab;
int h = hash(key);
long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE;
if ((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) != null &&
(tab = s.table) != null) {
for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile
(tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE);
e != null; e = e.next) {
K k;
if ((k = e.key) == key || (e.hash == h && key.equals(k)))
return true;
}
}
return false;
}
2.3.4 public boolean containsValue(Object value)
/**
-
Returns true if this map maps one or more keys to the
-
specified value. Note: This method requires a full internal
-
traversal of the hash table, and so is much slower than
-
method containsKey.
-
@param value value whose presence in this map is to be tested
-
@return true if this map maps one or more keys to the
-
specified value
-
@throws NullPointerException if the specified value is null
*/
public boolean containsValue(Object value) {
// Same idea as size()
if (value == null)
throw new NullPointerException();
final Segment<K,V>[] segments = this.segments;
boolean found = false;
long last = 0;
int retries = -1;
try {
outer: for (;😉 {
if (retries++ == RETRIES_BEFORE_LOCK) {
for (int j = 0; j < segments.length; ++j)
ensureSegment(j).lock(); // force creation
}
long hashSum = 0L;
int sum = 0;
for (int j = 0; j < segments.length; ++j) {
HashEntry<K,V>[] tab;
Segment<K,V> seg = segmentAt(segments, j);
if (seg != null && (tab = seg.table) != null) {
for (int i = 0 ; i < tab.length; i++) {
HashEntry<K,V> e;
for (e = entryAt(tab, i); e != null; e = e.next) {
V v = e.value;
if (v != null && value.equals(v)) {
found = true;
break outer;
}
}
}
sum += seg.modCount;
}
}
if (retries > 0 && sum == last)
break;
last = sum;
}
} finally {
if (retries > RETRIES_BEFORE_LOCK) {
for (int j = 0; j < segments.length; ++j)
segmentAt(segments, j).unlock();
}
}
return found;
}
2.3.5 public Set<Map.Entry<K,V>> entrySet(); 遍历元素方法。
public Set<Map.Entry<K,V>> entrySet() {
Set<Map.Entry<K,V>> es = entrySet;
return (es != null) ? es : (entrySet = new EntrySet());
}
final class EntrySet extends AbstractSet<Map.Entry<K,V>> {
public Iterator<Map.Entry<K,V>> iterator() {
return new EntryIterator();
}
public boolean contains(Object o) {
if (!(o instanceof Map.Entry))
return false;
Map.Entry<?,?> e = (Map.Entry<?,?>)o;
V v = ConcurrentHashMap.this.get(e.getKey());
return v != null && v.equals(e.getValue());
}
public boolean remove(Object o) {
if (!(o instanceof Map.Entry))
return false;
Map.Entry<?,?> e = (Map.Entry<?,?>)o;
return ConcurrentHashMap.this.remove(e.getKey(), e.getValue());
}
public int size() {
return ConcurrentHashMap.this.size();
}
public boolean isEmpty() {
return ConcurrentHashMap.this.isEmpty();
}
public void clear() {
ConcurrentHashMap.this.clear();
}
}
final class EntryIterator
extends HashIterator
implements Iterator<Entry<K,V>>
{
public Map.Entry<K,V> next() {
HashEntry<K,V> e = super.nextEntry();
return new WriteThroughEntry(e.key, e.value);
}
}
abstract class HashIterator {
int nextSegmentIndex;
int nextTableIndex;
HashEntry<K,V>[] currentTable;
HashEntry<K, V> nextEntry;
HashEntry<K, V> lastReturned;
HashIterator() {
nextSegmentIndex = segments.length - 1;
nextTableIndex = -1;
最后
看完美团、字节、腾讯这三家的面试问题,是不是感觉问的特别多,可能咱们又得开启面试造火箭、工作拧螺丝的模式去准备下一次的面试了。
开篇有提及我可是足足背下了1000道题目,多少还是有点用的呢,我看了下,上面这些问题大部分都能从我背的题里找到的,所以今天给大家分享一下互联网工程师必备的面试1000题。
注意不论是我说的互联网面试1000题,还是后面提及的算法与数据结构、设计模式以及更多的Java学习笔记等,皆可分享给各位朋友
互联网工程师必备的面试1000题
而且从上面三家来看,算法与数据结构是必备不可少的呀,因此我建议大家可以去刷刷这本左程云大佬著作的《程序员代码面试指南 IT名企算法与数据结构题目最优解》,里面近200道真实出现过的经典代码面试题。
is.size();
}
public boolean isEmpty() {
return ConcurrentHashMap.this.isEmpty();
}
public void clear() {
ConcurrentHashMap.this.clear();
}
}
final class EntryIterator
extends HashIterator
implements Iterator<Entry<K,V>>
{
public Map.Entry<K,V> next() {
HashEntry<K,V> e = super.nextEntry();
return new WriteThroughEntry(e.key, e.value);
}
}
abstract class HashIterator {
int nextSegmentIndex;
int nextTableIndex;
HashEntry<K,V>[] currentTable;
HashEntry<K, V> nextEntry;
HashEntry<K, V> lastReturned;
HashIterator() {
nextSegmentIndex = segments.length - 1;
nextTableIndex = -1;
最后
看完美团、字节、腾讯这三家的面试问题,是不是感觉问的特别多,可能咱们又得开启面试造火箭、工作拧螺丝的模式去准备下一次的面试了。
开篇有提及我可是足足背下了1000道题目,多少还是有点用的呢,我看了下,上面这些问题大部分都能从我背的题里找到的,所以今天给大家分享一下互联网工程师必备的面试1000题。
注意不论是我说的互联网面试1000题,还是后面提及的算法与数据结构、设计模式以及更多的Java学习笔记等,皆可分享给各位朋友
[外链图片转存中…(img-cK1XEEXA-1719270533421)]
互联网工程师必备的面试1000题
而且从上面三家来看,算法与数据结构是必备不可少的呀,因此我建议大家可以去刷刷这本左程云大佬著作的《程序员代码面试指南 IT名企算法与数据结构题目最优解》,里面近200道真实出现过的经典代码面试题。
[外链图片转存中…(img-lO0X6Osh-1719270533421)]