CLH锁，MCS锁，自旋锁分析和实例

最新推荐文章于 2024-06-04 16:14:37 发布

探索未知的自己

最新推荐文章于 2024-06-04 16:14:37 发布

阅读量1.1k

点赞数

分类专栏： java 性能优化

本文链接：https://blog.csdn.net/u013380694/article/details/82911774

版权

java 同时被 2 个专栏收录

24 篇文章 2 订阅

订阅专栏

性能优化

22 篇文章 1 订阅

订阅专栏

一、

1、SMP(Symmetric Multi-Processor)

SMP（Symmetric Multi-Processing）对称多处理器结构，指服务器中多个CPU对称工作，每个CPU访问内存地址所需时间相同。其主要特征是共享，包含对CPU，内存，I/O等进行共享。

SMP能够保证内存一致性，但这些共享的资源很可能成为性能瓶颈，随着CPU数量的增加，每个CPU都要访问相同的内存资源，可能导致内存访问冲突，

可能会导致CPU资源的浪费。常用的PC机就属于这种。

2、NUMA(Non-Uniform Memory Access)

非一致存储访问，将CPU分为CPU模块，每个CPU模块由多个CPU组成，并且具有独立的本地内存、I/O槽口等，模块之间可以通过互联模块相互访问，访问本地内存的速度将远远高于访问远地内存(系统内其它节点的内存)的速度，这也是非一致存储访问的由来。NUMA较好地解决SMP的扩展问题，

当CPU数量增加时，因为访问远地内存的延时远远超过本地内存，系统性能无法线性增加。

二、

1、CLH锁

CLH(Craig, Landin, and Hagersten locks): 是一个自旋锁，能确保无饥饿性，提供先来先服务的公平性。

CLH锁也是一种基于链表的可扩展、高性能、公平的自旋锁，申请线程只在本地变量上自旋，它不断轮询前驱的状态，如果发现前驱释放了锁就结束自旋。

当一个线程需要获取锁时：

a.创建一个的QNode，将其中的locked设置为true表示需要获取锁

b.线程对tail域调用getAndSet方法，使自己成为队列的尾部，同时获取一个指向其前趋结点的引用myPred

c.该线程就在前趋结点的locked字段上旋转，直到前趋结点释放锁

d.当一个线程需要释放锁时，将当前结点的locked域设置为false，同时回收前趋结点

　　如下图，线程A需要获取锁，其myNode域为true，tail指向线程A的结点，然后线程B也加入到线程A后面，tail指向线程B的结点。然后线程A和B都在其myPred域上旋转，一旦它的myPred结点的locked字段变为false，它就可以获取锁。明显线程A的myPred locked域为false，此时线程A获取到了锁。

2、CLH代码示例

public class CLHLock implements Lock {

AtomicReference<QNode> tail = new AtomicReference<QNode>(new QNode());

ThreadLocal<QNode> myPred;

ThreadLocal<QNode> myNode;

public CLHLock() {

tail = new AtomicReference<QNode>(new QNode());

myNode = new ThreadLocal<QNode>() {

protected QNode initialValue() {

return new QNode();

}

};

myPred = new ThreadLocal<QNode>() {

protected QNode initialValue() {

return null;

}

};

}

@Override

public void lock() {

QNode qnode = myNode.get();

qnode.locked = true;

QNode pred = tail.getAndSet(qnode);

myPred.set(pred);

while (pred.locked) {

}

@Override

public void unlock() {

QNode qnode = myNode.get();

qnode.locked = false;

myNode.set(myPred.get());

}

3、CLH分析

CLH队列锁的优点是空间复杂度低（如果有n个线程，L个锁，每个线程每次只获取一个锁，那么需要的存储空间是O（L+n），n个线程有n个。myNode，L个锁有L个tail），CLH的一种变体被应用在了JAVA并发框架中。

CLH在SMP系统结构下该法是非常有效的。但在NUMA系统结构下，每个线程有自己的内存，如果前趋结点的内存位置比较远，自旋判断前趋结点的locked域，性能将大打折扣，一种解决NUMA系统结构的思路是MCS队列锁。

三、

1、MCS锁

MSC与CLH最大的不同并不是链表是显示还是隐式，而是线程自旋的规则不同:CLH是在前趋结点的locked域上自旋等待，而MSC是在自己的结点的locked域上自旋等待。正因为如此，它解决了CLH在NUMA系统架构中获取locked域状态内存过远的问题。

MCS队列锁的具体实现如下：

a. 队列初始化时没有结点，tail=null

b. 线程A想要获取锁，于是将自己置于队尾，由于它是第一个结点，它的locked域为false

c. 线程B和C相继加入队列，a->next=b,b->next=c。且B和C现在没有获取锁，处于等待状态，所以它们的locked域为true，尾指针指向线程C对应的结点

d. 线程A释放锁后，顺着它的next指针找到了线程B，并把B的locked域设置为false。这一动作会触发线程B获取锁

2、代码实现

public class MCSLock implements Lock {

AtomicReference<QNode> tail;

ThreadLocal<QNode> myNode;

@Override

public void lock() {

QNode qnode = myNode.get();

QNode pred = tail.getAndSet(qnode);

if (pred != null) {

qnode.locked = true;

pred.next = qnode;

// wait until predecessor gives up the lock

while (qnode.locked) {

}

@Override

public void unlock() {

QNode qnode = myNode.get();

if (qnode.next == null) {

if (tail.compareAndSet(qnode, null))

return;

// wait until predecessor fills in its next field

while (qnode.next == null) {

}

qnode.next.locked = false;

qnode.next = null;

}

class QNode {

boolean locked = false;

QNode next = null;

}

四总结

jdk 1.5并发包中 AbstractQueuedSynchronizer （AQS）类中具体实现了该CLH锁

To enqueue into a CLH lock, you atomically splice it in as new tail. To dequeue, you just set the head field.

      +------+  prev +-----+       +-----+
head |      | <---- |     | <---- |     |  tail
      +------+       +-----+       +-----+

Insertion into a CLH queue requires only a single atomic operation on "tail", so there is a simple atomic point of
demarcation from unqueued to queued. Similarly, dequeuing involves only updating the "head". However, it takes a bit more work for nodes to determine who their successors are,in part to deal with possible cancellation due to timeouts and interrupts.

The "prev" links (not used in original CLH locks), are mainly needed to handle cancellation. If a node is cancelled, its successor is (normally) relinked to a non-cancelled predecessor. For explanation of similar mechanics in the case of spin locks, see the papers by Scott and Scherer at http://www.cs.rochester.edu/u/scott/synchronization/

We also use "next" links to implement blocking mechanics. The thread id for each node is kept in its own node, so a
predecessor signals the next node to wake up by traversing next link to determine which thread it is.  Determination of successor must avoid races with newly queued nodes to set the "next" fields of their predecessors.  This is solved when necessary by checking backwards from the atomically updated "tail" when a node's successor appears to be null. (Or, said differently, the next-links are an optimization so that we don't usually need a backward scan.)

Cancellation introduces some conservatism to the basic algorithms.  Since we must poll for cancellation of other
nodes, we can miss noticing whether a cancelled node is ahead or behind us. This is dealt with by always unparking
successors upon cancellation, allowing them to stabilize on a new predecessor, unless we can identify an uncancelled predecessor who will carry this responsibility.

CLH queues need a dummy header node to get started. But we don't create them on construction, because it would be wasted effort if there is never contention. Instead, the node is constructed and head and tail pointers are set upon first contention.

Threads waiting on Conditions use the same nodes, but use an additional link. Conditions only need to link nodes
in simple (non-concurrent) linked queues because they are only accessed when exclusively held.  Upon await, a node is inserted into a condition queue.  Upon signal, the node is transferred to the main queue.  A special value of status field is used to mark which queue a node is on.Thanks go to Dave Dice, Mark Moir, Victor Luchangco, Bill
Scherer and Michael Scott, along with members of JSR-166
expert group, for helpful ideas, discussions, and critiques
on the design of this class.

备注参考https://www.jianshu.com/p/4682a6b0802d

探索未知的自己

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
CLH锁，MCS锁，自旋锁分析和实例

一、1、SMP(Symmetric Multi-Processor) SMP（Symmetric Multi-Processing）对称多处理器结构，指服务器中多个CPU对称工作，每个CPU访问内存地址所需时间相同。其主要特征是共享，包含对CPU，内存，I/O等进行共享。 SMP能够保证内存一致性，但这些共享的资源很可能成为性能瓶颈，随着CPU数量的增加，每个CPU...
复制链接

扫一扫

专栏目录