提升自旋锁spinlock的性能-pause指令

最新推荐文章于 2024-02-01 15:06:24 发布

lotluck

最新推荐文章于 2024-02-01 15:06:24 发布

阅读量3.6k

点赞数 2

分类专栏：经验积累 linux下C编程文章标签：自旋锁 pause

本文链接：https://blog.csdn.net/lotluck/article/details/78329162

版权

经验积累同时被 2 个专栏收录

45 篇文章 3 订阅

订阅专栏

linux下C编程

27 篇文章 0 订阅

订阅专栏

看源码的时候get的一个新的知识点，可以提升自旋锁spinlock的性能-pause指令，看到的源码如下：

     #define cpu_pause()         __asm__ (".byte 0xf3, 0x90")
     #define NOP_CPU3(n)     {int i = 0; while(i++ < (n)) cpu_pause();} 
     // 调用代码
     NOP_CPU3(20);

经过上网查找资料 _asm_ (“.byte 0xf3, 0x90”) intel的pause指令。当spinlock执行lock()获得锁失败后会进行busy loop，不断检测锁状态，尝试获得锁。这么做有一个缺陷：频繁的检测会让流水线上充满了读操作。另外一个线程往流水线上丢入一个锁变量写操作的时候，必须对流水线进行重排，因为CPU必须保证所有读操作读到正确的值。流水线重排十分耗时，影响lock()的性能。

参考这位同学的文章：
自旋锁spinlock剖析与改进

Pause指令解释（from intel）：

Description
Improves the performance of spin-wait loops. When executing a “spin-wait loop,” a Pentium 4 or Intel Xeon processor suffers a severe performance penalty when exiting the loop because it detects a possible memory order violation. The PAUSE instruction provides a hint to the processor that the code sequence is a spin-wait loop. The processor uses this hint to avoid the memory order violation in most situations, which greatly improves processor performance. For this reason, it is recommended that a PAUSE instruction be placed in all spin-wait loops.

提升spin-wait-loop的性能，当执行spin-wait循环的时候，笨死和小强处理器会因为在退出循环的时候检测到memory order violation而导致严重的性能损失，pause指令就相当于提示处理器哥目前处于spin-wait中。在绝大多数情况下，处理器根据这个提示来避免violation，藉此大幅提高性能，由于这个原因，我们建议在spin-wait中加上一个pause指令。

名词解释(以下为本人猜想)：memory order violation，直译为-内存访问顺序冲突，当处理器在(out of order)乱序执行的流水线上去内存load某个内存地址的值(此处是lock)的时候，发现这个值正在被store，而且store本身就在load之前，对于处理器来说，这就是一个hazard，流水流不起来。

在本文中，具体是指当一个获得锁的工作线程W从临界区退出，在调用unlock释放锁的时候，有若干个等待线程S都在自旋检测锁是否可用，此时W线程会产生一个store指令，若干个S线程会产生很多load指令，在store之后的load指令要等待store在流水线上执行完毕才能执行，由于处理器是乱序执行，在没有store指令之前，处理器对多个没有依赖的load是可以随机乱序执行的，当有了store指令之后，需要reorder重新排序执行，此时会严重影响处理器性能，按照intel的说法，会带来25倍的性能损失。Pause指令的作用就是减少并行load的数量，从而减少reorder时所耗时间。