Java 中自旋锁的实现

最新推荐文章于 2024-08-02 15:58:48 发布

test_touch

最新推荐文章于 2024-08-02 15:58:48 发布

阅读量698

点赞数

文章标签： java

Java中初始是使用mutex互斥锁，因为互斥锁是会线程等待挂起，而对获取锁后的操作时间比较短暂的应用场景来说，这样的锁会让竞争锁的线程不停的park,unpark 的操作，这样的系统的调用性能是非常糟糕的，为了提高锁的性能，java 在6 默认使用了自旋锁。

在Linux中本身就已经提供了自旋锁的系统调用，在glibc-2.9中就有它的比较简单的实现方法

int pthread_spin_lock (lock) pthread_spinlock_t *lock; { asm ("\n" "1:\t" LOCK_PREFIX "decl %0\n\t" "jne 2f\n\t" ".subsection 1\n\t" ".align 16\n" "2:\trep; nop\n\t" "cmpl $0, %0\n\t" "jg 1b\n\t" "jmp 2b\n\t" ".previous" : "=m" (*lock) : "m" (*lock)); return 0; }
通过总线锁把参数-1保证了减法的原子性，如果减后的值是（0）的代表获得锁，其他线程的线程自旋直到参数变成初始值（1），继续竞争锁，直到获得这把锁。

Java 并没有使用系统自带的自旋锁，自己重写了自旋锁的逻辑，并且增加了自旋的次数的控制。详细见-XX:+UseSpinning 和-XX:PreBlockSpin=xx

让我们具体来看是如何实现的，注意这是mutex锁中所实现的lock，而并不是synchinized 的锁的spin lock的实现（这个你可以参考synchronizer.cpp里的方法TrySpin_VaryDuration）

int Monitor::TrySpin (Thread * const Self) { if (TryLock()) return 1 ; if (!os::is_MP()) return 0 ; int Probes = 0 ; int Delay = 0 ; int Steps = 0 ; int SpinMax = NativeMonitorSpinLimit ; int flgs = NativeMonitorFlags ; for (;;) { intptr_t v = _LockWord.FullWord; if ((v & _LBIT) == 0) { if (CASPTR (&_LockWord, v, v|_LBIT) == v) { return 1 ; } continue ; } if ((flgs & 8) == 0) { SpinPause () ; } // Periodically increase Delay -- variable Delay form // conceptually: delay *= 1 + 1/Exponent ++ Probes; if (Probes > SpinMax) return 0 ; if ((Probes & 0x7) == 0) { Delay = ((Delay << 1)|1) & 0x7FF ; // CONSIDER: Delay += 1 + (Delay/4); Delay &= 0x7FF ; } if (flgs & 2) continue ; // Consider checking _owner's schedctl state, if OFFPROC abort spin. // If the owner is OFFPROC then it's unlike that the lock will be dropped // in a timely fashion, which suggests that spinning would not be fruitful // or profitable. // Stall for "Delay" time units - iterations in the current implementation. // Avoid generating coherency traffic while stalled. // Possible ways to delay: // PAUSE, SLEEP, MEMBAR #sync, MEMBAR #halt, // wr %g0,%asi, gethrtime, rdstick, rdtick, rdtsc, etc. ... // Note that on Niagara-class systems we want to minimize STs in the // spin loop. N1 and brethren write-around the L1$ over the xbar into the L2$. // Furthermore, they don't have a W$ like traditional SPARC processors. // We currently use a Marsaglia Shift-Xor RNG loop. Steps += Delay ; if (Self != NULL) { jint rv = Self->rng[0] ; for (int k = Delay ; --k >= 0; ) { rv = MarsagliaXORV (rv) ; if ((flgs & 4) == 0 && SafepointSynchronize::do_call_back()) return 0 ; } Self->rng[0] = rv ; } else { Stall (Delay) ; } } }

a. os::is_MP() 判断系统是否是多核的系统，在单核下，自旋锁是没有意义的。

b. CASPTR 使用了 Atomic::cmpxchg_ptr 原子语义 cmpxchg 比较替换，如果比较的值相等就替换成需要的值并且返回去比较的值，如果不相同返回被比较的值的内容。

在这里的语义是比较_LockWord.FullWord 和 _Lockword 的值是否相同，如果相同就把_Lockword 的值置换成v|_LBIT（_LBIT的值是1）。

自旋锁的逻辑：判断_LockWord.FullWord bit 0 是否是0，如果是0代表没有占有锁，那就尝试去占有锁，通过原子替换置bit0 为1，如果置换成功那么代表拥有锁，没有则进入自旋。

SpinPause () 函数
在linux_x86 64位机器上定义了
.globl SpinPause
.align 16
.type SpinPause,@function
SpinPause:
rep
nop
movq $1, %rax
ret

主要在rep, nop 的指令经过编译器后的指令是pause,是用于提高cpu性能的，在官方上描述pase指令是为了避免memory order violation ，有种说法就是cpu是流水线的处理指令的，当原子指令store的时候，而如果有线程同时也在load他的值，那么load 必须等到store 执行成功,这样cpu就无法进行流水线作业了。但我更觉的这是个加强版的nop 也就是多增加几个空的机器周期，一来省电，二来本身spin lock就需要cpu空运行，并且不需要访问内存。

c. SafepointSynchronize::do_call_back()这是一个安全点，提供一个停止自旋锁的切入点，比如vm thread,在做线程dump, 内存 dump的时候，是需要让自旋锁提前停止的。

d. if (Probes > SpinMax) return 0 ; 当大于自旋的次数的时候，自旋自动退出,也就是前面所说的参数-XX:PreBlockSpin

最后这里还有个比较有意思的方法MarsagliaXORV (rv) ; 是算随机数的，不清楚为什么java让cpu自旋的过程中计算随机数的意义何在，为了不让cpu空转？感觉用spinpase 更合理一点。