Go语言设计与实现 -- Mutex源码剖析

胡桃姓胡，蝴蝶也姓胡

于 2022-12-28 15:21:07 发布

阅读量1k

点赞数

分类专栏： Go设计与实现文章标签： golang java 开发语言

本文链接：https://blog.csdn.net/qq_61039408/article/details/128468986

版权

Go设计与实现专栏收录该内容

29 篇文章 3 订阅

订阅专栏

golang-basic-sync-primitives

上图来自面向信仰编程

上图中，第一列为常见的同步原语，第二列为容器，第三列为互斥锁。

接下来我们来逐一介绍一下：

Mutex

我们先来看一下sync.Mutex的结构体：

type Mutex struct {
   // 当前互斥锁的状态
   state int32
   // 用于控制锁状态的信号量
   sema  uint32
}

状态

请添加图片描述

最低三位分别表示mutexLocked(互斥锁的锁定状态)，mutexWoken(被从正常模式唤醒)，mutexStarving(当前互斥锁进入饥饿状态)，剩余位置用来表示当前有多少Goroutine在等待互斥锁的释放。

我们上面的介绍中引出了两个概念：正常模式饥饿模式。

正常模式和饥饿模式

正常模式是非公平锁。饥饿模式是公平锁。

刚开始的时候是处于正常模式的，也就是当一个G1持有一个锁的时候，G2会自旋的去尝试获取这个锁。

当自旋超过4次还没有获取到锁的时候，G2就会被加入到锁的等待队列里面去，并阻塞等待被唤醒。

正常模式下，所有等待所的Goroutine按照FIFO的顺序等待。唤醒的Goroutine不会直接拥有锁，而是会和新的请求所的Goroutine竞争锁。但是新请求锁的Goroutine是具有优势的：它在CPU上执行，而且可能有好几个，所以刚刚唤醒的Goroutine有很大可能在锁竞争中失败，长时间获取不到锁，就会进入饥饿模式。

因此一旦Goroutine超过1ms没有获取到锁，它就会将当前互斥锁切换到饥饿模式，防止部分Goroutine被饿死。

在饥饿模式下，互斥锁会直接交给等待队列最前面的Goroutine。新创建的Goroutine在该状态下不能获取锁，也不会进入自旋状态，它们只会在队列末尾等待。如果一个Gorouine获得了互斥锁并且它在队列末尾或者它等待的时间少于1ms，那么当前互斥锁就会切换回正常模式。

正常模式下的互斥锁可以获得更高的性能，但是饥饿模式下的能避免由于陷入等待无法获取锁而造成的高尾延迟。

加锁和解锁

我们来看一下加锁的源码：

func (m *Mutex) Lock() {
   // Fast path: grab unlocked mutex.
   if atomic.CompareAndSwapInt32(&m.state, 0, mutexLocked) {
      if race.Enabled {
         race.Acquire(unsafe.Pointer(m))
      }
      return
   }
   // Slow path (outlined so that the fast path can be inlined)
   m.lockSlow()
}

有一句代码非常重要：

atomic.CompareAndSwapInt32(&m.state, 0, mutexLocked)

这一句代码叫做CAS(compare and swap)。CAS是原子的，原因是它是由硬件指令完成的。

// CompareAndSwapInt32 executes the compare-and-swap operation for an int32 value.
func CompareAndSwapInt32(addr *int32, old, new int32) (swapped bool)

假设我有内存中的原数据addr，旧的期望值old，需要修改的新值new，那么CAS的大致流程如下：

比较addr和old，查看是否相等
如果相等的话，那么就把new写入addr，替代原来的old，并且返回true
否则就返回false，不执行任何操作

这个操作的本质是：检测在两个操作之间有没有其他go程掺杂了操作，如果掺杂了那么操作无效，如果没有掺杂，那么继续操作就可以了

各种各样的锁都会被CAS实现。

然后回到上面的源码，我们继续来剖析：

如果互斥锁的状态不是0，那么CAS就会返回false，从而执行函数lockSlow()。

我们来分析一下这个函数：

func (m *Mutex) lockSlow() {
   var waitStartTime int64
   starving := false
   awoke := false
   iter := 0
   old := m.state
   for {
      // Don't spin in starvation mode, ownership is handed off to waiters
      // so we won't be able to acquire the mutex anyway.
      if old&(mutexLocked|mutexStarving) == mutexLocked && runtime_canSpin(iter) {
         // Active spinning makes sense.
         // Try to set mutexWoken flag to inform Unlock
         // to not wake other blocked goroutines.
         if !awoke && old&mutexWoken == 0 && old>>mutexWaiterShift != 0 &&
            atomic.CompareAndSwapInt32(&m.state, old, old|mutexWoken) {
            awoke = true
         }
         runtime_doSpin()
         iter++
         old = m.state
         continue
      }
      new := old
      // Don't try to acquire starving mutex, new arriving goroutines must queue.
      if old&mutexStarving == 0 {
         new |= mutexLocked
      }
      if old&(mutexLocked|mutexStarving) != 0 {
         new += 1 << mutexWaiterShift
      }
      // The current goroutine switches mutex to starvation mode.
      // But if the mutex is currently unlocked, don't do the switch.
      // Unlock expects that starving mutex has waiters, which will not
      // be true in this case.
      if starving && old&mutexLocked != 0 {
         new |= mutexStarving
      }
      if awoke {
         // The goroutine has been woken from sleep,
         // so we need to reset the flag in either case.
         if new&mutexWoken == 0 {
            throw("sync: inconsistent mutex state")
         }
         new &^= mutexWoken
      }
      if atomic.CompareAndSwapInt32(&m.state, old, new) {
         if old&(mutexLocked|mutexStarving) == 0 {
            break // locked the mutex with CAS
         }
         // If we were already waiting before, queue at the front of the queue.
         queueLifo := waitStartTime != 0
         if waitStartTime == 0 {
            waitStartTime = runtime_nanotime()
         }
         runtime_SemacquireMutex(&m.sema, queueLifo, 1)
         starving = starving || runtime_nanotime()-waitStartTime > starvationThresholdNs
         old = m.state
         if old&mutexStarving != 0 {
            // If this goroutine was woken and mutex is in starvation mode,
            // ownership was handed off to us but mutex is in somewhat
            // inconsistent state: mutexLocked is not set and we are still
            // accounted as waiter. Fix that.
            if old&(mutexLocked|mutexWoken) != 0 || old>>mutexWaiterShift == 0 {
               throw("sync: inconsistent mutex state")
            }
            delta := int32(mutexLocked - 1<<mutexWaiterShift)
            if !starving || old>>mutexWaiterShift == 1 {
               // Exit starvation mode.
               // Critical to do it here and consider wait time.
               // Starvation mode is so inefficient, that two goroutines
               // can go lock-step infinitely once they switch mutex
               // to starvation mode.
               delta -= mutexStarving
            }
            atomic.AddInt32(&m.state, delta)
            break
         }
         awoke = true
         iter = 0
      } else {
         old = m.state
      }
   }

   if race.Enabled {
      race.Acquire(unsafe.Pointer(m))
   }
}

这个函数做了以下几个事情：

判断当前Goroutine能否进入自旋
通过自旋等待互斥锁的释放
计算互斥锁的最新状态
更新互斥锁的状态并获取锁

我们先来看第一部分，互斥锁是如何判断当前Goroutine能否进入自旋等待互斥锁的释放的：

var waitStartTime int64
starving := false
awoke := false
iter := 0
old := m.state
for {
   // Don't spin in starvation mode, ownership is handed off to waiters
   // so we won't be able to acquire the mutex anyway.
   if old&(mutexLocked|mutexStarving) == mutexLocked && runtime_canSpin(iter) {
      // Active spinning makes sense.
      // Try to set mutexWoken flag to inform Unlock
      // to not wake other blocked goroutines.
      if !awoke && old&mutexWoken == 0 && old>>mutexWaiterShift != 0 &&
         atomic.CompareAndSwapInt32(&m.state, old, old|mutexWoken) {
         awoke = true
      }
      runtime_doSpin()
      iter++
      old = m.state
      continue
   }

进入自旋的条件是：

old := m.state
old&(mutexLocked|mutexStarving) == mutexLocked && runtime_canSpin(iter)

这句话抽象出来的意思就是：

互斥锁只有在普通模式下才能够进入自旋
runtime_canSpin必须返回true

然后我们来看runtime_canSpin在什么条件下才能返回true

在有多个CPU的机器上运行
当前Goroutine为了获取该锁进入自旋的次数少于4
当前机器上至少存在一个正在运行的处理器P并且运行队列为空

可以看到条件非常苛刻，不过这也情有可原，因为自旋的过程会一直保持CPU的占用，持续检查某一个条件是否为真。使用不当会拖慢程序。

处理完自旋的特殊逻辑之后，互斥锁会根据上下文计算(只是计算，还没有更新)当前互斥锁的最新状态，会更新state字段中存储的不同信息。

new := old
// Don't try to acquire starving mutex, new arriving goroutines must queue.
if old&mutexStarving == 0 {
   new |= mutexLocked
}
if old&(mutexLocked|mutexStarving) != 0 {
   new += 1 << mutexWaiterShift
}
// The current goroutine switches mutex to starvation mode.
// But if the mutex is currently unlocked, don't do the switch.
// Unlock expects that starving mutex has waiters, which will not
// be true in this case.
if starving && old&mutexLocked != 0 {
   new |= mutexStarving
}
if awoke {
   // The goroutine has been woken from sleep,
   // so we need to reset the flag in either case.
   if new&mutexWoken == 0 {
      throw("sync: inconsistent mutex state")
   }
   new &^= mutexWoken
}

计算了互斥锁状态之后，会使用CAS函数更新状态：

if atomic.CompareAndSwapInt32(&m.state, old, new) {
   if old&(mutexLocked|mutexStarving) == 0 {
      break // 通过CAS获取了锁
   }
   // If we were already waiting before, queue at the front of the queue.
   queueLifo := waitStartTime != 0
   if waitStartTime == 0 {
      waitStartTime = runtime_nanotime()
   }
   runtime_SemacquireMutex(&m.sema, queueLifo, 1)
   starving = starving || runtime_nanotime()-waitStartTime > starvationThresholdNs
   old = m.state
   if old&mutexStarving != 0 {
      // If this goroutine was woken and mutex is in starvation mode,
      // ownership was handed off to us but mutex is in somewhat
      // inconsistent state: mutexLocked is not set and we are still
      // accounted as waiter. Fix that.
      if old&(mutexLocked|mutexWoken) != 0 || old>>mutexWaiterShift == 0 {
         throw("sync: inconsistent mutex state")
      }
      delta := int32(mutexLocked - 1<<mutexWaiterShift)
      if !starving || old>>mutexWaiterShift == 1 {
         // Exit starvation mode.
         // Critical to do it here and consider wait time.
         // Starvation mode is so inefficient, that two goroutines
         // can go lock-step infinitely once they switch mutex
         // to starvation mode.
         delta -= mutexStarving
      }
      atomic.AddInt32(&m.state, delta)
      break
   }
   awoke = true
   iter = 0
} else {
   old = m.state
}

然后我们来看看解锁过程，解锁过程相比加锁过程稍微简单一点：

func (m *Mutex) Unlock() {
   if race.Enabled {
      _ = m.state
      race.Release(unsafe.Pointer(m))
   }

   // Fast path: drop lock bit.
   new := atomic.AddInt32(&m.state, -mutexLocked)
   if new != 0 {
      // Outlined slow path to allow inlining the fast path.
      // To hide unlockSlow during tracing we skip one extra frame when tracing GoUnblock.
      m.unlockSlow(new)
   }
}

先调用函数atomic.AddInt32(&m.state, -mutexLocked)进行快速解锁。

如果返回值等于0，那么快速解锁成功。

如果不等于0，那么就调用 m.unlockSlow(new)进行慢速解锁。

func (m *Mutex) unlockSlow(new int32) {
   if (new+mutexLocked)&mutexLocked == 0 {
      throw("sync: unlock of unlocked mutex")
   }
   if new&mutexStarving == 0 {
      old := new
      for {
         // If there are no waiters or a goroutine has already
         // been woken or grabbed the lock, no need to wake anyone.
         // In starvation mode ownership is directly handed off from unlocking
         // goroutine to the next waiter. We are not part of this chain,
         // since we did not observe mutexStarving when we unlocked the mutex above.
         // So get off the way.
         if old>>mutexWaiterShift == 0 || old&(mutexLocked|mutexWoken|mutexStarving) != 0 {
            return
         }
         // Grab the right to wake someone.
         new = (old - 1<<mutexWaiterShift) | mutexWoken
         if atomic.CompareAndSwapInt32(&m.state, old, new) {
            runtime_Semrelease(&m.sema, false, 1)
            return
         }
         old = m.state
      }
   } else {
      // Starving mode: handoff mutex ownership to the next waiter, and yield
      // our time slice so that the next waiter can start to run immediately.
      // Note: mutexLocked is not set, the waiter will set it after wakeup.
      // But mutex is still considered locked if mutexStarving is set,
      // so new coming goroutines won't acquire it.
      runtime_Semrelease(&m.sema, true, 1)
   }
}

我们来看一下这个函数：

首先会校验锁状态的合法性，如果当前互斥锁已经被解锁了，会直接抛异常终止程序
然后进行判断，如果是正常模式的话进行一套处理，饥饿模式进行另外一套处理。
当互斥锁处于饥饿模式时，将锁的所有权交给队列中的下一个等待者，等待者会负责设置 mutexLocked 标志位；
当互斥锁处于普通模式时，如果没有 Goroutine 等待锁的释放或者已经有被唤醒的 Goroutine 获得了锁，会直接返回；在其他情况下会通过 sync.runtime_Semrelease 唤醒对应的 Goroutine；