【组件-池式】线程池2-线程互斥

最新推荐文章于 2024-07-25 13:33:06 发布

好好学习++

最新推荐文章于 2024-07-25 13:33:06 发布

阅读量598

点赞数 13

分类专栏：课程笔记 # C/C++服务器文章标签： linux c语言 c++ 性能优化

本文链接：https://blog.csdn.net/learnandimprove/article/details/136407425

版权

课程笔记同时被 2 个专栏收录

24 篇文章 0 订阅

订阅专栏

C/C++服务器

8 篇文章 0 订阅

订阅专栏

声明：仅为个人学习总结，还请批判性查看，如有不同观点，欢迎交流。

摘要

总结整理 POSIX 和 C++ 提供的线程互斥技术，包括互斥锁、读写锁、信号量、自旋锁以及原子变量，介绍相关概念、接口和使用示例。另外还包括几点死锁预防建议。

在理想情况下，每个线程可以独立工作，无所顾忌，全速运行；
在现实情况下，线程之间通常需要分工协作，需要涉及
- 线程互斥：对共享资源进行保护，线程之间按照一定的规则有序访问；
- 线程同步：在一个线程完成特定的工作之后，另一个线程才能够继续执行。

本章主要总结线程互斥相关技术。

1 基本概念

1.1 资源共享问题

多个线程并发访问同一份共享资源时，如果一个线程正在进行修改，同时其他线程正在进行读取或者修改，它们之间就会产生冲突，引发数据竞争（data race）。

数据竞争示例：

// 编译指令：g++ -pthread test.cpp
#include <pthread.h>
#include <stdio.h>

#define LOOPS 100000000  // 循环次数
int gcn = 0;             // 全局计数变量，初始为 0

void* thread_proc1(void* arg) {
  for (int i = 0; i < LOOPS; i++) {
    gcn += 1;  // 一个线程对 gcn 进行 LOOPS 次 +1
  }
  return NULL;
}

void* thread_proc2(void* arg) {
  for (int i = 0; i < LOOPS; i++) {
    gcn -= 1;  // 一个线程对 gcn 进行 LOOPS 次 -1
  }
  return NULL;
}

int main() {
  pthread_t pid1, pid2;
  for (int i = 0; i < 10; i++) {  // 测试 10 次
    pthread_create(&pid1, NULL, thread_proc1, NULL);
    pthread_create(&pid2, NULL, thread_proc2, NULL);
    pthread_join(pid1, NULL);
    pthread_join(pid2, NULL);
    printf("gcn %d: %d\n", i, gcn);  // 查看 gcn 是否为 0
    gcn = 0;
  }
}

结果示意： 在 gcn 的多次输出中，存在部分输出不是 0。

gcn 0: 5650432
gcn 1: 0
gcn 2: 3087238
gcn 3: 0
gcn 4: 563284
gcn 5: -2728073
gcn 6: 0
gcn 7: 0
gcn 8: 0
gcn 9: -6779036

问题分析： 语句 gcn += 1; 或 gcn -= 1; 通常会被编译成 3 条机器指令：

读：将变量 gcn 对应的内存值读入到某个寄存器 reg 中；
改：将寄存器 reg 中的值加 1 或减 1；
写：将寄存器 reg 中的值写回 gcn 对应的内存单元。

示例中的两个线程，并发 +1 和 -1 的执行顺序可能如下：

步骤	线程1	线程2	`gcn` 值	`reg` 值
1	将内存 `gcn` 中的值读入寄存器 `reg`		假设为`0`	`0`
2	将寄存器 `reg` 中的值加 `1` （线程挂起，保存 `reg`）		`0`	`1`
3		将内存 `gcn` 中的值读入寄存器 `reg`	`0`	`0`
4		将寄存器 `reg` 中的值减 `1`	`0`	`-1`
5		将寄存器 `reg` 中的值写回内存 `gcn`	`-1`	`-1`
6	恢复寄存器 `reg`，将值写回内存 `gcn`		`1`	`1`

上述示例为“一条语句的执行被其他线程打断”的情况，对于由多条语句组成的任务，也会存在被其他线程打断的情况。

1.2 资源互斥访问

对于线程之间因共享数据而引发的数据竞争，可以通过互斥机制来保证线程之间的有序执行。

基本概念：

临界资源：一次仅允许一个线程使用的共享资源；
临界区：每个线程中访问临界资源的那段代码。

典型的互斥访问：

如果有若干线程要求进入空闲的临界区，一次仅允许一个线程进入；
如果已有线程进入临界区，那么其他所有试图进入临界区的线程需要等待（阻塞，让出 CPU）；
进入临界区的线程要在有限时间内退出，以便其他线程能够及时进入临界区。

互斥机制有多种实现方式，包括：

互斥锁：一次仅允许一个线程进入临界区（典型的互斥访问）；
读写锁：允许读线程同时进入临界区；
信号量：允许一定数量的线程同时进入临界区；
自旋锁：一次仅允许一个线程进入临界区（等待线程不阻塞）；
原子变量：一次仅允许一个线程访问变量（低级别、锁无关）。

2 基于 POSIX 的线程互斥

2.1 互斥锁

互斥锁（互斥体/互斥量）：同一时刻，只允许一个线程对临界区进行访问。

使用流程：

初始化一个互斥锁（多个线程共用）；
线程在进入临界区之前，加锁（阻止其他线程进入临界区）；
线程在退出临界区之后，解锁（允许其他线程进入临界区）；
销毁互斥锁。

2.1.1 常用 API 函数

互斥锁 mutex（mutual exclusion，相互排斥）。

序号	函数	说明
1	`int pthread_mutex_init(pthread_mutex_t restrict mutex, const pthread_mutexattr_t restrict attr);`	按照指定属性（`NULL` 为默认属性）初始化互斥锁。
2	`int pthread_mutex_destroy(pthread_mutex_t *mutex);`	销毁互斥锁（不能正在被使用，如：已加锁 / 其它线程正尝试加锁 / 被 `pthread_cond_wait()` 等函数使用）。
3	`pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;`	以静态方式初始化互斥锁（按照默认属性，使用 `PTHREAD_MUTEX_INITIALIZER` 直接给互斥锁变量赋值），不需要销毁。
4	`int pthread_mutex_lock(pthread_mutex_t *mutex);`	对互斥锁加锁，如果互斥锁已经被其他线程加锁，那么调用该函数的线程将阻塞（让出 CPU）
5	`int pthread_mutex_trylock(pthread_mutex_t *mutex);`	同 `pthread_mutex_lock`，除外：如果互斥锁已经被任意线程加锁，则函数立即返回 `EBUSY`（对于默认属性）
6	`int pthread_mutex_timedlock(pthread_mutex_t* restrict mutex, const struct timespec* restrict abstime);`	同 `pthread_mutex_lock`，除外：如果互斥锁已经被任意线程加锁，则进行等待，直到超时返回 `ETIMEDOUT`（对于默认属性）
7	`int pthread_mutex_unlock(pthread_mutex_t *mutex);`	对互斥锁进行解锁。`pthread_mutex_unlock` 要和 `pthread_mutex_*lock` 成对使用，避免死锁。

关于返回值：

如果成功，函数均返回 0；
2.1.2 设置类型属性部分，对于不同类型属性，加锁/解锁后，会有不同的状态及返回值；
在内存等资源不足、权限不足、参数错误、加锁数量超出上限等情况下，会返回相应错误码。

通过 man7.org/linux/man-pages 搜索 pthread_mutex 可以查看更多相关函数，以及更详细的说明。

2.1.2 使用示例

#include <pthread.h>
#include <stdio.h>

#define LOOPS 10000000  // 循环次数
int gcn = 0;            // 全局计数变量（由互斥锁保护）
pthread_mutex_t mutex;  // 全局互斥锁

void* thread_proc1(void* arg) {
  for (int i = 0; i < LOOPS; i++) {
    pthread_mutex_lock(&mutex);    // 加锁
    gcn += 1;                      // 包含临界资源 gcn 的临界区
    pthread_mutex_unlock(&mutex);  // 解锁
  }
  return NULL;
}

void* thread_proc2(void* arg) {
  for (int i = 0; i < LOOPS; i++) {
    pthread_mutex_lock(&mutex);
    gcn -= 1;
    pthread_mutex_unlock(&mutex);
  }
  return NULL;
}

int main() {
  pthread_t pid1, pid2;
  pthread_mutex_init(&mutex, NULL);  // 初始化互斥锁
  for (int i = 0; i < 10; i++) {     // 测试 10 次
    pthread_create(&pid1, NULL, thread_proc1, NULL);
    pthread_create(&pid2, NULL, thread_proc2, NULL);
    pthread_join(pid1, NULL);
    pthread_join(pid2, NULL);
    printf("gcn %d: %d\n", i, gcn);  // gcn 值均为 0
    gcn = 0;
  }
  pthread_mutex_destroy(&mutex);  // 销毁互斥锁
}

2.1.3 设置类型属性

对于分时调度（默认调度策略）线程，互斥锁属性包括：进程共享属性（pshared）、健壮属性（robust，线程未解锁退出情况）、类型属性（type）。
其中，类型属性相关 API 函数如下：

序号	函数	说明
1	`int pthread_mutexattr_init(pthread_mutexattr_t *attr);`	初始化互斥锁属性结构。
2	`int pthread_mutexattr_destroy(pthread_mutexattr_t *attr);`	销毁互斥锁属性结构。（在完成互斥锁初始化后，就可以将其销毁）
3	`int pthread_mutexattr_settype(pthread_mutexattr_t *attr, int type);`	设置互斥锁类型属性，包括 `PTHREAD_MUTEX_NORMAL`（默认）、`PTHREAD_MUTEX_ERRORCHECK`、`PTHREAD_MUTEX_RECURSIVE`。
4	`int pthread_mutexattr_gettype(const pthread_mutexattr_t restrict attr, int restrict type);`	获取互斥锁类型属性。

对于不同的锁类型，重复加锁（Relock）及非法解锁（Unlock When Not Owner）的行为如下：

锁类型	类型标识	加锁函数	已加锁后再加锁（线程间）	已加锁后再加锁（线程内）	已加锁后解锁（线程内）	未加锁时解锁（线程内）
普通锁	`PTHREAD_MUTEX_NORMAL`	`pthread_mutex_lock()`	阻塞等待	死锁	其他线程可加锁	未定义行为
		`pthread_mutex_trylock()`	返回 `EBUSY`	返回 `EBUSY`
		`pthread_mutex_timedlock()`	返回 `ETIMEDOUT`	返回 `ETIMEDOUT`
检错锁	`PTHREAD_MUTEX_ERRORCHECK`	`pthread_mutex_lock()`	同 `NORMAL`	返回 `EDEADLK`		返回 `EPERM`
		`pthread_mutex_trylock()`	同 `NORMAL`	返回 `EBUSY`
		`pthread_mutex_timedlock()`	同 `NORMAL`	返回 `EDEADLK`
递归锁	`PTHREAD_MUTEX_RECURSIVE`	`pthread_mutex_*lock()`	同 `NORMAL`	增加锁计数	减少锁计数，计数为 `0` 时，其他线程可加锁	返回 `EPERM`
默认锁	`PTHREAD_MUTEX_DEFAULT` 在 linux 中同 `NORMAL`	/	/	/	/	/

对于不同类型属性，测试加锁后的返回值：

#include <errno.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

#define handle_error_en(en, msg) \
  do { errno = en; perror(msg); exit(EXIT_FAILURE); } while (0)

pthread_mutex_t mutex;  // 全局互斥锁

void* thread_proc(void* arg) {
  for (int i = 3; i > 0; i--) {
    pthread_mutex_lock(&mutex);
    printf("thread_proc [%d]\n", i);
    sleep(3);
    pthread_mutex_unlock(&mutex);
  }
  return NULL;
}

int main() {
  int ret;
  pthread_t pid;

  pthread_mutexattr_t attr;
  pthread_mutexattr_init(&attr);                                           // 属性初始化
  if (ret = pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_ERRORCHECK)) {  // 赋值类型属性
    handle_error_en(ret, "pthread_mutexattr_settype");
  }
  pthread_mutex_init(&mutex, &attr);  // 为 mutex 设置属性

  int type;
  pthread_mutexattr_gettype(&attr, &type);  // 获取当前类型属性
  printf("Attr Type = %s\n", (type == PTHREAD_MUTEX_NORMAL)       ? "MUTEX_NORMAL"
                             : (type == PTHREAD_MUTEX_ERRORCHECK) ? "MUTEX_ERRORCHECK"
                             : (type == PTHREAD_MUTEX_RECURSIVE)  ? "MUTEX_RECURSIVE"
                                                                  : "???");
  pthread_mutexattr_destroy(&attr);  // 属性销毁

  if (ret = pthread_create(&pid, NULL, thread_proc, NULL)) handle_error_en(ret, "pthread_create");
  sleep(1);  // 让子线程先执行

  int locked = 0;  // 加锁计数
  if ((ret = pthread_mutex_trylock(&mutex)) == 0) locked++;  // trylock()
  printf("trylock ret(%d) vs EBUSY(%d)\n", ret, EBUSY);

  struct timespec time_out;
  clock_gettime(CLOCK_REALTIME, &time_out);
  time_out.tv_sec += 1;
  if ((ret = pthread_mutex_timedlock(&mutex, &time_out)) == 0) locked++;  // timedlock()
  printf("timedlock ret(%d) vs ETIMEDOUT(%d)\n", ret, ETIMEDOUT);

  if ((ret = pthread_mutex_lock(&mutex)) == 0) locked++;  // lock()
  printf("lock ret(%d) vs EDEADLK(%d)\n", ret, EDEADLK);

  if (type != PTHREAD_MUTEX_NORMAL) {
    if ((ret = pthread_mutex_lock(&mutex)) == 0) locked++;  // lock()
    printf("relock ret(%d) vs EDEADLK(%d)\n", ret, EDEADLK);
  }

  sleep(1);

  for (; locked > 0; locked--) {
    ret = pthread_mutex_unlock(&mutex);  // unlock()
    printf("unlock [%d] ret(%d)\n", locked, ret);
  }

  pthread_join(pid, NULL);
  pthread_mutex_destroy(&mutex);
}

2.2 读写锁

对共享数据的访问操作，可以分为“读操作”和“写操作”。对于读操作，即便有多个线程并发读取，也不会像写操作那样造成数据竞争。
利用上述特性，在读操作远多于写操作的场景中，使用“读写锁”可以达到比“互斥锁”更好的并发性能：

如果一个线程用读锁锁住了临界区，那么其他线程也可以用读锁来进入临界区，从而实现并发读取；
- 加读锁后，如果其他线程再请求加写锁，就会发生阻塞；
- 在写锁请求阻塞后，如果其他线程继续有读锁请求，那么这些读锁请求是否会阻塞，需要根据锁的属性判断，应尽量避免“读锁长期占用资源、写锁饥饿”的情况。
如果一个线程用写锁锁住了临界区，那么其他线程不管是请求读锁还是写锁都会发生阻塞。

2.2.1 常用 API 函数

序号	函数	说明
1	`int pthread_rwlock_init(pthread_rwlock_t restrict rwlock, const pthread_rwlockattr_t restrict attr);`	按照指定属性（`NULL` 为默认属性）初始化读写锁。
2	`int pthread_rwlock_destroy(pthread_rwlock_t *rwlock);`	销毁读写锁（不能正在被使用，如：已加锁 / 其它线程正尝试加锁）
3	`pthread_rwlock_t rwlock = PTHREAD_RWLOCK_INITIALIZER;`	以静态方式初始化读写锁（按照默认属性，使用 `PTHREAD_RWLOCK_INITIALIZER` 直接给读写锁变量赋值），不需要销毁。
4	`int pthread_rwlock_rdlock(pthread_rwlock_t *rwlock);`	为读操作加锁（加读锁），如果读写锁已经被其他线程加写锁，那么调用该函数的线程将阻塞；否则，如果某个线程正在等待加写锁（已加读锁），函数是否阻塞由实现定义（对于分时调度策略）。在同一个线程内，可以多次加读锁（默认属性下），需要确保同样次数的解锁。如果当前线程已经加写锁，再次加读锁，将返回 `EDEADLK`，或者产生死锁。
5	`int pthread_rwlock_wrlock(pthread_rwlock_t *rwlock);`	为写操作加锁（加写锁），如果读写锁已经被其他线程加锁（无论是读锁还是写锁），那么调用该函数的线程将阻塞。如果检测到死锁条件，或者在当前线程已经加读锁或写锁的情况下，再次加写锁，将返回 `EDEADLK`，或者产生死锁。
6	`int pthread_rwlock_tryrdlock(pthread_rwlock_t *rwlock);`	同 `pthread_rwlock_rdlock()`，除外：对于 `pthread_rwlock_rdlock()` 需要阻塞的情况（其他线程已经加写锁或正在等待加写锁），或者当前线程已经加写锁，函数立即返回 `EBUSY`。
7	`int pthread_rwlock_trywrlock(pthread_rwlock_t *rwlock);`	同 `pthread_rwlock_wrlock()`，除外：如果任意线程已经加读锁或写锁，函数立即返回 `EBUSY`。
8	`int pthread_rwlock_timedrdlock(pthread_rwlock_t restrict rwlock, const struct timespec restrict abstime);`	同 `pthread_rwlock_rdlock()`，除外：对于 `pthread_rwlock_rdlock()` 需要阻塞的情况（其他线程已经加写锁或正在等待加写锁），则进行等待，直到超时返回 `ETIMEDOUT`。如果当前线程已经加写锁，或者检测到死锁条件，则返回 `EDEADLK`。
9	`int pthread_rwlock_timedwrlock(pthread_rwlock_t restrict rwlock, const struct timespec restrict abstime);`	同 `pthread_rwlock_wrlock()`，除外：如果其他线程已经加读锁或写锁，则进行等待，直到超时返回 `ETIMEDOUT`。如果当前线程已经加读锁或写锁，或者检测到死锁条件，将返回 `EDEADLK` 或 `ETIMEDOUT`（已加读锁？）
10	`int pthread_rwlock_unlock(pthread_rwlock_t *rwlock);`	对读写锁进行解锁，需要与加锁（读锁及写锁）函数成对使用，避免死锁。

如果成功，函数均返回 0，否则返回错误码（TODO：待进步梳理）。
通过 man7.org/linux/man-pages 搜索 pthread_rwlock 可以查看更多相关函数，以及更详细的说明。

对于“分时调度”线程，pthread_rwlock_rdlock() 和 pthread_rwlock_wrlock() 在“默认属性”下的加锁情况：

当前锁状态	请求加读锁（线程间）	请求加读锁（线程内）	请求加写锁（线程间）	请求加写锁（线程内）
无锁	成功加锁	成功加锁	成功加锁	成功加锁
已经加读锁（无写锁阻塞）	成功加锁	成功加锁	阻塞	死锁（经测试）
已经加读锁（有写锁阻塞）	实现定义（经测试可加）	实现定义（经测试可加）	阻塞	死锁（经测试）
已经加写锁	阻塞	返回 `EDEADLK` （经测试）	阻塞	返回 `EDEADLK` （经测试）

2.2.2 设置类型属性

读写锁属性包括：进程共享属性（pshared）、类型属性（kind）。
其中，类型属性相关 API 函数如下：

序号	函数	说明
1	`int pthread_rwlockattr_init(pthread_rwlockattr_t *attr);`	初始化读写锁属性结构。
2	`int pthread_rwlockattr_destroy(pthread_rwlockattr_t *attr);`	销毁读写锁属性结构。（在完成读写锁初始化后，就可以将其销毁）
3	`int pthread_rwlockattr_setkind_np(pthread_rwlockattr_t *attr, int pref);`	设置读写锁类型属性，包括 `PTHREAD_RWLOCK_PREFER_READER_NP`（默认）、`PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP`
4	`int pthread_rwlockattr_getkind_np(const pthread_rwlockattr_t restrict attr, int restrict pref);`	获取读写锁类型属性。

名称中的后缀 “_np” 代表 “nonportable”（不可移植）。

类型说明：

序号	类型	说明
1	`PTHREAD_RWLOCK_PREFER_READER_NP`	读者优先：如果已加读锁，即使其他线程存在阻塞等待的写锁，接下来的读锁请求仍将成功加锁。也就是，只要有读锁存在，就会导致写锁饥饿（无法成功加锁）。一个线程可以持有多个读锁，即线程内部读锁是可递归的。
2	`PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP`	写者优先：如果已加读锁，并且其他线程存在阻塞等待的写锁，接下来的读锁请求将会阻塞，避免写锁饥饿。同时，通过“NONRECURSIVE”标明，在线程内部，不可以递归请求读锁，如果已经持有读锁，并且其他线程存在阻塞等待的写锁，再次尝试请求读锁将会死锁（经测试）。

2.2.3 读写锁优先级测试

#include <errno.h>
#include <pthread.h>
#include <semaphore.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#define handle_error(msg) \
  do { perror(msg); exit(EXIT_FAILURE); } while (0)

#define handle_error_en(en, msg) \
  do { errno = en; perror(msg); exit(EXIT_FAILURE); } while (0)

#define READER_COUNT 5  // 读线程数量 vs 1个写线程
#define LOOPS 5         // 线程内部加/解锁次数

int gcn = 0;              // 全局变量，读多写少，通过读写锁保护
pthread_rwlock_t rwlock;  // 全局读写锁

/**
 * 加锁时空示意
 * 读线程1：读——|——读——|——读——|——读——|——读
 * 读线程2：——读——|——读——|——读——|——读——|——读
 * 读线程3：——|——读——|——读——|——读——|——读——|——读
 * 读线程4：——|——|——读——|——读——|——读——|——读——|——读
 * 读线程5：——|——|——|——读——|——读——|——读——|——读——|——读
 * 写线程1：——|——写？
 */

void *thread_read(void *arg) {
  int idx = (int)(long)(arg);  // 读线程编号，从 0 开始
  sleep(idx);  // 每隔 1 秒开始一个读锁请求，确保写锁请求位于中间，以便观察写锁请求的优先级

  printf("reader %d: begin\n", idx);
  for (int i = LOOPS; i > 0; i--) {
    pthread_rwlock_rdlock(&rwlock);  // 加读锁
    printf("reader %d: loop [%d] reading (%d)...\n", idx, i, gcn);
    sleep(2);  // 读线程睡眠 2 秒，确保持续存在读锁请求，以便观察写锁请求的优先级
    pthread_rwlock_unlock(&rwlock);  // 释放读锁
  }
  printf("reader %d: end\n", idx);
  return NULL;
}

void *thread_write(void *arg) {
  sleep(1);  // 确保写锁请求在第 1 个读锁请求之后，以便观察写锁请求的优先级

  printf("writer    begin\n");
  for (int i = LOOPS; i > 0; i--) {
    pthread_rwlock_wrlock(&rwlock);  // 加写锁
    printf("writer    loop [%d] writing (%d)...\n", i, ++gcn);
    sleep(1);  // 写线程睡眠 1 秒，增加“写锁请求”的频率
    pthread_rwlock_unlock(&rwlock);  // 释放写锁
  }
  printf("writer    end\n");
  return NULL;
}

int main() {
  int ret;
  pthread_t pids[READER_COUNT];
  pthread_t wid;

#if 0 // 默认属性
  if (ret = pthread_rwlock_init(&rwlock, NULL)) handle_error_en(ret, "pthread_rwlock_init");
#else
  pthread_rwlockattr_t attr;
  if (ret = pthread_rwlockattr_init(&attr)) handle_error_en(ret, "pthread_rwlockattr_init");
  if (ret = pthread_rwlockattr_setkind_np(&attr, PTHREAD_RWLOCK_PREFER_READER_NP)) handle_error_en(ret, "pthread_rwlockattr_setkind_np");

  if (ret = pthread_rwlock_init(&rwlock, &attr)) handle_error_en(ret, "pthread_rwlock_init");

  int kind;
  pthread_rwlockattr_getkind_np(&attr, &kind);
  printf("Attr Kind = %s\n", (kind == PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP) ? "PREFER_WRITER_NONRECURSIVE_NP"
                             : (kind == PTHREAD_RWLOCK_PREFER_READER_NP)            ? "PREFER_READER_NP(default)"
                             : (kind == PTHREAD_RWLOCK_PREFER_WRITER_NP)            ? "PREFER_WRITER_NP(ignored by glibc)"
                                                                                    : "???");

  if (ret = pthread_rwlockattr_destroy(&attr)) handle_error_en(ret, "pthread_rwlockattr_destroy");
#endif

  if (ret = pthread_create(&wid, NULL, thread_write, NULL)) handle_error_en(ret, "pthread_create");
  for (int i = 0; i < READER_COUNT; i++) {
    if (ret = pthread_create(&pids[i], NULL, thread_read, (void *)(long)i)) handle_error_en(ret, "pthread_create");
  }

  if (ret = pthread_join(wid, NULL)) handle_error_en(ret, "pthread_join");
  for (int i = 0; i < READER_COUNT; i++) {
    if (ret = pthread_join(pids[i], NULL)) handle_error_en(ret, "pthread_join");
  }

  if (ret = pthread_rwlock_destroy(&rwlock)) handle_error_en(ret, "pthread_rwlock_destroy");
}

2.3 信号量

信号量是一种特殊类型的“变量”：

初始化为非负整数值（代表可共享资源的数量），可以被增加（生产/解锁）或减少（消费/加锁）；
对信号量的访问是原子操作，也就是，如果有两个（或更多）的线程试图改变一个信号量的值，系统将保证所有的操作都会依次（串行）进行。

根据信号量的取值可以分为：

二值信号量（binary semaphore）：只有 0 和 1 两种取值（一个可用资源），每次只允许一个线程进入临界区；
- 当临界区可用时，信号量的值是 1；
- 在线程进入临界区前，将信号量的值减 1 变为 0（加锁），表示临界区正在被使用，此时其它需要进入临界区的线程只能等待；
- 在线程离开临界区后，将信号量的值加 1 变为 1（解锁），使临界区再次变为可用，此时等待进入临界区的某个线程将被唤醒。
计数信号量（counting semaphore）：可以有更大的取值范围（多个可用资源），允许有限数量的线程同时进入临界区。
- 在线程进入临界区前，减少信号量的值，在减之前，如果值不大于 0（没有可用资源），线程将阻塞等待；
- 在线程离开临界区后，增加信号量的值，在加之后，如果存在阻塞等待线程，则根据调度策略将某个线程唤醒。

上述功能与互斥锁类似，不同的是：

信号量允许多个线程同时进入临界区（有多个资源可供同时访问，如多台打印机）；
而互斥锁一次只允许一个线程进入临界区。
信号量可以在一个线程中增加值（生产/解锁），在另一个线程中减少值（消费/加锁），这使其可以用于线程间的同步（“线程同步”相关内容会在下一章展开）；
而互斥锁要求加锁和解锁要在同一线程内进行。

2.3.1 常用 API 函数

序号	函数	说明
1	`int sem_init(sem_t *sem, int pshared, unsigned value);`	初始化由 `sem` 指向的信号量对象，并给它一个初始的整数值 `value`。共享选项 `pshared` 如果为 `0`，表示当前进程内的局部信号量；否则，这个信号量可以在多个进程之间共享。
2	`int sem_destroy(sem_t *sem);`	销毁由 `sem` 指向的信号量对象（不能正在被使用）。
3	`int sem_post(sem_t *sem);`	以原子操作的方式给信号量的值加 `1`。在加之后，如果存在阻塞在 `sem_wait()` 的线程，则根据调度策略将某个线程唤醒。
4	`int sem_wait(sem_t *sem);`	以原子操作的方式将信号量的值减 `1`。在减之前，如果信号量的值大于 `0`，则函数执行减法后立即返回；否则函数会阻塞等待，直到有其他线程增加了该信号量的值使其可以进行减操作，或者函数调用被 signal 中断（`errno` 赋值 `EINTR`）。
5	`int sem_trywait(sem_t *sem);`	同 `sem_wait`。除外：如果不能立即减 `1`，函数直接返回 `-1`（`errno` 赋值 `EAGAIN`），而不是阻塞。
6	`int sem_timedwait(sem_t restrict sem, const struct timespec restrict abs_timeout);`	同 `sem_wait`。除外：如果不能立即减 `1`，则进行等待，直到超时返回 `-1`（`errno` 赋值 `ETIMEDOUT`）。

如果成功，函数均返回 0，否则返回 -1，同时设置 errno。

通过 man7.org/linux/man-pages 搜索 sem_ 可以查看更多相关函数，以及更详细的说明。

2.3.2 互斥示例

#include <errno.h>
#include <pthread.h>
#include <semaphore.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#define handle_error(msg) \
  do { perror(msg); exit(EXIT_FAILURE); } while (0)

#define handle_error_en(en, msg) \
  do { errno = en; perror(msg); exit(EXIT_FAILURE); } while (0)

#define PID_COUNT 5  // 消费者线程数量
#define RES_COUNT 3  // 资源数量

sem_t sem;  // 定义全局信号量

void *thread_proc(void *arg) {
  int idx = (int)(long)(arg);
  printf("sub thread %d begin\n", idx);

  while (sem_wait(&sem)) {                        // 信号量值减 1（加锁）
    if (errno == EINTR) continue;                 // Restart if interrupted by a signal handler.
    handle_error("sem_wait");
  }

  printf("sub thread %d processing...\n", idx);
  sleep(idx % 2 + 1);
  if (sem_post(&sem)) handle_error("sem_post");  // 信号量值加 1（解锁）

  printf("sub thread %d end\n", idx);
  return NULL;
}

int main() {
  int ret;
  pthread_t pids[PID_COUNT];

  // 初始化信号量，值为 RES_COUNT（可同时访问资源数量）
  if (sem_init(&sem, 0, RES_COUNT)) handle_error("sem_init");

  for (int i = 0; i < PID_COUNT; i++) {
    if (ret = pthread_create(&pids[i], NULL, thread_proc, (void *)(long)i)) handle_error_en(ret, "pthread_create");
  }

  for (int i = 0; i < PID_COUNT; i++) {
    if (ret = pthread_join(pids[i], NULL)) handle_error_en(ret, "pthread_join");
  }

  if (sem_destroy(&sem)) handle_error("sem_destroy");
}

2.3.3 同步示例

#include <errno.h>
#include <pthread.h>
#include <semaphore.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define handle_error(msg) \
  do { perror(msg); exit(EXIT_FAILURE); } while (0)

#define handle_error_en(en, msg) \
  do { errno = en; perror(msg); exit(EXIT_FAILURE); } while (0)

sem_t sem;  // 定义全局信号量

#define BUF_SIZE 1024
char buf[BUF_SIZE];  // 生产者线程向 buf 写入数据，消费者线程从 buf 读取数据

void *thread_proc(void *arg) {
  if (sem_wait(&sem)) handle_error("sem_wait");  // 消费者线程（将信号量的值减 1），等待生产者线程向 buf 写入数据
  while (strcmp(buf, "q\n")) {                   // 读取 buf 数据
    printf("Input %ld characters\n", strlen(buf) - 1);
    if (sem_wait(&sem)) handle_error("sem_wait");  // 继续等待 buf 写入新的数据
  }
  return NULL;
}

int main() {
  int ret;
  pthread_t pid;

  // 初始化信号量，值为 0（即，buf 中没有可供消费的数据，消费者线程会在 sem_wait() 处阻塞）
  if (sem_init(&sem, 0, 0)) handle_error("sem_init");
  if (ret = pthread_create(&pid, NULL, thread_proc, NULL)) handle_error_en(ret, "pthread_create");

  printf("Input some text. Enter 'q' to quit\n");
  while (strcmp(buf, "q\n")) {
    fgets(buf, BUF_SIZE, stdin);
    // 生产者线程，在向 buf 写入数据后，将信号量的值加 1，唤醒阻塞的消费者线程（如果存在）
    if (sem_post(&sem)) handle_error("sem_post");
  }

  if (ret = pthread_join(pid, NULL)) handle_error_en(ret, "pthread_join");
  if (sem_destroy(&sem)) handle_error("sem_destroy");
}

2.4 自旋锁

自旋锁（Spinlock）是一种低级别（low-level）的互斥机制，主要适用于多处理器之间共享资源。
当线程请求一个已经被其他线程持有的锁时，这个线程会在一个循环中反复检查锁是否可用（即“自旋”），而不是进入阻塞状态。
线程在成功加锁之后，应该在尽量短的时间内解锁，以减少其他请求线程因为“自旋”造成的对处理器时间的浪费。

使用自旋锁的注意事项：

自旋锁应该在实时调度策略（如 SCHED_FIFO，或 SCHED_RR 如果可能的话）下使用。
因为在非确定性调度策略（如 SCHED_OTHER）下，如果一个线程在持有自旋锁的情况下被调离 CPU，那么其他请求锁的线程将会一直消耗 CPU 时间在锁上自旋，直到锁的持有者再次被调度并释放锁。
在实时调度策略下，为了避免持有自旋锁的线程被抢占，高优先级的线程可能会等待低优先级线程的执行，导致发生优先级反转。
使用自旋锁时，需要特别注意避免死锁的发生。
因为如果线程在持有自旋锁的情况下发生了死锁，那么其他请求锁的线程将会一直消耗 CPU 时间在锁上自旋。
在用户空间，不适合选择自旋锁作为一般的加锁解决方案。
因为它们本质上容易发生优先级反转和无限自旋消耗。
使用自旋锁编写程序，不仅在代码方面，而且在系统配置、线程放置和优先级分配等方面，都必须特别小心。

2.4.1 常用 API 函数

序号	函数	说明
1	`int pthread_spin_init(pthread_spinlock_t *lock, int pshared);`	按照 `pshared` 属性（是否在进程间共享）初始化自旋锁。
2	`int pthread_spin_destroy(pthread_spinlock_t *lock);`	销毁自旋锁（不能正在被使用，如：已加锁）
3	`int pthread_spin_lock(pthread_spinlock_t *lock);`	请求加锁，如果其他线程已加锁，调用线程将会自旋（即，函数调用不会返回），直到锁变得可用；如果在调用发生时，调用线程已经持有该锁，结果是未定义的，可能死锁，可能返回 `EDEADLK`。
4	`int pthread_spin_trylock(pthread_spinlock_t *lock);`	同 `pthread_spin_lock`，除外：如果自旋锁已经被加锁，返回 `EBUSY`。
5	`int pthread_spin_unlock(pthread_spinlock_t *lock);`	对自旋锁进行解锁。需要与加锁成对使用。

如果成功，函数均返回 0，否则返回错误码。
通过 man7.org/linux/man-pages 搜索 pthread_spin 可以查看更详细的说明。

2.4.2 使用示例

#include <pthread.h>
#include <stdio.h>

#define LOOPS 10000000    // 循环次数
int gcn = 0;              // 全局变量，通过自旋锁保护
pthread_spinlock_t lock;  // 全局自旋锁

void* thread_proc1(void* arg) {
  for (int i = 0; i < LOOPS; i++) {
    pthread_spin_lock(&lock);
    gcn += 1;
    pthread_spin_unlock(&lock);
  }
  return NULL;
}

void* thread_proc2(void* arg) {
  for (int i = 0; i < LOOPS; i++) {
    pthread_spin_lock(&lock);
    gcn -= 1;
    pthread_spin_unlock(&lock);
  }
  return NULL;
}

int main() {
  pthread_t pid1, pid2;
  pthread_spin_init(&lock, PTHREAD_PROCESS_PRIVATE);  // 初始化自旋锁
  for (int i = 0; i < 10; i++) {                      // 测试 10 次
    pthread_create(&pid1, NULL, thread_proc1, NULL);
    pthread_create(&pid2, NULL, thread_proc2, NULL);
    pthread_join(pid1, NULL);
    pthread_join(pid2, NULL);
    printf("gcn %d: %d\n", i, gcn);  // gcn 值均为 0
    gcn = 0;
  }
  pthread_spin_destroy(&lock);  // 销毁自旋锁
}

3 基于 C++ 的线程互斥

由操作系统提供的 API，虽然限制最少，但使用起来并不是很方便，也难以进行跨系统兼容。
从 C++11 标准开始，C++ 语言提供了一系列线程互斥与同步支持。

3.1 互斥锁（C++11）

“互斥锁概念”回顾

3.1.1 类定义

与互斥锁相关的类和函数在头文件 <mutex> 中声明：

序号	类	说明
1	`mutex`	基本互斥锁（排他、不可递归）（对比`PTHREAD_MUTEX_NORMAL`）
2	`timed_mutex`	定时互斥锁，可以设置加锁超时时间
3	`recursive_mutex`	递归互斥锁，在线程内可以递归加锁（对比`PTHREAD_MUTEX_RECURSIVE`）
4	`recursive_timed_mutex`	定时递归互斥锁，结合 `timed_mutex` 和 `recursive_mutex` 特性

各类的成员函数：

序号	成员函数	`mutex`	`timed_` `mutex`	`recursive_` `mutex`	`recursive_` `timed_mutex`	说明
1	构造函数	√	√	√	√	提供默认构造函数；不可拷贝，不可移动。
2	析构函数	√	√	√	√
3	`void lock();`	√	√	√	√	阻塞式加锁，失败抛异常
4	`bool try_lock();`	√	√	√	√	如果不能立刻加锁，返回 `false`
5	`void unlock();`	√	√	√	√	解锁，需要与加锁成对使用，避免死锁
6	`native_handle_type native_handle();`	√	√	√	√	Linux 对应 `pthread_mutex_t*`
7	`template<class Rep, class Period>` `bool try_lock_for(const chrono::duration<Rep, Period>& timeout_duration);`		√		√	如果等待超过指定时长，返回 `false`
8	`template<class Clock, class Duration>` `bool try_lock_until(const chrono::time_point<Clock, Duration>& timeout_time);`		√		√	如果等待到达指定时间点，返回 `false`

虽然，可以直接调用成员函数 lock() 进行加锁、unlock() 进行解锁。
但是，为了让 unlock() 始终与 lock() 成对调用，推荐使用标准库提供的、融合实现 RAII 的类模板，如 std::lock_guard<>：

// 头文件 include/c++/10/bits/std_mutex.h

// 用于指定“锁定策略”的标记类型和常量，表示当前线程已经成功加锁
struct adopt_lock_t { explicit adopt_lock_t() = default; };
_GLIBCXX17_INLINE constexpr adopt_lock_t adopt_lock{};

template <typename _Mutex>
class lock_guard {  // 基于作用域的互斥锁包装器
 public:
  typedef _Mutex mutex_type;  // 互斥锁类型

  // 在构造函数中，通过互斥锁对象的成员函数 lock() 加锁
  explicit lock_guard(mutex_type& __m) : _M_device(__m) { _M_device.lock(); }

  // 互斥锁对象已加锁，构造函数不需要再次加锁
  lock_guard(mutex_type& __m, adopt_lock_t) noexcept : _M_device(__m) {}  // calling thread owns mutex

  // 在析构函数中，通过互斥锁对象的成员函数 unlock() 解锁
  ~lock_guard() { _M_device.unlock(); }

  // ...

 private:
  mutex_type& _M_device;  // 互斥锁对象引用
};

3.1.2 使用示例

// 编译指令：g++ -std=c++11 -pthread test.cpp
#include <iostream>
#include <mutex>
#include <thread>

#define LOOPS 10000000
int gcn = 0;         // 全局变量，通过互斥锁保护
std::mutex g_mutex;  // 全局互斥锁

void* thread_proc1() {
  for (int i = 0; i < LOOPS; i++) {
    std::lock_guard<std::mutex> guard(g_mutex);  // 构造时加锁，析构时解锁
    gcn += 1;                                    // 包含临界资源 gcn 的临界区
  }
  return NULL;
}

void* thread_proc2() {
  for (int i = 0; i < LOOPS; i++) {
    std::lock_guard<std::mutex> guard(g_mutex);
    gcn -= 1;
  }
  return NULL;
}

int main() {
  for (int i = 0; i < 10; i++) {  // 测试 10 次
    std::thread t1(thread_proc1);
    std::thread t2(thread_proc2);
    t1.join();
    t2.join();
    std::cout << "gcn " << i << ": " << gcn << '\n';  // gcn 值均为 0
    gcn = 0;
  }
}

与 std::lock_guard<> 相比，类模板 std::unique_lock<> 要更加灵活：

可以指定在构造时是否需要加锁（支持多种加锁方式），指定在析构时是否需要解锁；
可以通过 unique_lock 对象进行加锁和解锁；
可以移动 unique_lock 对象关联的互斥锁。

3.1.3 一次性互斥

在某些情况下，只需要对共享数据进行一次性互斥。
例如，对于延迟初始化（lazy initialization，通常初始化开销较大）的共享数据，只需要在初始化过程中进行互斥保护，初始化完成后，相关操作不涉及数据竞争（如只读），不再需要互斥保护。

如果使用互斥锁来实现数据保护，会影响多线程的并发性能：

std::shared_ptr<some_resource> resource_ptr;
std::mutex resource_mutex;
void foo() {
  std::unique_lock<std::mutex> lk(resource_mutex);  // 每次执行，线程之间都需要互斥等待
  if (!resource_ptr) {
    resource_ptr.reset(new some_resource);
  }
  lk.unlock();
  resource_ptr->do_something();
}

C++11 提供了 std::once_flag 类和 std::call_once() 函数模板来实现一次性互斥，确保即使有多个线程、多次、同时调用某个函数（可调用对象），其只会被执行一次。

// 可调用对象 f 只被执行一次
// 通过 flag 标记执行状态，如果 f 抛出异常，则仍然可以再次尝试执行
template <class Callable, class... Args>
void call_once(std::once_flag& flag, Callable&& f, Args&&... args);

// 编译指令：g++ -std=c++11 -pthread test.cpp
#include <iostream>
#include <mutex>
#include <thread>
#include <vector>

struct some_resource {
  some_resource() {  // 需要通过互斥初始化共享资源
    std::cout << "some_resource constructor\n";
    std::this_thread::sleep_for(std::chrono::seconds(1));
  }
  void do_something(int tid, int loop) {  // 不需要通过互斥访问共享资源
    std::cout << "do_something: " + std::to_string(tid) + ", loop " + std::to_string(loop) + "\n";
    std::this_thread::sleep_for(std::chrono::seconds(1));
  }
};

std::shared_ptr<some_resource> resource_ptr;  // 资源指针
std::once_flag resource_flag;                 // 一个 once_flag 实例对应一次初始化操作
void init_resource(int tid, int loop) {       // 资源初始化函数
  std::cout << "init_resource: " + std::to_string(tid) + ", loop " + std::to_string(loop) + "\n";
  resource_ptr.reset(new some_resource);
}

void foo(int tid, int loop) {                               // 多线程并发执行
  std::call_once(resource_flag, init_resource, tid, loop);  // 确保 init_resource 只被执行一次
  resource_ptr->do_something(tid, loop);                    // 后续操作不再需要互斥保护
}

void thread_proc(int tid) {
  for (int i = 0; i < 3; i++) {
    foo(tid, i);
  }
}

int main() {
  std::vector<std::thread> threads;
  for (int i = 0; i < 3; i++) {
    // C++11 emplace_back 用于在容器末尾直接构造对象，避免拷贝或移动
    threads.emplace_back(std::thread(thread_proc, i));
  }
  for (auto& t : threads) {
    t.join();
  }
}

POSIX 提供的相关 API 函数为 pthread_once()。

3.2 读写锁（C++14/17）

“读写锁概念”回顾

3.2.1 类定义

读写锁（共享锁）相关类在 <shared_mutex> 中定义。
与类 std::shared_mutex（C++17）相比，类 std::shared_timed_mutex（C++14）支持更多操作。

写锁成员函数：

序号	成员函数（写锁/排他锁操作）	`shared_timed` `_mutex`	`shared` `_mutex`
1	`void lock();`	√	√
2	`bool try_lock();`	√	√
3	`void unlock();`	√	√
4	`template <class Rep, class Period>` `bool try_lock_for(const std::chrono::duration<Rep, Period>& timeout_duration);`	√
5	`template <class Clock, class Duration>` `bool try_lock_until(const std::chrono::time_point<Clock, Duration>& timeout_time);`	√

读锁成员函数：（相比写锁，读锁命名增加了 _shared 后缀）

序号	成员函数（读锁/共享锁操作）	`shared_timed` `_mutex`	`shared` `_mutex`
1	`void lock_shared();`	√	√
2	`bool try_lock_shared();`	√	√
3	`void unlock_shared();`	√	√
4	`template <class Rep, class Period>` `bool try_lock_shared_for(const std::chrono::duration<Rep, Period>& timeout_duration);`	√
5	`template <class Clock, class Duration>` `bool try_lock_shared_until(const std::chrono::time_point<Clock, Duration>& timeout_time);`	√

读写锁的 RAII 包装器（wrapper）：

可以使用 std::lock_guard<std::shared_mutex> 和 std::unique_lock<std::shared_mutex> 对“写锁”进行加锁和解锁管理；（类模板与互斥锁相同）
可以使用 std::shared_lock<std::shared_mutex> 对“读锁”进行加锁和解锁管理。（类模板为读锁特有）

类模板 shared_lock<> 内部调用读写锁的 *_shared 系列成员函数，使用方式与 unique_lock<> 类似：

可以指定在构造时是否需要加锁（支持多种加锁方式），在析构时是否需要解锁；
可以通过 shared_lock 对象进行加锁和解锁；
可以移动 shared_lock 对象关联的读写锁。

3.2.2 使用示例

// 编译指令：g++ -std=c++17 -pthread test.cpp
#include <iostream>
#include <map>
#include <mutex>
#include <shared_mutex>
#include <string>
#include <thread>
#include <vector>

class dns_entry { // DNS 条目
 public:
  dns_entry() = default;
  dns_entry(std::string ip, std::string time) : ip_address(ip), timestamp(time) {}
  std::string ip_address;
  std::string timestamp;
};

// 存储 DNS 条目的缓存表，用于将域名解释成对应的 IP 地址【读多】
// 会不时加入新条目，但对于给定的 DNS 条目通常在很长时间内都不会变化【写少】
class dns_cache {
  std::map<std::string, dns_entry> entries;  // 存放缓存数据
  mutable std::shared_mutex entry_mutex;     // 对 entries 进行读写保护

 public:
  dns_entry find_entry(std::string const& domain) const {
    std::shared_lock<std::shared_mutex> lk(entry_mutex);  // 保护共享的、只读的访问
    std::map<std::string, dns_entry>::const_iterator const it = entries.find(domain);
    return (it == entries.end()) ? dns_entry() : it->second;
  }

  void update_or_add_entry(std::string const& domain, dns_entry const& dns_details) {
    std::lock_guard<std::shared_mutex> lk(entry_mutex);  // 保护排他的、写访问
    entries[domain] = dns_details;
  }
};

int main() {
  dns_cache cache;

  std::vector<std::thread> threads;
  for (int i = 0; i < 10; i++) {
    if (i == 5) {  // 新增条目
      threads.emplace_back(
          [&cache] { cache.update_or_add_entry("www.baidu.com", dns_entry("182.61.200.7", "2024-3-1")); });
    }

    threads.emplace_back([&cache, i] {  // 读取条目
      auto entry = cache.find_entry("www.baidu.com");
      std::cout << std::to_string(i) + ": " + entry.ip_address + '\n';
    });
  }

  for (auto& t : threads) {
    t.join();
  }
}

3.3 信号量（C++20）

“信号量概念”回顾

3.3.1 类定义

在头文件 <semaphore> 中，包含“计数信号量”和“二值信号量”的定义：

// 计数信号量，对于同一资源，至少允许 LeastMaxValue 个并发访问。
// LeastMaxValue 是最小的最大值，而不是实际的最大值。
template <std::ptrdiff_t LeastMaxValue = /* 由实现定义 */>
class counting_semaphore;

// 二值信号量，LeastMaxValue 为 1
using binary_semaphore = std::counting_semaphore<1>;

序号	成员函数	说明
1	`constexpr explicit counting_semaphore(std::ptrdiff_t desired);`	构造信号量对象，其内部计数器初始化为 `desired`（值需 `>= 0` 并且 `<= max()`）
2	`void release(std::ptrdiff_t update = 1);`	将内部计数器值增加 `update`，如果存在因为 `acquire()` 阻塞的线程，随后将根据调度策略解除线程阻塞。
3	`void acquire();`	如果内部计数器大于 `0`，则在将其减 `1` 后返回；否则线程将阻塞，直到计数器大于 `0` 并且能够被成功减 `1`。
4	`bool try_acquire() noexcept;`	如果内部计数器大于 `0`，则尝试将其减 `1`，如果成功减 `1`，则返回 `true`；否则返回 `false`（可能存在虚假失败：计数器大于 `0`，却没能减 `1`）。
5	`template <class Rep, class Period>` `bool try_acquire_for(const std::chrono::duration<Rep, Period>& rel_time);`	如果内部计数器大于 `0`，并且能够成功减 `1` 则返回 `true`；否则阻塞，直到它大于 `0` 并且可以成功减 `1`，或者已超过 `rel_time` 指定时长。
6	`template <class Clock, class Duration>` `bool try_acquire_until(const std::chrono::time_point<Clock, Duration>& abs_time);`	如果内部计数器大于 `0`，并且能够成功减 `1` 则返回 `true`；否则阻塞，直到它大于 `0` 并且可以成功减 `1`，或者已到达 `abs_time` 指定时间点。
7	`static constexpr std::ptrdiff_t max() noexcept;`	返回内部计数器的最大可能值，该值大于或等于 `LeastMaxValue`。

3.3.2 互斥示例

// 编译指令（需要 GCC 11 及以上）：g++ -std=c++20 -pthread test.cpp
#include <iostream>
#include <semaphore>
#include <string>
#include <thread>
#include <vector>

#define PID_COUNT 5  // 消费者线程数量
#define RES_COUNT 3  // 资源数量

std::counting_semaphore sem{RES_COUNT};  // 全局计数信号量，初始计数（可访问资源数量）为 RES_COUNT

void thread_proc(int idx) {
  std::cout << "sub thread " + std::to_string(idx) + " begin\n";
  sem.acquire();  // 信号量值减 1（加锁）

  std::cout << "sub thread " + std::to_string(idx) + " processing...\n";
  sleep(idx % 2 + 1);

  sem.release();  // 信号量值加 1（解锁）
  std::cout << "sub thread " + std::to_string(idx) + " end\n";
}

int main() {
  std::vector<std::thread> threads;

  for (int i = 0; i < PID_COUNT; i++) {
    threads.emplace_back(thread_proc, i);
  }

  for (auto& thread : threads) {
    thread.join();
  }
}

3.3.3 同步示例

// 编译指令（需要 GCC 11 及以上）：g++ -std=c++20 -pthread test.cpp
#include <chrono>
#include <iostream>
#include <semaphore>
#include <thread>

// 全局二值信号量，初始计数为 0（非信号状态）
std::binary_semaphore smphSignalMainToThread{0}, smphSignalThreadToMain{0};

void ThreadProc() {
  // 通过减少信号量计数，等待主线程发送开始工作的通知
  smphSignalMainToThread.acquire();
  std::cout << "[thread] Got the signal\n";

  std::this_thread::sleep_for(std::chrono::seconds(3));

  // 通过增加信号量计数，通知主线程已完成工作
  std::cout << "[thread] Send the signal\n";
  smphSignalThreadToMain.release();
}

int main() {
  std::thread thrWorker(ThreadProc);

  // 通过增加信号量计数，通知工作线程开始工作
  std::cout << "[main] Send the signal\n";
  smphSignalMainToThread.release();

  // 通过减少信号量计数，等待工作线程发送完成工作的通知
  smphSignalThreadToMain.acquire();
  std::cout << "[main] Got the signal\n";

  thrWorker.join();
}

3.4 死锁预防

死锁（deadlock）是最棘手的多线程代码问题之一，即便在大多数情形中，一切运行正常，死锁也可能不期而遇。
在编写代码时，我们可以通过一些相对简单的规则，来预防死锁。

3.4.1 死锁原因及预防建议

3.4.1.1 加锁和解锁未成对使用

死锁原因：一个线程加锁后，在没有解锁的情况下退出，导致其他线程持续阻塞。
预防建议：使用 RAII 技术将加锁和解锁代码封装起来，确保其成对使用。

3.4.1.2 线程内部重复加锁

死锁原因：在一个线程内部，重复请求加锁，可能造成死锁。
预防建议：根据需求，选择适配的锁类型，是递增锁计数（递归锁）、还是读共享（读写锁），或者有多份资源（信号量）。

3.4.1.3 线程间循环等待加锁

死锁原因：
在多个线程同时请求多个锁时，如果请求锁的顺序不一致，也容易造成死锁。
例如，线程 1 在持有锁 A 的情况下请求锁 B（顺序为 AB），而线程 2 在持有锁 B 的情况下请求锁 A（顺序为 BA），就会造成死锁。

预防建议：

通常是始终按照相同的顺序请求加锁，例如，总是先请求对 A 加锁，再请求对 B 加锁；
但顺序加锁并不总是奏效，例如下面情况，这时可以考虑同时加锁（排除顺序影响）。

void swap(type& lhs, type& rhs) {
  // 对 lhs 加锁
  // 对 rhs 加锁
  // 交换 lhs 和 rhs 的内部数据
}

// type A, B;
// 线程1 调用 swap(A, B);
// 线程2 调用 swap(B, A);

避免死锁的核心思想：只要另一个线程有可能正在（直接或间接）等待当前线程，那么当前线程千万不能反过来等待它。

避免死锁的简单规则：

避免嵌套锁：假如已经持有锁，那么不要试图请求第二个锁；
一旦持锁，就尽量避免调用由用户提供的程序接口，如果接口内部试图请求锁，就违反了“避免嵌套锁”的规则；
一旦持锁，就尽量避免 join() 等待别的线程，如果被等待线程恰好试图请求锁，就会导致死锁；
如果需要请求多个锁，尽量采用 std::lock() 函数，一次请求全部锁；
如果需要请求多个锁，并且无法一次请求全部锁，尽量在每个线程内部都依从固定顺序请求这些锁，例如按层级（顺序）加锁。

3.4.1.4 线程间循环等待 join

死锁原因：
线程之间在不涉及锁操作的情况下，由于相互等待也会发生死锁。
例如，有两个线程，各自关联一个 std::thread 实例，如果它们同时在对方的 std::thread 实例上调用 join()，就会导致死锁。

预防建议：

让同一个函数启动全部线程，join() 全部线程；
参照按层级加锁，针对线程规定层级，使得每个线程仅等待层级更低的线程。

3.4.2 使用 lock() 同时加锁

C++11 标准库提供了 std::lock() 函数模板：

template <class Lockable1, class Lockable2, class... LockableN>
void lock(Lockable1& lock1, Lockable2& lock2, LockableN&... lockn);

可以同时为多个互斥锁加锁，没有因为加锁顺序不同而导致的死锁风险；
处理结果为 “全员共同成败”（all-or-nothing），即，或者全部成功加锁，或者全部未加锁并抛出异常。

// 编译指令：g++ -std=c++11 -pthread test.cpp
#include <chrono>
#include <iostream>
#include <mutex>
#include <string>
#include <thread>
#include <vector>

class Employee {
  std::string id;                 // Employee 实例 ID
  std::vector<std::string> gifts; // 收到的所有礼物
  std::mutex m;                   // 为每个实例都配备互斥锁

 public:
  Employee(std::string id) : id(id) {}
  std::string output() const {
    std::string ret = "Employee " + id + " has gifts from: ";
    auto n{gifts.size()};
    for (const auto& gift : gifts) ret += gift + (--n ? ", " : "");
    return ret;
  }

  friend void exchange_gifts(Employee& e1, Employee& e2);
};

void exchange_gifts(Employee& e1, Employee& e2) {
  if (&e1 == &e2) return;  // 需要指向不同的实例

  std::cout << e1.id + " and " + e2.id + " are waiting for locks\n";

  {
    std::lock(e1.m, e2.m);                                      // 同时为两个实例加互斥锁
    std::lock_guard<std::mutex> lock_a(e1.m, std::adopt_lock);  // 通过 adopt_lock 表明已加锁
    std::lock_guard<std::mutex> lock_b(e2.m, std::adopt_lock);  // 构造函数不加锁，析构函数解锁

    std::cout << e1.id + " and " + e2.id + " got locks\n";

    e1.gifts.push_back(e2.id);
    e2.gifts.push_back(e1.id);
  }

  std::this_thread::sleep_for(std::chrono::milliseconds(696));  // 模拟耗时操作
}

int main() {
  Employee alice("Alice"), bob("Bob"), christina("Christina"), dave("Dave");

  std::vector<std::thread> threads;
  threads.emplace_back(exchange_gifts, std::ref(alice), std::ref(bob));
  threads.emplace_back(exchange_gifts, std::ref(christina), std::ref(bob));
  threads.emplace_back(exchange_gifts, std::ref(christina), std::ref(alice));
  threads.emplace_back(exchange_gifts, std::ref(dave), std::ref(bob));

  for (auto& thread : threads) thread.join();

  std::cout << alice.output() << '\n' 
            << bob.output() << '\n' 
            << christina.output() << '\n' 
            << dave.output() << '\n';
}

针对可变参数的 std::lock() 函数模板，C++17 标准提供了新的 RAII 类模板 std::scoped_lock<>，除了参数可变之外，作用与 std::lock_guard<> 完全等价。

// 头文件 include/c++/10/mutex
template <typename... _MutexTypes> // 可变参数模板（variadic template）
class scoped_lock {                // 可以接收多个互斥锁对象作为构造函数的参数列表
 public:

  // 在构造函数中，通过互斥锁对象的成员函数 lock() 加锁
  explicit scoped_lock(_MutexTypes&... __m) : _M_devices(std::tie(__m...)) { std::lock(__m...); }

  // 互斥锁对象已加锁，构造函数不需要再次加锁
  explicit scoped_lock(adopt_lock_t, _MutexTypes&... __m) noexcept : _M_devices(std::tie(__m...)) {}

  // 在析构函数中，通过互斥锁对象的成员函数 unlock() 解锁
  ~scoped_lock() {
    std::apply([](auto&... __m) { (__m.unlock(), ...); }, _M_devices);
  }

  //...

 private:
  tuple<_MutexTypes&...> _M_devices;  // 互斥锁对象引用列表
};

使用 std::scoped_lock<> 重写 exchange_gifts() 函数：

// 编译指令：g++ -std=c++17
void exchange_gifts(Employee& e1, Employee& e2) {
  if (&e1 == &e2) return; 

  {
    std::scoped_lock<std::mutex, std::mutex> guard(e1.m, e2.m); // 构造时加锁，析构时解锁
    // std::scoped_lock guard(e1.m, e2.m); // 等价版本，C++17 新特性：隐式类模板参数推导
    e1.gifts.push_back(e2.id);
    e2.gifts.push_back(e1.id);
  }

  std::this_thread::sleep_for(std::chrono::milliseconds(696));
}

3.4.3 按层级（顺序）加锁

将应用程序分层，明确每个互斥锁所属的层级。
按照程序从高层到底层的调用顺序，如果某个线程已经对高层级的互斥锁加锁，那么接下来它只能继续对相对低层级的互斥锁加锁。

// 编译指令：g++ -std=c++11 -pthread test.cpp
#include <climits>  // ULONG_MAX
#include <iostream>
#include <mutex>
#include <stdexcept>  // logic_error
#include <thread>

/**
 * 自定义互斥锁类型，保存层级编号，确保线程内的互斥锁按指定的层级顺序加锁和解锁（否则抛出异常）
 */
class hierarchical_mutex {
  std::mutex internal_mutex;               // 内部互斥锁
  unsigned long const hierarchy_value;     // 当前互斥锁层级
  unsigned long previous_hierarchy_value;  // 上一次加锁的互斥锁层级（只能比当前层级高）
  static thread_local unsigned long this_thread_hierarchy_value;  // 线程局部变量，保存每个线程最后一次加锁的层级

  void check_for_hierarchy_violation() {                   // 加锁前检测
    if (this_thread_hierarchy_value <= hierarchy_value) {  // 当前锁层级只能低于当前线程最后一次加锁的层级
      throw std::logic_error("mutex hierarchy violated");
    }
  }

  void update_hierarchy_value() {
    previous_hierarchy_value = this_thread_hierarchy_value;  // 备份上一次加锁的层级，以便在解锁时恢复
    this_thread_hierarchy_value = hierarchy_value;           // 更新当前线程最后一次加锁的层级
  }

 public:
  explicit hierarchical_mutex(unsigned long value) : hierarchy_value(value), previous_hierarchy_value(0) {}

  void lock() {
    check_for_hierarchy_violation();  // 检查锁层级是否合法
    internal_mutex.lock();            // 执行加锁操作
    update_hierarchy_value();         // 更新锁层级信息
  }

  void unlock() {                                          // 按照加锁的逆序解锁
    if (this_thread_hierarchy_value != hierarchy_value) {  // 当前解锁层级必须等于当前线程最后一次加锁的层级
      throw std::logic_error("mutex hierarchy violated");
    }
    this_thread_hierarchy_value = previous_hierarchy_value;  // 恢复上一次加锁的层级
    internal_mutex.unlock();                                 // 执行解锁操作
  }

  bool try_lock() {
    check_for_hierarchy_violation();
    if (!internal_mutex.try_lock()) return false;
    update_hierarchy_value();
    return true;
  }
};

thread_local unsigned long hierarchical_mutex::this_thread_hierarchy_value(ULONG_MAX);

/**
 * hierarchical_mutex 的使用示例
 */
hierarchical_mutex high_level_mutex(100);  // 为不同层级程序上的互斥锁赋予不同的层级编号
hierarchical_mutex low_level_mutex(10);
hierarchical_mutex middle_level_mutex(60);

void low_level_func() {
  std::lock_guard<hierarchical_mutex> lk(low_level_mutex);  // low_level 加锁
  std::this_thread::sleep_for(std::chrono::seconds(1));     // 模拟 low_level 操作
}

void high_level_func() {
  std::lock_guard<hierarchical_mutex> lk(high_level_mutex);  // high_level 加锁
  std::this_thread::sleep_for(std::chrono::seconds(1));      // 模拟 high_level 操作
  low_level_func();                                          // 调用 low_level 处理
}

void thread_proc1() { high_level_func(); }

void other_stuff() {
  high_level_func();                                     // 调用 high_level 处理
  std::this_thread::sleep_for(std::chrono::seconds(1));  // 模拟 other_stuff 操作
}

void thread_proc2() {
  try {
    std::lock_guard<hierarchical_mutex> lk(middle_level_mutex);  // middle_level 加锁
    other_stuff();                                               // 间接调用 high_level 处理，锁级别不合规
  } catch (const std::exception& ex) {
    std::cout << "Exception: " << ex.what() << '\n';
  }
}

int main() {
  std::thread t1(thread_proc1);
  std::thread t2(thread_proc2);
  t1.join();
  t2.join();
}

4 原子变量

引入原子类型和原子操作的目的是支持锁无关（lock-free）的程序设计，从而降低系统开销，提高程序的执行效率。

本节仅对原子变量做简要介绍，关于内存序等更多内容可参考 cppreference 的 C Atomic operations 和 C++ Atomic operations 部分。

4.1 原子变量（C11）

C11 标准通过新增一个关键字 _Atomic 来支持原子类型，通过引入一个新的头文件 <stdatomic.h> 来支持原子操作。

4.1.1 原子类型

可以通过关键字 _Atomic 直接声明原子类型，如 _Atomic (int)（_Atomic 作为类型指定符）或 _Atomic int（_Atomic 作为类型限定符）；
- _Atomic 不能用来修饰数组、函数、原子或者限定的类型。
可以使用标准库定义的原子类型别名，如 atomic_int。

4.1.2 初始化原子变量

如果原子变量具有静态存储期或者线程存储期，并且没有初始化，则将自动初始化为 0 值；
如果原子变量具有自动存储期，并且没有初始化，则将处于不确定的状态。

/**
 * 初始化原子变量
 * 泛型函数：用 A 指代某种原子类型，用 C 指代与 A 相对应的非原子类型；
 * 用参数 value 的值初始化参数 obj 指向的原子对象，同时为该对象设置一些由实现定义的附加状态。
 * 注：在初始化期间，对当前对象的其他并发访问将会造成数据竞争，即使它们是原子操作。
 */
void atomic_init(volatile A * obj, C value);

atomic_int atom_i;
atomic_init(&atom_i, 77); // 将原子变量 atom_i 初始为 77

4.1.3 原子操作

原子操作，是对原子变量（对象）施加的操作：

是不可分割的操作（indivisible operation），比如一个“读”操作、一个“写”操作，或者一个完整的“读-改-写”操作，包括：
- 复合赋值运算符 +=、-=、&=、|=、^=;
- 前置/后置的 ++ 和 -- 运算符;
- 在 <stdatomic.h> 中定义了很多执行原子操作的库函数。
如果一个线程发起一个原子操作，那么在操作执行期间，其他线程不能访问相同位置的内存。
- 可能借助于处理器的硬件指令（atomic instruction）实现；
- 可能借助于编译器和标准库的内部锁来实现（如果处理器硬件不支持，还是要通过锁实现）；
- 在所有原子类型中，只有原子标志类型（atomic_flag）可以保证一定是锁无关的（atomic_is_lock_free() 返回真）。

// 编译指令：gcc -pthread test.c
#ifdef __STDC_NO_ATOMICS__
#error "Not support atomic facilities."
#endif

#include <pthread.h>
#include <stdatomic.h>
#include <stdio.h>

#define LOOPS 10000000  // 循环次数
_Atomic int gcn = 0;    // 全局计数变量（原子类型）

void* thread_proc1(void* arg) {
  for (int i = 0; i < LOOPS; i++) {  // 方法1：
    gcn += 1;                        // 原子复合赋值运算
    // gcn++;                        // 原子自增运算
    // gcn = gcn + 1;                // 非原子操作（无法实现互斥保护）
    // atomic_fetch_add(&gcn, 1);    // 原子加法操作
  }
  return NULL;
}

void* thread_proc2(void* arg) {
  for (int i = 0; i < LOOPS; i++) {    // 方法2：
    int new, old = atomic_load(&gcn);  // 步骤1：原子“读取值”到临时变量 old
    do {
      new = old - 1;  // 步骤2：计算（可以很复杂）新值 new

      // 如果变量 old 和 gcn 相等，则将 new 赋值给 gcn，并返回 true；
      // 否则，说明其它线程已经修改 gcn，则将 gcn 赋值给 old，并返回 false。
      // 使用原子操作函数，其它线程不需要阻塞，但是增加了线程内部的执行时间（需要重复尝试）
    } while (!atomic_compare_exchange_weak(&gcn, &old, new));  // 步骤3：原子“比较-修改/回读”操作
  }
  return NULL;
}

int main() {
  pthread_t pid1, pid2;
  for (int i = 0; i < 10; i++) {  // 测试 10 次
    pthread_create(&pid1, NULL, thread_proc1, NULL);
    pthread_create(&pid2, NULL, thread_proc2, NULL);
    pthread_join(pid1, NULL);
    pthread_join(pid2, NULL);
    printf("gcn %d: %d\n", i, gcn);  // gcn 值均为 0
    gcn = 0;
  }
}

4.2 原子变量（C++11）

4.2.1 类定义

在头文件 <atomic> 中，包含原子类型类模板 std::atomic<> 的定义，可以将其特化成各种原子类型，如 std::atomic<int>、std::atomic<int*>，以及 std::atomic<用户自定义类型>。
对于 bool 和所有整型，还可以使用相应的类型别名，如 atomic_bool、atomic_int。

主要成员函数：（原子操作）

序号	`std::atomic<T>` 成员函数	说明
1	`bool is_lock_free() const noexcept;`	检查当前原子类型对象上的原子操作是否为无锁操作。
2	`T operator=( T desired ) noexcept;`	将 `desired` 赋值给原子变量，相当于 `store(desired)`。返回 `desired` 的拷贝。
3	`void store(T desired, memory_order order = memory_order_seq_cst) noexcept;`	将 `desired` 赋值给原子变量。
4	`T load(memory_order order = memory_order_seq_cst) const noexcept;`	返回原子变量的当前值。
5	`operator T() const noexcept;`	返回原子变量的当前值。相当于 `load()`。
6	`T exchange(T desired, memory_order order = memory_order_seq_cst) noexcept;`	将 `desired` 赋值给原子变量，并返回赋值前的原子变量值（读-改-写操作）
7	`bool compare_exchange_strong(T& expected, T desired, memory_order order = memory_order_seq_cst) noexcept;`	比较 `this` 值与 `expected`，如果它们按位相等，则将前者替换为 `desired`，返回 `true`（读-改-写操作）；否则，将 `this` 值加载到 `expected`，返回 `false`（加载操作）。
8	`bool compare_exchange_weak(T& expected, T desired, memory_order order = memory_order_seq_cst) noexcept;`	同上，可能存在虚假失败（即使 `this` 与 `expected` 相等，也可能由于线程被切出 CPU 等原因，导致 `this` 更新失败，最终按照不相等的情况处理并返回），须配合循环使用。
9	`void wait(T old, memory_order order = memory_order::seq_cst) const noexcept;`	C++20，原子等待操作，阻塞当前线程，直到收到 `notify()` 通知，并且原子变量值与 `old` 不相等。这种形式的“变更检测”通常比简单的轮询或纯自旋锁更高效。
10	`void notify_one() noexcept;`	C++20，原子通知操作，如果有线程在 `*this` 上因为原子等待操作（如 `wait()`）阻塞，则对“至少一个”这样的线程解除阻塞。
11	`void notify_all() noexcept;`	C++20，原子通知操作，如果有线程在 `*this` 上因为原子等待操作（如 `wait()`）阻塞，则对“所有”这样的线程解除阻塞。
12	`T fetch_add(T arg, memory_order order = memory_order_seq_cst) noexcept;`	将原子变量值增加 `arg`，返回操作执行前的原子变量值（读-改-写操作）
13	`T fetch_sub(T arg, memory_order order = memory_order_seq_cst) noexcept;`	将原子变量值减少 `arg`，返回值含义同上
14	`T fetch_and(T arg, memory_order order = memory_order_seq_cst) noexcept;`	将原子变量值与 `arg` 进行按位与，返回值含义同上
15	`T fetch_or(T arg, memory_order order = memory_order_seq_cst) noexcept;`	将原子变量值与 `arg` 进行按位或，返回值含义同上
16	`T fetch_xor(T arg, memory_order order = memory_order_seq_cst) noexcept;`	将原子变量值与 `arg` 进行按位异或，返回值含义同上
17	`++`,`--`（后置自增/自减）	等价操作：`return fetch_add(1);`、`return fetch_sub(1);`
18	`++`,`--`（前置自增/自减）	等价操作：`return fetch_add(1) + 1;`、`return fetch_sub(1) - 1;`
19	`+=`,`-=`,`&=`,`\|=`,`^=`（复合赋值）	等价操作：`return fetch_add(arg) + arg;`、`_sub(arg) - arg;`、`_and(arg) & arg;`、`_or(arg) \| arg;`、`_xor(arg) ^ arg;`

4.2.2 使用示例

// 编译指令：g++ -std=c++11 -pthread test.cpp
#include <atomic>
#include <iostream>
#include <thread>

#define LOOPS 10000000    // 循环次数
std::atomic<int> gcn{0};  // 全局计数变量（原子类型）

void thread_proc1() {
  for (int i = 0; i < LOOPS; i++) {  // 方法1：
    gcn += 1;                        // 原子复合赋值运算
    // gcn++;                        // 原子自增运算
    // gcn = gcn + 1;                // 非原子操作（无法实现互斥保护）
    // atomic_fetch_add(&gcn, 1);    // 原子加法操作
    // gcn.fetch_add(1);             // 原子加法操作
  }
}

void thread_proc2() {
  for (int i = 0; i < LOOPS; i++) {  // 方法2：
    int newv, old = gcn.load();      // 步骤1：原子“读取值”到临时变量 old
    do {
      newv = old - 1;  // 步骤2：计算（可以很复杂）新值 newv

      // 如果变量 old 和 gcn 相等，则将 newv 赋值给 gcn，并返回 true；
      // 否则，说明其它线程已经修改 gcn，则将 gcn 赋值给 old，并返回 false。
    } while (!gcn.compare_exchange_weak(old, newv));  // 步骤3：原子“比较-修改/回读”操作
  }
}

int main() {
  for (int i = 0; i < 10; i++) {  // 测试 10 次
    std::thread t1(thread_proc1);
    std::thread t2(thread_proc2);
    t1.join();
    t2.join();
    std::cout << "gcn " << i << ": " << gcn << '\n';  // gcn 值均为 0
    gcn = 0;
  }
}

4.2.3 基于原子类型的自旋锁

“自旋锁概念”回顾

在头文件 <atomic> 中，提供了原子标志类型 std::atomic_flag，只有“清零”（false）和“设置”（true）两种状态，是“唯一可以保证锁无关”的原子类型。

序号	成员函数	说明
1	`atomic_flag()`	构造原子标志对象。在 C++20 前，需要通过 `ATOMIC_FLAG_INIT` 初始化为清零状态；C++20 后，默认为清零状态。
2	`void clear(memory_order order = memory_order_seq_cst) noexcept;`	将原子标志置为清零状态（`false`）
3	`bool test_and_set(memory_order order = memory_order_seq_cst) noexcept;`	将原子标志置为设置状态（`true`），返回操作执行前的原子标志
4	`bool test(memory_order order = memory_order_seq_cst) const noexcept;`	C++20，返回当前原子标志
5	`void wait(bool old, memory_order order = memory_order::seq_cst) const noexcept;`	C++20，原子等待操作，阻塞当前线程，直到收到 `notify()` 通知，并且原子标志与 `old` 不相等。这种形式的“变更检测”通常比简单的轮询或纯自旋锁更高效。
6	`void notify_one() noexcept;`	C++20，原子通知操作，如果有线程在 `*this` 上因为原子等待操作（如 `wait()`）阻塞，则对“至少一个”这样的线程解除阻塞。
7	`void notify_all() noexcept;`	C++20，原子通知操作，如果有线程在 `*this` 上因为原子等待操作（如 `wait()`）阻塞，则对“所有”这样的线程解除阻塞。

可以使用 std::atomic_flag 提供的“测试和设置”（test_and_set()）功能，实现自旋锁：

// 编译指令：g++ -std=c++11 -pthread test.cpp
#include <atomic>
#include <iostream>
#include <mutex>
#include <thread>

class spinlock {
  std::atomic_flag flag;

 public:
  spinlock() : flag(ATOMIC_FLAG_INIT) {}  // 原子标志初始化为 false，表示未被任何线程占用

  void lock() {
    // 如果 test_and_set() 返回 true，说明原子标志原先就是 true，也就是，其他线程正拥有这个标志；
    // 如果返回 false，说明原子标志原先是 false，也就是，操作前没有任何线程拥有这个标志，操作后当前线程拥有了这个标志。
    while (flag.test_and_set(std::memory_order_acquire))
      ;  // 自旋（原地打转），直到其他线程清零标志，并且当前线程成功设置标志（线程不会阻塞，没有调度开销，但会占用 CPU）
  }

  void unlock() {
    flag.clear(std::memory_order_release);  // 清零原子标志（置为 false），让其他线程获得设置（置为 true）机会
  }
};

#define LOOPS 10000000  // 循环次数
int gcn = 0;            // 全局计数变量（由自旋锁保护）
spinlock glock;         // 全局自旋锁

void thread_proc1() {
  for (int i = 0; i < LOOPS; i++) {
    std::lock_guard<spinlock> guard(glock);
    gcn = gcn + 1;
  }
}

void thread_proc2() {
  for (int i = 0; i < LOOPS; i++) {
    std::lock_guard<spinlock> guard(glock);
    gcn = gcn - 1;
  }
}

int main() {
  for (int i = 0; i < 10; i++) {  // 测试 10 次
    std::thread t1(thread_proc1);
    std::thread t2(thread_proc2);
    t1.join();
    t2.join();
    std::cout << "gcn " << i << ": " << gcn << '\n';  // gcn 值均为 0
    gcn = 0;
  }
}

5 互斥技术对比

互斥技术	特点	适用场景
互斥锁（Mutex）	一次仅允许一个线程成功加锁；其他试图加锁线程将被阻塞；加锁和解锁需要在同一线程内成对使用。	一般的互斥访问场景。
读写锁（Read-Write Lock）	多个读线程可以同时加锁，写线程只能独自加锁。	读操作远多于写操作的场景。
信号量（Semaphore）	允许一定数量的线程同时加锁；加锁和解锁可以不在同一线程内成对使用。	对一组资源进行互斥访问的场景；也可以处理其他线程同步问题。
自旋锁（Spinlock）	一次仅允许一个线程成功加锁；其他试图加锁线程将会自旋等待，而不是进入阻塞状态，减少线程调度开销，增加 CPU 自旋消耗。	持锁时间非常短，并且在等待锁时，不希望线程被挂起的场景。
原子变量（Atomic Variables）	在处理器硬件支持的情况下，可以进行高效的锁无关操作。	简单的数据类型和操作。