那么，古尔丹，读写锁的代价是什么呢

Kira Skyler

已于 2024-07-23 17:47:42 修改

阅读量401

点赞数 5

文章标签： linux c++

于 2024-07-23 11:04:19 首次发布

GNU

本文链接：https://blog.csdn.net/weixin_42544902/article/details/140630240

版权

古尔丹，代价是什么呢

简介

《Linux多线程服务端编程(使用muduo C++网络库)(博文视点出品)》书中提到读写锁的开销比互斥锁更大会更容易产生问题

读写锁可能产生问题：

读取锁可重入任何可重入的锁都不推荐使用 (可以用过编码规范自我约束不使用读取锁的重入)
读锁里不可以做写入操作需要其他人熟悉代码遵守

但是古尔丹读写锁的代价是什么呢

对读写锁的性能测试用例

测试代码：g++ rwlock.cpp -O0 -g -Wall -lpthread

#include <pthread.h>
#include <sys/types.h>
#include <unistd.h>

#define READ_THREAD_CNT 20
#define WRITE_THREAD_CNT 4

#define READ_TIMEWAIT_US  (50)
#define WRITE_TIMEWAIT_US (100)

#define READ_LOOP_CNT           50000
#define WRITE_LOOP_CNT          1000

// #define LOCK_WITH_RWLOCK        1
#define LOCK_WITH_SPINLOCK      1
// #define CRITICAL_SECTION_MINI   1

pthread_rwlock_t rwlock;
pthread_mutex_t mutex;
pthread_spinlock_t spinlock;

/**
 * @brief 经典读写锁
 * 
 * @param args 
 * @return * void* 
 */
void* thread_read(void* args) {
    __useconds_t rdwait_us = *(__useconds_t*)args;

    for (int i = 0; i < READ_LOOP_CNT; i++) {
        #if LOCK_WITH_RWLOCK
        if (pthread_rwlock_rdlock(&rwlock) == 0) {
        #elif LOCK_WITH_SPINLOCK
        if (pthread_spin_lock(&spinlock) == 0) {
        #else
        if (pthread_mutex_lock(&mutex) == 0) {
        #endif

            /* 临界区非常小 这里没有延迟 */
            #ifndef CRITICAL_SECTION_MINI
            usleep(rdwait_us);
            #endif

            #if LOCK_WITH_RWLOCK
            pthread_rwlock_unlock(&rwlock);
            #elif LOCK_WITH_SPINLOCK
            pthread_spin_unlock(&spinlock);
            #else
            pthread_mutex_unlock(&mutex);
            #endif
        }

        /** 真正的读取 */
        #ifdef CRITICAL_SECTION_MINI
        usleep(rdwait_us);
        #endif
    }

    return nullptr;
}

void* thread_write(void* args) {
    __useconds_t rdwait_us = *(__useconds_t*)args;

    #if LOCK_WITH_RWLOCK
    if (pthread_rwlock_wrlock(&rwlock) == 0) {
    #elif LOCK_WITH_SPINLOCK
    if (pthread_spin_lock(&spinlock) == 0) {
    #else
    if (pthread_mutex_lock(&mutex) == 0) {
    #endif

        /* 临界区非常小 这里没有延迟 */
        #ifndef CRITICAL_SECTION_MINI
        usleep(rdwait_us);
        #endif

        #if LOCK_WITH_RWLOCK
        pthread_rwlock_unlock(&rwlock);
        #elif LOCK_WITH_SPINLOCK
        pthread_spin_unlock(&spinlock);
        #else
        pthread_mutex_unlock(&mutex);
        #endif
    }

    /** 真正的写入 */
    #ifdef CRITICAL_SECTION_MINI
    usleep(rdwait_us);
    #endif

    return nullptr;
}

int main(int args, char* argv[]) {
    #if LOCK_WITH_RWLOCK
    pthread_rwlock_init(&rwlock, nullptr);
    #elif LOCK_WITH_SPINLOCK
    pthread_spin_init(&spinlock, 0);
    #else
    pthread_mutex_init(&mutex, nullptr);
    #endif

    __useconds_t rdwait_us = READ_TIMEWAIT_US;
    __useconds_t wrwait_us = WRITE_TIMEWAIT_US;

    pthread_t threads_r[READ_THREAD_CNT];
    for (int i = 0; i < READ_THREAD_CNT; i++) {
        pthread_create(&threads_r[i], nullptr, thread_read, &rdwait_us);
    }

    pthread_t threads_w[WRITE_THREAD_CNT];
    for (int i = 0; i < WRITE_THREAD_CNT; i++) {
        pthread_create(&threads_w[i], nullptr, thread_write, &wrwait_us);
    }

    for (int i = 0; i < READ_THREAD_CNT; i++) {
        pthread_join(threads_r[i], nullptr);
    }

    // for (int i = 0; i < WRITE_THREAD_CNT; i++) {
    //     pthread_join(threads_w[i], nullptr);
    // }
}

20个读取线程，4个写入线程，测试机器有32个核心
临界区有两个测试方法，临界区有延迟，模拟在临界区内较大的操作，临界区内没有延迟，临界区外有延迟，模拟内存中的对象的读写，如对象的读取，写入时替换对象指针的形式。

注意：
文件读写不属于临界区内的大操作，因为文件读不多线程安全的，而且多线程读取也需要考虑文件的偏移量问题

测试结果

读写锁无临界区

./a.out  1.70s user 4.49s system 114% cpu 5.430 total

读写锁有临界区

./a.out  1.81s user 4.37s system 113% cpu 5.423 total

互斥锁无临界区

./a.out  1.49s user 4.70s system 114% cpu 5.420 total

互斥锁有临界区

./a.out  1.04s user 15.44s system 14% cpu 1:51.35 total

自旋锁无临界区

./a.out  1.33s user 4.88s system 114% cpu 5.415 total

自旋锁有临界区

./a.out  1197.30s user 0.44s system 198% cpu 10:02.12 total

总结

在有临界区的情况下，读写锁是有作用的
但是，有临界区不属于优秀的程序设计，程序设计应该减少资源互斥，尽可能临界区足够
可能是临界区的存在，读锁多线程同时访问起到了作用，读锁的开销≈(互斥锁开销+互斥锁导致的调度开销)

在优秀的多线程设计中，临界区足够小的时候，读写锁确实没有优势，不如互斥锁那么简洁不易产生问题。
有些锁的封装库或是新的语言有互斥锁，但没有提供读写锁，也行也是这个问题

例如下面的伪代码，使用非常小的临界区对共享资源读写，优秀设计

read() {
    {                       # 临界区
        pthread_lock()
        ptr = g_ptr
        pthread_unlock()
    }
    
    read ptr something...
}

write() {
    if is_lock
    {
        ptr = g_ptr.clone   # 在新的变量中修改
        ptr.inser

        pthread_lock()
        g_ptr = ptr         # 替代全局指针 这样临界区非常小
        pthread_unlock()
    }
    else
    {
        /** 虽然 is_lock 到这里可能又被 lock 了。。。 */
        pthread_lock()
        g_ptr.insert
        pthread_unlock()
    }
}

额外验证

调整代码里的线程数量方便观测

#define READ_THREAD_CNT 200
#define WRITE_THREAD_CNT 40

注释所有 usleep(rdwait_us);

当取消线程里的睡眠时，验证上下文切换次数与总时间

读写锁

./a.out  15.96s user 0.01s system 199% cpu 8.018 total

互斥锁

./a.out  0.91s user 13.87s system 201% cpu 7.350 total

自旋锁

./a.out  133.39s user 0.03s system 198% cpu 1:07.15 total

日常时候这个机器的上下文切换次数大概是2.5w/s

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 3874004 101660 54298452    0    0     0    28 15759 23820  0  0 99  0  0
 0  0      0 3875000 101660 54298544    0    0     0     0 15112 23020  0  0 100  0  0
 0  0      0 3874900 101668 54298544    0    0     0    12 16210 24982  0  0 99  0  0

读写锁和自旋锁时的数据和上面差不多，互斥锁时候则不一样

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
192  0      0 3865540 101636 54298324    0    0     0    36 20042 37480  0  3 96  0  0
199  0      0 3865392 101636 54298324    0    0     0     0 21989 46440  0  4 95  0  0
 4  0      0 3864968 101636 54298424    0    0     0     0 22588 46159  1  5 95  0  0
193  0      0 3865240 101644 54298424    0    0     0    84 20732 42499  1  5 95  0  0
188  0      0 3865868 101644 54298428    0    0     0    68 21320 44422  1  5 95  0  0
175  0      0 3865960 101652 54298452    0    0     0   164 18943 39816  0  4 95  0  0
126  0      0 3865184 101652 54298452    0    0     0     0 20536 43004  0  4 95  0  0

调整读取更多，写入更少时，读写锁才能发挥出一点点的优势

#define READ_THREAD_CNT 200
#define WRITE_THREAD_CNT 1

读写锁 ./a.out  13.66s user 0.02s system 199% cpu 6.863 total
互斥锁 ./a.out  0.90s user 14.06s system 199% cpu 7.489 total

读写均衡时候

#define READ_THREAD_CNT 100
#define WRITE_THREAD_CNT 100

#define READ_LOOP_CNT           50000
#define WRITE_LOOP_CNT          50000

读写锁 ./a.out  6.61s user 0.02s system 203% cpu 3.254 total
互斥锁 ./a.out  6.40s user 0.00s system 200% cpu 3.193 total

读写百万倍差异时候

#define READ_THREAD_CNT 10
#define WRITE_THREAD_CNT 10

#define READ_LOOP_CNT           50000000
#define WRITE_LOOP_CNT          50

读写锁 ./a.out  301.76s user 0.01s system 198% cpu 2:31.68 total
互斥锁 ./a.out  41.19s user 196.56s system 199% cpu 1:59.46 total