FreeBSD lock 内核源码解析

前言

此文主要介绍FreeBSD lock的内核实现,只是几年前的随笔记录,希望能提供帮助。lock的实现和进程调度相关,有兴趣的需要联合进程调度一起分析,当把这些串联起来就发现操作系统就是个哲学系统,不是什么不可逾越的天堑,国内没有成熟的操作系统只是这方面没有从0-N的积累,没有培养国产操作系统的土壤。

正文

propagate_priority 是在turnstile_wait 中被调用(能调用到propagate_priority的地方基本上是拿着锁的线程也是被blocked),其内部会查这个锁(blocking lock: mutex,rw,rm)被谁占用,一旦查出是由sleep lock(sx,lockmgr,sleep())占用,则直接panic 说一个睡的线程占用一个不可睡的锁(这里的锁就是上面的block锁)

propagate_priority(td) 函数实现 :

ts = td->td_blocked;  当前线程锁在的turnstile
pri = td->td_priority;

for (;;){

          td = ts->ts_owner;      当前拿着这个turnstile的thread

          /*
          * If this thread already has higher priority than the
          * thread that is being blocked, we are finished.
          */
          if (td->td_priority <= pri) {
               thread_unlock(td);
               return;
          }

          /*
          * Bump this thread's priority.
          */
          sched_lend_prio(td, pri);   升级拿着此lock的线程优先级

        /*
         * If lock holder is actually running or on the run queue
         * then we are done.
         */
        if (TD_IS_RUNNING(td) || TD_ON_RUNQ(td)) { ------------>线程正在运行, 或是在可运行队列 则退出函数;
            MPASS(td->td_blocked == NULL);
            thread_unlock(td);
            return;
        }

 /*
  * Pick up the lock that td is blocked on.
  */
 ts = td->td_blocked;        当前拿着这个turnstile的thread 锁在的turnstile    (这里相当于链表的next,这样一直把调用propagate_priority的线程的优先级尽可能沿着thread的 td_blocked(td_blocked指向的是turnstile)向上传递;如果调用该函数的thread优先级很高则会提升所有挡在它前面的线程优先级)

        /* Resort td on the list if needed. */
        if (!turnstile_adjust_thread(ts, td)) {  -------->这个线程的优先级已经被上面的函数sched_lend_prio  提升过优先级了,而且这个线程也是被block, 所以自然需要在blocked的turnstile上要求重新排队; 
            mtx_unlock_spin(&ts->ts_lock);
            return;
        }
}

propagate_priority 实现思想: 从当前持有该把锁的turnstile 开始, 如果持有该锁的线程(ts_owner) 优先级低,则借给他优先级, 如果持有该锁的优先级很高则 没必要继续下去了, 函数返回

接下来的流程是在借过优先级之后,如果当前线程正在运行,或者在运行队列上, 则借过优先级就行了, 如果没在可运行队列上, 则说明持有该锁的线程 也是被block在了其他的turnstile上(不可能在sleep queue上,因为申请sleep lock的线程不可以拿 blocking mutex)
那么就要继续找到持有该锁的线程 被阻塞在了哪个线程, 然后继续for循环,从头开始,总之就是 一个线程要把自己的优先级借给一切挡在了其前面的线程;

sleepable 标志只在 sleep lock(如上)中被标记

sched_bind()中会调用 sched_pin() 来把 curthread->td_pinned++

在进程调度的时候会检查是否可被迁移
#define THREAD_CAN_MIGRATE(td) ((td)->td_pinned == 0)

Turnstile:

ts_pending链表是被turnstile_signal or turnstile_broadcase 操作的用来将线程放在运行队列之前的一个过渡链表;

td_blocked 记录的是该线程被锁在哪个turnstile 上,在turnstile_wait 时 记录block在哪个turnstile上,在
turnstile_unpend时将指针清空

turnstile_setowner:
ts->ts_owner = owner;
LIST_INSERT_HEAD(&owner->td_contested, ts, ts_link); 由此可见 td_contested

3类turnstile 链表 entry 是turnstile
/*

  • There are three different lists of turnstiles as follows. The list
  • connected by ts_link entries is a per-thread list of all the turnstiles
  • attached to locks that we(we are owner thread) own. This is used to fixup our priority when
  • a lock is released. The other two lists use the ts_hash entries. The
  • first of these two is the turnstile chain list that a turnstile is on
  • when it is attached to a lock. The second list to use ts_hash is the
  • free list hung off of a turnstile that is attached to a lock.

在turnstile里面的lists, entry 是 thread

  • Each turnstile contains three lists of threads. The two ts_blocked lists
  • are linked list of threads blocked on the turnstile’s lock. One list is
  • for exclusive waiters, and the other is for shared waiters. The
  • ts_pending list is a linked list of threads previously awakened by
  • turnstile_signal() or turnstile_wait() that are waiting to be put on
  • the run queue.

turnstile_trywait:
根据lock hash查到该lock的 turnstile chain tc,之后再这个tc单项链表上根据 ts_hash找到 在这个chain上是否有该lock的turnstile 如果有则返回;
如果没有说明没有在该hash上的 chain为空,则把当前线程的 turnstile 返回并在返回之前 记录ts->ts_lockobj = lock;

turnstile_wait:
在这里将上面函数选出来的turnstile 作为目标turnstile, 将传进来的thread 放到ts_blocked链表上(有优先级的链表), 然后将thread的 td_blocked 指向这个ts;

增删ts_hash的地方有:1.turnstile_wait 如果是该lock还没有一个turnstile则把当前线程的turnstile作为该lock的turnstile插入到该turnstile chain的头部(插入到链表头部)
如果已经有了该lock的turnstile则把当前线程插入到该turnstile的 ts_blocked的队列里面去(通过thread的 td_lockq),把该线程的turnstile视为无用 放入到该lock的turnstile
的free 链表中(当turnstile_broadcast的时候会从该free链表中取出turnstile结构还给线程)
2.turnstile_signal 和 turnstile_broadcast一样:
都是调用LIST_REMOVE从当前的链表中删除(这里n-1个是从ts->ts_free这个链表里remove,最后一个应该是在turnstile chain里面把该lock的turnstile删除)

td_lockq 基本使用来连接lock在同一个turnstile的所有线程的

ts_link:
通过ts_link把该turnstile放到持有该把所的线程的 td_contested为链表头的链表当中去

    |<--------------------------------ts_owner
     |                                                       |
thread -> td_contested----------------->turnstile for lock1-----(ts_link)------>turnstile for lock2-----------turn for lock3--------
                                                             |
                                                         (   |-->ts_blocked 链,串起来所以的block在当前锁的thread)
                                                             |
                                                            thread  X wait for lock1
                                                             |
                                                          (  | -->use td_lockq)
                                                             |
                                                            thread Y wait for lock1

/*

  • Adjust the thread’s position on a turnstile after its priority has been
  • changed.
    */
    A<===>turnstile_adjust_thread(ts,td):
    该函数在一个线程 的优先级改变的时候被调用 假如上一个图里面的thread Y的优先级得到改变(改变的可能路径:sched_prio---->turnstile_adjust—>A;
    turnstile_wait ----> propagate_priority—>A)

该函数通过td_lockq来找到相应合适的位置,假如现在的优先级高于thread X,则现在应该把thread Y 添加到X的前面

关于读写锁:

在释放读锁的时候,因为读锁是共享的,所以没有阻塞在读锁上的读锁请求,多个读也只是体现在计数上而已,所以读锁释放时如果读者不止一个则很简单只需
减计数即可(因为即使有写者等待,也得等所有读锁释放后才能有下文),所以当读锁全部释放且当前有写者在等待则该锁上是有turnstile(如果没有写者在等待,那很好直接把锁置为RW_UNLOCKED),所以接下来需要turnstile_broadcast;

释放写锁第一步看是否有等待者:没有则直接置为RW_UNLOCKED;若有等待 则一定是有turnstile的(无论读等还是写等,肯定有block所以肯定有turnstile)所以上来就开始turnstile_broadcast

_rw_init_flags:
读写锁初始化为:
rw->rw_lock = RW_UNLOCKED;  

/*#define     RW_UNLOCKED          RW_READERS_LOCK(0)
#define     RW_READERS_LOCK(x)     ((x) << RW_READERS_SHIFT | RW_LOCK_READ)
#define     RW_LOCK_READ          0x01

 * The rw_lock field consists of several fields.  The low bit(bit 0) indicates
* if the lock is locked with a read (shared) or write (exclusive) lock.
* A value of 0 indicates a write lock, and a value of 1 indicates a read
* lock.  Bit 1 is a boolean indicating if there are any threads waiting
* for a read lock.  Bit 2 is a boolean indicating if there are any threads
* waiting for a write lock.  The rest of the variable's definition is
* dependent on the value of the first bit.  For a write lock, it is a
* pointer to the thread holding the lock, similar to the mtx_lock field of
* mutexes.  For read locks, it is a count of read locks that are held.
*/

#define     RW_LOCK_READ          0x01
#define     RW_LOCK_READ_WAITERS     0x02
#define     RW_LOCK_WRITE_WAITERS     0x04

 * When the lock is not locked by any thread, it is encoded as a read lock
* with zero waiters.


/* Try to obtain a write lock once. */
#define     _rw_write_lock(rw, tid)                              \
     atomic_cmpset_acq_ptr(&(rw)->rw_lock, RW_UNLOCKED, (tid))   如果当前为读锁且读者个数为0则是无人在锁(见上面的初始化)

/* Release a write lock quickly if there are no waiters. */
#define     _rw_write_unlock(rw, tid)                         \
     atomic_cmpset_rel_ptr(&(rw)->rw_lock, (tid), RW_UNLOCKED)


__rw_rlock :
从函数的 语句 ”if (RW_CAN_READ(v)) { “中得出:
如果读写锁先是被读锁,之后有写锁再等,之后又来了一个想读锁的,这时读锁要靠边站(不能去拿读锁了,因为有人想写,就得让给人家,不然的话写者容易被饿死,这样看来 写锁是优先于读锁的)
.....
          if (!(v & RW_LOCK_READ_WAITERS)) { 如果当前是有写者或者是有写等待者,则要给锁置上有人在等的标志
               if (!atomic_cmpset_ptr(&rw->rw_lock, v,
                   v | RW_LOCK_READ_WAITERS)) {
                    turnstile_cancel(ts);
                    continue;
               }
               if (LOCK_LOG_TEST(&rw->lock_object, 0))
                    CTR2(KTR_LOCK, "%s: %p set read waiters flag",
                        __func__, rw);
          }


_rw_runlock_cookie:在读锁释放的时候1要看是否有多个读锁者,若没有则2.要看是否有等待者,若没有最好赶紧释放锁并置成RW_UNLOCKED,如果有等待者 3要看是读等还是写等:

          x = RW_UNLOCKED;
          if (v & RW_LOCK_WRITE_WAITERS) {  /*选择唤醒挂在哪个queue上的线程(s),从这里也能看出 写等优先级高于读等*/
               queue = TS_EXCLUSIVE_QUEUE;
               x |= (v & RW_LOCK_READ_WAITERS);
          } else
               queue = TS_SHARED_QUEUE;


__rw_wunlock_hard:
能进这个函数则肯定是有人在等待的,因为如果没有等待的直接调用_rw_write_unlock就成功了 
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值