关于RCU-sched的研究
对于synchronize_rcu原理的研究,在现在的源码中发现:
* When synchronize_rcu() is invoked on one CPU while other CPUs
* are within RCU read-side critical sections, then the
* synchronize_rcu() is guaranteed to block until after all the other
* CPUs exit their critical sections. Similarly, if call_rcu() is invoked
* on one CPU while other CPUs are within RCU read-side critical
* sections, invocation of the corresponding RCU callback is deferred
* until after the all the other CPUs exit their critical sections.
也就是说一个CPU的synchronize_rcu需要等待其他所有CPU都脱离RCU read-side临界区!
* In non-preemptible RCU implementations (pure TREE_RCU and TINY_RCU),
* it is illegal to block while in an RCU read-side critical section.
* In preemptible RCU implementations (PREEMPT_RCU) in CONFIG_PREEMPTION
* kernel builds, RCU read-side critical sections may be preempted,
* but explicit blocking is illegal. Finally, in preemptible RCU
* implementations in real-time (with -rt patchset) kernel builds, RCU
* read-side critical sections may be preempted and they may also block, but
* only when acquiring spinlocks that are subject to priority inheritance.
也就是说其实是有可抢占版的RCU的!
如果config文件定义了CONFIG_TREE_PREEMPT_RCU=y,那么sychronize_rcu将默认使用rcu_preempt_state(sychronize_rcu)。这类rcu的特点就在于read_lock期间是允许其它进程抢占的,因此它判断宽限期度过的方法就不太一样。
从rcu_read_lock和rcu_read_unlock的定义就可以知道,TREE_PREEMPT_RCU并不是以简单的经过抢占为CPU渡过GP的标准,而是有个rcu_read_lock_nesting计数
在lwn文章中详细介绍了不同的内核配置导致RCU的不同:
CONFIG_PREEMPT=n
andCONFIG_SMP=y
impliesCONFIG_TREE_RCU
, selecting the non-preemptible tree-based RCU implementation that is appropriate for server-class SMP builds. It can accommodate a very large number of CPUs, but scales down sufficiently well for all but the most memory-constrained systems.CONFIG_PREEMPT=y
impliesCONFIG_TREE_PREEMPT_RCU
, selecting the preemptible tree-based RCU implementation that is appropriate for real-time and low-latency SMP builds. It can also accommodate a very large number of CPUs, and also scales down sufficiently well for all but the most memory-constrained systems. The boot parameters forCONFIG_TREE_RCU
also apply toCONFIG_TREE_PREEMPT_RCU
.CONFIG_PREEMPT=n
andCONFIG_SMP=n
impliesCONFIG_TINY_RCU
, selecting the non-preemptible uniprocessor RCU implementation that is appropriate for non-real-time UP builds. It has the smallest memory footprint of any of the current in-kernel RCU implementations. In fact, its memory footprint is so small that it doesn’t even have any kernel boot parameters.
也就是说,CONFIG_PREEMPT=n
内核用的是不可抢占RCU,而CONFIG_PREEMPT=y
implies CONFIG_TREE_PREEMPT_RCU
,所以CONFIG_PREEMPT=y
用的是可抢占的RCU。查了下我自己的电脑,是CONFIG_PREEMPT_VOLUNTARY=y,因此用的是不可抢占内核!对于不可抢占的RCU来说,RCU和RCU-Sched是一样的,因为反正都是不可抢占。而对于可抢占的RCU来说,RCU是可以抢占的,而RCU-Sched是不可抢占!官方文档中介绍了Sched-Favor的RCU:
“Note well that in CONFIG_PREEMPT=y
kernels, rcu_read_lock_sched()
and rcu_read_unlock_sched()
disable and re-enable preemption, respectively.” 也就是说,RCU-sched在CONFIG_PREEMPT=y的情况下,rcu_read_lock_sched()
是关抢占的!
因此,在RCU中_sched API后缀的意义仅仅体现在当CONFIG_PREEMPT=y也就是将内核配置为可抢占的情况!在CONFIG_PREEMPT=y情况下,RCU read-side临界区是可抢占的,而RCU-sched read-side是不可抢占的!
在lwn文章中还讨论的一个RCU和RCU-Sched混用的问题:
Quick Quiz 3: What happens if you mix and match RCU and RCU-Sched?
假如我们把RCU和RCU-Sched混合使用了怎么办?
Answer: In a CONFIG_TREE_RCU
or a CONFIG_TINY_RCU
kernel, mixing these two works “by accident” because in those kernel builds, RCU and RCU-Sched map to the same implementation.
对于CONFIG_TREE_RCU
or a CONFIG_TINY_RCU
=y的内核来说,混用两者无所谓,因为在这种内核里RCU和RCU-Sched的实现是相同的
However, this mixture is fatal in CONFIG_TREE_PREEMPT_RCU
builds, due to the fact that RCU’s read-side critical sections can then be preempted, which would permit synchronize_sched()
to return before the RCU read-side critical section reached its rcu_read_unlock()
call.
但是对于CONFIG_TREE_PREEMPT_RCU
=y的内核来说,其RCU read-side是可以被抢占的,那么假如读端用的rcu_read_lock/unlock在写端使用synchronize_sched,读端的临界区发生抢占就会提前结束其宽限期(其实并没有)
This could, in turn, result in a data structure being freed before the read-side critical section was finished with it, which could, in turn, greatly increase the actuarial risk experienced by your kernel.
这可能会导致读者还没读完,写者就free掉了其读的内存,导致风险!
Even in CONFIG_TREE_RCU
and CONFIG_TINY_RCU
builds, such mixing and matching is of course very strongly discouraged. Mixing and matching other flavors of RCU is even worse: it can result in hard-to-debug bad-pointer bugs.