在系统中,有时可能会看到如下警告:
INFO: rcu_sched self-detected stall on CPU
0-...: (9999 ticks this GP) idle=cee/140000000000001/0 softirq=64858/64858 fqs=4891
(t=10000 jiffies g=13416 c=13415 q=68)
Task dump for CPU 0:
my_debug_1 R running task
这是rcu的cpu停顿警告,也只是一个警告诉你。就是告诉我们,某个cpu长时间没有进行进程切换等其他操作。
几个概念:
静止状态QS(Quiescent State): CPU发生了上下文切换称为经历一个quiescent state;
当所有CPU都经历过静止状态时,才标志着一个宽限期的结束。
静止状态的标志:
- 在时钟tick中检测CPU处于
用户模式
或者idle模式
,则表明CPU离开了临界区; - 在不支持抢占的RCU实现中,检测到CPU有context切换,就能表明CPU离开了临界区;
- 另外对于 rcu_bh,如果现在没有正在处理软中断或是禁止了软中断,对于 rcu_bh 类型的 RCU 来说,也经历过了一个Quiescent State。
导致RCU警告的情况
导致的原因有很多,下文为内核文档 RCU stallwarn.txt
的部分原文以及对应的翻译:
So your kernel printed an RCU CPU stall warning. The next question is “What caused it?” The following problems can result in RCU CPU stall warnings:
所以你的内核打印了一个 RCU CPU 停顿警告。 下一个问题是“是什么引起的?” 以下问题可能导致 RCU CPU 停顿警告:
-
A CPU looping in an RCU read-side critical section.
在 RCU 读取端临界区中循环的 CPU。
-
A CPU looping with interrupts disabled.
禁用中断的 CPU 循环
-
A CPU looping with preemption disabled. This condition can result in RCU-sched stalls and, if ksoftirqd is in use, RCU-bh stalls.
禁用抢占的 CPU 循环。 这种情况会导致 RCU-sched 停顿,如果 ksoftirqd 正在使用,RCU-bh 停顿。
-
A CPU looping with bottom halves disabled. This condition can result in RCU-sched and RCU-bh stalls.
禁用下半部分的 CPU 循环。 这种情况会导致 RCU-schedule 和 RCU-bh 停顿。
-
For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel without invoking schedule(). Note that cond_resched() does not necessarily prevent RCU CPU stall warnings. Therefore, if the looping in the kernel is really expected and desirable behavior, you might need to replace some of the cond_resched() calls with calls to cond_resched_rcu_qs().
对于 !CONFIG_PREEMPT 内核,CPU 在内核中的任何位置循环,而无需调用 schedule()。 请注意, cond_resched() 不一定能阻止 RCU CPU 停顿警告。 因此,如果内核中的循环确实是预期和可取的行为,您可能需要将一些 cond_resched() 调用替换为对 cond_resched_rcu_qs() 的调用
-
Booting Linux using a console connection that is too slow to keep up with the boot-time console-message rate. For example, a 115Kbaud serial console can be -way- too slow to keep up with boot-time message rates, and will frequently result in RCU CPU stall warning messages. Especially if you have added debug printk()s.
使用控制台连接引导 Linux,该连接太慢而无法跟上引导时控制台消息速率。 例如,115Kbaud 串行控制台可能太慢而无法跟上启动时的消息速率,并且会经常导致 RCU CPU 停顿警告消息。 特别是如果您添加了调试 printk()s。
-
Anything that prevents RCU’s grace-period kthreads from running. This can result in the “All QSes seen” console-log message. This message will include information on when the kthread last ran and how often it should be expected to run.
任何阻止 RCU 的宽限期 kthread 运行的东西。 这可能会导致“所有 QSes 可见”控制台日志消息。 此消息将包含有关 kthread 上次运行的时间以及应该多久运行一次的信息。
-
A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might happen to preempt a low-priority task in the middle of an RCU read-side critical section. This is especially damaging if that low-priority task is not permitted to run on any other CPU, in which case the next RCU grace period can never complete, which will eventually cause the system to run out of memory and hang. While the system is in the process of running itself out of memory, you might see stall-warning messages.
CONFIG_PREEMPT 内核中的 CPU 密集型实时任务,它可能会在 RCU 读取端临界区的中间抢占低优先级任务。 如果不允许该低优先级任务在任何其他 CPU 上运行,则这尤其具有破坏性,在这种情况下,下一个 RCU 宽限期将永远无法完成,最终将导致系统内存不足并挂起。 当系统正在耗尽内存时,您可能会看到停顿警告消息。
-
A CPU-bound real-time task in a CONFIG_PREEMPT_RT kernel that is running at a higher priority than the RCU softirq threads. This will prevent RCU callbacks from ever being invoked, and in a CONFIG_PREEMPT_RCU kernel will further prevent RCU grace periods from ever completing. Either way, the system will eventually run out of memory and hang. In the CONFIG_PREEMPT_RCU case, you might see stall-warning messages.
CONFIG_PREEMPT_RT 内核中的 CPU 密集型实时任务,它以比 RCU 软中断线程更高的优先级运行。 这将阻止 RCU 回调被调用,并且在 CONFIG_PREEMPT_RCU 内核中将进一步阻止 RCU 宽限期完成。 无论哪种方式,系统最终都会耗尽内存并挂起。 在 CONFIG_PREEMPT_RCU 情况下,您可能会看到停顿警告消息。
-
A hardware or software issue shuts off the scheduler-clock interrupt on a CPU that is not in dyntick-idle mode. This problem really has happened, and seems to be most likely to result in RCU CPU stall warnings for CONFIG_NO_HZ_COMMON=n kernels.
硬件或软件问题会关闭未处于 dyntick-idle 模式的 CPU 上的调度程序时钟中断。 这个问题确实发生了,并且似乎最有可能导致 CONFIG_NO_HZ_COMMON=n 内核的 RCU CPU 停顿警告。
-
A bug in the RCU implementation.
RCU 实现中的错误。
-
A hardware failure. This is quite unlikely, but has occurred at least once in real life. A CPU failed in a running system, becoming unresponsive, but not causing an immediate crash. This resulted in a series of RCU CPU stall warnings, eventually leading the realization that the CPU had failed.
硬件故障。 这不太可能,但在现实生活中至少发生过一次。 CPU 在运行的系统中出现故障,变得无响应,但不会导致立即崩溃。 这导致了一系列 RCU CPU 停顿警告,最终导致人们意识到 CPU 已发生故障。
导致RCU警告的原因
上述情况都会因为阻止进程切换或CPU的IDLE状态。这样,在其他核心使用RCU的时候,RCU宽限期就会被延长,甚至无限延长。当宽限期超过 /sys/module/rcupdate/parameters/rcu_cpu_stall_timeout
的值的时候,就会打印此消息。
验证
方式1
将多核处理器的某个核心通过自旋锁所住,然后在其他核心使用rcu,看是 call_rcu() 函数注册的回调处理函数是否会被执行。
结果:
回调函数不会被执行,该回调函数会在宽限期结束时执行,但是由于死锁的存在,宽限期被无无限延长,所以,回调函数永远不会执行。
方式2
对于双核设备,将创建两个线程,分别绑定两个核心,然后将两个线程互相锁住。
volatile int a = 0;
volatile int b = 0;
int debug_1(void *arg)
{
printk("11111 begin\n");
msleep(1000);
printk("11111 after sleep\n");
spin_lock(&sp_lock_1);
while(a == 0);
b = 1;
printk("11111 get lock2\n");
spin_lock(&sp_lock_2);
printk("11111 over\n");
return 0;
}
int debug_2(void *arg)
{
printk("22222 begin\n");
msleep(1000);
printk("22222 after sleep\n");
spin_lock(&sp_lock_2);
a = 1;
while(b == 0);
printk("22222 get lock1\n");
spin_lock(&sp_lock_1);
printk("22222 over\n");
return 0;
}
结果:
不会打印RCU失速警告,因为死锁过程中没有使用RCU,所以不会打印RCU警告。