Linux内核中Lockdep死锁检测

为了维护世界和平_

已于 2022-10-24 14:51:18 修改

阅读量2.8k

点赞数 2

分类专栏： linux内核分析 linux内核调试与追踪文章标签： lockdep 死锁检测 1024程序员节

于 2022-09-27 10:35:07 首次发布

本文链接：https://blog.csdn.net/WANGYONGZIXUE/article/details/127030715

版权

linux内核分析同时被 2 个专栏收录

104 篇文章 78 订阅

订阅专栏

linux内核调试与追踪

28 篇文章 29 订阅

订阅专栏

一、死锁

死锁是两个或者多个进程/线程竞争资源造成相互等待的现象。

举例：如A进程需要资源X，进程B需要资源Y，但X资源被B所占用，Y资源被A占用，且都不释放，造成死锁。

常见的死锁：

1、递归死锁

2、 AB-BA死锁

检测技术：Lockdep

原理：其跟踪每个锁的自身状态和各个锁之间的依赖关系，经过规则验证来保证依赖的关系正确。

二、Lockdep 内核配置

自旋锁与互斥锁

在内核文件lib/Kconfig.debug中有详细的描述

CONFIG_DEBUG_LOCKDEP 在死锁发生，内核报告相应的死锁

CONFIG_PROVE_LOCKING=y

CONFIG_LOCK_STAT 追踪锁竞争的点，解释的更详细

CONFIG_DEBUG_RI_MUTEXES 实时互斥锁语义相关的死锁

CONFIG_DEBUG_LOCK_ALLOC 检测不正确的活锁（live lock）释放

CONFIG_DEBUG_ATOMIC_SLEEP 检测原子内睡眠

CONFIG_DEBUG_LOCKING_API_SELFTESTS 锁API引导时间自检

CONFIG_LOCK_TORTURE_TEST 锁的测试

Kernel hacking->Lock Debugging

将这些全打开在内核调试模式下是可以的，但是在生产环境中最好不要打开，因为占用太多内存，牺牲内核的速度。

输出的报告

WARN*()
deadlocks/lock inversion scenarios,
circular lock dependencies,
and hard IRQ/soft IRQ safe/unsafe locking bugs

三、死锁检测实例

1、试验一：隐藏的加锁

1）程序的简化版

do_each_thread(g, t) { /* 'g' : process ptr; 't': thread ptr */
 task_lock(t);
 [ ... ]
 get_task_comm(tasknm, t);
 task_unlock(t);
}

使用迭代的方式获取线程的数据结构信息。先上锁，获取任务信息，再解锁。看起来没有问题。

2）内核输出

#insmod thrd_showall_buggy.ko 
[ 1404.479012] thrd_showall_buggy: loading out-of-tree module taints kernel.
[ 1404.484444] thrd_showall_buggy: module verification failed: signature and/or required key missing - tainting kernel
[ 1404.510962] thrd_showall_buggy: inserted
[ 1404.516417] ============================================
[ 1404.517142] WARNING: possible recursive locking detected
[ 1404.517826] 5.0.0+ #2 Tainted: G           OE    
[ 1404.518250] --------------------------------------------
[ 1404.518432] insmod/1348 is trying to acquire lock:
[ 1404.519375] 000000001759de9e (&(&p->alloc_lock)->rlock){+.+.}, at: __get_task_comm+0x38/0x88
[ 1404.521885] 
[ 1404.521885] but task is already holding lock:
[ 1404.522282] 000000001759de9e (&(&p->alloc_lock)->rlock){+.+.}, at: showthrds_buggy+0x9c/0x504 [thrd_showall_buggy]
[ 1404.523738] 
[ 1404.523738] other info that might help us debug this:
[ 1404.524108]  Possible unsafe locking scenario:
[ 1404.524108] 
[ 1404.524359]        CPU0
[ 1404.524451]        ----
[ 1404.524588]   lock(&(&p->alloc_lock)->rlock);
[ 1404.524774]   lock(&(&p->alloc_lock)->rlock);
[ 1404.525658] 
[ 1404.525658]  *** DEADLOCK ***
[ 1404.525658] 
[ 1404.526054]  May be due to missing lock nesting notation
[ 1404.526054] 
[ 1404.526665] 1 lock held by insmod/1348:
[ 1404.527124]  #0: 000000001759de9e (&(&p->alloc_lock)->rlock){+.+.}, at: showthrds_buggy+0x9c/0x504 [thrd_showall_buggy]
[ 1404.528286] 
[ 1404.528286] stack backtrace:
[ 1404.529195] CPU: 1 PID: 1348 Comm: insmod Kdump: loaded Tainted: G           OE     5.0.0+ #2
[ 1404.530369] Hardware name: linux,dummy-virt (DT)
[ 1404.531230] Call trace:
[ 1404.531459]  dump_backtrace+0x0/0x52c

3）输出分析

WARNING: possible recursive locking detected 循环锁检测
[ 1404.518432] insmod/1348 is trying to acquire lock://尝试获取锁

        000000001759de9e (&(&p->alloc_lock)->rlock){+.+.}, at: __get_task_comm+0x38/0x88
        函数名偏移值以及函数大小，方便进行定位

        [ 1404.521885] but task is already holding lock://已经上锁
        at: showthrds_buggy+0x9c/0x504 [thrd_showall_buggy]//任务

符号{+.+.}的含义

'+' 意味着在启用 IRQ 的情况下获取锁定

'.' 意味着在禁用 IRQ 的情况下获取锁定，而不是在 IRQ 上下文中获取

具体含义参考：https://www.kernel.org/doc/Documentation/ locking/lockdep-design.txt

通过上面的分析看出，get_task_comm函数尝试获取相同锁，导致死锁，查看这个内核函数，果然有进行上锁。

#define get_task_comm(buf, tsk) ({			\
	BUILD_BUG_ON(sizeof(buf) != TASK_COMM_LEN);	\
	__get_task_comm(buf, sizeof(buf), tsk);		\
})


char *__get_task_comm(char *buf, size_t buf_size, struct task_struct *tsk)
{
	task_lock(tsk);
	strncpy(buf, tsk->comm, buf_size);
	task_unlock(tsk);
	return buf;
}
EXPORT_SYMBOL_GPL(__get_task_comm);

4）解决方法：简化版

do_each_thread(g, t) {   
       task_lock(t); 
       ...
       task_unlock(t);
       get_task_comm(tasknm, t);
       task_lock(t);
       ...    
       task_unlock(t);
}

2、试验二：AB-BA锁

1）模型

2）程序中的锁的顺序

线程1 ：
         spin_lock(&lockA);           
         spin_lock(&lockB);

         spin_unlock(&lockB);
         spin_unlock(&lockA);

线程2：
         spin_lock(&lockB);
         spin_lock(&lockA);

         spin_unlock(&lockA);
         spin_unlock(&lockB);

3）内核检测输出

insmod deadlock_eg_AB-BA.ko lock_ooo=1
key missing - tainting kernel
[  190.895374] deadlock_eg_AB-BA: inserted (param: lock_ooo=1)
[  190.924925] thrd_work():115: *** thread PID 1616 on cpu 0 now ***
[  190.936420] thrd_work():115: *** thread PID 1617 on cpu 1 now ***
[  190.937541]  iteration #0 on cpu #1
[  190.938060]  Thread #0: locking: we do: lockA --> lockB
[  190.939223]  Thread #1: locking: we do: lockB --> lockA
[  190.941822]  iteration #0 on cpu #0
[  190.946014] B
[  190.946185] A
[  190.946231] B
[  190.949057] A
[  190.949818] A
[  190.950828] irq event stamp: 12493
[  190.950846] 
[  190.952232] hardirqs last  enabled at (12493): [<ffff0000106f0c28>] kmem_cache_free+0x6b0/0x1178
[  190.953328] hardirqs last disabled at (12492): [<ffff0000106f0bd8>] kmem_cache_free+0x660/0x1178
[  190.953951] ======================================================
[  190.953983] WARNING: possible circular locking dependency detected
[  190.955155] softirqs last  enabled at (12436): [<ffff000010091b34>] fpsimd_restore_current_state+0x4fc/0x53c
[  190.957546] 5.0.0+ #2 Tainted: G           OE    
[  190.957646] ------------------------------------------------------
[  190.957880] softirqs last disabled at (12434): [<ffff000010091960>] fpsimd_restore_current_state+0x328/0x53c
[  190.960741] thrd_0/0/1616 is trying to acquire lock:
[  190.964268] (____ptrval____) (lockB){+.+.}, at: thrd_work+0x1e8/0x6c0 [deadlock_eg_AB_BA]
[  190.973906] 
[  190.973906] but task is already holding lock:
[  190.975638] (____ptrval____) (lockA){+.+.}, at: thrd_work+0x130/0x6c0 [deadlock_eg_AB_BA]
[  190.979383] 
[  190.979383] which lock already depends on the new lock.
[  190.979383] 
[  190.984836] 
[  190.984836] the existing dependency chain (in reverse order) is:
[  190.989808] 
[  190.989808] -> #1 (lockA){+.+.}:
[  190.991925]        validate_chain+0x1250/0x14a0
[  190.992364]        __lock_acquire+0xae4/0xc08
[  190.993684]        lock_acquire+0x664/0x6b8
[  190.998583]        _raw_spin_lock+0x54/0xb0
[  190.999824]        thrd_work+0x3f0/0x6c0 [deadlock_eg_AB_BA]
[  191.001019]        kthread+0x3c0/0x3cc
[  191.002301] 
[  191.002301] -> #0 (lockB){+.+.}:
[  191.006355]        check_prevs_add+0x148/0x2cc
[  191.007502]        validate_chain+0x1250/0x14a0
[  191.010230]        __lock_acquire+0xae4/0xc08
[  191.011146]        lock_acquire+0x664/0x6b8
[  191.012896]        _raw_spin_lock+0x54/0xb0
[  191.016753]        thrd_work+0x1e8/0x6c0 [deadlock_eg_AB_BA]
[  191.020368]        kthread+0x3c0/0x3cc
[  191.022408] 
[  191.022408] other info that might help us debug this:
[  191.022408] 
[  191.025625]  Possible unsafe locking scenario:
[  191.025625] 
[  191.030342]        CPU0                    CPU1
[  191.034011]        ----                    ----
[  191.035514]   lock(lockA);
[  191.037973]                                lock(lockB);
[  191.042529]                                lock(lockA);
[  191.045536]   lock(lockB);
[  191.047178] 
[  191.047178]  *** DEADLOCK ***
[  191.047178] 
[  191.051286] 1 lock held by thrd_0/0/1616:
[  191.053763]  #0: (____ptrval____) (lockA){+.+.}, at: thrd_work+0x130/0x6c0 [deadlock_eg_AB_BA]
[  191.058936] 
[  191.058936] stack backtrace:
[  191.060266] CPU: 0 PID: 1616 Comm: thrd_0/0 Kdump: loaded Tainted: G           OE     5.0.0+ #2
[  191.061426] Hardware name: linux,dummy-virt (DT)
[  191.062226] Call trace:
[  191.062582]  dump_backtrace+0x0/0x52c
[  191.063168]  show_stack+0x24/0x30

4）分析

WARNING: possible circular locking dependency detected
Possible unsafe locking scenario:

[ 191.030342] CPU0 CPU1
[ 191.034011] ---- ----
[ 191.035514] lock(lockA);
[ 191.037973] lock(lockB);
[ 191.042529] lock(lockA);
[ 191.045536] lock(lockB);

四、锁统计

内核提供锁统计信息，以便轻松识别竞争激烈的锁。

锁可以被争用，也就是说，当上下文想要获取锁，但它已经被占用了，所以它必须等待解锁发生。激烈的争用可能会造成严重的性能瓶颈;

内核配置 CONFIG_LOCK_STAT

命令行

清空锁的状态：echo 0 > /proc/lock_stat

使能锁：echo 1 > /proc/sys/kernel/lock_stat

不使能锁：echo 0 > /proc/sys/kernel/lock_stat

五、lockdep编程的建议

使用lockdep_assert_held宏源码位置// include/linux/lockdep.h

#define lockdep_assert_held(l)	do {				\
		WARN_ON(debug_locks && !lockdep_is_held(l));	\
	} while (0)

#define lockdep_assert_held_write(l)	do {			\
		WARN_ON(debug_locks && !lockdep_is_held_type(l, 0));	\
	} while (0)

#define lockdep_assert_held_read(l)	do {				\
		WARN_ON(debug_locks && !lockdep_is_held_type(l, 1));	\
	} while (0)

#define lockdep_assert_held_once(l)	do {				\
		WARN_ON_ONCE(debug_locks && !lockdep_is_held(l));	\
	} while (0)

如果断言失败，则会通过WARN_ON发出警告。

六、lockdep 使用可能存在的问题

存在的问题

重复加载和卸载模块可能会导致超出 lockdep 的内部锁定类限制。实际上，要么不要重复加载/卸载模块，要么重置系统。
在数据结构比较大的情况下，需要巨大的锁，未能正确初始化每个锁都可能导致lockdep溢出。
提示信息：*WARNING* lock debugging disabled!! - possibly due to a lockdep warning. 这可能是由于lockdep提前发出警告而发生的。

解决办法