1 现象
出现大概率反复开关机卡死,大概测试到几百次就能复现。
2 分析
幸好卡死的时候有adb端口。用ps命令执行发现会卡死,但是能打印出部分进程状态,已经打印出来的进程里面看到systemmanger处于D状态,ps –t 查看是它的主线程处于D状态,但是用cat /proc/***/stack又没有发现该线程有啥异常,
尝试直接用echo w > /proc/sysrq-trigger打印D状态内核D状态调用栈,另外用echo l >/proc/sysrq-trigger打印activeCPU的栈。Log如下:
从log中能看到很多D状态线程都卡在ashmem_ioctl函数中,
<6>[2015:07:24 21:36:21] [ 399.118434] [2652.0 sh]main D c098eb5c 0 2640 1 0x00000000
<4>[2015:07:24 21:36:21] [ 399.118514] [2652.0 sh]Backtrace:
<4>[2015:07:24 21:36:21] [ 399.118596] [2652.0 sh][<c098e750>](__schedule+0x0/0x95c) from [<c098f20c>] (schedule+0x50/0x68)
<4>[2015:07:24 21:36:21] [ 399.118692] [2652.0 sh][<c098f1bc>](schedule+0x0/0x68) from [<c099028c>] (__mutex_lock_slowpath+0x130/0x1c8)
<4>[2015:07:24 21:36:21] [ 399.118789] [2652.0 sh][<c099015c>](__mutex_lock_slowpath+0x0/0x1c8) from [<c0990364>](mutex_lock+0x40/0x44)
<4>[2015:07:24 21:36:21] [ 399.118887] [2652.0 sh][<c0990324>](mutex_lock+0x0/0x44) from [<c013efd0>] (ashmem_ioctl+0x310/0x74c)
<4>[2015:07:24 21:36:21] [ 399.118951] [2652.0 sh] r4:42f66a0cr3:41007701
<4>[2015:07:24 21:36:21] [ 399.119056] [2652.0 sh][<c013ecc0>](ashmem_ioctl+0x0/0x74c) from [<c015a044>] (do_vfs_ioctl+0x8c/0x5bc)
<4>[2015:07:24 21:36:21] [ 399.119153] [2652.0 sh][<c0159fb8>](do_vfs_ioctl+0x0/0x5bc) from [<c015a5e8>] (sys_ioctl+0x74/0x84)
<4>[2015:07:24 21:36:21] [ 399.119249] [2652.0 sh][<c015a574>](sys_ioctl+0x0/0x84) from [<c0057fc0>] (ret_fast_syscall+0x0/0x30)
<4>[2015:07:24 21:36:21] [ 399.119313] [2652.0 sh] r8:c0058168r7:00000036 r6:47427020 r5:42f66a0c r4:00000018
对照C代码是如下位置,说明该锁被人占了,这个ashmem_mutex锁是在kernel/mm/ashmem.c中定义的,只有这个文件内部使用,可以挨个检查下。
static intset_name(struct ashmem_area *asma, void __user *name)
{
int ret = 0;
mutex_lock(&ashmem_mutex);
/* cannot change an existing mapping'sname */
if (unlikely(asma->file)) {
ret = -EINVAL;
goto out;
}
if(unlikely(copy_from_user(asma->name + ASHMEM_NAME_PREFIX_LEN,
name,ASHMEM_NAME_LEN)))
ret = -EFAULT;
asma->name[ASHMEM_FULL_NAME_LEN-1] ='\0';
out:
mutex_unlock(&ashmem_mutex);
return ret;
}
同时log中发现有如下信息,
<6>[2015:07:2421:36:21] [ 399.134198] [2652.0sh]sysEvents handl D c098eb5c 0 2601 161 0x00000000
<4>[2015:07:2421:36:21] [ 399.134222] [2652.0sh]Backtrace:
<4>[2015:07:2421:36:21] [ 399.134245] [2652.0sh][<c098e750>] (__schedule+0x0/0x95c) from [<c098f20c>](schedule+0x50/0x68)
<4>[2015:07:2421:36:21] [ 399.134269] [2652.0sh][<c098f1bc>] (schedule+0x0/0x68) from [<c0991498>](__down_read+0xb4/0xec)
<4>[2015:07:2421:36:21] [ 399.134294] [2652.0sh][<c09913e4>] (__down_read+0x0/0xec) from [<c0990640>](down_read+0x10/0x14)
<4>[2015:07:2421:36:21] [ 399.134311] [2652.0 sh]r7:42f6400b r6:d3318440 r5:00000817 r4:d26e21a0
<4>[2015:07:2421:36:21] [ 399.134350] [2652.0sh][<c0990630>] (down_read+0x0/0x14) from [<c0060798>](do_page_fault+0x6c/0x1ec)
<4>[2015:07:2421:36:21] [ 399.134385] [2652.0sh][<c006072c>] (do_page_fault+0x0/0x1ec) from [<c0051b70>] (do_DataAbort+0x40/0x110)这里发生了缺页中断
<4>[2015:07:2421:36:21] [ 399.134413] [2652.0sh][<c0051b30>] (do_DataAbort+0x0/0x110) from [<c0057a2c>](__dabt_svc+0x4c/0x60)
<4>[2015:07:2421:36:21] [ 399.134432] [2652.0 sh]Exceptionstack(0xceb57d70 to 0xceb57db8)
<4>[2015:07:2421:36:21] [ 399.134451] [2652.0sh]7d60: ceb57e28 42f6400b 00000000 00000000
<4>[2015:07:2421:36:21] [ 399.134474] [2652.0 sh]7d80:ceb57e28 00001000 c1a7a240 0001f2e8 00000000 00000000 cffe3b84 ceb57de4
<4>[2015:07:2421:36:21] [ 399.134494] [2652.0 sh]7da0:00007000 ceb57db8 c0123090 c0108cd8 20000013 ffffffff
<4>[2015:07:2421:36:21] [ 399.134533] [2652.0sh][<c0108c74>] (file_read_actor+0x0/0xfc)from [<c0123090>] (shmem_file_aio_read+0x13c/0x3a0)
<4>[2015:07:2421:36:21] [ 399.134562] [2652.0sh][<c0122f54>] (shmem_file_aio_read+0x0/0x3a0) from [<c0148580>](do_sync_read+0xac/0xdc)
<4>[2015:07:2421:36:21] [ 399.134593] [2652.0sh][<c01484d4>] (do_sync_read+0x0/0xdc) from [<c013e3ac>](ashmem_read+0x60/0x98)
<4>[2015:07:2421:36:21] [ 399.134618] [2652.0sh][<c013e34c>] (ashmem_read+0x0/0x98)from [<c0148d28>] (vfs_read+0xa4/0x148) 这里正在读数据
<4>[2015:07:2421:36:21] [ 399.134636] [2652.0 sh]r7:ceb57f78 r6:42f6300c r5:0001f2e8 r4:ce859640
<4>[2015:07:2421:36:21] [ 399.134667] [2652.0sh][<c0148c84>] (vfs_read+0x0/0x148) from [<c0148e0c>](sys_read+0x40/0x78)
<4>[2015:07:2421:36:21] [ 399.134684] [2652.0 sh]r8:0001f2e8 r7:42f6300c r6:ce859640 r5:00000000 r4:00000000
<4>[2015:07:2421:36:21] [ 399.134719] [2652.0sh][<c0148dcc>] (sys_read+0x0/0x78) from [<c0057fc0>](ret_fast_syscall+0x0/0x30)
<4>[2015:07:2421:36:21] [ 399.134736] [2652.0 sh]r8:c0058168 r7:00000003 r6:7151c400 r5:537025cc r4:47428940
难道这里占着锁,对照代码查看
static ssize_tashmem_read(struct file *file, char __user *buf,
size_t len, loff_t*pos)
{
struct ashmem_area *asma =file->private_data;
int ret = 0;
mutex_lock(&ashmem_mutex);
/* If size is not set, or set to 0,always return EOF. */
if (asma->size == 0) {
goto out;
}
if (!asma->file) {
ret = -EBADF;
goto out;
}
ret = asma->file->f_op->read(asma->file, buf, len, pos);
if (ret < 0) {
goto out;
}
/** Update backing file pos, sincef_ops->read() doesn't */
asma->file->f_pos = *pos;
out:
mutex_unlock(&ashmem_mutex);
return ret;
}
果然,read函数调用是在持有锁且并没有释放的时候执行的,因为read函数有可能导致触发缺页,那么就存在长时间卡在这里的风险了。问题是4.4版本为什么没有这个问题呢,我理解是升级到L版本后,内存需求变大了,发生缺页的概率提高,导致因为这把锁出现D状态的概率变大了。
修改:
这个问题应该是3.0.8原生内核的bug,对照3.10内核发现已经进行了修正,因此可以仿照新内核版本进行修改,所有可能发生缺页的地方都修改下。这里主要有几个地方
1. ashmem_read函数,这里的read函数要放在锁之后进行
2. set_name和get_name函数因为调用copy_from_user和copy_to_user也可能会导致缺页,也需要用先用局部变量缓冲下,再上锁操作局部变量。3. 除了上述的修改外,也作了些空指针的保护