高通sxr2130平台下(aarch64系统),死机问题分析

4 篇文章 1 订阅
4 篇文章 0 订阅

高通sxr2130平台下(aarch64系统),死机问题分析

举例

二级目录

三级目录

正文
通过qcap解析得到结果如下(参看附件中的QCAP 3.0 Report.html):
61.586598: <6> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000146
61.595874: <6> Mem abort info:
61.598975: <6> ESR = 0x96000045
61.602215: <6> Exception class = DABT (current EL), IL = 32 bits
61.608384: <6> SET = 0, FnV = 0
61.608553: <2> type=1400 audit(1619681318.106:2527): avc: denied { read } for comm=“pxrstreamingser” name=“plugin_status” dev=“sysfs” ino=70479 scontext=u:r:pxrstreamingservice:s0 tcontext=u:object_r:sysfs:s0 tclass=file permissive=1
61.611553: <6> EA = 0, S1PTW = 0
61.611855: <2> type=1400 audit(1619681319.058:2533): avc: denied { read } for comm=“pxrhmdservice” name=“brightness” dev=“sysfs” ino=83427 scontext=u:r:pxrhmdservice:s0 tcontext=u:object_r:sysfs_graphics:s0 tclass=file permissive=1
61.614814: <6> Data abort info:
61.614817: <6> ISV = 0, ISS = 0x00000045
61.614819: <6> CM = 0, WnR = 1
61.614823: <6> user pgtable: 4k pages, 39-bit VAs, pgdp = ffffffe822bd0000
61.614834: <6> [0000000000000146] pgd=0000000000000000, pud=0000000000000000
61.614842: <6> Internal error: Oops: 96000045 [#1] PREEMPT SMP
61.614844: <6> Modules linked in: kernel_file(+) wlan(O) machine_dlkm(O) wcd938x_slave_dlkm(O) wcd938x_dlkm(O) wcd9xxx_dlkm(O) mbhc_dlkm(O) tx_macro_dlkm(O) rx_macro_dlkm(O) va_macro_dlkm(O) wsa_macro_dlkm(O) swr_ctrl_dlkm(O) bolero_cdc_dlkm(O) wsa881x_dlkm(O) wcd_core_dlkm(O) stub_dlkm(O) hdmi_dlkm(O) swr_dlkm(O) pinctrl_lpi_dlkm(O) pinctrl_wcd_dlkm(O) usf_dlkm(O) native_dlkm(O) platform_dlkm(O) q6_dlkm(O) adsp_loader_dlkm(O) apr_dlkm(O) snd_event_dlkm(O) q6_notifier_dlkm(O) q6_pdr_dlkm(O) msm_11ad_proxy
61.614872: <6> Process insmod (pid: 4610, stack limit = 0xffffff8020e78000)
61.614875: <6> CPU: 5 PID: 4610 Comm: insmod Tainted: G S W O 4.19.81+ #43
61.614876: <6> Hardware name: Qualcomm Technologies, Inc. kona-iot MTP (DT)
61.614877: <2> pstate: 60400005 (nZCv daif +PAN -UAO)
61.614882: <2> pc : init_module+0xd8/0x1000 [kernel_file]
61.614883: <2> lr : init_module+0xd4/0x1000 [kernel_file]
61.614884: <2> sp : ffffff8020e7b9f0
61.614885: <2> x29: ffffff8020e7bab0 x28: 000000000000001c
61.614887: <2> x27: ffffffac7222a050 x26: 0000000000000700
61.614889: <2> x25: 0000000000000002 x24: 0000000000000002
61.614890: <2> x23: ffffffe8193d5c80 x22: 0000000000000000
61.614892: <2> x21: ffffffac72229000 x20: ffffffe7a20a4fc8
61.614894: <2> x19: fffffffffffffffe x18: 00000000fffa5364
61.614896: <2> x17: 00000000000a5364 x16: 0000000000000000
61.614897: <2> x15: 00000000000000c2 x14: 0000000000000024
61.614899: <2> x13: 0000000000000004 x12: 0000000046670ad0
61.614901: <2> x11: 0000000046670ad0 x10: 0000000000000015
61.614903: <2> x9 : 21fae061be9b1600 x8 : 21fae061be9b1600
61.614904: <2> x7 : ffffffacae1713ec x6 : 0000000000000000
61.614906: <2> x5 : 0000000000000080 x4 : 0000000000000001
61.614908: <2> x3 : ffffff8020e7b608 x2 : ffffffe8b9e17798
61.614910: <2> x1 : ffffffacae17153c x0 : 0000000000000011
61.614912: <2> Call trace:
61.614914: <2> init_module+0xd8/0x1000 [kernel_file]
61.614919: <2> do_one_initcall+0x1fc/0x410
61.614923: <2> do_init_module+0x60/0x228
61.614924: <2> load_module+0x34d4/0x3cf0
61.614927: <2> __arm64_sys_finit_module+0xfc/0x130
61.614930: <2> el0_svc_common+0xac/0x188
61.614932: <2> el0_svc_handler+0x7c/0x98
61.614933: <2> el0_svc+0x8/0xc
61.614935: <6> Code: f94416e8 f9005d14 94000064 d0000dd5 (f900a67f)
61.614936: <6> —[ end trace 0bde494be0c54cfa ]—
61.614943: <6> Kernel panic - not syncing: Fatal exception

在out目录下,grep “kernel_file”,运行命令"grep “kernel_file” out/target/product/kona/ -rn",发现如下结果:
Binary file out/target/product/kona/dlkm/lib/modules/kernel-file.ko matches
Binary file out/target/product/kona/obj/kernel/msm-4.19/drivers/misc/kernel-file.ko matches

结合之前qcap解析得到结果中,有如下关键字:
61.614872: <6> Process insmod (pid: 4610, stack limit = 0xffffff8020e78000)
可以大概判断出是系统insmod /vendor/lib/modules/kernel-file.ko造成系统panic了。然后,用如下命令:
aarch64-linux-android-addr2line -f -e out/target/product/kona/obj/kernel/msm-4.19/drivers/misc/kernel-file.ko d8,解析的结果如下:
hello_init
/mnt/3rd-hdd/pico/code/sxr2130-androidq-master/ap/kernel/msm-4.19/drivers/misc/kernel-file.c:172。注意这里的d8,是根据pc : init_module+0xd8/0x1000 [kernel_file]中的d8来的,它是指偏移init_module 0xd8的地址处运行异常了.
注意: adb push的是out/target/product/kona/dlkm/lib/modules/kernel-file.ko,但aarch64-linux-android-addr2line,要解析的是out/target/product/kona/obj/kernel/msm-4.19/drivers/misc/kernel-file.ko 。而172行,对应如下代码的 lut_fp->f_pos = 0;这一行。

static int __init hello_init (void)
{
    char cbuf[128] = {0};
    

    struct file *lut_fp = openFile("/data/mac.log", 0, 0, 0, 0);
    if (NULL == lut_fp ) {
		printk("module init,open file error\n");
        return 0;
    }
	printk("Hello module init\n");
    lut_fp->f_pos = 0;
printk("%s line%d\n",__func__,__LINE__);
    readFile(lut_fp, cbuf, sizeof(cbuf), &lut_fp->f_pos);

    printk("%s \n", cbuf);
    closeFile (lut_fp);


    return 0;
}

而lut_fp->f_pos的f_pos这个变量的类型是loff_t,其实就是long long类型,理论上看,也没什么问题,于是我们先注释掉这一行,再试试,结果还是死机,通过死机ramdump解析结果(参看附件《kernel-file-ramdump_2.zip.001》和《kernel-file-ramdump_2.zip.002》以及《kernel-file-ramdump_2.zip.003》中的QCAP 3.0 Report-2.html)如下:
85.593457: <2> hello_init line173
85.593459: <2> readFile line146
85.593462: <2> readFile line148
85.593465: <2> readFile line150
85.593473: <6> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000092
85.602551: <6> Mem abort info:
85.605470: <6> ESR = 0x96000005
85.608726: <6> Exception class = DABT (current EL), IL = 32 bits
85.614990: <6> SET = 0, FnV = 0
85.618350: <6> EA = 0, S1PTW = 0
85.621745: <6> Data abort info:
85.624873: <6> ISV = 0, ISS = 0x00000005
85.629039: <6> CM = 0, WnR = 0
85.632301: <6> user pgtable: 4k pages, 39-bit VAs, pgdp = fffffffa95610000
85.639290: <6> [0000000000000092] pgd=0000000000000000, pud=0000000000000000
85.646519: <6> Internal error: Oops: 96000005 [#1] PREEMPT SMP
85.652257: <6> Modules linked in: kernel_file(+) wlan(O) machine_dlkm(O) wcd938x_slave_dlkm(O) wcd938x_dlkm(O) wcd9xxx_dlkm(O) mbhc_dlkm(O) tx_macro_dlkm(O) rx_macro_dlkm(O) va_macro_dlkm(O) wsa_macro_dlkm(O) swr_ctrl_dlkm(O) bolero_cdc_dlkm(O) wsa881x_dlkm(O) wcd_core_dlkm(O) stub_dlkm(O) hdmi_dlkm(O) swr_dlkm(O) pinctrl_lpi_dlkm(O) pinctrl_wcd_dlkm(O) usf_dlkm(O) native_dlkm(O) platform_dlkm(O) q6_dlkm(O) adsp_loader_dlkm(O) apr_dlkm(O) snd_event_dlkm(O) q6_notifier_dlkm(O) q6_pdr_dlkm(O) msm_11ad_proxy
85.697878: <6> Process insmod (pid: 4650, stack limit = 0xffffff8021c08000)
85.704767: <6> CPU: 7 PID: 4650 Comm: insmod Tainted: G S W O 4.19.81+ #43
85.712540: <6> Hardware name: Qualcomm Technologies, Inc. kona-iot MTP (DT)
85.719430: <2> pstate: 60c00005 (nZCv daif +PAN +UAO)
85.724367: <2> pc : vfs_read+0x28/0x140
85.728049: <2> lr : init_module+0x16c/0x1000 [kernel_file]
然后,结合如下命令:
aarch64-linux-android-addr2line -f -e out/target/product/kona/obj/kernel/msm-4.19/drivers/misc/kernel-file.ko 0x16c,得到结果是:readFile
/mnt/3rd-hdd/pico/code/sxr2130-androidq-master/ap/kernel/msm-4.19/drivers/misc/kernel-file.c:151,看源代码如下:

static int32_t readFile(struct file *file, char *addr, size_t len, loff_t *pos)
{
    int32_t ret;
    do {
        mm_segment_t old_fs;
printk("%s line%d\n",__func__,__LINE__);
        old_fs = get_fs();
		printk("%s line%d\n",__func__,__LINE__);
        set_fs(KERNEL_DS); /* Enable to read in kernel memory */
	printk("%s line%d\n",__func__,__LINE__);
        ret = vfs_read(file, addr, len, pos);
	printk("%s line%d\n",__func__,__LINE__);
        set_fs(old_fs); /* Disable to read in kernel memory */
	printk("%s line%d\n",__func__,__LINE__);
    } while (0);
	printk("%s line%d,ret=%d\n",__func__,__LINE__,ret);
    return ret;
}

151行,对应的是ret = vfs_read(file, addr, len, pos),我们在System.map里找到vfs_read的地址,vfs_read+0x28的地址就是FFFFFF80083270D0,运行如下命令:
aarch64-linux-android-addr2line -f -e out/target/product/kona/obj/kernel/msm-4.19/vmlinux FFFFFF80083270D0
得到结果如下:
vfs_read
/mnt/3rd-hdd/pico/code/sxr2130-androidq-master/ap/kernel/msm-4.19/fs/read_write.c:441
对应的代码行是:if (!(file->f_mode & FMODE_READ)),发现file试过指针变量,难道这个指针变量的地址非法?仔细看看代码,发现这个指针变量来自,struct file *lut_fp = openFile("/data/mac.log", 0, 0, 0, 0),是不是没有/data/mac.log文件造成的,于是,我手动创建/data/mac.log之后,就不死机了.

另外,我们可以用如下命令反汇编(生成c和汇编都有的文件):
aarch64-linux-androidkernel-objdump -S -g out/target/product/kona/obj/kernel/msm-4.19/drivers/misc/kernel-file.ko > …/backup/ap/kernel-file/kernel-file-2.dis (附件中也有该文件)

参考链接:

https://www.cnblogs.com/lifexy/p/8006748.html
Linux驱动调试-根据oops的栈信息,确定函数调用过程
https://www.cnblogs.com/lifexy/p/8011966.html

https://blog.csdn.net/lickylin/article/details/19172725
https://www.shangmayuan.com/a/f6054b0feea34da481a76110.html
https://blog.csdn.net/forever_2015/article/details/70185313

  • 2
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值