在内核模块开发过程中,常发生系统崩溃的现象,此时系统死机,无法定位和分析问题。
常见的定位方法是安装kdump-tools,kdump-tools可以把死机前的内核日志保存下来,以便开机后能分析上次死机的日志。
这里不介绍kdump-tools的安装配置方法,介绍如何分析crash日志,找到代码中出错的地方。
kdump-tools的crash日志一般放在/var/crash/出错时间/dmesg.时间目录下,如 /var/crash/201706131703/dmesg.201706131703,打开此文件,可见如下:
[1493201.293587] buflen=2097152,gwid=223344,addr=33554671
[1493258.160173] fq=300 full,will be change fq
[1493258.160179] max_gw_buf_len0=81984,max_gw_buf_len1=0
[1493258.160199] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[1493258.160204] IP: [<ffffffffc02ef10a>] search_fq_to_insert+0x1d2/0x239 [HNRcore]
[1493258.160216] PGD 0
[1493258.160219] Oops: 0000 [#1] SMP
[1493258.160222] Modules linked in: binfmt_misc fou(OE) HNRcore(OE) iptable_filter xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables ipip tunnel4 ip_tunnel ip6_udp_tunnel udp_tunnel bonding joydev input_leds intel_powerclamp coretemp kvm ipmi_ssif ipmi_devintf irqbypass gpio_ich crct10dif_pclmul crc32_pclmul 8250_fintek dcdbas shpchp aesni_intel serio_raw aes_x86_64 lrw gf128mul glue_helper lpc_ich ablk_helper i7core_edac cryptd edac_core ipmi_si ipmi_msghandler acpi_power_meter mac_hid parport_pc ppdev lp parport autofs4 hid_generic psmouse usbhid hid pata_acpi m