通过dmesg crash信息调试驱动代码

最近在给一个驱动程序添加一个功能 --> 通过给定的进程名找到对应进程的pid号,但是遇到了crash的情况,我们一起找找问题出在哪里!

首先给到dmesg中的crash信息:

[ 4534.975026] BUG: unable to handle kernel NULL pointer dereference at 0000000000000430
[ 4534.976059] IP: [<ffffffffc0747e78>] bts_write+0x1b8/0x830 [bts]
[ 4534.977065] PGD 2195a2067 PUD 219c6f067 PMD 0 
[ 4534.978066] Oops: 0000 [#3] SMP 
[ 4534.979027] Modules linked in: bts(OE) chr(OE) hid_generic usbhid hid rfcomm bnep bluetooth intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm arc4 ath9k amdkfd ath9k_common ath9k_hw amd_iommu_v2 ath radeon snd_hda_codec_idt snd_hda_codec_generic snd_hda_codec_hdmi crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec mac80211 snd_hda_core aesni_intel aes_x86_64 joydev snd_hwdep hp_wmi snd_pcm sparse_keymap input_leds lrw serio_raw gf128mul glue_helper ppdev ablk_helper lp parport_pc snd_seq_midi cfg80211 snd_seq_midi_event snd_rawmidi snd_seq ttm cryptd snd_seq_device snd_timer mei_me drm_kms_helper mei drm snd i2c_algo_bit soundcore hp_accel lpc_ich lis3lv02d tpm_infineon input_polldev parport video 8250_fintek hp_wireless mac_hid wmi psmouse ahci libahci firewire_ohci sdhci_pci firewire_core e1000e sdhci crc_itu_t ptp pps_core [last unloaded: bts]
[ 4534.985521] CPU: 0 PID: 3462 Comm: ops_main Tainted: G      D W  OE   4.2.0-42-generic #49~14.04.1-Ubuntu
[ 4534.986561] Hardware name: Hewlett-Packard HP ProBook 6470b/179C, BIOS 68ICE Ver. F.45 10/07/2013
[ 4534.987607] task: ffff8802203a5280 ti: ffff880220298000 task.ti: ffff880220298000
[ 4534.988636] RIP: 0010:[<ffffffffc0747e78>]  [<ffffffffc0747e78>] bts_write+0x1b8/0x830 [bts]
[ 4534.989674] RSP: 0018:ffff88022029bd38  EFLAGS: 00010246
[ 4534.990663] RAX: ffffffff81c15840 RBX: 0000000000000006 RCX: 0000000000000002
[ 4534.991635] RDX: 0000000000000002 RSI: ffff88022029bd51 RDI: ffff8802203a5859
[ 4534.992587] RBP: ffff88022029be98 R08: ffffffffc074b060 R09: 315f6e65706f5f34
[ 4534.993573] R10: 00007fd6ff1ba6a0 R11: 0000000000000246 R12: 0000000000000000
[ 4534.994497] R13: ffffffff81c15840 R14: ffff8802203a5858 R15: ffff8800b8e7b000
[ 4534.995411] FS:  00007fd6ff3cb740(0000) GS:ffff88023ec00000(0000) knlGS:0000000000000000
[ 4534.996324] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4534.997232] CR2: 0000000000000430 CR3: 000000022b479000 CR4: 00000000001406f0
[ 4534.998334] Stack:
[ 4534.999528]  ffff88022029bd68 ffffffff811f833e ffff88022029bd68 ffffffffc074a201
[ 4535.000466]  ffffffff81c15840 7700007472617473 6174732065746972 6563617274207472
[ 4535.001395]  646e616d6d6f6320 253a726f72726520 7320737462000a73 6f72726520706f74
[ 4535.002401] Call Trace:
[ 4535.003364]  [<ffffffff811f833e>] ? terminate_walk+0x6e/0xe0
[ 4535.004328]  [<ffffffff811ede38>] __vfs_write+0x18/0x40
[ 4535.005283]  [<ffffffff811ee479>] vfs_write+0xa9/0x190
[ 4535.006244]  [<ffffffff810dbefd>] ? call_rcu_sched+0x1d/0x20
[ 4535.007182]  [<ffffffff811ef1e6>] SyS_write+0x46/0xa0
[ 4535.008111]  [<ffffffff817c36f2>] entry_SYSCALL_64_fastpath+0x16/0x75
[ 4535.009038] Code: 00 00 49 8b 84 24 40 03 00 00 48 89 85 c0 fe ff ff 4c 8b ad c0 fe ff ff 4d 8d a5 c0 fc ff ff 49 81 fc 00 55 c1 81 75 bc 45 31 e4 <45> 8b a4 24 30 04 00 00 48 c7 c7 1d a2 74 c0 31 c0 44 89 e6 e8 
[ 4535.011028] RIP  [<ffffffffc0747e78>] bts_write+0x1b8/0x830 [bts]
[ 4535.011968]  RSP <ffff88022029bd38>
[ 4535.012902] CR2: 0000000000000430
[ 4535.013850] ---[ end trace bd7d268405d6447e ]---

从dmesg Log中可以看到 BUG: unable to handle kernel NULL pointer dereference at 0000000000000430 从字面意思来看遇到了一个空指针类型的错误,还有第二个信息是十分重要的,bts_write+0x1b8/0x830 [bts] ,从这个信息我们可以看出出错的函数以及偏移,出错的函数在 bts_write ,相对偏移为0x1b8;

针对这个信息,第一件要做的事情就是把驱动编译过程文件xxx.o进行反汇编,现在Linux 自带的objdump就可以了;

//要是不知道具体参数 objdump -h就知道了
curtis@curtis-virtual-machine:/mnt/hgfs/share/write_code/runqueue$ objdump --help
Usage: objdump <option(s)> <file(s)>
 Display information from object <file(s)>.
 At least one of the following switches must be given:
  -a, --archive-headers    Display archive header information
  -f, --file-headers       Display the contents of the overall file header
  -p, --private-headers    Display object format specific file header contents
  -P, --private=OPT,OPT... Display object format specific contents
  -h, --[section-]headers  Display the contents of the section headers
  -x, --all-headers        Display the contents of all headers
  -d, --disassemble        Display assembler contents of executable sections
  -D, --disassemble-all    Display assembler contents of all sections
  -S, --source             Intermix source code with disassembly
  -s, --full-contents      Display the full contents of all sections requested
  -g, --debugging          Display debug information in object file
  -e, --debugging-tags     Display debug information using ctags style
  -G, --stabs              Display (in raw form) any STABS info in the file
  -W[lLiaprmfFsoRt] or
  --dwarf[=rawline,=decodedline,=info,=abbrev,=pubnames,=aranges,=macro,=frames,
          =frames-interp,=str,=loc,=Ranges,=pubtypes,
          =gdb_index,=trace_info,=trace_abbrev,=trace_aranges,
          =addr,=cu_index]
                           Display DWARF info in the file
  -t, --syms               Display the contents of the symbol table(s)
  -T, --dynamic-syms       Display the contents of the dynamic symbol table
  -r, --reloc              Display the relocation entries in the file
  -R, --dynamic-reloc      Display the dynamic relocation entries in the file
  @<file>                  Read options from <file>
  -v, --version            Display this program's version number
  -i, --info               List object formats and architectures supported
  -H, --help               Display this information

 The following switches are optional:
  -b, --target=BFDNAME           Specify the target object format as BFDNAME
  -m, --architecture=MACHINE     Specify the target architecture as MACHINE
  -j, --section=NAME             Only display information for section NAME
  -M, --disassembler-options=OPT Pass text OPT on to the disassembler
  -EB --endian=big               Assume big endian format when disassembling
  -EL --endian=little            Assume little endian format when disassembling
      --file-start-context       Include context from start of file (with -S)
  -I, --include=DIR              Add DIR to search list for source files
  -l, --line-numbers             Include line numbers and filenames in output
  -F, --file-offsets             Include file offsets when displaying information
  -C, --demangle[=STYLE]         Decode mangled/processed symbol names
                                  The STYLE, if specified, can be `auto', `gnu',
                                  `lucid', `arm', `hp', `edg', `gnu-v3', `java'
                                  or `gnat'
  -w, --wide                     Format output for more than 80 columns
  -z, --disassemble-zeroes       Do not skip blocks of zeroes when disassembling
      --start-address=ADDR       Only process data whose address is >= ADDR
      --stop-address=ADDR        Only process data whose address is <= ADDR
      --prefix-addresses         Print complete address alongside disassembly
      --[no-]show-raw-insn       Display hex alongside symbolic disassembly
      --insn-width=WIDTH         Display WIDTH bytes on a single line for -d
      --adjust-vma=OFFSET        Add OFFSET to all displayed section addresses
      --special-syms             Include special symbols in symbol dumps
      --prefix=PREFIX            Add PREFIX to absolute paths for -S
      --prefix-strip=LEVEL       Strip initial directory names for -S
      --dwarf-depth=N        Do not display DIEs at depth N or greater
      --dwarf-start=N        Display DIEs starting with N, at the same depth
                             or deeper
      --dwarf-check          Make additional dwarf internal consistency checks.      

objdump: supported targets: elf64-x86-64 elf32-i386 elf32-x86-64 a.out-i386-linux pei-i386 pei-x86-64 elf64-l1om elf64-k1om elf64-little elf64-big elf32-little elf32-big pe-x86-64 pe-i386 plugin srec symbolsrec verilog tekhex binary ihex
objdump: supported architectures: i386 i386:x86-64 i386:x64-32 i8086 i386:intel i386:x86-64:intel i386:x64-32:intel i386:nacl i386:x86-64:nacl i386:x64-32:nacl l1om l1om:intel k1om k1om:intel plugin

The following i386/x86-64 specific disassembler options are supported for use
with the -M switch (multiple options should be separated by commas):
  x86-64      Disassemble in 64bit mode
  i386        Disassemble in 32bit mode
  i8086       Disassemble in 16bit mode
  att         Display instruction in AT&T syntax
  intel       Display instruction in Intel syntax
  att-mnemonic
              Display instruction in AT&T mnemonic
  intel-mnemonic
              Display instruction in Intel mnemonic
  addr64      Assume 64bit address size
  addr32      Assume 32bit address size
  addr16      Assume 16bit address size
  data32      Assume 32bit data size
  data16      Assume 16bit data size
  suffix      Always display instruction suffix in AT&T syntax
Report bugs to <http://www.sourceware.org/bugzilla/>.

//这里使用-D参数把所有sections反汇编,并重定向到文件方便后续查看
curtis@curtis-HP-ProBook-6470b:~/Desktop/per_bts/drv$ objdump bts.o -D > err.txt

objdump 默认情况下输出的是ATT汇编语法,如果不习惯可以转换成intel汇编语法,添加参数 -M intel ,下一步就是找到出错函数的基址,vim打开搜索bts_write就可以找到:

0000000000000cc0 <bts_write>:
     cc0:       e8 00 00 00 00          callq  cc5 <bts_write+0x5>
     cc5:       55                      push   %rbp
     cc6:       b9 20 00 00 00          mov    $0x20,%ecx
     ccb:       48 89 e5                mov    %rsp,%rbp
     cce:       41 57                   push   %r15
     cd0:       41 56                   push   %r14
     cd2:       45 31 f6                xor    %r14d,%r14d
     cd5:       41 55                   push   %r13
     cd7:       4c 8d ad c8 fe ff ff    lea    -0x138(%rbp),%r13
     cde:       41 54                   push   %r12
     ce0:       53                      push   %rbx
     ce1:       48 89 d3                mov    %rdx,%rbx
     ce4:       ba fe 00 00 00          mov    $0xfe,%edx
     ce9:       48 81 ec 38 01 00 00    sub    $0x138,%rsp
     cf0:       4c 8b bf d0 00 00 00    mov    0xd0(%rdi),%r15
     cf7:       48 8d bd c8 fe ff ff    lea    -0x138(%rbp),%rdi
     cfe:       65 48 8b 04 25 28 00    mov    %gs:0x28,%rax
     d05:       00 00
     d07:       48 89 45 c8             mov    %rax,-0x38(%rbp)

从以上信息可以看出,函数的基址为0xcc0,想要找到具体的出错行,还需加上偏移0x1b8 --> 0xcc0+0x1b8=0xe78;
下一步就是如何定位出错代码行,这里就要用到另外一个工具,addr2line;

注意:有同学可能在编译驱动的时候在Makefile中没有添加参数 “KBUILD_CFLAGS+= -g” 参数,导致使用addr2line工具时无法找到oops具体对应的行号!!!

Makefile 示例如下:

obj-m += good.o

KBUILD_CFLAGS+= -g

all:
	make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

clean:
	make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

curtis@curtis-HP-ProBook-6470b:~/Desktop/per_bts/drv$ addr2line -h
Usage: addr2line [option(s)] [addr(s)]
 Convert addresses into line number/file name pairs.
 If no addresses are specified on the command line, they will be read from stdin
 The options are:
  @<file>                Read options from <file>
  -a --addresses         Show addresses
  -b --target=<bfdname>  Set the binary file format
  -e --exe=<executable>  Set the input file name (default is a.out)
  -i --inlines           Unwind inlined functions
  -j --section=<name>    Read section-relative offsets instead of addresses
  -p --pretty-print      Make the output easier to read for humans
  -s --basenames         Strip directory names
  -f --functions         Show function names
  -C --demangle[=style]  Demangle function names
  -h --help              Display this information
  -v --version           Display the program's version

curtis@curtis-HP-ProBook-6470b:~/Desktop/per_bts/drv$ addr2line -C -f -e bts.o e78
find_pid
/home/curtis/Desktop/per_bts/drv/bts_driver.c:108

这里成功找到出错行函数以及出错行号,出错函数为find_pid,行号为108,在代码中找到对应函数;

static int find_pid(char *string_name)
{
        unsigned int pid;
        char *find_name = &string_name;  --> char *find_name = string_name;
        struct task_struct* task;

        task = find_task(find_name);
        pid = task->pid;   <--108printk("Have find pid is %d\n",pid);
        return pid;
}

仔细分析发现是因为find_task函数没有返回进程的task_struct结构体,导致出现空指针,根本原因是前后代码改动较大,忽略了对find_name的初始化出错了,传入的形参是字符串指针,改完之后,完美解决问题;

crash添加调试驱动符号信息

mod -S /path/to/driver.o

要打开Linux内核某个驱动文件的调试信息,你可以使用printk函数在内核日志中输出信息。以下是一些常用的方法: 1. 在驱动程序中添加printk语句:在驱动程序代码中添加printk语句,以输出调试信息。例如,你可以在某个函数中添加以下语句: ``` printk(KERN_DEBUG "mydriver: myfunction called\n"); ``` 这将输出一个调试信息,包括驱动程序名称和函数名称。 2. 使用dmesg命令查看内核日志:打开终端窗口,并输入以下命令: ``` $ dmesg -wH ``` 这个命令将打开内核日志,并将其输出到终端窗口中。接着,你可以在终端窗口中执行驱动程序,以捕获输出的调试信息。 3. 使用syslog工具:你可以使用syslog工具来捕获内核日志,并将其保存在一个文件中。要使用syslog工具,请执行以下步骤: - 安装syslog工具:在终端窗口中输入以下命令: ``` $ sudo apt-get install syslog-ng ``` - 配置syslog-ng:打开syslog-ng配置文件,并添加以下内容: ``` source s_mydriver {file("/var/log/mydriver.log");}; filter f_mydriver {facility(kern) and match("mydriver:");}; destination d_mydriver {file("/var/log/mydriver.log");}; log {source(s_mydriver); filter(f_mydriver); destination(d_mydriver);}; ``` 这将配置syslog-ng来捕获内核日志中包含“mydriver:”关键字的信息,并将其保存在“/var/log/mydriver.log”文件中。 - 在驱动程序中添加printk语句:在驱动程序代码中添加printk语句,以输出调试信息。 - 重新加载syslog-ng:在终端窗口中,输入以下命令以重新加载syslog-ng配置文件: ``` $ sudo service syslog-ng reload ``` 现在,你可以在驱动程序中执行操作,并查看“/var/log/mydriver.log”文件,以捕获输出的调试信息。 请注意,打开驱动程序的调试信息可能会影响系统性能,并且需要具备一定的调试经验。建议在测试环境中进行操作。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值