jlink+openocd+gdb 调试+linux异常backtrace分析

这套调试环境可以实现断点 + 异常现场backtrace分析, 直接显示当前异常的pc代码位置,以及实时查看变量的数值等功能.

同时描述一个典型异常backtrace流程

1. 驱动安装

恢复jlink驱动程序

安装完openocd的winusb驱动回退,可以使用卸载设备并删除驱动。

jlink默认驱动版本

1.1.1.

首先通过Zadig软件修改jlink驱动(替换为ocd使用win usb驱动)。

A、点击“Options”菜单栏,选择List All Devices

B、选择J-link

C、点击Install Driver,修改Jlink的驱动为WinUSB

2. shell脚本启动openocd 

使用管理员权限运行openocd软件,前者的-f表示jinlk的配置,后面的-f表示目标板的配置。

 .\openocd.exe -f ..\scripts\interface\jlink.cfg -f x6.cfg

在x6.cfg配置文件中重点关注几个点:

1、CPU的debugbase,这个一般是arm官方会推荐一个配置值,具体信息咨询CPU的designer。

2、配置远程telnet服务及调试端口,用于远程调试。

这个DBG和CTI地址错误,会造成halt失效,无法正常调试,暂停等。

set _DBGBASE {0x81010000 0x81110000 0x81210000 0x81310000}
set _CTIBASE {0x81020000 0x81120000 0x81220000 0x81320000}

bindto 10.50.33.54
gdb_port 11111
tcl_port 111117
telnet_port 11116

可以通过以下方法判断这个地址是否正确

2.1.1. > cpu.dap info

AP ID register 0x24770002
        Type is MEM-AP APB
MEM-AP BASE 0x80000003
        Valid ROM table present
                Component base address 0x80000000
                Peripheral ID 0x0000080000
                Designer is 0x080, <invalid>
                Part is 0x0, Unrecognized
                Component class is 0x1, ROM table
                MEMTYPE system memory not present: dedicated debug bus
        ROMTABLE[0x0] = 0x81000003
                Component base address 0x01000000
                Peripheral ID 0x04006bb4e4
                Designer is 0x4bb, ARM Ltd
                Part is 0x4e4, Unrecognized
                Component class is 0x9, CoreSight component
                Type is 0x00, Miscellaneous, other
        ROMTABLE[0x4] = 0x0
                End of ROM table

2.1.2. > cpu.cti0 dump

    CTR (0x0000) 0x00000001
    GATE (0x0140) 0x00000000
   INEN0 (0x0020) 0x00000000
   INEN1 (0x0024) 0x00000000
   INEN2 (0x0028) 0x00000000
   INEN3 (0x002c) 0x00000000
   INEN4 (0x0030) 0x00000000
   INEN5 (0x0034) 0x00000000
   INEN6 (0x0038) 0x00000000
   INEN7 (0x003c) 0x00000000
   INEN8 (0x0040) 0x00000000
  OUTEN0 (0x00a0) 0x00000001
  OUTEN1 (0x00a4) 0x00000002
  OUTEN2 (0x00a8) 0x00000000
  OUTEN3 (0x00ac) 0x00000000
  OUTEN4 (0x00b0) 0x00000000
  OUTEN5 (0x00b4) 0x00000000
  OUTEN6 (0x00b8) 0x00000000
  OUTEN7 (0x00bc) 0x00000000
  OUTEN8 (0x00c0) 0x00000000
    TRIN (0x0130) 0x00000000
   TROUT (0x0134) 0x00000000
    CHIN (0x0138) 0x00000000
   CHOUT (0x013c) 0x00000000
  APPSET (0x0014) 0x00000000
  APPCLR (0x0018) 0x00000000
APPPULSE (0x001c) 0x00000000
   INACK (0x0010) 0x00000000

2.1.3. arm coresight(DAP和CTI)技术介绍

3. telnet+openocd 指令介绍

在服务器上,新建telnet服务,连接到目标设备上

输入halt命令让CPU先停住,然后先进入到指定目录,加载bin文件到ddr内存,然后再唤醒cpu

3.1.1. 暂停cpu

halt

3.1.2. 加载文件到ddr

load_image u-boot-dtb.bin 0x200000000

3.1.3. dump地址到文件

dump_image filename address size

3.1.4. 恢复cpu

resume 或 resume 地址

3.1.5. 内存读写

mdb ['phys'] address [count]
      display memory bytes
mdd ['phys'] address [count]
      display memory double-words
mdh ['phys'] address [count]
      display memory half-words
mdw ['phys'] address [count]
      display memory words
measure_clk
      Runs a test to measure the JTAG clk. Useful with RCLK / RTCK.
      (command valid any time)
mem2array arrayname bitwidth address count
      read 8/16/32 bit memory and return as a TCL array for script
      processing
ms
      Returns ever increasing milliseconds. Used to calculate differences
      in time. (command valid any time)
mwb ['phys'] address value [count]
      write memory byte
mwd ['phys'] address value [count]
      write memory double-word
mwh ['phys'] address value [count]
      write memory half-word
mww ['phys'] address value [count]
      write memory word

3.1.6. 系统寄存器reg

> reg 

3.1.7. 显示jlink状态

> jlink hwstatus
VTarget = 3.296 V
TCK = 1 TDI = 1 TDO = 1 TMS = 0 SRST = 1 TRST = 1

4. gdb调试kernel

4.1.1. 调试应用程序

4.1.2. kernel编译就是打开CONFIG_DEBUG_INFO选项

4.2. 宿主机启动gdb调试

4.2.1. 方式1

vmlinux 带符号表的elf文件,也可以启动后通过file命令加载

./aarch64-linux-gnu-gdb ../../vmlinux

连接远程gdbserver 

(gdb) target remote 10.50.33.54:11111 (正常情况下,remote 连接上, cpu会自动暂停。continue继续运行)

4.2.2. 方式二 

x6:    ../a55_kernel_tools/gcc-linaro-6.3.1-2017.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu-gdb -ex "target remote 10.50.33.54:11111" vmlinux

gp2 :       ./riscv64-elf-gdb -ex "tar remote $1" -x $GDBINITC -ex "c" -ex "q"

-ex表示gdb启动执行的初始化命令

-e表示执行的gdb启动执行初始化脚本


set $vmlinux_addr = 0x830000000

restore ../../kernel/build/Image binary $vmlinux_addr


 

set $pc = $spl_addr

5. gdb调试常用指令介绍

5.1. list 函数对应的地址,地址对应函数

5.1.1. list *(driver_probe_device+516) 

0xffff0000083c464c is in driver_probe_device (drivers/base/dd.c:413).
408              * should always go first
409              */
410             devices_kset_move_last(dev);
411
412             if (dev->bus->probe) {
413                     ret = dev->bus->probe(dev);
414                     if (ret)
415                             goto probe_failed;
416             } else if (drv->probe) {
417                     ret = drv->probe(dev);

5.1.2. (gdb) list *(0xffff0000084e7e20)

0xffff0000084e7e20 is in isp_vin_probe (drivers/media/platform/innosilicon/isp/inno_isp_vin.c:636).
631             }
632
633             // isp_vin->isp_buf_vaddr = dma_alloc_coherent(isp_vin->dev, 128*1024*1024, &isp_vin->isp_buf, GFP_KERNEL);
634             memset(isp_vin->isp_buf_vaddr, 0x0, 128*1024*1024);
635             debug_info("dma isp buf %llx\n",isp_vin->isp_buf);
636             isp_vin->isp_statistics_vaddr = dma_alloc_coherent(isp_vin->dev, 1*1024*1024, &isp_vin->isp_statistics, GFP_KERNEL);
637             debug_info("dma isp statistics %llx\n",isp_vin->isp_statistics);
638             memset(isp_vin->isp_statistics_vaddr, 0xff, 1024*1024);
639
640             if(isp_vin->isp_mode == ISP_SINGLE){

5.1.3. 方法二

5.1.3.1. info symbol  0xfffff

打印地址对应的函数

5.1.3.2. info address test

打印test函数的地址 

5.2. print 打印对应函数信息

p complete_signal

5.2.1.  p 查看某个地址的结构体变量

 p/x *(struct pt_regs*)0xffff80002d493ac0

5.3. disassemble 查看对应地址的汇编代码

(gdb) disassemble 0xffff0000083c5f40
Dump of assembler code for function platform_drv_probe:
   0xffff0000083c5ee8 <+0>:     stp     x29, x30, [sp,#-48]!
   0xffff0000083c5eec <+4>:     mov     w1, #0x0                        // #0
   0xffff0000083c5ef0 <+8>:     mov     x29, sp
   0xffff0000083c5ef4 <+12>:    stp     x19, x20, [sp,#16]
   0xffff0000083c5ef8 <+16>:    mov     x20, x0
   0xffff0000083c5efc <+20>:    str     x21, [sp,#32]
   0xffff0000083c5f00 <+24>:    ldr     x0, [x0,#584]
   0xffff0000083c5f04 <+28>:    ldr     x21, [x20,#136]

5.4. b和hb 断点和硬件断点:

5.4.1. 在文件行号处断点:

b a/file.c:122

5.4.2. 在地址出断点

b *0xfff0000xxxx

5.4.3. 所有断点信息

info breakpoints

5.4.4. 删除断点

delete 1

5.5. x查看寄存器

5.6. layout窗口模式

5.7. bt

backtrace打印函数调用栈

5.8. info stack和info frame查看堆栈信息

(gdb) frame 12

(gdb) info frame
Stack level 12, frame at 0xffff80002d493ac0:
 pc = 0xffff000008082654 in el1_sync (arch/arm64/kernel/entry.S:555); saved pc = <not saved>
 Outermost frame: previous frame identical to this frame (corrupt stack?)
 caller of frame at 0xffff80002d493ac0
 source language asm.
 Arglist at 0xffff80002d493ac0, args: 
 Locals at 0xffff80002d493ac0, Previous frame's sp is 0xffff80002d493ac0
 Saved registers:
  x0 at 0xffff80002d493ac0, x1 at 0xffff80002d493ac8, x2 at 0xffff80002d493ad0, x3 at 0xffff80002d493ad8, x4 at 0xffff80002d493ae0, x5 at 0xffff80002d493ae8, x6 at 0xffff80002d493af0,
  x7 at 0xffff80002d493af8, x8 at 0xffff80002d493b00, x9 at 0xffff80002d493b08, x10 at 0xffff80002d493b10, x11 at 0xffff80002d493b18, x12 at 0xffff80002d493b20, x13 at 0xffff80002d493b28,
  x14 at 0xffff80002d493b30, x15 at 0xffff80002d493b38, x16 at 0xffff80002d493b40, x17 at 0xffff80002d493b48, x18 at 0xffff80002d493b50, x19 at 0xffff80002d493b58, x20 at 0xffff80002d493b60,
  x21 at 0xffff80002d493b68, x22 at 0xffff80002d493b70, x23 at 0xffff80002d493b78, x24 at 0xffff80002d493b80, x25 at 0xffff80002d493b88, x26 at 0xffff80002d493b90, x27 at 0xffff80002d493b98,
  x28 at 0xffff80002d493ba0, x29 at 0xffff80002d493ba8
(gdb) frame 11
#11 0xffff000008080ab4 in do_mem_abort (addr=0, esr=2516582468, regs=0xffff80002d493ac0) at arch/arm64/mm/fault.c:739
739             if (!inf->fn(addr, esr, regs))
(gdb) info frame
Stack level 11, frame at 0xffff80002d493ac0:
 pc = 0xffff000008080ab4 in do_mem_abort (arch/arm64/mm/fault.c:739); saved pc = 0xffff000008082654
 called by frame at 0xffff80002d493ac0, caller of frame at 0xffff80002d493a10
 source language c.
 Arglist(函数的参数列表) at 0xffff80002d493a10, args: addr=0, esr=2516582468, regs=0xffff80002d493ac0
 Locals(函数的局部变量) at 0xffff80002d493a10, Previous frame's sp is 0xffff80002d493ac0
 Saved registers:
  x19 at 0xffff80002d493a20, x20 at 0xffff80002d493a28, x21 at 0xffff80002d493a30, x22 at 0xffff80002d493a38, x29 at 0xffff80002d493a10, x30 at 0xffff80002d493a18

asmlinkage void __exception do_mem_abort(unsigned long addr, unsigned int esr,
                     struct pt_regs *regs)
{
    const struct fault_info *inf = esr_to_fault_info(esr);
    struct siginfo info;



    if (!inf->fn(addr, esr, regs))
        return;

    pr_alert("Unhandled fault: %s (0x%08x) at 0x%016lx\n",
         inf->name, esr, addr);

    mem_abort_decode(esr);

    info.si_signo = inf->sig;
    info.si_errno = 0;
    info.si_code  = inf->code;
    info.si_addr  = (void __user *)addr;
    arm64_notify_die("", regs, &info, esr);
}

6. kernel异常现场分析实例

怎样通过gdb 查看异常调用之前保存的现场, 也许可以帮助分析那些没有异常打印信息卡死

6.1. linux系统打印现场信息

[    3.551574] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[    3.560615] Mem abort info:
[    3.563728]   Exception class = DABT (current EL), IL = 32 bits
[    3.570340]   SET = 0, FnV = 0
[    3.573743]   EA = 0, S1PTW = 0
[    3.577246] Data abort info:
[    3.580455]   ISV = 0, ISS = 0x00000044
[    3.584731]   CM = 0, WnR = 1
[    3.588044] user pgtable: 4k pages, 48-bit VAs, pgd = ffff800021828000
[    3.595330] [0000000000000000] *pgd=0000000000000000
[    3.600878] Internal error: Oops: 96000044 [#1] SMP
[    3.606321] Modules linked in:
[    3.609720] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G S              4.14.48 #414
[    3.617977] Hardware name: inno-x6 (DT)
[    3.622255] task: ffff80002d488000 task.stack: ffff80002d490000
[    3.628872] PC is at __memset+0x1ac/0x1c8
[    3.633349] LR is at isp_vin_probe+0x368/0xc00
[    3.638306] pc : [<ffff0000086a6b2c>] lr : [<ffff0000084e7e20>] pstate: 40000145
[    3.646563] sp : ffff80002d493c00
[    3.650256] x29: ffff80002d493c00 x28: 0000000000000000
[    3.656178] x27: ffff000008925060 x26: 0000000000000002
[    3.662108] x25: 0000000000000038 x24: ffff00000d0be138
[    3.668038] x23: ffff80002d7aa400 x22: ffff00000d078000
[    3.673967] x21: ffff80002d7aa410 x20: 0000000000000002
[    3.679896] x19: ffff00000d0be000 x18: 0000000000000010
[    3.685826] x17: 0000000000000001 x16: 0000000000000019
[    3.691747] x15: 0000000000000006 x14: 0140000000000000
[    3.697677] x13: ffff00000d07ed9d x12: 0000000000000000
[    3.703606] x11: 0000000000000006 x10: 0101010101010101
[    3.709535] x9 : 0000000000000000 x8 : 0000000000000000
[    3.715464] x7 : 0000000000000000 x6 : 000000000000003f
[    3.721393] x5 : 0000000000000040 x4 : 0000000000000000
[    3.727314] x3 : 0000000000000004 x2 : 0000000007ffffc0
[    3.733235] x1 : 0000000000000000 x0 : 0000000000000000
[    3.739156] Process swapper/0 (pid: 1, stack limit = 0xffff80002d490000)
[    3.746638] Call trace:
[    3.749362] Exception stack(0xffff80002d493ac0 to 0xffff80002d493c00)
[    3.756555] 3ac0: 0000000000000000 0000000000000000 0000000007ffffc0 0000000000000004
[    3.765300] 3ae0: 0000000000000000 0000000000000040 000000000000003f 0000000000000000
[    3.774046] 3b00: 0000000000000000 0000000000000000 0101010101010101 0000000000000006
[    3.782781] 3b20: 0000000000000000 ffff00000d07ed9d 0140000000000000 0000000000000006
[    3.791517] 3b40: 0000000000000019 0000000000000001 0000000000000010 ffff00000d0be000
[    3.800254] 3b60: 0000000000000002 ffff80002d7aa410 ffff00000d078000 ffff80002d7aa400
[    3.808999] 3b80: ffff00000d0be138 0000000000000038 0000000000000002 ffff000008925060
[    3.817744] 3ba0: 0000000000000000 ffff80002d493c00 ffff0000084e7e20 ffff80002d493c00
[    3.826489] 3bc0: ffff0000086a6b2c 0000000040000145 ffff80002d7aa410 ffff8000221a7b00
[    3.835225] 3be0: ffffffffffffffff ffff0000084e7d7c ffff80002d493c00 ffff0000086a6b2c
[    3.843970] [<ffff0000086a6b2c>] __memset+0x1ac/0x1c8
[    3.849612] [<ffff0000083c5f40>] platform_drv_probe+0x58/0xc0
[    3.856027] [<ffff0000083c464c>] driver_probe_device+0x204/0x2c0
[    3.862733] [<ffff0000083c47b4>] __driver_attach+0xac/0xb0
[    3.868857] [<ffff0000083c285c>] bus_for_each_dev+0x64/0xa0
[    3.875078] [<ffff0000083c4950>] driver_attach+0x20/0x28
[    3.880998] [<ffff0000083c31a8>] bus_add_driver+0x108/0x228
[    3.887219] [<ffff0000083c5220>] driver_register+0x60/0xf8
[    3.893342] [<ffff0000083c5e94>] __platform_driver_register+0x3c/0x48
[    3.900528] [<ffff0000088dbaa0>] isp_vin_driver_init+0x18/0x20
[    3.907042] [<ffff000008083148>] do_one_initcall+0x38/0x120
[    3.913265] [<ffff0000088c0ce8>] kernel_init_freeable+0x134/0x1d0
[    3.920069] [<ffff0000086b87b0>] kernel_init+0x10/0x100
[    3.925892] [<ffff000008084258>] ret_from_fork+0x10/0x18
[    3.931814] Code: 91010108 54ffff4a 8b040108 cb050042 (d50b7428)
[    3.938621] ---[ end trace b52b486df56bf3db ]---
[    3.943793] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    3.943793]
[    3.953997] SMP: stopping secondary CPUs
[    3.958373] Kernel Offset: disabled
[    3.962254] CPU features: 0x0802020
[    3.966143] Memory Limit: 2048 MB
[    3.969840] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

6.2.  gdb的打印back trace

这里可以看出, cpu进入同步异常的data abort类型.这个属于不可恢复异常, 系统打印异常信息,调用panic把cpu停在这个异常现场.

(gdb) bt
#0  __delay (cycles=886429981) at arch/arm64/lib/delay.c:31
#1  0xffff0000086a63c0 in __const_udelay (xloops=<optimized out>) at arch/arm64/lib/delay.c:41
#2  0xffff000008098e98 in panic (fmt=<optimized out>) at kernel/panic.c:298
#3  0xffff00000809c89c in find_child_reaper (father=<optimized out>) at kernel/exit.c:578
#4  forget_original_parent (dead=<optimized out>, father=<optimized out>) at kernel/exit.c:670
#5  exit_notify (group_dead=<optimized out>, tsk=<optimized out>) at kernel/exit.c:706
#6  do_exit (code=<optimized out>) at kernel/exit.c:885
#7  0xffff000008088230 in die (str=<optimized out>, regs=0xffff80002d493ac0, err=-1778384828) at arch/arm64/kernel/traps.c:281
#8  0xffff000008093d1c in __do_kernel_fault (addr=0, esr=2516582468, regs=0xffff80002d493ac0) at arch/arm64/mm/fault.c:289
#9  0xffff000008093fa8 in do_page_fault (addr=0, esr=2516582468, regs=0xffff80002d493ac0) at arch/arm64/mm/fault.c:558
#10 0xffff00000809412c in do_translation_fault (addr=<optimized out>, esr=<optimized out>, regs=<optimized out>) at arch/arm64/mm/fault.c:584
#11 0xffff000008080ab4 in do_mem_abort (addr=0, esr=2516582468, regs=0xffff80002d493ac0) at arch/arm64/mm/fault.c:739
#12 0xffff000008082654 in el1_sync () at arch/arm64/kernel/entry.S:555
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

6.3. 异常处理机制 (看懂back trace的预备知识)

6.3.1. 详细请参考阅读知乎<ARM64的异常处理(Exception)机制>

6.3.1.1. 异常入口

当异常发生后,操作系统需要做以下事情: 根据异常发生的类型, : 跳转到合适的异常向量表-异

常向量表的每个表项会保存一个异常处理的跳转函数,然后跳转到恰当的异常处理函数并处理异常

6.3.1.2. 异常现场struct pt_regs

6.3.2. 函数的栈 调用栈关系

6.4. do_mem_abort函数分析

do_mem_abort(unsigned long addrunsigned int esr, struct pt_regs *regs)

do_mem_abort函数的参数regs即为 struct pt_regs结构体,中断现场.

重新查看gdb的backtrace,   

#11 0xffff000008080ab4 in do_mem_abort (addr=0, esr=2516582468, regs=0xffff80002d493ac0) at arch/arm64/mm/fault.c:739

6.4.1. 打印regs参数信息.

p/x *(struct pt_regs*)0xffff80002d493ac0

$3 = {{user_regs = {regs = {0x0, 0x0, 0x7ffffc0, 0x4, 0x0, 0x40, 0x3f, 0x0, 0x0, 0x0, 0x101010101010101, 0x6, 0x0, 0xffff00000d07ed9d, 0x140000000000000, 0x6, 0x19, 0x1, 0x10, 0xffff00000d0be000, 0x2, 
        0xffff80002d7aa410, 0xffff00000d078000, 0xffff80002d7aa400, 0xffff00000d0be138, 0x38, 0x2, 0xffff000008925060, 0x0, 0xffff80002d493c00, 0xffff0000084e7e20}, sp = 0xffff80002d493c00, 
      pc = 0xffff0000086a6b2c, pstate = 0x40000145}, {regs = {0x0, 0x0, 0x7ffffc0, 0x4, 0x0, 0x40, 0x3f, 0x0, 0x0, 0x0, 0x101010101010101, 0x6, 0x0, 0xffff00000d07ed9d, 0x140000000000000, 0x6, 0x19, 0x1, 
        0x10, 0xffff00000d0be000, 0x2, 0xffff80002d7aa410, 0xffff00000d078000, 0xffff80002d7aa400, 0xffff00000d0be138, 0x38, 0x2, 0xffff000008925060, 0x0, 0xffff80002d493c00, 0xffff0000084e7e20}, 
      sp = 0xffff80002d493c00, pc = 0xffff0000086a6b2c, pstate = 0x40000145}}, orig_x0 = 0xffff80002d7aa410, syscallno = 0x221a7b00, unused2 = 0xffff8000, orig_addr_limit = 0xffffffffffffffff, 
  unused = 0xffff0000084e7d7c, stackframe = {0xffff80002d493c00, 0xffff0000086a6b2c}}

6.4.2. 查看pc 0xffff0000086a6b2c 位置

(gdb) list *(0xffff0000086a6b2c)
0xffff0000086a6b2c is at arch/arm64/lib/memset.S:211.

6.4.3. 查看lr(X30)= 0xffff0000084e7e20 位置

(gdb) list *(0xffff0000084e7e20)
0xffff0000084e7e20 is in isp_vin_probe (drivers/media/platform/innosilicon/isp/inno_isp_vin.c:636).

这里就和内核串口打印栈信息符合,分析到此结束.

7. 总结

这里gdb还有很多功能, 包括查看异常的 task信息等还有待发现. 还有变量watch功能. 实践中可以继续发掘

{"enableNumbering":true}

  • 13
    点赞
  • 23
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值