关于gdb的使用心得

GDB作为一个很强的调试工具,我以为已经是人尽皆知了,至少在嵌入式中是这样,但是在平时工作中发现并非如此,很多人依然在使用加log的方式来定位诸如段错误之类的问题,不是说加log不好,只是有时候GDB效率更高。

关于用法网上的资料已经很丰富了,这里只是做个简单的总结,以及我经常用到调试方法,我把它分成三部分:常用命令   难以复现的问题   没有加-g的文件调试。

常用命令

1.set args:设置入参,有的程序需要传参


#include <stdio.h>
int main(int argc, char* argv[])
{
   int i = 0;
   for(i = 1; i < argc; i++)
   {
      printf("%d %s\n",i,argv[i]);
   }
   return 0;
}

(gdb) set args how are you?
(gdb) r
Starting program: /home/luogf/20210109/a.out how are you?
1 how
2 are
3 you?

2.break 可简写成首字母b,用于打断点,后面可以是函数名或者文件行(文件行断点依赖编译选项-g)

3.list 可简写成首字母l,用于打印当前代码段(依赖编译选项-g)

4.continue 可简写成首字母c,程序停下来时,用于让程序继续执行


(gdb) b check
Breakpoint 1 at 0x40047b: file gdb.c, line 4.
(gdb) r
Starting program: /home/luogf/20210109/a.out

Breakpoint 1, check (n=0) at gdb.c:4
4          int c = 10 / n;
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.212.el6_10.3.x86_64
(gdb) l
1       #include<stdio.h>
2       int check(int n)
3       {
4          int c = 10 / n;
5          return 0;
6       }
7       int main()
8       {
9          int num = 0;
10         check(num);
(gdb) b 4
Note: breakpoint 1 also set at pc 0x40047b.
Breakpoint 2 at 0x40047b: file gdb.c, line 4.
(gdb) c
Continuing.

Program received signal SIGFPE, Arithmetic exception.
0x0000000000400485 in check (n=0) at gdb.c:4
4          int c = 10 / n;
(gdb)

5.除此之外,p命令可以打印变量的值set var(有的人喜欢省略var,这是不好的习惯,因为set在gdb还有其他子命令,搞不好会冲突)命令gdb还可以改变变量的值,比如下面程序正常运行是会发生复位的,这个功能通常用来验证程序内部逻辑。


(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/luogf/20210109/a.out

Breakpoint 1, check (n=0) at gdb.c:4
4          int c = 10 / n;
(gdb) p n
$2 = 0
(gdb) set var n = 2
(gdb) c
Continuing.

Program exited normally.

如果你懂汇编,disassemble可以查看当前函数的汇编代码i r查看当前寄存器的状态,这两个命令一般在程序没有被-g编译或者被strip的情况下用到。


Breakpoint 1, check (n=0) at gdb.c:4
4          int c = 10 / n;
(gdb) disassemble
Dump of assembler code for function check:
   0x0000000000400474 <+0>:     push   %rbp
   0x0000000000400475 <+1>:     mov    %rsp,%rbp
   0x0000000000400478 <+4>:     mov    %edi,-0x14(%rbp)
=> 0x000000000040047b <+7>:     mov    $0xa,%eax
   0x0000000000400480 <+12>:    mov    %eax,%edx
   0x0000000000400482 <+14>:    sar    $0x1f,%edx
   0x0000000000400485 <+17>:    idivl  -0x14(%rbp)
   0x0000000000400488 <+20>:    mov    %eax,-0x4(%rbp)
   0x000000000040048b <+23>:    mov    $0x0,%eax
   0x0000000000400490 <+28>:    leaveq
   0x0000000000400491 <+29>:    retq
End of assembler dump.
(gdb) i r
rax            0x0      0
rbx            0x0      0
rcx            0x400492 4195474
rdx            0x7fffffffe668   140737488348776
rsi            0x7fffffffe658   140737488348760
rdi            0x0      0
rbp            0x7fffffffe550   0x7fffffffe550
rsp            0x7fffffffe550   0x7fffffffe550
r8             0x7ffff7dd7ba0   140737351875488
r9             0x7ffff7deae20   140737351953952
r10            0x7fffffffe3c0   140737488348096
r11            0x7ffff7a66c20   140737348267040
r12            0x400390 4195216
r13            0x7fffffffe650   140737488348752
r14            0x0      0
r15            0x0      0
rip            0x40047b 0x40047b <check+7>
eflags         0x206    [ PF IF ]
cs             0x33     51
ss             0x2b     43
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0

难以复现的问题

比如程序跑了不知道多久会复位,这种问题可以让程序生成core文件,然后使用gdb解析,前提:你需要一份和问题复现一致的加了-g选项的进程文件。

首先得确认可以生成coredump文件,具体方法这里就不说了,不同环境设置方法不一,网上都有。

加-q可以让gdb不打印冗长的自我介绍。

方法:gdb  <进程文件>  <coredump文件>


[luogf@VM-0-2-centos 20210109]$ gdb -q ./a.out core.1097
Reading symbols from /home/luogf/20210109/a.out...done.
[New Thread 1097]
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Core was generated by `./a.out'.
Program terminated with signal 8, Arithmetic exception.
#0  0x0000000000400485 in check (n=0) at gdb.c:4
4          int c = 10 / n;

没有加-g的文件调试

有时候出于某种原因,对于环境中没有-g编译的进程文件产生的coredump文件,结合本地使用-g编译的进程文件也可以使用上面的方法分析。

但是要是,需要在线调试呢?而且处于某种原由你不能向环境上传并替换你-g编译好的进程呢?

这时可以使用objdump反汇编本地加-g的进程,结合环境中汇编代码定位。

[luogf@VM-0-2-centos 20210109]$ gdb -q ./a.out
Reading symbols from /home/luogf/20210109/a.out...(no debugging symbols found)...done.
(gdb) r
Starting program: /home/luogf/20210109/a.out

Program received signal SIGFPE, Arithmetic exception.
0x0000000000400485 in check ()
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.212.el6_10.3.x86_64
(gdb) bt
#0  0x0000000000400485 in check ()
#1  0x00000000004004ab in main ()
(gdb) l
No symbol table is loaded.  Use the "file" command.
(gdb) disassemble
Dump of assembler code for function check:
   0x0000000000400474 <+0>:     push   %rbp
   0x0000000000400475 <+1>:     mov    %rsp,%rbp
   0x0000000000400478 <+4>:     mov    %edi,-0x14(%rbp)
   0x000000000040047b <+7>:     mov    $0xa,%eax
   0x0000000000400480 <+12>:    mov    %eax,%edx
   0x0000000000400482 <+14>:    sar    $0x1f,%edx
=> 0x0000000000400485 <+17>:    idivl  -0x14(%rbp)
   0x0000000000400488 <+20>:    mov    %eax,-0x4(%rbp)
   0x000000000040048b <+23>:    mov    $0x0,%eax
   0x0000000000400490 <+28>:    leaveq
   0x0000000000400491 <+29>:    retq
End of assembler dump.

从上面可以看出问题出在0x0000000000400485 <+17>:    idivl  -0x14(%rbp)这一行,=>表示当前运行到的汇编位置

然后反汇编加了-g选项的进程。

objdump -d -S -l a.out

截取其中非系统接口的代码

check():
/home/luogf/20210109/gdb.c:3
#include<stdio.h>
int check(int n)
{
  400474:       55                      push   %rbp
  400475:       48 89 e5                mov    %rsp,%rbp
  400478:       89 7d ec                mov    %edi,-0x14(%rbp)
/home/luogf/20210109/gdb.c:4
   int c = 10 / n;
  40047b:       b8 0a 00 00 00          mov    $0xa,%eax
  400480:       89 c2                   mov    %eax,%edx
  400482:       c1 fa 1f                sar    $0x1f,%edx
  400485:       f7 7d ec                idivl  -0x14(%rbp)
  400488:       89 45 fc                mov    %eax,-0x4(%rbp)
/home/luogf/20210109/gdb.c:5
   return 0;
  40048b:       b8 00 00 00 00          mov    $0x0,%eax
/home/luogf/20210109/gdb.c:6
}
  400490:       c9                      leaveq
  400491:       c3                      retq

0000000000400492 <main>:
main():
/home/luogf/20210109/gdb.c:8
int main()
{
  400492:       55                      push   %rbp
  400493:       48 89 e5                mov    %rsp,%rbp
  400496:       48 83 ec 10             sub    $0x10,%rsp
/home/luogf/20210109/gdb.c:9
   int num = 0;
  40049a:       c7 45 fc 00 00 00 00    movl   $0x0,-0x4(%rbp)
/home/luogf/20210109/gdb.c:10
   check(num);
  4004a1:       8b 45 fc                mov    -0x4(%rbp),%eax
  4004a4:       89 c7                   mov    %eax,%edi
  4004a6:       e8 c9 ff ff ff          callq  400474 <check>
/home/luogf/20210109/gdb.c:11
   return 0;
  4004ab:       b8 00 00 00 00          mov    $0x0,%eax
/home/luogf/20210109/gdb.c:12
}
  4004b0:       c9                      leaveq
  4004b1:       c3                      retq
  4004b2:       90                      nop
  4004b3:       90                      nop
  4004b4:       90                      nop
  4004b5:       90                      nop
  4004b6:       90                      nop
  4004b7:       90                      nop
  4004b8:       90                      nop
  4004b9:       90                      nop
  4004ba:       90                      nop
  4004bb:       90                      nop
  4004bc:       90                      nop
  4004bd:       90                      nop
  4004be:       90                      nop
  4004bf:       90                      nop

对比汇编发现,问题出在int c = 10 / n;这行。

/home/luogf/20210109/gdb.c:4
   int c = 10 / n;
  40047b:       b8 0a 00 00 00          mov    $0xa,%eax
  400480:       89 c2                   mov    %eax,%edx
  400482:       c1 fa 1f                sar    $0x1f,%edx
  400485:       f7 7d ec                idivl  -0x14(%rbp)
  400488:       89 45 fc                mov    %eax,-0x4(%rbp)

额外的事

闲着也是闲着,继续往下分析汇编。

打印出问题时寄存器的值


(gdb) i r
rax            0xa      10
rbx            0x0      0
rcx            0x400492 4195474
rdx            0x0      0
rsi            0x7fffffffe658   140737488348760
rdi            0x0      0
rbp            0x7fffffffe550   0x7fffffffe550
rsp            0x7fffffffe550   0x7fffffffe550
r8             0x7ffff7dd7ba0   140737351875488
r9             0x7ffff7deae20   140737351953952
r10            0x7fffffffe3c0   140737488348096
r11            0x7ffff7a66c20   140737348267040
r12            0x400390 4195216
r13            0x7fffffffe650   140737488348752
r14            0x0      0
r15            0x0      0
rip            0x400485 0x400485 <check+17>
eflags         0x10246  [ PF ZF IF RF ]
cs             0x33     51
ss             0x2b     43
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0

已知问题出在0x0000000000400485 <+17>:    idivl  -0x14(%rbp)这一行,idivl时除法指令,-0x14(%rbp)是被除数,由上知道寄存器rbp地址是0x7fffffffe550,我们通过gdb看看-0x14(%rbp)究竟是什么值。


(gdb) x (int*)0x7fffffffe550-0x14
0x7fffffffe500: 0x00000000

没错,是0.

关于x命令呢,它可以打印某个地址里面存储的值。

  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值