一、启动GDB调试
使用 GDB 调试程序一般有三种方式: gdb filename gdb attach pid gdb filename corename
1、直接调试目标程序
2、附加进程
3、调试 core 文件
各个参数的说明如下: 参数名称 参数含义(英文) 参数含义(中文) %p insert pid into filename 添加 pid 到 core 文件名中 %u insert current uid into filename 添加当前 uid 到 core 文件名中 %g insert current gid into filename 添加当前 gid 到 core 文件名中 %s insert signal that caused the coredump into the filename 添加导致产生 core 的信号到 core 文件名中 %t insert UNIX time that the coredump occurred into filename 添加 core 文件生成时间(UNIX)到 core 文件名中 %h insert hostname where the coredump happened into filename 添加主机名到 core 文件名中 %e insert coredumping executable name into filename 添加程序名到 core 文件名中
二、GDB 常用的调试命令概览
先给出一个常用命令的列表,后面会结合具体的例子详细介绍每个命令的用法。
命令名称 命令缩写 命令说明 run r 运行一个程序 continue c 让暂停的程序继续运行 next n 运行到下一行 step s 如果有调用函数,进入调用的函数内部,相当于 step into until u 运行到指定行停下来 finish fi 结束当前调用函数,到上一层函数调用处 return return 结束当前调用函数并返回指定值,到上一层函数调用处 jump j 将当前程序执行流跳转到指定行或地址 print p 打印变量或寄存器值 backtrace bt 查看当前线程的调用堆栈 frame f 切换到当前调用线程的指定堆栈,具体堆栈通过堆栈序号指定 thread thread 切换到指定线程 break b 添加断点 tbreak tb 添加临时断点 delete del 删除断点 enable enable 启用某个断点 disable disable 禁用某个断点 watch watch 监视某一个变量或内存地址的值是否发生变化 list l 显示源码 info info 查看断点 / 线程等信息 ptype ptype 查看变量类型 disassemble dis 查看汇编代码 set args 设置程序启动命令行参数 show args 查看设置的命令行参数
三、GDB 常用命令详解
本课的核心内容如下:
run 命令 continue 命令 break 命令 backtrace 与 frame 命令 info break、enable、disable 和 delete 命令 list 命令 print 和 ptype 命令
为了结合实践,这里以调试 Redis 源码为例来介绍每一个命令,先介绍一些常用命令的基础用法,某些命令的高级用法会在后面讲解。 Redis 源码下载与 debug 版本编译 Redis 的最新源码下载地址可以在 Redis 官网获得,使用 wget 命令将 Redis 源码文件下载下来:
[root@localhost gdbtest]# wget http://download.redis.io/releases/redis-4.0.11.tar.gz –2018-09-08 13:08:41– http://download.redis.io/releases/redis-4.0.11.tar.gz Resolving download.redis.io (download.redis.io)… 109.74.203.151 Connecting to download.redis.io (download.redis.io)|109.74.203.151|:80… connected. HTTP request sent, awaiting response… 200 OK Length: 1739656 (1.7M) [application/x-gzip] Saving to: ‘redis-4.0.11.tar.gz’ 54% [======================> ] 940,876 65.6KB/s eta 9s
解压:
[root@localhost gdbtest]# tar zxvf redis-4.0.11.tar.gz
进入生成的 redis-4.0.11 目录使用 makefile 命令进行编译 为了方便调试,我们需要生成调试符号并且关闭编译器优化选项,操作如下:
[root@localhost gdbtest]# cd redis-4.0.11 [root@localhost redis-4.0.11]# make CFLAGS=”-g -O0” -j 4
注意:由于 redis 是纯 C 项目,使用的编译器是 gcc,因而这里设置编译器的选项时使用的是 CFLAGS 选项;如果项目使用的语言是 C++,那么使用的编译器一般是 g++,相对应的编译器选项是 CXXFLAGS。这点请读者注意区别。 另外,这里 makefile 使用了 -j 选项,其值是 4,表示开启 4 个进程同时编译,加快编译速度。 编译成功后,会在 src 目录下生成多个可执行程序,其中 redis-server 和 redis-cli 是需要调试的程序。 进入 src 目录,使用 GDB 启动 redis-server 这个程序:
[root@localhost src]# gdb redis-server Reading symbols from /root/gdbtest/redis-4.0.11/src/redis-server…done.
1、run 命令
默认情况下,前面的课程中我们说 gdb filename 命令只是附加的一个调试文件,并没有启动这个程序,需要输入 run 命令(简写为 r)启动这个程序:
(gdb) r Starting program: /root/gdbtest/redis-4.0.11/src/redis-server [Thread debugging using libthread_db enabled] Using host libthread_db library “/lib64/libthread_db.so.1”. 46455:C 08 Sep 13:43:43.957 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo 46455:C 08 Sep 13:43:43.957 # Redis version=4.0.11, bits=64, commit=00000000, modified=0, pid=46455, just started 46455:C 08 Sep 13:43:43.957 # Warning: no config file specified, using the default config. In order to specify a config file use /root/gdbtest/redis-4.0.11/src/redis-server /path/to/redis.conf 46455:M 08 Sep 13:43:43.957 * Increased maximum number of open files to 10032 (it was originally set to 1024). [New Thread 0x7ffff07ff700 (LWP 46459)] [New Thread 0x7fffefffe700 (LWP 46460)] [New Thread 0x7fffef7fd700 (LWP 46461)] . _.-__ ''-.__.- .. ”-. Redis 4.0.11 (00000000/0) 64 bit .-.-```. ```\/ _.,_ ''-._( ' , .-` | `, ) Running in standalone mode|`-._`-...-` __...-.-._|’_.-'| Port: 6379|-._ ._ / _.-' | PID: 46455 -._ -._-./ .-’ .-’ |-._-._ -.__.-' _.-'_.-'||-._-._ _.-'_.-' | http://redis.io -._ -._-._.-‘.-’ _.-’ |-._-._ -.__.-' _.-'_.-'||-._-._ _.-'_.-' | -._ -._-._.-‘.-’ _.-’ -._-._.-’ .-’ -._ _.-' -.__.-‘46455:M 08 Sep 13:43:43.965 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128. 46455:M 08 Sep 13:43:43.965 # Server initialized 46455:M 08 Sep 13:43:43.965 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add ‘vm.overcommit_memory = 1’ to /etc/sysctl.conf and then reboot or run the command ‘sysctl vm.overcommit_memory=1’ for this to take effect. 46455:M 08 Sep 13:43:43.965 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command ‘echo never > /sys/kernel/mm/transparent_hugepage/enabled’ as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled. 46455:M 08 Sep 13:43:43.965 * Ready to accept connections
这就是 redis-server 启动界面,假设程序已经启动,再次输入 run 命令则是重启程序。我们在 GDB 界面按 Ctrl + C 快捷键让 GDB 中断下来,再次输入 r 命令,GDB 会询问我们是否重启程序,输入 yes 确认重启。
^C Program received signal SIGINT, Interrupt. 0x00007ffff73ee923 in epoll_wait () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.el7_4.2.x86_64 (gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) yes Starting program: /root/gdbtest/redis-4.0.11/src/redis-server
2、continue 命令
当 GDB 触发断点或者使用 Ctrl + C 命令中断下来后,想让程序继续运行,只要输入 continue 命令即可(简写为 c)。当然,如果 continue 命令继续触发断点,GDB 就会再次中断下来。
^C Program received signal SIGINT, Interrupt. 0x00007ffff73ee923 in epoll_wait () from /lib64/libc.so.6 (gdb) c Continuing.
3、break 命令
break 命令(简写为 b)即我们添加断点的命令,可以使用以下方式添加断点:
break functionname,在函数名为 functionname 的入口处添加一个断点; break LineNo,在当前文件行号为 LineNo 处添加一个断点; break filename:LineNo,在 filename 文件行号为 LineNo 处添加一个断点。
这三种方式都是我们常用的添加断点的方式。举个例子,对于一般的 Linux 程序来说,main() 函数是程序入口函数,redis-server 也不例外,我们知道了函数的名字,就可以直接在 main() 函数处添加一个断点:
(gdb) b main Breakpoint 1 at 0x423450: file server.c, line 3709.
添加好了以后,使用 run 命令重启程序,就可以触发这个断点了,GDB 会停在断点处。
(gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /root/gdbtest/redis-4.0.11/src/redis-server [Thread debugging using libthread_db enabled] Using host libthread_db library “/lib64/libthread_db.so.1”.Breakpoint 1, main (argc=1, argv=0x7fffffffe648) at server.c:3709 3709 int main(int argc, char **argv) { (gdb)
redis-server 默认端口号是 6379 ,我们知道这个端口号肯定是通过操作系统的 socket API bind() 函数创建的,通过文件搜索,找到调用这个函数的文件,其位于 anet.c 441 行。
我们使用 break 命令在这个地方加一个断点:
(gdb) b anet.c:441 Breakpoint 3 at 0x426cf0: file anet.c, line 441
由于程序绑定端口号是 redis-server 启动时初始化的,为了能触发这个断点,再次使用 run 命令重启下这个程序,GDB 第一次会触发 main() 函数处的断点,输入 continue 命令继续运行,接着触发 anet.c:441 处的断点:
(gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /root/gdbtest/redis-4.0.11/src/redis-server [Thread debugging using libthread_db enabled] Using host libthread_db library “/lib64/libthread_db.so.1”.Breakpoint 1, main (argc=1, argv=0x7fffffffe648) at server.c:3709 3709 int main(int argc, char **argv) { (gdb) c Continuing. 46699:C 08 Sep 15:30:31.403 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo 46699:C 08 Sep 15:30:31.403 # Redis version=4.0.11, bits=64, commit=00000000, modified=0, pid=46699, just started 46699:C 08 Sep 15:30:31.403 # Warning: no config file specified, using the default config. In order to specify a config file use /root/gdbtest/redis-4.0.11/src/redis-server /path/to/redis.conf 46699:M 08 Sep 15:30:31.404 * Increased maximum number of open files to 10032 (it was originally set to 1024).Breakpoint 3, anetListen (err=0x746bb0 “”, s=10, sa=0x75edb0, len=28, backlog=511) at anet.c:441 441 if (bind(s,sa,len) == -1) { (gdb)
anet.c:441 处的代码如下:
现在断点停在第 441 行,所以当前文件就是 anet.c,可以直接使用“break 行号”添加断点。例如,可以在第 444 行、450 行、452 行分别加一个断点,看看这个函数执行完毕后走哪个 return 语句退出,则可以执行:
440 static int anetListen(char *err, int s, struct sockaddr *sa, socklen_t len, int backlog) { 441 if (bind(s,sa,len) == -1) { 442 anetSetError(err, “bind: %s”, strerror(errno)); 443 close(s); 444 return ANET_ERR; (gdb) l 445 } 446 447 if (listen(s, backlog) == -1) { 448 anetSetError(err, “listen: %s”, strerror(errno)); 449 close(s); 450 return ANET_ERR; 451 } 452 return ANET_OK; 453 } 454 (gdb) b 444 Breakpoint 3 at 0x426cf5: file anet.c, line 444. (gdb) b 450 Breakpoint 4 at 0x426d06: file anet.c, line 450. (gdb) b 452 Note: breakpoint 4 also set at pc 0x426d06. Breakpoint 5 at 0x426d06: file anet.c, line 452. (gdb)
添加好这三个断点以后,我们使用 continue 命令继续运行程序,发现程序运行到第 452 行中断下来(即触发 Breakpoint 5):
(gdb) c Continuing.Breakpoint 5, anetListen (err=0x746bb0 “”, s=10, sa=0x7e34e0, len=16, backlog=511) at anet.c:452 452 return ANET_OK;
说明 redis-server 绑定端口号并设置侦听(listen)成功,我们可以再打开一个 SSH 窗口,验证一下,发现 6379 端口确实已经处于侦听状态了:
[root@localhost src]# lsof -i -Pn | grep redis redis-ser 46699 root 10u IPv6 245844 0t0 TCP *:6379 (LISTEN)
4、backtrace 与 frame 命令
backtrace 命令(简写为 bt)用来查看当前调用堆栈。接上,redis-server 现在中断在 anet.c:452 行,可以通过 backtrace 命令来查看当前的调用堆栈:
(gdb) bt#0 anetListen (err=0x746bb0 "", s=10, sa=0x7e34e0, len=16, backlog=511) at anet.c:452#1 0x0000000000426e35 in _anetTcpServer (err=err@entry=0x746bb0 "", port=port@entry=6379, bindaddr=bindaddr@entry=0x0, af=af@entry=10, backlog=511)at anet.c:487#2 0x000000000042793d in anetTcp6Server (err=err@entry=0x746bb0 "", port=port@entry=6379, bindaddr=bindaddr@entry=0x0, backlog=511)at anet.c:510#3 0x000000000042b0bf in listenToPort (port=6379, fds=fds@entry=0x746ae4 , count=count@entry=0x746b24 ) at server.c:1728#4 0x000000000042fa77 in initServer () at server.c:1852#5 0x0000000000423803 in main (argc=1, argv=0x7fffffffe648) at server.c:3862(gdb)
这里一共有 6 层堆栈,最顶层是 main() 函数,最底层是断点所在的 anetListen() 函数,堆栈编号分别是 #0 ~ #5 ,如果想切换到其他堆栈处,可以使用 frame 命令(简写为 f),该命令的使用方法是“frame 堆栈编号(编号不加 #)”。在这里依次切换至堆栈顶部,然后再切换回 #0 练习一下:
(gdb) f 1#1 0x0000000000426e35 in _anetTcpServer (err=err@entry=0x746bb0 "", port=port@entry=6379, bindaddr=bindaddr@entry=0x0, af=af@entry=10, backlog=511)at anet.c:487487 if (anetListen(err,s,p->ai_addr,p->ai_addrlen,backlog) == ANET_ERR) s = ANET_ERR;(gdb) f 2#2 0x000000000042793d in anetTcp6Server (err=err@entry=0x746bb0 "", port=port@entry=6379, bindaddr=bindaddr@entry=0x0, backlog=511)at anet.c:510510 return _anetTcpServer(err, port, bindaddr, AF_INET6, backlog);(gdb) f 3#3 0x000000000042b0bf in listenToPort (port=6379, fds=fds@entry=0x746ae4 , count=count@entry=0x746b24 ) at server.c:17281728 fds[*count] = anetTcp6Server(server.neterr,port,NULL,(gdb) f 4#4 0x000000000042fa77 in initServer () at server.c:18521852 listenToPort(server.port,server.ipfd,&server.ipfd_count) == C_ERR)(gdb) f 5#5 0x0000000000423803 in main (argc=1, argv=0x7fffffffe648) at server.c:38623862 initServer();(gdb)
通过查看上面的各个堆栈,可以得出这里的调用层级关系,即:
main() 函数在第 3862 行调用了 initServer() 函数 initServer() 在第 1852 行调用了 listenToPort() 函数 listenToPort() 在第 1728 行调用了 anetTcp6Server() 函数 anetTcp6Server() 在第 510 行调用了 _anetTcpServer() 函数 _anetTcpServer() 函数在第 487 行调用了 anetListen() 函数 当前断点正好位于 anetListen() 函数中
5、info break、enable、disable 和 delete 命令
在程序中加了很多断点,而我们想查看加了哪些断点时,可以使用 info break 命令(简写为 info b):
(gdb) info b Num Type Disp Enb Address What 1 breakpoint keep y 0x0000000000423450 in main at server.c:3709 breakpoint already hit 1 time 2 breakpoint keep y 0x000000000049c1f0 in _redisContextConnectTcp at net.c:267 3 breakpoint keep y 0x0000000000426cf0 in anetListen at anet.c:441 breakpoint already hit 1 time 4 breakpoint keep y 0x0000000000426d05 in anetListen at anet.c:444 breakpoint already hit 1 time 5 breakpoint keep y 0x0000000000426d16 in anetListen at anet.c:450 breakpoint already hit 1 time 6 breakpoint keep y 0x0000000000426d16 in anetListen at anet.c:452 breakpoint already hit 1 time
通过上面的内容片段可以知道,目前一共增加了 6 个断点,除了断点 2 以外,其他的断点均被触发一次,其他信息比如每个断点的位置(所在的文件和行号)、内存地址、断点启用和禁用状态信息也一目了然。如果我们想禁用某个断点,使用“disable 断点编号”就可以禁用这个断点了,被禁用的断点不会再被触发;同理,被禁用的断点也可以使用“enable 断点编号”重新启用。
(gdb) disable 1 (gdb) info b Num Type Disp Enb Address What 1 breakpoint keep n 0x0000000000423450 in main at server.c:3709 breakpoint already hit 1 time 2 breakpoint keep y 0x000000000049c1f0 in _redisContextConnectTcp at net.c:267 3 breakpoint keep y 0x0000000000426cf0 in anetListen at anet.c:441 breakpoint already hit 1 time 4 breakpoint keep y 0x0000000000426d05 in anetListen at anet.c:444 breakpoint already hit 1 time 5 breakpoint keep y 0x0000000000426d16 in anetListen at anet.c:450 breakpoint already hit 1 time 6 breakpoint keep y 0x0000000000426d16 in anetListen at anet.c:452 breakpoint already hit 1 time
使用 disable 1 以后,第一个断点的 Enb 一栏的值由 y 变成 n,重启程序也不会再次触发:
(gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /root/gdbtest/redis-4.0.11/src/redis-server [Thread debugging using libthread_db enabled] Using host libthread_db library “/lib64/libthread_db.so.1”. 46795:C 08 Sep 16:15:55.681 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo 46795:C 08 Sep 16:15:55.681 # Redis version=4.0.11, bits=64, commit=00000000, modified=0, pid=46795, just started 46795:C 08 Sep 16:15:55.681 # Warning: no config file specified, using the default config. In order to specify a config file use /root/gdbtest/redis-4.0.11/src/redis-server /path/to/redis.conf 46795:M 08 Sep 16:15:55.682 * Increased maximum number of open files to 10032 (it was originally set to 1024).Breakpoint 3, anetListen (err=0x746bb0 “”, s=10, sa=0x75edb0, len=28, backlog=511) at anet.c:441 441 if (bind(s,sa,len) == -1) {
如果 disable 命令和 enable 命令不加断点编号,则分别表示禁用和启用所有断点:
(gdb) disable (gdb) info b Num Type Disp Enb Address What 1 breakpoint keep n 0x0000000000423450 in main at server.c:3709 2 breakpoint keep n 0x000000000049c1f0 in _redisContextConnectTcp at net.c:267 3 breakpoint keep n 0x0000000000426cf0 in anetListen at anet.c:441 breakpoint already hit 1 time 4 breakpoint keep n 0x0000000000426d05 in anetListen at anet.c:444 5 breakpoint keep n 0x0000000000426d16 in anetListen at anet.c:450 6 breakpoint keep n 0x0000000000426d16 in anetListen at anet.c:452 (gdb) enable (gdb) info b Num Type Disp Enb Address What 1 breakpoint keep y 0x0000000000423450 in main at server.c:3709 2 breakpoint keep y 0x000000000049c1f0 in _redisContextConnectTcp at net.c:267 3 breakpoint keep y 0x0000000000426cf0 in anetListen at anet.c:441 breakpoint already hit 1 time 4 breakpoint keep y 0x0000000000426d05 in anetListen at anet.c:444 5 breakpoint keep y 0x0000000000426d16 in anetListen at anet.c:450 6 breakpoint keep y 0x0000000000426d16 in anetListen at anet.c:452 (gdb)
使用“delete 编号”可以删除某个断点,如 delete 2 3 则表示要删除的断点 2 和断点 3:
(gdb) delete 2 3 (gdb) info b Num Type Disp Enb Address What 1 breakpoint keep y 0x0000000000423450 in main at server.c:3709 4 breakpoint keep y 0x0000000000426d05 in anetListen at anet.c:444 5 breakpoint keep y 0x0000000000426d16 in anetListen at anet.c:450 6 breakpoint keep y 0x0000000000426d16 in anetListen at anet.c:452
同样的道理,如果输入 delete 不加命令号,则表示删除所有断点。