面临问题:一般情况下,如果被gdb调试的程序中调用fork派生出一个新的子进程,这时gdb调试的仍然还是父进程,其子进程的执行不被理会。如果之前你在子进程的执行routine上设置了断点,那么当子进程执行到那个断点时,子进程会因为收到一个SIGTRAP信号而自行终止,除非你在子进程中拦截了该信号。
那么使用GDB该如何调试多进程程序呢?
1、测试代码:该测试程序中子进程运行过程中会在wib函数中出现一个'除0'异常。现在我们就要调试该子进程。
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
int wib(int no1, int no2)
{
int result, diff;
diff = no1 - no2;
result = no1 / diff;
return result;
}
int main()
{
pid_t pid;
pid = fork();
if (pid <0) {
printf("fork err/n");
exit(-1);
} else if (pid == 0) {
/* in child process */
sleep(2*60); //------------------ (!)
int value = 10;
int div = 6;
int total = 0;
int i = 0;
int result = 0;
for (i = 0; i < 10; i++) {
result = wib(value, div);
total += result;
div++;
value--;
}
printf("%d wibed by %d equals %d/n", value, div, total);
exit(0);
} else {
/* in parent process */
sleep(4);
wait(-1);
exit(0);
}
}
2、调试步骤:
[调试原理]
不知道大家发现没有,在(!)处在我们的测试程序在父进程fork后,子进程调用sleep睡了2*60秒。这就是关键,这个sleep本来是不该存在于子进程代码中的,而是而了使用GDB调试后加入的,它是我们调试的一个关键点。为什么要让子进程刚刚运行就开始sleep呢?因为我们要在子进程睡眠期间,利用shell命令获取其process id,然后再利用gdb调试外部进程的方法attach到该process id上,调试该进程。
我觉上面的调试原理的思路已经很清晰了,剩下的就是如何操作的问题了。我们来实践一次吧!
窗口1:
root@ubuntu:/mnt/hgfs/ubuntu-share# ./test &
[2] 12602
[1] Done ./test
root@ubuntu:/mnt/hgfs/ubuntu-share# ps -ef | grep test
root 12602 12506 0 22:06 pts/2 00:00:00 ./test
root 12603 12602 0 22:06 pts/2 00:00:00 ./test
root 12605 12506 0 22:06 pts/2 00:00:00 grep --color=auto test
窗口2:
root@ubuntu:/mnt/hgfs/ubuntu-share# gdb
GNU gdb (GDB) 7.5-ubuntu
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
(gdb)
(gdb) attach 12603
Attaching to process 12603
Reading symbols from /mnt/hgfs/ubuntu-share/test...done.
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/x86_64-linux-gnu/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
0x00007f08939cc820 in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) list
warning: Source file is more recent than executable.
6 {
7 int result, diff;
8 diff = no1 - no2;
9 result = no1 / diff;
10 return result;
11 }
12 int main()
13 {
14 pid_t pid;
15 pid = fork();
(gdb)
16 if (pid <0) {
17 printf("fork err/n");
18 exit(-1);
19 } else if (pid == 0) {
20 /* in child process */
21 sleep(2*60); //------------------ (!)
22 int value = 10;
23 int div = 6;
24 int total = 0;
25 int i = 0;
(gdb)
26 int result = 0;
27 for (i = 0; i < 10; i++) {
28 result = wib(value, div);
29 total += result;
30 div++;
31 value--;
32 }
33 printf("%d wibed by %d equals %d/n", value, div, total);
34 exit(0);
35 } else {
(gdb) break 28 //此句是不会执行的
Breakpoint 1 at 0x4006d7: file test.c, line 28.
(gdb) continue
Continuing.
Breakpoint 1, main () at test.c:28
28 result = wib(value, div);
(gdb) step //会进入函数中的
wib (no1=10, no2=6) at test.c:8
8 diff = no1 - no2;
(gdb) continue
Continuing.
Breakpoint 1, main () at test.c:28
28 result = wib(value, div);
(gdb) step
wib (no1=9, no2=7) at test.c:8
8 diff = no1 - no2;
(gdb) p diff
$1 = 4
(gdb) continue
Continuing.
Breakpoint 1, main () at test.c:28
28 result = wib(value, div);
(gdb) step
wib (no1=8, no2=8) at test.c:8
8 diff = no1 - no2;
(gdb) p diff
$2 = 2
(gdb) continue
Continuing.
Program received signal SIGFPE, Arithmetic exception.
0x000000000040065d in wib (no1=8, no2=8) at test.c:9
9 result = no1 / diff;
(gdb) p diff //错误定位了
$3 = 0
(gdb) q
A debugging session is active.
Inferior 1 [process 12603] will be detached.
Quit anyway? (y or n) y
Detaching from program: /mnt/hgfs/ubuntu-share/test, process 12603