1.增加coredump文件的目的:
在Linux中程序崩溃时,内核会生成一个core文件,通过该文件可以定位程序异常的原因。
2.如何生成coredump文件:
2.1临时修改方法:
2.1.1执行如下命令,打开core文件的生成开关。
ulimit -c unlimited
2.1.2执行如下命令,定制core文件名,core文件默认生成在程序当前目录。
sudo sh -c 'echo "%e-%p-%t" > /proc/sys/kernel/core_pattern'
2.2永久修改方法:
2.2.1打开limits.conf文件,并添加两行语句:
sudo vi /etc/security/limits.conf
* soft core unlimited
* hard core unlimited
2.2.2core生成方式(方式一)
执行命令cat /proc/sys/kernel/core_pattern,发现其指向apport.
cat /proc/sys/kernel/core_pattern
打开apport文件并修改:
sudo vi /etc/init.d/apport
将该行:
echo "|$AGENT %p %s %c %d %P %E" > /proc/sys/kernel/core_pattern
修改为:
echo "core_%e-%p-%t" > /proc/sys/kernel/core_pattern
2.2.3core生成方式(方式二)
打开该文件:
sudo vi /etc/sysctl.conf
文件尾部增加:
kernel.core_pattern = core_%e-%p-%t
重启后上述两个配置会生效。
3.测试上述命令是否生效:
3.1编写一个异常程序,该程序对空指针赋值,如下:
#include <stdio.h>
int main()
{
int *p=0;
*p=10;
return 0;
}
3.2编译该程序,注意要增加-g选项,如过没有-g选项,后续在使用core文件时无法定位问题具体位置。
gcc -o test test.c -g
3.3运行程序,会发现有Segmentation fault (core dumped),查看后会发现生成了test-xxxx-xxxxx文件,该文件即core文件,同时每次执行程序后都会生成新的core文件。
book@100ask:~/workspace/test1$ ./test
Segmentation fault (core dumped)
book@100ask:~/workspace/test1$ ll
total 216
drwxrwxr-x 2 book book 4096 Apr 25 10:37 ./
drwxrwxrwx 11 book book 4096 Apr 25 09:44 ../
-rwxrwxr-x 1 book book 10640 Apr 25 10:19 test*
-rw------- 1 book book 241664 Apr 25 10:16 test-3470-1650896215
-rw------- 1 book book 241664 Apr 25 10:37 test-3639-1650897462
-rw-rw-r-- 1 book book 65 Apr 25 10:16 test.c
book@100ask:~/workspace/test1$
4.如何使用core文件定位问题
执行命令gdb ./test test-3470-1650896215,其中./test是执行程序,test-3470-1650896215是core文件。下图提示第6行即问题所在:
book@100ask:~/workspace/test1$ gdb ./test test-3470-1650896215
GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./test...done.
warning: exec file is newer than core file.
[New LWP 3470]
Core was generated by `./test'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000055e56acf960a in main () at test.c:6
6 *p=10;
(gdb) p p
$1 = (int *) 0x0
(gdb) bt
#0 0x000055e56acf960a in main () at test.c:6
(gdb) q
5.记录实际项目调试遇到的bug
5.1 #0 pevent是指针,pevent=0x0说明是空指针,所以在35行调用时异常。
#0 0x0000564e67279414 in OSQPend (perr=<synthetic pointer>, timeout=1000,
pevent=0x0) at os_q.c:35
35 msglen = message_queue_receive_settable(pevent->OSEventType, msg, MSQ_MAX_LEN, timeout);
6.查看core dump文件路径存储位置
book@100ask:/usr/share/apport$ cat /proc/sys/kernel/core_pattern
|/usr/share/apport/apport %p %s %c %d %P %E
注:如果第一个字符是'|',表示把后面的pattern作为命令来执行,这样不会创建coredump文件,而是将其输入到某个程序来处理。
比如 '|/usr/share/apport/apport %p %s %c %d %P' 就是表示由apport来处理core dump文件。