Tombstone 文件分析
/*
* 下面信息是dropbox负责添加的
**/
isPrevious: true
Build: Rock/odin/odin:7.1.1/NMF26F/1500868195:user/dev-keys
Hardware: msm8953
Revision: 0
Bootloader: unknown
Radio: unknown
Kernel: Linux version 3.18.31-perf-g34cb3d1 (smartcm@hardcomp5) (gcc version 4.9 20150123 (prerelease) (GCC) ) #1 SMP PREEMPT Mon Jul 24 11:54:35 CST 2017
//从这儿开始为tombstone.cpp负责写入tombstone_0* 文件内容
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint: 'Rock/odin/odin:7.1.1/NMF26F/1500868195:user/dev-keys'
Revision: '0'
ABI: 'arm'
pid: 13289, tid: 29467, name: CodecLooper >>> /system/bin/mediaserver <<<
signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
...... //由于篇幅较多,省略
tombstone文件在通过Dropbox上传的时候受buff size限制,平常在文件结尾能看到[[TRUNCATED]]关键字,说明文件已经被截断了。完整的文件格式如下:
#分析
##查找对应的symbol
在生成tombstone文件之始,dropbox将生成文件上传时会携带版本等基本信息。一般用Linux version(如:Linux version 3.18.31-perf-g34cb3d1 (smartcm@hardcomp5) (gcc version 4.9 20150123 (prerelease) (GCC) ) #1 SMP PREEMPT Mon Jul 24 11:54:35 CST 2017)去http://172.16.2.18/vmlinux.html上查找与之对应的符号库位置,将backtrace中列出的ELF文件拷贝到本地,以便于解析分析。
##tombstone文件内容
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint: 'Rock/odin/odin:7.1.1/NMF26F/1492496606:user/dev-keys'
Revision: '0'
ABI: 'arm64'
pid: 4399, tid: 4399, name: netstat >>> netstat <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0
x0 0000000000000000 x1 0000005555591002 x2 0000000000000007 x3 0000000000000004
x4 000000000000004c x5 0000000000000000 x6 0000007fd1fd44b8 x7 fefefefefefefefe
x8 0000005555591002 x9 0000007f7ae0c708 x10 000000000000004c x11 0101010101010101
x12 0000007f7ae0c708 x13 000000008000002f x14 0000007f7b5d7108 x15 0000007f7b5d6dac
x16 00000055555aa778 x17 0000007f7b54c89c x18 00000000ffffffff x19 0000007f7ae2a340
x20 0000007f7ae37400 x21 000000555558d9f5 x22 0000005555591002 x23 00000055555ac760
x24 000000555559100b x25 0000000000000000 x26 ffffffffffffffff x27 0000000000000000
x28 216a8f7c1fbcf681 x29 0000007fd1fd4540 x30 0000005555575b1c
sp 0000007fd1fd44a0 pc 0000007f7b54c984 pstate 0000000000000000
v0 0000007fd1fd44100000007fd1fd44a0 v1 0000007fd1fd44100000007fd1fd44a0
v2 ffffff80ffffffc80000007fd1fd43b0 v3 0000007fd1fd43f00000007fd1fd44a0
v4 00000000000000000000000000000000 v5 40100401401004014010040140100401
v6 00000000000000000000000000000000 v7 00000000000000000000000000000000
v8 00000000000000000000000000000000 v9 00000000000000000000000000000000
v10 00000000000000000000000000000000 v11 00000000000000000000000000000000
v12 00000000000000000000000000000000 v13 00000000000000000000000000000000
v14 00000000000000000000000000000000 v15 00000000000000000000000000000000
v16 00000000000000000000000000000000 v17 00000000000000000000000000000000
v18 00000000000000000000000000000000 v19 00000000000000000000000000000000
v20 00000000000000000000000000000000 v21 00000000000000000000000000000000
v22 00000000000000000000000000000000 v23 00000000000000000000000000000000
v24 00000000000000000000000000000000 v25 00000000000000000000000000000000
v26 00000000000000000000000000000000 v27 00000000000000000000000000000000
v28 00000000000000000000000000000000 v29 00000000000000000000000000000000
v30 00000000000000000000000000000000 v31 00000000000000000000000000000000
fpsr 00000000 fpcr 00000000
backtrace:
#00 pc 000000000001b984 /system/lib64/libc.so (strncmp+232)
#01 pc 0000000000020b18 /system/bin/toybox
#02 pc 000000000000c58c /system/bin/toybox
#03 pc 000000000000c678 /system/bin/toybox
#04 pc 000000000000c5e4 /system/bin/toybox
#05 pc 0000000000020108 /system/bin/toybox
#06 pc 0000000000011880 /system/bin/toybox
#07 pc 0000000000011400 /system/bin/toybox
#08 pc 00000000000118f4 /system/bin/toybox
#09 pc 000000000001a7d8 /system/lib64/libc.so (__libc_init+88)
#10 pc 000000000000b6f8 /system/bin/toybox
-确定发生错误的signal
①如上为signal 11,code 为1,fatal addr 为0x0。
使用aarch64-linux-android-addr2line 工具,将对应的backtrace 的pc偏移地址转成跟代码对应的文件名跟行号。在这里有必要提一下,之前一直用aarch64-linux-android-addr2line -fe ./symbols/system/bin/linker 20b18 执行结果为:
$aarch64-linux-android-addr2line -fe system/bin/toybox 20b18
ss_inode
/proc/self/cwd/external/toybox/toys/pending/netstat.c:378
但中间有inline 函数调用,就不知道调用关系了,不久前发现此工具中的参数帮我们解决了此问题。添加-i等参数,完美解决此问题,具体执行如下:
{ qiyunlong@qiyunlong-pc /home/qiyunlong/tmp/7/1 }
$aarch64-linux-android-addr2line -fCpie system/bin/toybox 20b18
ss_inode at /proc/self/cwd/external/toybox/toys/pending/netstat.c:378
(inlined by) scan_pid_inodes at /proc/self/cwd/external/toybox/toys/pending/netstat.c:428
(inlined by) scan_pid at /proc/self/cwd/external/toybox/toys/pending/netstat.c:447
(inlined by) scan_pids at /proc/self/cwd/external/toybox/toys/pending/netstat.c:456
②还有比较常见的signal有,signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr ----
此类问题工作获得代码行号阅读代码,此类问题为进程自己进行abort判断退出,比较好分析,一般都会带有终止message。
-阅读相关代码逻辑
-争取复现问题,便于在修改后加以验证
如上是我自己搞清楚来龙去脉的方法。当然,对于tombstone文件,以上只是皮毛,更多高级用法如使用objdump -SD … 得到汇编,通过tombstone文件记录的寄存器,和backtrace下面的stack 压入栈的信息还原调用逻辑,以及通过dump内存和maps信息确定虚拟内存对应的linker 文件。
有不对之处,欢迎抛砖。。。