简介
so出问题对大多数人来说不是很好定位,之前同事遇到了so的问题,当时记录了分析的过程。这里就当做案例整理分享一下。
定位问题
先放出完整的错误日志
DEBUG: pid: 30636, tid: 30636, name: com.xxx.xxx >>> com.xxx.xxx <<<
2022-03-21 14:58:07.484 5058-5058/? A/DEBUG: uid: 10648
2022-03-21 14:58:07.484 5058-5058/? A/DEBUG: signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0xa
2022-03-21 14:58:07.484 5058-5058/? A/DEBUG: Cause: null pointer dereference
2022-03-21 14:58:07.484 5058-5058/? A/DEBUG: x0 0000000000000002 x1 0000000000000000 x2 0000000000000002 x3 0000000000000000
2022-03-21 14:58:07.484 5058-5058/? A/DEBUG: x4 0000007ff3739140 x5 000000773cb58b68 x6 28273d73686d683b x7 0000000000000000
2022-03-21 14:58:07.484 5058-5058/? A/DEBUG: x8 c695b1dd923d50de x9 c695b1dd923d50de x10 0000000000430000 x11 ecae000000060106
2022-03-21 14:58:07.484 5058-5058/? A/DEBUG: x12 0000000000000000 x13 0000000000651110 x14 0000000000526110 x15 0000000000000000
2022-03-21 14:58:07.484 5058-5058/? A/DEBUG: x16 0000007a99f2f7d0 x17 0000007789e49b88 x18 0000007a9c8c0000 x19 0000000000000000
2022-03-21 14:58:07.484 5058-5058/? A/DEBUG: x20 0000000000000000 x21 00000079159b0380 x22 0000007a9c24f000 x23 00000079159b0438
2022-03-21 14:58:07.484 5058-5058/? A/DEBUG: x24 000000779c11f6f8 x25 0000007a9c24f000 x26 00000000000000e7 x27 0000000000000009
2022-03-21 14:58:07.484 5058-5058/? A/DEBUG: x28 0000000000000004 x29 0000007ff3738aa0
2022-03-21 14:58:07.484 5058-5058/? A/DEBUG: lr 0000007789e564fc sp 0000007ff3738a90 pc 0000007789e62448 pst 0000000040001000
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: backtrace:
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #00 pc 000000000001f448 /data/app/~~2xZag9J2zy9KNGoxlnOpiw==/com.xxx.xxx-TbIX56VZqQLnC73_TkryCQ==/lib/arm64/libmarsxlog.so (BuildId: 510be5e1f6d6cbd2f3ba8a437739e14791700304)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #01 pc 00000000000134f8 /data/app/~~2xZag9J2zy9KNGoxlnOpiw==/com.xxx.xxx-TbIX56VZqQLnC73_TkryCQ==/lib/arm64/libmarsxlog.so (BuildId: 510be5e1f6d6cbd2f3ba8a437739e14791700304)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #02 pc 000000000013740c /data/app/~~2xZag9J2zy9KNGoxlnOpiw==/com.xxx.xxx-TbIX56VZqQLnC73_TkryCQ==/oat/arm64/base.odex (art_jni_trampoline+140)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #03 pc 0000000000134564 /apex/com.android.art/lib64/libart.so (art_quick_invoke_stub+548) (BuildId: 590bc1a07a50a63c196efb048a7125f4)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #04 pc 0000000000198e94 /apex/com.android.art/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+204) (BuildId: 590bc1a07a50a63c196efb048a7125f4)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #05 pc 000000000030c254 /apex/com.android.art/lib64/libart.so (art::interpreter::ArtInterpreterToCompiledCodeBridge(art::Thread*, art::ArtMethod*, art::ShadowFrame*, unsigned short, art::JValue*)+376) (BuildId: 590bc1a07a50a63c196efb048a7125f4)
2022-03-21 14 :58:07.631 5058-5058/? A/DEBUG: #06 pc 000000000030736c /apex/com.android.art/lib64/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+884) (BuildId: 590bc1a07a50a63c196efb048a7125f4)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #07 pc 0000000000641910 /apex/com.android.art/lib64/libart.so (MterpInvokeVirtualQuick+708) (BuildId: 590bc1a07a50a63c196efb048a7125f4)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #08 pc 0000000000132594 /apex/com.android.art/lib64/libart.so (mterp_op_invoke_virtual_quick+20) (BuildId: 590bc1a07a50a63c196efb048a7125f4)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #09 pc 00000000017df6a0 /data/app/~~2xZag9J2zy9KNGoxlnOpiw==/com.xxx.xxx-TbIX56VZqQLnC73_TkryCQ==/oat/arm64/base.vdex (k.v.i.a.a.d.<init>+472)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #10 pc 000000000063d838 /apex/com.android.art/lib64/libart.so (MterpInvokeDirect+1164) (BuildId: 590bc1a07a50a63c196efb048a7125f4)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #11 pc 000000000012e914 /apex/com.android.art/lib64/libart.so (mterp_op_invoke_direct+20) (BuildId: 590bc1a07a50a63c196efb048a7125f4)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #12 pc 00000000017df326 /data/app/~~2xZag9J2zy9KNGoxlnOpiw==/com.xxx.xxx-TbIX56VZqQLnC73_TkryCQ==/oat/arm64/base.vdex (k.v.i.a.a.d.a+26)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #13 pc 00000000002fed48 /apex/com.android.art/lib64/libart.so (art::interpreter::Execute(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame&, art::JValue, bool, bool) (.llvm.7983234147973590803)+268) (BuildId: 590bc1a07a50a63c196efb048a7125f4)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #14 pc 0000000000306a10 /apex/com.android.art/lib64/libart.so (art::interpreter::ArtInterpreterToInterpreterBridge(art::Thread*, art::CodeItemDataAccessor const&, art::ShadowFrame*, art::JValue*)+200) (BuildId: 590bc1a07a50a63c196efb048a7125f4)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #15 pc 0000000000307350 /apex/com.android.art/lib64/libart.so (bool art::interpreter::DoCall<false, false>(art::ArtMethod*, art::Thread*, art::ShadowFrame&, art::Instruction const*, unsigned short, art::JValue*)+856) (BuildId: 590bc1a07a50a63c196efb048a7125f4)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #16 pc 000000000063de5c /apex/com.android.art/lib64/libart.so (MterpInvokeStatic+548) (BuildId: 590bc1a07a50a63c196efb048a7125f4)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #17 pc 000000000012e994 /apex/com.android.art/lib64/libart.so (mterp_op_invoke_static+20) (BuildId: 590bc1a07a50a63c196efb048a7125f4)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG: #18 pc 00000000017df00a /data/app/~~2xZag9J2zy9KNGoxlnOpiw==/com.xxx.xxx-TbIX56VZqQLnC73_TkryCQ==/oat/arm64/base.vdex (k.v.i.a.a.c.<init>+10)
2022-03-21 14:58:07.631 5058-5058/? A/DEBUG:
#00 pc 000000000001f448
就是这行函数对应的指令地址
首先分两种情况来讨论,就是有调试符号的情况和没有调试符号的情况。
有调试符号
使用addr2line确定位置,-C
将C++的name mangline后的函数名还原,-f
打印函数名字,-e
指定符号表文件路径,18120
就是pc寄存器指向的地址
╭─kevin@dell ~/AndroidStudioProjects/AndroidFFmpegLabs ‹master●›
╰─$ addr2line -C -f -e ./ffmpeg/build/intermediates/cmake/debug/obj/armeabi-v7a/libkxffmpeg.so 18120
FFmpegDecoder::get_h264_nalu(char const*, unsigned int, unsigned char*, unsigned int*, unsigned char*, unsigned int*, bool*)
/home/kevin/AndroidStudioProjects/AndroidFFmpegLabs/ffmpeg/src/main/cpp/ffmpeg_decoder.cpp:44
这里可以看到是ffmpeg_decoder.cpp
文件的第44
行位置,函数名是FFmpegDecoder::get_h264_nalu(char const*, unsigned int, unsigned char*, unsigned int*, unsigned char*, unsigned int*, bool*)
没有调试符号
这里错误日志遇到的是这种情况。这种情况需要使用so反编译来定位。
使用IDA反编译so文件
具体IDA安装和使用,可以看网上文章,这里不展开讲
通过IDA可以定位出问题的位置 #00 pc 000000000001f448
对应的汇编是LDR X0,[X0,#8]
,如下图所示:
查看指令文档可以知道,这个是一个指针的解引用操作。但是指针地址是2,是一个无效地址。所以解引用失败了。这里说一下falt addr 0xa = 2 + 8
。
再看一下调用者的位置,#01 pc 00000000000134f8
,这里的话,使用F5,反汇编转成可读性更强的c语言样式,看到这个指针是从Java_com_tencent_mars_xlog_Xlog_setConsoleLogOpen
这个函数参数中传递进来的
然后再往上层调用者追溯,也就是Java层。可以看到是下面这里。
继续追溯,就可以看到出错的地方了
这里的2就是指针2的来源。
从图中可以看到这是函数使用错误导致,本意是想设置日志等级,但是这个函数并不是用来设置日志等级的。
总结
问题的定位思路,最好的办法。就是从错误日志出发,一步一步往上追溯到错误的原因。上面的问题通过从错误日志出发,一步一步推导。最终就能找到问题原因。