1、问题描述
旋转屏测试5~6小时system_server发生abort,手机重启导致测试停止,android7.0平台多个项目都有此问题爆出。
和测试同事了解旋转屏幕在settings界面进行操作概率较高。
2、问题分析
从tombstone分析system_server abort原因是global reference table overflow 。
pid: 3749, tid: 8164, name:Binder:3749_E >>> system_server<<< signal 6 (SIGABRT), code -6 (SI_TKILL), faultaddr -------- Abort message: 'vendor/intel/art-extension/runtime/indirect_reference_table.cc:125]JNI ERROR (app bug): global reference table overflow (max=51200)' rax0000000000000000 rbx00007f0567b034f8 rcx00007f0590c28f37 rdx 0000000000000006 rsi0000000000001fe4 rdi 0000000000000ea5 r8 ffffffffffffffd8 r9 00007f0567b01790 r10 0000000000000008 r11 0000000000000206 r120000000000001fe4 r130000000000000006 r1400007f0567b024b0 r15 00007f05902d0800 cs 0000000000000033 ss 000000000000002b rip00007f0590c28f37 rbp000000000000000b rsp00007f0567b02358 eflags 0000000000000206
backtrace: #00pc 000000000008ef37 /system/lib64/libc.so (tgkill+7) #01pc 000000000008b952 /system/lib64/libc.so (pthread_kill+66) #02pc 0000000000030371 /system/lib64/libc.so(raise+17) #03pc 000000000002875e /system/lib64/libc.so (abort+78) #04pc 0000000000572f29 /system/lib64/libart.so (_ZN3art7Runtime5AbortEv+361) #05pc 00000000001d2091 /system/lib64/libart.so (_ZN3art10LogMessageD1Ev+817) #06pc 000000000036b5dd /system/lib64/libart.so(_ZN3art22IndirectReferenceTable3AddEjPNS_6mirror6ObjectE+669) #07pc 00000000004201e9 /system/lib64/libart.so(_ZN3art9JavaVMExt12AddGlobalRefEPNS_6ThreadEPNS_6mirror6ObjectE+57) #08pc 000000000045cd3b /system/lib64/libart.so |
此类泄漏问题可能是由于naitve层代码导致也可能是java层代码导致,需要具体分析log。
A001-01 23:45:55.469 3749 8164 F art : vendor/intel/art-extension/runtime/indirect_reference_table.cc:125] 6202 of com.android.server.print.UserState$4 (6202 unique instances) A001-01 23:45:55.469 3749 8164 F art :vendor/intel/art-extension/runtime/indirect_reference_table.cc:125] 6198 of com.android.server.print.UserState$3 (6198 unique instances) A001-01 23:45:55.465 3749 8164 F art :vendor/intel/art-extension/runtime/indirect_reference_table.cc:125] 18626 of android.os.RemoteCallbackList$Callback (18626 unique instances) A001-01 23:45:55.463 3749 8164 F art :vendor/intel/art-extension/runtime/indirect_reference_table.cc:125] 19147 of java.lang.ref.WeakReference(19147 unique instances) 18626 + 19147+ 6202 + 6198 = 50173 从log看是上面4个类型对象实例过多导致,这里4个对象的引用和是50173,加上其他的对象引用就大于了我们设置的阀值51200,就会主动触发abort。 |
从上面log我们不能完全确定是那里的代码逻辑有问题,因为UserState、RemoteCallbackList、WeakReference都是比较公共的类。我们需要通过发生问题时的调用栈来看是哪里有问题。