昨天碰到了一个Gc 时Suspend All 超时导致的Runtime abort问题。
顺带就研究了下Suspend的机制以及超时检查的机制。
第一部分,suspend机制:
在进程被signal 3或者GC或者debugger尝试attach,就会suspend,那么suspend是如何实现的呢?
首先看一个Thread的 dump
"android.fg" prio=5 tid=19 Native
| group="" sCount=1 dsCount=0 obj=0x12e5b740 self=0xb482f000
| sysTid=590 nice=0 cgrp=default sched=0/0 handle=0xb4926d80
| state=S schedstat=( 0 0 0 ) utm=59 stm=63 core=0 HZ=100
| stack=0xa39f2000-0xa39f4000 stackSize=1036KB
| held mutexes=
native: #00 pc 000133cc /system/lib/libc.so (syscall+28)
native: #01 pc 000a99eb /system/lib/libart.so (art::ConditionVariable::Wait(art::Thread*)+82)
native: #02 pc 0027c8a5 /system/lib/libart.so (art::GoToRunnable(art::Thread*)+756)
native: #03 pc 00087679 /system/lib/libart.so (art::JniMethodEnd(unsigned int, art::Thread*)+8)
native: #04 pc 000b39d5 /data/dalvik-cache/arm/system@framework@boot.oat (Java_android_os_MessageQueue_nativePollOnce__JI+112)
at android.os.MessageQueue.nativePollOnce(Native method)
at android.os.MessageQueue.next(MessageQueue.java:143)
at android.os.Looper.loop(Looper.java:122)
at android.os.HandlerThread.run(HandlerThread.java:61)
at com.android.server.ServiceThread.run(ServiceThread.java:46)
这个线程在Jni方法调用返回,想从native状态切换为Runnable状态时,检测到当前线程的suspend flag在位,于是进入conditionwait等待唤醒。
ART线程通过TransitionFreomSuspendToRunnable以及TransitionFromRunnableToSuspended两个函数来完成Runable到suspend或其他状态的转换。
而线程的常见状态有如下多种:
enum ThreadState {
// Thread.State JDWP state
kTerminated = 66, // TERMINATED TS_ZOMBIE Thread.run has returned, but Thread* still around
kRunnable, // RUNNABLE TS_RUNNING runnable
kTimedWaiting, // TIMED_WAITING TS_WAIT in Object.wait() with a timeout
kSleeping, // TIMED_WAITING TS_SLEEPING in Thread.sleep()
kBlocked, // BLOCKED TS_MONITOR blocked on a monitor
kWaiting, // WAITING TS_WAIT in Object.wait()
kWaitingForGcToComplete, // WAITING TS_WAIT blocked waiting for GC
kWaitingForCheckPointsToRun, // WAITING TS_WAIT GC waiting for checkpoints to run
kWaitingPerformingGc, // WAITING TS_WAIT performing GC
kWaitingForDebuggerSend, // WAIT