本文主要介绍 ART异常处理,ART对SIGSEGV信号的拦截处理,Implicit Suspend Check的实现,以及一般的 Java Exception在ART种的检测和抛出。由于 StackOverflowError / NullPointerException的检测抛出,throw-catch的实现比较复杂,开始写到一篇文章内,发现文章太长了,后来把这3个比较复杂的处理拆分出来单独列出了。
ART异常处理机制(2) - StackOverflowError 实现
ART异常处理机制(3) - NullPointerException实现
ART异常处理机制(4) - throw & catch & finally实现
实际上 ART 种的主要的两种Exception的处理都是通过产生 SIGSEGV信号好拦截SIGSEGV信号进行实现的。所以我们接下来先弄明白 ART种对 SIGSEGV信号的拦截的处理流程。
1. FaultManager的初始化
ART中处理linux信号是通过 FaultManger来处理,对于特定的信号,先经过ART中的信号处理函数 art_fault_handler进行处理,在ART中能够识别的情况下,把这些信号转换为Java工程师能够识别的 Java Exception 抛出,以便于工程师处理异常。下面我们看下 FaultManger的实现。实际上 FaultManger实在虚拟机启动的时候,就完成了初始化,在虚拟机启动完成,即可立即处理 Java Exception。在 runtime.cc 的 Runtime::Init 函数:其中 fault_manager.Init() 初始化,会通过 signal 信号处理函数,设置拦截几个特定的信号,通过对这些信号进行特殊处理,来实现 Java Exception;下面的几个Handler的创建,实际都是通过构造函数,将自己添加到 fault_manager 的信号处理 handler 集合中,以便后续处理特定信号。bool Runtime::Init(RuntimeArgumentMap&& runtime_options_in) { ... fault_manager.Init(); if (implicit_suspend_checks_) { new SuspensionHandler(&fault_manager); } if (implicit_so_checks_) { new StackOverflowHandler(&fault_manager); } if (implicit_null_checks_) { new NullPointerHandler(&fault_manager); } if (kEnableJavaStackTraceHandler) { new JavaStackTraceHandler(&fault_manager); } ... }
当前 mask中,包含 SIGABRT,SIGBUS,SIGFPE,SIGILL,SIGSEGV 之外的所有信号。void FaultManager::Init() { CHECK(!initialized_); sigset_t mask; sigfillset(&mask); sigdelset(&mask, SIGABRT); sigdelset(&mask, SIGBUS); sigdelset(&mask, SIGFPE); sigdelset(&mask, SIGILL); sigdelset(&mask, SIGSEGV); SigchainAction sa = { .sc_sigaction = art_fault_handler, .sc_mask = mask, .sc_flags = 0UL, }; AddSpecialSignalHandlerFn(SIGSEGV, &sa); initialized_ = true; }
而 AddSpecialSignalHandlerFn中,只传递了 SIGSEGV过去。extern "C" void AddSpecialSignalHandlerFn(int signal, SigchainAction* sa) { InitializeSignalChain(); if (signal <= 0 || signal >= _NSIG) { fatal("Invalid signal %d", signal); } // Set the managed_handler. chains[signal].AddSpecialHandler(sa); chains[signal].Claim(signal); }
其中 chains是一个数组,其长度是linux 信号的个数: static SignalChain chains[_NSIG];在 InitializeSignalChain函数中,获取 sigchainlib 里的 sigaction 和 sigprocmask 函数,以便后续使用:记得Android之前的版本 libsigchain.so 是在 initrc 中通过 LD_PRELOAD 添加到 ldpath 中的,后来好像改了,修改后还没研究过。__attribute__((constructor)) static void InitializeSignalChain() { ... void* linked_sigaction = dlsym(RTLD_NEXT, "sigaction"); void* linked_sigprocmask = dlsym(RTLD_NEXT, "sigprocmask"); ... }
在 sigchain lib中实现了自己的 sigaction和 sigprocmask函数,通过类似 LD_PRELOAD 的手段,把 libsigchain.so 添加到 ldpath;使得在调用 sigaction 函数以及 sigprocmask时,会调用 libsigchain的这两个函数,而不是 libc的这两个函数。这里通过 dlsym(RTLD_NEXT,"***"),来获取 libc的这两个函数的指针,后续使用。简单来讲,就相当于 hook 了 libc的这两个函数,使得调用这两个函数的地方都会进入 libsigchain 实现的 sigaction和sigprocmask函数内。从代码中看到,通过 sigaction设置的信号会调用 libsigchain 的sigaction函数路径设置new action,如果时我们关注的信号,则没有真正设置new action到kernel,而是将其存放到该信号对应的SignalChain对应的 action_成员,用以记录 old_action,并返回该信号原来的 saved_action;若不是我们关注的信号,则还走 libc的 sigaction 函数,会真正设置new action到kernel。extern "C" int sigaction(int signal, const struct sigaction* new_action, struct sigaction* old_action) { InitializeSignalChain(); if (signal < 0 || signal >= _NSIG) { errno = EINVAL; return -1; } if (chains[signal].IsClaimed()) { struct sigaction saved_action = chains[signal].GetAction(); if (new_action != nullptr) { chains[signal].SetAction(new_action); } if (old_action != nullptr) { *old_action = saved_action; } return 0; } // Will only get here if the signal chain has not been claimed. We want // to pass the sigaction on to the kernel via the real sigaction in libc. return linked_sigaction(signal, new_action, old_action); }
同时也 hook 了 signal() 函数,目的与 hook sigaction函数一样,只不过,当走默认路径时,并不是使用libc的 signal()函数,而是也使用 libc的sigaction函数。sigchainlib中实现的 sigprocmask也类似,目的是当有程序调用 sigprocmask设置 SIG_BLOCK 要阻塞我们关注的 signal时,要把我们关注的 signal从信号掩码中去除掉,以便影响我们的功能。
接下来的 chains[signal].AddSpecialHandlers(sa),和 Claim(signal) :SigchainAction special_handlers_[2];
void AddSpecialHandler(SigchainAction* sa) { for (SigchainAction& slot : special_handlers_) { if (slot.sc_sigaction == nullptr) { slot = *sa; return; } } fatal("too many special signal handlers"); }
这里把两个 special handler的 SigChainAction 都设置为前面 FaultManger::Init 函数中初始化的 SigChainAction,其中 sc_sigaction = art_fault_handler;Claim(SIGSEGV):void Claim(int signo) { if (!claimed_) { Register(signo); claimed_ = true; } } void Register(int signo) { struct sigaction handler_action = {}; handler_action.sa_sigaction = SignalChain::Handler; handler_action.sa_flags = SA_RESTART | SA_SIGINFO | SA_ONSTACK; sigfillset(&handler_action.sa_mask); linked_sigaction(signo, &handler_action, &action_); }
可以看到在调用 Claim 函数 Register信号时,调用用过了前面的 linked_sigaction,其实就是 libc的 sigaction()函数。这里指定了 SIGSEGV信号的信号处理函数,即 void SignalChain::Handler(int signo, siginfo_t* siginfo, void* ucontext_raw) 函数来处理 SIGSEGV 信号。
总的来讲,fault_manager.Init()函数通过 sigchain中函数,设置了 SIGSEGV 信号的处理函数为 SignalChain::Handler函数,并 hook 了 libc 的 sigaction(),sigprocmask(), signal(),以及32bit时的 bsd_signal()这四个函数,防止被其他程序破坏我们的设置。
在 SignalChain::Handler() 函数中,会先调用 art_fault_handler 来先尝试处理 SIGSEGV信号,如果处理不了,会再使用 saved sigaction(default 或者应用设置的sigaction)来处理这个信号。回到 Runtime::Init()函数,fault_manager.Init()后面的 new SuspensionHandler(&fault_manager);几条语句,实际在这几个Handler的构造函数中,把它们各自都添加到了FaultManager的成员 generated_code_handlers_集合中,后面在 art_fault_hander函数中会使用这几个 Handler 尝试处理 SIGSEGV:比如:传递的第二个参数是 true:NullPointerHandler::NullPointerHandler(FaultManager* manager) : FaultHandler(manager) { manager_->AddHandler(this, true); }
到这里,已经把 NullPointerHander 添加到 generated_code_handlers_集合中;void FaultManager::AddHandler(FaultHandler* handler, bool generated_code) { DCHECK(initialized_); if (generated_code) { generated_code_handlers_.push_back(handler); } else { other_handlers_.push_back(handler); } }
再看 art_fault_hander 函数:static bool art_fault_handler(int sig, siginfo_t* info, void* context) { return fault_manager.HandleFault(sig, info, context); }
bool FaultManager::HandleFault(int sig, siginfo_t* info, void* context) { ... if (IsInGeneratedCode(info, context, true)) { for (const auto& handler : generated_code_handlers_) { VLOG(signals) << "invoking Action on handler " << handler; if (handler->Action(sig, info, context)) { return true; } } ... }
总结:FaultManger 初始化完成了两件事情:
- 设置 SIGSEGV 信号必须先通过 ART 处理
- ART 处理 SIGSEGV时,在 art_fault_handler 函数中主要先通过 generated_code_handlers_ 进行处理
- 把 NullPointerHander 等几个 Handler 添加到 generated_code_handlers_
所以,总的来讲,ART 中对 Java Exception的支持完全是通过 SIGSEGV 这个信号实现的。
2. ART 中对 SIGSEGV 信号的处理
前面已经知道,SIGSEGV信号会通过 SignalChain::Handler 函数处理:void SignalChain::Handler(int signo, siginfo_t* siginfo, void* ucontext_raw) { if (!GetHandlingSignal()) { for (const auto& handler : chains[signo].special_handlers_) { if (handler.sc_sigaction == nullptr) { break; } bool handler_noreturn = (handler.sc_flags & SIGCHAIN_ALLOW_NORETURN); sigset_t previous_mask; linked_sigprocmask(SIG_SETMASK, &handler.sc_mask, &previous_mask); ScopedHandlingSignal restorer; if (!handler_noreturn) { SetHandlingSignal(true); } if (handler.sc_sigaction(signo, siginfo, ucontext_raw)) { return; } linked_sigprocmask(SIG_SETMASK, &previous_mask, nullptr); } } // Forward to the user's signal handler. int handler_flags = chains[signo].action_.sa_flags; ucontext_t* ucontext = static_cast<ucontext_t*>(ucontext_raw); sigset_t mask; sigorset(&mask, &ucontext->uc_sigmask, &chains[signo].action_.sa_mask); if (!(handler_flags & SA_NODEFER)) { sigaddset(&mask, signo); } linked_sigprocmask(SIG_SETMASK, &mask, nullptr); if ((handler_flags & SA_SIGINFO)) { chains[signo].action_.sa_sigaction(signo, siginfo, ucontext_raw); } else { auto handler = chains[signo].action_.sa_handler; if (handler == SIG_IGN) { return; } else if (handler == SIG_DFL) { fatal("exiting due to SIG_DFL handler for signal %d", signo); } else { handler(signo); } } }
这个函数的功能:
- 如果当前线程没有正在处理信号,则尝试使用 special hander的 sc_sigaction 函数来处理该信号,即使用 art_fault_handler函数尝试处理 SIGSEGV
- 如果 art_fault_handler 能够处理当前信号,则处理完成后 return
- 如果不能处理当前信号,则会调用 SIGSEGV信号对应的 SignalChain中保存的 saved action(action_) 来处理这个信号,即如果应用程序设置过该信号的处理函数,则调用其,如果没有应该会走 linker 中设置的 sigaction,最终走到 debuggerd处理该信号。
所以,一般情况下,收到 SIGSEGV信号后,先走到当前函数,然后走到 art_fault_hanlder 函数:
// Signal handler called on SIGSEGV. static bool art_fault_handler(int sig, siginfo_t* info, void* context) { return fault_manager.HandleFault(sig, info, context); }
bool FaultManager::HandleFault(int sig, siginfo_t* info, void* context) { VLOG(signals) << "Handling fault"; #ifdef TEST_NESTED_SIGNAL // Simulate a crash in a handler. raise(SIGSEGV); #endif if (IsInGeneratedCode(info, context, true)) { VLOG(signals) << "in generated code, looking for handler"; for (const auto& handler : generated_code_handlers_) { VLOG(signals) << "invoking Action on handler " << handler; if (handler->Action(sig, info, context)) { // We have handled a signal so it's time to return from the // signal handler to the appropriate place. return true; } } // We hit a signal we didn't handle. This might be something for which // we can give more information about so call all registered handlers to // see if it is. if (HandleFaultByOtherHandlers(sig, info, context)) { return true; } } // Set a breakpoint in this function to catch unhandled signals. art_sigsegv_fault(); return false; }
可以看到,在HandleFault中,会先通过 IsInGeneratedCode() 判断当前的 SIGSEGV是否是发生在 generated code中,也就是判断是否是在从 java 代码编译出来的 native code中,如果是的话,才会依次使用 generated_code_handlers_ 以及 other handlers尝试处理该 SIGSEGV信号。
这里把这个函数的关键代码展示出来,判断是否在 generated code中:bool FaultManager::IsInGeneratedCode(siginfo_t* siginfo, void* context, bool check_dex_pc) { ... ThreadState state = thread->GetState(); if (state != kRunnable) { return false; } if (!Locks::mutator_lock_->IsSharedHeld(thread)) { return false; } GetMethodAndReturnPcAndSp(siginfo, context, &method_obj, &return_pc, &sp); const OatQuickMethodHeader* method_header = method_obj->GetOatQuickMethodHeader(return_pc); uint32_t dexpc = method_header->ToDexPc(method_obj, return_pc, false); return !check_dex_pc || dexpc != DexFile::kDexNoIndex; }
- 如果SIGSEGV发生在generated code中,则当前线程肯定是 kRunnable状态,且持有 mutator_lock_
- 根据当前 context 尝试获取对应的 ArtMethod,如果发生在 generated code中,则肯定能获取成功
- 根据ArtMethod获取当前 SIGSEGV发生位置对应的 dex_pc,如果发生在 generated code中,也应该能够获取成功
假设当前处理的这个SIGSEGV就是发生在 generated code中,那么接下来,依次通过 generated_code_handlers_ 和 other handlers的Action函数尝试处理该信号。generated_code_handlers_ 中依次是如下这几个handler:
而other_handlers_中只有一个handler:if (implicit_suspend_checks_) { new SuspensionHandler(&fault_manager); } if (implicit_so_checks_) { new StackOverflowHandler(&fault_manager); } if (implicit_null_checks_) { new NullPointerHandler(&fault_manager); }
if (kEnableJavaStackTraceHandler) { new JavaStackTraceHandler(&fault_manager); }
所以,一个 SIGSEGV信号来了之后,上面的这四个handler处理是有优先级的,就是它们的顺序。每个handler尝试处理时,都是通过各自的 Action函数,从当前 context中获取一定信息,判断是否匹配各自期待的信息,如果匹配,则就能够处理当前这个 SIGSEGV,返回 true,后面的handler就不再需要处理了;否则继续交由下一个handler尝试处理;这些handler都不能处理的话,最终再交给默认的处理函数,最终走到debuggerd。另外,看到这几个Handler的添加都是有条件的,拿一个7.0的手机看了下,这几个开关的值分别是:所以真正的运行环境中,SIGSEGV 信号只需要先被 StackOverflowHander 和 NullPointerHandler 这俩个handler尝试处理,不能处理,则走到Linker中的处理函数。(gdb) p 'art::Runtime'::instance_->implicit_suspend_checks_ $2 = false (gdb) p 'art::Runtime'::instance_->implicit_so_checks_ $3 = true (gdb) p 'art::Runtime'::instance_->implicit_null_checks_ $4 = true static constexpr bool kEnableJavaStackTraceHandler = false;
3. SuspensionHander 实现
各个handler的实现主要就是在其对应的 Action()函数中,而由于其要获取当前 context的信息,所以这些Action函数是平台相关的,比如x86平台context的处理和arm平台不一样,arm平台上 32bit和64bit对context的处理也不相同。这里我们主要分析 arm 32上的实现。
这几行注释是 SuspensionHandler实现的原理。// A suspend check is done using the following instruction sequence: // 0xf723c0b2: f8d902c0 ldr.w r0, [r9, #704] ; suspend_trigger_ // .. some intervening instruction // 0xf723c0b6: 6800 ldr r0, [r0, #0]
当想要一个线程在generated code中执行的时候进行 suspend check时,实际就是把线程 thread的 suspend_trigger_设置为 nullptr,按照上面的实现,在线程执行 generated code的过程中,会先通过 ldr.w r0,[r9, #704] 获取 suspend_trigger_成员(其中 r9表示thread,704 时 suspend_trigger_成员对应于 thread的offset),然后执行 ldr r0,[r0,#0]来取r0中的数据,而trigger的情况下,suspend_trigger_是0,此时就会触发一个 SIGSEGV,然后走到 SuspensionHandler::Action函数先尝试处理,发现匹配后,就跳转到Suspend Check中。
反之,当不需要进行suspend check时,把 suspend_trigger_的地址赋值给它自己就可以了,此时不会触发SIGSEGV。
trigger suspend 的enable和disable:
void TriggerSuspend() { tlsPtr_.suspend_trigger = nullptr; }
其enable有3种情况:void RemoveSuspendTrigger() { tlsPtr_.suspend_trigger = reinterpret_cast<uintptr_t*>(&tlsPtr_.suspend_trigger); }
- 在 bool Thread::ModifySuspendCountInternal()函数结尾,如果发现更改后的suspend_count大于0,说明当前线程被请求suspend,那么当然是越快越好,此时会调用 TriggerSuspend()函数,以便当前线程执行 generated code过程中进程 Suspend check,从而进入suspend状态
- 在 bool Thread::RequestCheckpoint(Closure* function) 函数给一个线程设置Checkpoint function成功后,会调用 TriggerSuspend() 函数,因为被设置了Checkpoint function,也是越快执行越好,trigger后,在suspend check时,会先检查 checkpoint function,如果存在,则立即执行 checkpoint function
- 在 bool Thread::RequestEmptyCheckpoint() 函数成功后也会调用 TriggerSuspend();EmptyCheckpoint的功用没有详细了解
知道这些知识点后,再看 SuspensionHandler::Action函数的实现就简单了:
Action函数中,实际就是先判断出发 SIGSEGV的代码是否是 0x6800( ldr r0,[r0,#0]),如果是,才有可能是Suspend Check,然后检查这个代码之前的40个字节之内是否出现了 0xf8d902c0 指令(ldr.w r0,[r9, #704]),至于为什是 40个字节,这个应该跟 thumb指令长度以及 ART 编译java 代码的 code generator相关,还没有研究。bool SuspensionHandler::Action(int sig ATTRIBUTE_UNUSED, siginfo_t* info ATTRIBUTE_UNUSED, void* context) { // These are the instructions to check for. The first one is the ldr r0,[r9,#xxx] // where xxx is the offset of the suspend trigger. uint32_t checkinst1 = 0xf8d90000 + Thread::ThreadSuspendTriggerOffset<PointerSize::k32>().Int32Value(); uint16_t checkinst2 = 0x6800; struct ucontext* uc = reinterpret_cast<struct ucontext*>(context); struct sigcontext *sc = reinterpret_cast<struct sigcontext*>(&uc->uc_mcontext); uint8_t* ptr2 = reinterpret_cast<uint8_t*>(sc->arm_pc); uint8_t* ptr1 = ptr2 - 4; VLOG(signals) << "checking suspend"; uint16_t inst2 = ptr2[0] | ptr2[1] << 8; VLOG(signals) << "inst2: " << std::hex << inst2 << " checkinst2: " << checkinst2; if (inst2 != checkinst2) { // Second instruction is not good, not ours. return false; } uint8_t* limit = ptr1 - 40; // Compiler will hoist to a max of 20 instructions. bool found = false; while (ptr1 > limit) { uint32_t inst1 = ((ptr1[0] | ptr1[1] << 8) << 16) | (ptr1[2] | ptr1[3] << 8); VLOG(signals) << "inst1: " << std::hex << inst1 << " checkinst1: " << checkinst1; if (inst1 == checkinst1) { found = true; break; } ptr1 -= 2; // Min instruction size is 2 bytes. } if (found) { sc->arm_lr = sc->arm_pc + 3; // +2 + 1 (for thumb) sc->arm_pc = reinterpret_cast<uintptr_t>(art_quick_implicit_suspend); // Now remove the suspend trigger that caused this fault. Thread::Current()->RemoveSuspendTrigger(); VLOG(signals) << "removed suspend trigger invoking test suspend"; return true; } return false; }
我们暂时跳过这个疑问,假设经过检测后,发现匹配,确实是因为 TriggerSuspend()触发的一个 SIGSEGV信号,那么我们就需要处理这个 SIGSEGV信号了。处理的方式就是通过设置 arm_pc来跳转到隐式的 suspend check处理函数,另外在跳转之前 lr 会设置为 pc+2+1(+2因为当前pc指向的指令是2个字节,+1是因为在从susped check 返回回来后,需要运行在 thumb模式):
下面就进入到了suspend check函数,同样是平台相关的:sc->arm_lr = sc->arm_pc + 3; // +2 + 1 (for thumb) sc->arm_pc = reinterpret_cast<uintptr_t>(art_quick_implicit_suspend);
我们看到,这里实际是跳到了 artTestSuspendFromCode 函数中:ENTRY art_quick_implicit_suspend mov r0, rSELF SETUP_SAVE_REFS_ONLY_FRAME r1 @ save callee saves for stack crawl bl artTestSuspendFromCode @ (Thread*) RESTORE_SAVE_REFS_ONLY_FRAME_AND_RETURN END art_quick_implicit_suspend
然后就到了 thread的 CheckSuspend()函数:extern "C" void artTestSuspendFromCode(Thread* self) REQUIRES_SHARED(Locks::mutator_lock_) { // Called when suspend count check value is 0 and thread->suspend_count_ != 0 ScopedQuickEntrypointChecks sqec(self); self->CheckSuspend(); }
可以看到在这个函数里,按照 CheckpointFunction,SuspendCheck,EmptyCheckpoint的优先级进行执行,对应了上面讲到的 3种 TriggerSuspend()的情况。inline void Thread::CheckSuspend() { DCHECK_EQ(Thread::Current(), this); for (;;) { if (ReadFlag(kCheckpointRequest)) { RunCheckpointFunction(); } else if (ReadFlag(kSuspendRequest)) { FullSuspendCheck(); } else if (ReadFlag(kEmptyCheckpointRequest)) { RunEmptyCheckpoint(); } else { break; } } }
到这里知道了SuspensionHandler工作的大体流程,但是有一个问题:
这个隐式的suspend check是在 generated code中的怎样的位置,它在怎样的时机执行?要搞明白这个问题,还需要研究隐式的suspend check的设计需求以及code generator生成这种代码的流程。因为隐式的 Suspend Check没有打开,暂不研究了。
因为在 generated code中,并没有安插这类隐式的 suspend check代码。那么使用的suspend check应该就是显示的检查了。在这里简单提一下Suspend Check的场景:
Supend Check 会在java函数的返回时,线程运行状态转换为 kRunnable状态时,以及 kRunnable状态的线程的 thread loop(goto),cmp(if-ge),switch(packed-swtich)这些执行过程,都需要进行suspend check。简单总结就是:1.线程从其他状态切换到 kRunnable状态时需要检查 2.kRunnable状态的线程执行跳转时需要检查
1.运行在Interpreter模式时的 suspend check:
在各个 suspen check的点执行 MterpSuspendCheck函数来检查是否需要进入suspend 状态。
extern "C" size_t MterpSuspendCheck(Thread* self) REQUIRES_SHARED(Locks::mutator_lock_) { self->AllowThreadSuspension(); return MterpShouldSwitchInterpreters(); }
inline void Thread::AllowThreadSuspension() { DCHECK_EQ(Thread::Current(), this); if (UNLIKELY(TestAllFlags())) { CheckSuspend(); } // Invalidate the current thread's object pointers (ObjPtr) to catch possible moving GC bugs due // to missing handles. PoisonObjectPointers(); }
CheckSuspend()函数在前面已经提到了。
2.generated code中现实的 suspend check:
suspend check的安插点,需要达到相同的目的,但有些许不同,generated code中的检测代码是compiler 在编译 java method的时候安插进去的:
在这个函数的 generated code中,函数入口位置 0x0060aaf8进行检查线程的私有数据 stata_and_flags,如果不是0,则需要跳转到 0x0060ab2c 处,进行suspend check,可以看到是 跳转到了 [tr, #1232] 处:4: void java.lang.ThreadLocal$ThreadLocalMap.<init>(java.lang.ThreadLocal$ThreadLocalMap, java.lang.ThreadLocal$ThreadLocalMap) (dex_method_idx=3662) DEX CODE: 0x0000: 7020 4d0e 1000 | invoke-direct {v0, v1}, void java.lang.ThreadLocal$ThreadLocalMap.<init>(java.lang.ThreadLocal$ThreadLocalMap) // method@3661 0x0003: 0e00 | return-void CODE: (code_offset=0x0060aae4 size_offset=0x0060aae0 size=100)... 0x0060aae4: d1400bf0 sub x16, sp, #0x2000 (8192) 0x0060aae8: b940021f ldr wzr, [x16] StackMap [native_pc=0x60aaec] (dex_pc=0x0, native_pc_offset=0x8, dex_register_map_offset=0xffffffff, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b00000000000000) 0x0060aaec: f81b0fe0 str x0, [sp, #-80]! 0x0060aaf0: a90357f4 stp x20, x21, [sp, #48] 0x0060aaf4: a9047bf6 stp x22, lr, [sp, #64] 0x0060aaf8: 79400270 ldrh w16, [tr] ; state_and_flags 0x0060aafc: 35000190 cbnz w16, #+0x30 (addr 0x60ab2c) 0x0060ab00: aa0303f4 mov x20, x3 0x0060ab04: aa0103f5 mov x21, x1 0x0060ab08: aa0203f6 mov x22, x2 0x0060ab0c: d0ff6ac0 adrp x0, #-0x12a6000 (addr -0xc9c000) 0x0060ab10: f9428c00 ldr x0, [x0, #1304] 0x0060ab14: f940181e ldr lr, [x0, #48] 0x0060ab18: d63f03c0 blr lr StackMap [native_pc=0x60ab1c] (dex_pc=0x0, native_pc_offset=0x38, dex_register_map_offset=0x0, inline_info_offset=0xffffffff, register_mask=0x700000, stack_mask=0b00000000000000) v0: in register (21) [entry 0] v1: in register (22) [entry 1] v2: in register (20) [entry 2] 0x0060ab1c: a94357f4 ldp x20, x21, [sp, #48] 0x0060ab20: a9447bf6 ldp x22, lr, [sp, #64] 0x0060ab24: 910143ff add sp, sp, #0x50 (80) 0x0060ab28: d65f03c0 ret 0x0060ab2c: a9010be1 stp x1, x2, [sp, #16] 0x0060ab30: f90013e3 str x3, [sp, #32] 0x0060ab34: f9426a7e ldr lr, [tr, #1232] ; pTestSuspend 0x0060ab38: d63f03c0 blr lr StackMap [native_pc=0x60ab3c] (dex_pc=0x0, native_pc_offset=0x58, dex_register_map_offset=0x3, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b00000101010000) v0: in stack (16) [entry 3] v1: in stack (24) [entry 4] v2: in stack (32) [entry 5] 0x0060ab3c: a9410be1 ldp x1, x2, [sp, #16] 0x0060ab40: f94013e3 ldr x3, [sp, #32] 0x0060ab44: 17ffffef b #-0x44 (addr 0x60ab00)
所以真实的情况是跳转到 thread 的 tlsPtr_.quick_entrypoints->pTestSuspend 函数,而它的值实际是指向 art_quick_test_suspend 函数入口:(gdb) p (('art::Thread'*)0x7f87e4ea00)->tlsPtr_.quick_entrypoints->pTestSuspend $2 = (void (*)(void)) 0x7f878bab10 <art_quick_test_suspend> (gdb) p &(('art::Thread'*)0x7f87e4ea00)->tlsPtr_.quick_entrypoints->pTestSuspend $3 = (void (**)(void)) 0x7f87e4eed0 (gdb) p 0x7f87e4eed0-0x7f87e4ea00 $4 = 1232
最终跳转到 artTestSuspendFromCode函数,接下来就与 art_quick_implicit_suspend 基本相同了。ENTRY art_quick_test_suspend #ifdef ARM_R4_SUSPEND_FLAG ldrh rSUSPEND, [rSELF, #THREAD_FLAGS_OFFSET] cbnz rSUSPEND, 1f @ check Thread::Current()->suspend_count_ == 0 mov rSUSPEND, #SUSPEND_CHECK_INTERVAL @ reset rSUSPEND to SUSPEND_CHECK_INTERVAL bx lr @ return if suspend_count_ == 0 1: mov rSUSPEND, #SUSPEND_CHECK_INTERVAL @ reset rSUSPEND to SUSPEND_CHECK_INTERVAL #endif SETUP_SAVE_EVERYTHING_FRAME r0 @ save everything for GC stack crawl mov r0, rSELF bl artTestSuspendFromCode @ (Thread*) RESTORE_SAVE_EVERYTHING_FRAME bx lr END art_quick_test_suspend
4. StackOverflowHandler 的实现
从这个Handler的存在,我们知道,Android上对 java stack overflow的检测,也是通过 SIGSEGV实现的。
5. NullPointerHandler 实现
如果 StackOVerflowHandler不能处理这次的 SIGSEGV信号,那么接下来 NullPointerHandler将尝试去处理。
具体分析见:ART异常处理机制(3) - NullPointerException实现
6. JavaStackTraceHandler 实现
看下其代码:从实现上看,bool JavaStackTraceHandler::Action(int sig ATTRIBUTE_UNUSED, siginfo_t* siginfo, void* context) { // Make sure that we are in the generated code, but we may not have a dex pc. bool in_generated_code = manager_->IsInGeneratedCode(siginfo, context, false); if (in_generated_code) { LOG(ERROR) << "Dumping java stack trace for crash in generated code"; ArtMethod* method = nullptr; uintptr_t return_pc = 0; uintptr_t sp = 0; Thread* self = Thread::Current(); manager_->GetMethodAndReturnPcAndSp(siginfo, context, &method, &return_pc, &sp); // Inside of generated code, sp[0] is the method, so sp is the frame. self->SetTopOfStack(reinterpret_cast<ArtMethod**>(sp)); self->DumpJavaStack(LOG_STREAM(ERROR)); } return false; // Return false since we want to propagate the fault to the main signal handler. }
- 只要 SIGSEGV发生在 generated code,就会DumpJavaStack,目的是方便用来分析
- 无论有没有dump java stack,都会返回 false,相当于不消费这个 SIGSEGV,最终仍然交给 main signal handler处理
这个 Handler 实现比较简单,其目的是:当 generated code 中发生 SIGSEGV 后,前面的几个handler都没有能够处理的情况下,打印一下 java stack trace,便于提供更多直观的信息。
7. 其他类型的 Java Exception的实现
7.1 ArrayIndexOutOfBoundsException
贴一段访问 array 数据时检测 IndexOutOfBounds 的代码,以分析这个 Exception的实现。
Java 代码:
public void setPropertyName(@NonNull String propertyName) { // mValues could be null if this is being constructed piecemeal. Just record the // propertyName to be used later when setValues() is called if so. if (mValues != null) { PropertyValuesHolder valuesHolder = mValues[0]; String oldName = valuesHolder.getPropertyName(); valuesHolder.setPropertyName(propertyName); mValuesMap.remove(oldName); mValuesMap.put(propertyName, valuesHolder); } mPropertyName = propertyName; // New property/values/target should cause re-initialization prior to starting mInitialized = false; }
DEX CODE:
40: void android.animation.ObjectAnimator.setPropertyName(java.lang.String) (dex_method_idx=1461) DEX CODE: 0x0000: 1203 | const/4 v3, #+0 0x0001: 5442 5316 | iget-object v2, v4, [Landroid/animation/PropertyValuesHolder; android.animation.ObjectAnimator.mValues // field@5715 0x0003: 3802 1700 | if-eqz v2, +23 0x0005: 5442 5316 | iget-object v2, v4, [Landroid/animation/PropertyValuesHolder; android.animation.ObjectAnimator.mValues // field@5715 0x0007: 4601 0203 | aget-object v1, v2, v3 0x0009: 6e10 4406 0100 | invoke-virtual {v1}, java.lang.String android.animation.PropertyValuesHolder.getPropertyName() // method@1604 0x000c: 0c00 | move-result-object v0 0x000d: 6e20 7106 5100 | invoke-virtual {v1, v5}, void android.animation.PropertyValuesHolder.setPropertyName(java.lang.String) // method@1649 0x0010: 5442 5416 | iget-object v2, v4, Ljava/util/HashMap; android.animation.ObjectAnimator.mValuesMap // field@5716 0x0012: 6e20 82fc 0200 | invoke-virtual {v2, v0}, java.lang.Object java.util.HashMap.remove(java.lang.Object) // method@64642 0x0015: 5442 5416 | iget-object v2, v4, Ljava/util/HashMap; android.animation.ObjectAnimator.mValuesMap // field@5716 0x0017: 6e30 80fc 5201 | invoke-virtual {v2, v5, v1}, java.lang.Object java.util.HashMap.put(java.lang.Object, java.lang.Object) // method@64640 0x001a: 5b45 5116 | iput-object v5, v4, Ljava/lang/String; android.animation.ObjectAnimator.mPropertyName // field@5713 0x001c: 5c43 4f16 | iput-boolean v3, v4, Z android.animation.ObjectAnimator.mInitialized // field@5711 0x001e: 0e00 | return-void
QUICK CODE:
在跳转到 pThrowArrayBounds之前,准备了两个参数:r0 (index),r1 (array size)CODE: (code_offset=0x01aef425 size_offset=0x01aef420 size=176)... 0x01aef424: f5ad5c00 sub r12, sp, #8192 0x01aef428: f8dcc000 ldr.w r12, [r12, #0] StackMap [native_pc=0x1aef42d] (dex_pc=0x0, native_pc_offset=0x8, dex_register_map_offset=0xffffffff, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000) 0x01aef42c: e92d4de0 push {r5, r6, r7, r8, r10, r11, lr} 0x01aef430: b089 sub sp, sp, #36 0x01aef432: 9000 str r0, [sp, #0] 0x01aef434: f8b9c000 ldrh.w r12, [r9, #0] ; state_and_flags 0x01aef438: f1bc0f00 cmp.w r12, #0 0x01aef43c: d13d bne +122 (0x01aef4ba) 0x01aef43e: 6a4d ldr r5, [r1, #36] ; r1是 this,这里 r5 是 mValues 0x01aef440: 2d00 cmp r5, #0 0x01aef442: d02a beq +84 (0x01aef49a) 0x01aef444: 460f mov r7, r1 0x01aef446: 4690 mov r8, r2 0x01aef448: 2600 movs r6, #0 ; 这个 0 是 mValues[0] 的下标index 0 0x01aef44a: 68a8 ldr r0, [r5, #8] ;这里应该是获取 mValues的size StackMap [native_pc=0x1aef44d] (dex_pc=0x7, native_pc_offset=0x28, dex_register_map_offset=0x0, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000) v2: in register (5) [entry 0] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3] 0x01aef44c: 4286 cmp r6, r0 ;比较 index(0) 和 mValues 的 size 0x01aef44e: d23c bcs +120 (0x01aef4ca) ; 若index(0)大于等于 mValues的size,择跳转到 0x01aef4ca 抛出异常 0x01aef450: 68e9 ldr r1, [r5, #12] 0x01aef452: 468a mov r10, r1 0x01aef454: 6808 ldr r0, [r1, #0] StackMap [native_pc=0x1aef457] (dex_pc=0x9, native_pc_offset=0x32, dex_register_map_offset=0x3, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000) v1: in register (1) [entry 4] v2: in register (5) [entry 0] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3] 0x01aef456: f8d000c0 ldr.w r0, [r0, #192] 0x01aef45a: f8d0e020 ldr.w lr, [r0, #32] 0x01aef45e: 47f0 blx lr StackMap [native_pc=0x1aef461] (dex_pc=0x9, native_pc_offset=0x3c, dex_register_map_offset=0x7, inline_info_offset=0xffffffff, register_mask=0x5a0, stack_mask=0b0000000000) v1: in register (10) [entry 5] v2: in register (5) [entry 0] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3] 0x01aef460: 4642 mov r2, r8 0x01aef462: 4651 mov r1, r10 0x01aef464: 4683 mov r11, r0 0x01aef466: 6808 ldr r0, [r1, #0] 0x01aef468: f8d000f0 ldr.w r0, [r0, #240] 0x01aef46c: f8d0e020 ldr.w lr, [r0, #32] 0x01aef470: 47f0 blx lr StackMap [native_pc=0x1aef473] (dex_pc=0xd, native_pc_offset=0x4e, dex_register_map_offset=0xb, inline_info_offset=0xffffffff, register_mask=0xda0, stack_mask=0b0000000000) v0: in register (11) [entry 6] v1: in register (10) [entry 5] v2: in register (5) [entry 0] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3] 0x01aef472: 6ab9 ldr r1, [r7, #40] 0x01aef474: 465a mov r2, r11 0x01aef476: 460d mov r5, r1 0x01aef478: 6808 ldr r0, [r1, #0] StackMap [native_pc=0x1aef47b] (dex_pc=0x12, native_pc_offset=0x56, dex_register_map_offset=0xf, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000) v0: in register (11) [entry 6] v1: in register (10) [entry 5] v2: in register (1) [entry 4] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3] 0x01aef47a: f8d000d8 ldr.w r0, [r0, #216] 0x01aef47e: f8d0e020 ldr.w lr, [r0, #32] 0x01aef482: 47f0 blx lr StackMap [native_pc=0x1aef485] (dex_pc=0x12, native_pc_offset=0x60, dex_register_map_offset=0xb, inline_info_offset=0xffffffff, register_mask=0xda0, stack_mask=0b0000000000) v0: in register (11) [entry 6] v1: in register (10) [entry 5] v2: in register (5) [entry 0] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3] 0x01aef484: 6ab9 ldr r1, [r7, #40] 0x01aef486: 4642 mov r2, r8 0x01aef488: 4653 mov r3, r10 0x01aef48a: 460d mov r5, r1 0x01aef48c: 6808 ldr r0, [r1, #0] StackMap [native_pc=0x1aef48f] (dex_pc=0x17, native_pc_offset=0x6a, dex_register_map_offset=0xf, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000) v0: in register (11) [entry 6] v1: in register (10) [entry 5] v2: in register (1) [entry 4] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3] 0x01aef48e: f8d000d0 ldr.w r0, [r0, #208] 0x01aef492: f8d0e020 ldr.w lr, [r0, #32] 0x01aef496: 47f0 blx lr StackMap [native_pc=0x1aef499] (dex_pc=0x17, native_pc_offset=0x74, dex_register_map_offset=0xb, inline_info_offset=0xffffffff, register_mask=0xda0, stack_mask=0b0000000000) v0: in register (11) [entry 6] v1: in register (10) [entry 5] v2: in register (5) [entry 0] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3] 0x01aef498: e002 b +4 (0x01aef4a0) 0x01aef49a: 460f mov r7, r1 0x01aef49c: 4690 mov r8, r2 0x01aef49e: 2600 movs r6, #0 0x01aef4a0: f8c78074 str.w r8, [r7, #116] 0x01aef4a4: f1b80f00 cmp.w r8, #0 0x01aef4a8: d003 beq +6 (0x01aef4b2) 0x01aef4aa: f8d90080 ldr.w r0, [r9, #128] ; card_table 0x01aef4ae: 09f9 lsrs r1, r7, #7 0x01aef4b0: 5440 strb r0, [r0, r1] 0x01aef4b2: 767e strb r6, [r7, #25] 0x01aef4b4: b009 add sp, sp, #36 0x01aef4b6: e8bd8de0 pop {r5, r6, r7, r8, r10, r11, pc} 0x01aef4ba: 9104 str r1, [sp, #16] 0x01aef4bc: 9205 str r2, [sp, #20] 0x01aef4be: f8d9e2a8 ldr.w lr, [r9, #680] ; pTestSuspend 0x01aef4c2: 47f0 blx lr StackMap [native_pc=0x1aef4c5] (dex_pc=0x0, native_pc_offset=0xa0, dex_register_map_offset=0x13, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000110000) v4: in stack (16) [entry 7] v5: in stack (20) [entry 8] 0x01aef4c4: 9904 ldr r1, [sp, #16] 0x01aef4c6: 9a05 ldr r2, [sp, #20] 0x01aef4c8: e7b9 b -142 (0x01aef43e) 0x01aef4ca: 4601 mov r1, r0 ;把 mValues 的 size 作为第二个参数 0x01aef4cc: 4630 mov r0, r6 ;把 index(0) 作为第一个参数 0x01aef4ce: f8d9e2b0 ldr.w lr, [r9, #688] ; pThrowArrayBounds 0x01aef4d2: 47f0 blx lr ; 调用 pThrowArrayBounds(artThrowArrayBoundsFromCode)抛出 ArrayIndexOutOfBoundsException StackMap [native_pc=0x1aef4d5] (dex_pc=0x7, native_pc_offset=0xb0, dex_register_map_offset=0x0, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b0000000000) v2: in register (5) [entry 0] v3: in register (6) [entry 1] v4: in register (7) [entry 2] v5: in register (8) [entry 3]
qpoints->pThrowArrayBounds = art_quick_throw_array_bounds;
看下这个宏:/* * Called by managed code to create and deliver an ArrayIndexOutOfBoundsException. Arg1 holds * index, arg2 holds limit. */ TWO_ARG_RUNTIME_EXCEPTION_SAVE_EVERYTHING art_quick_throw_array_bounds, artThrowArrayBoundsFromCode
在原有参数的基础上又加了第三个参数 r2,它是 Thread* self;然后跳转到 artThrowArrayBoundsFromCode:.macro TWO_ARG_RUNTIME_EXCEPTION_SAVE_EVERYTHING c_name, cxx_name .extern \cxx_name ENTRY \c_name SETUP_SAVE_EVERYTHING_FRAME r2 @ save all registers as basis for long jump context mov r2, r9 @ pass Thread::Current bl \cxx_name @ \cxx_name(Thread*) END \c_name .endm
// Called by generated code to throw an array index out of bounds exception. extern "C" NO_RETURN void artThrowArrayBoundsFromCode(int index, int length, Thread* self) REQUIRES_SHARED(Locks::mutator_lock_) { ScopedQuickEntrypointChecks sqec(self); ThrowArrayIndexOutOfBoundsException(index, length); self->QuickDeliverException(); }
7.2 ArithmeticException
看下注释:看一个例子:/* * Called by managed code to create and deliver an ArithmeticException. */ NO_ARG_RUNTIME_EXCEPTION_SAVE_EVERYTHING art_quick_throw_div_zero, artThrowDivZeroFromCode
Java CODE:
DEX CODE:public static int floorDiv(int x, int y) { int r = x / y; // if the signs are different and modulo not zero, round down if ((x ^ y) < 0 && (r * y != x)) { r--; } return r; }
QUICK CODE:24: int java.lang.Math.floorDiv(int, int) (dex_method_idx=2574) DEX CODE: 0x0000: 9300 0203 | div-int v0, v2, v3 0x0002: 9701 0203 | xor-int v1, v2, v3 0x0004: 3b01 0800 | if-gez v1, +8 0x0006: 9201 0003 | mul-int v1, v0, v3 0x0008: 3221 0400 | if-eq v1, v2, +4 0x000a: d800 00ff | add-int/lit8 v0, v0, #-1 0x000c: 0f00 | return v0
可以看到在除法运算中安插了除数为0的检测。Interpreter模式下的检测不再介绍。CODE: (code_offset=0x005d0024 size_offset=0x005d0020 size=72)... 0x005d0024: f81f0fe0 str x0, [sp, #-16]! 0x005d0028: f90007fe str lr, [sp, #8] 0x005d002c: 340001c2 cbz w2, #+0x38 (addr 0x5d0064) 0x005d0030: 1ac20c20 sdiv w0, w1, w2 0x005d0034: 4a020023 eor w3, w1, w2 0x005d0038: 36f80103 tbz w3, #31, #+0x20 (addr 0x5d0058) 0x005d003c: 1b007c42 mul w2, w2, w0 0x005d0040: 6b02003f cmp w1, w2 0x005d0044: 1a9f17e1 cset w1, eq 0x005d0048: 51000402 sub w2, w0, #0x1 (1) 0x005d004c: 7100003f cmp w1, #0x0 (0) 0x005d0050: 1a821003 csel w3, w0, w2, ne 0x005d0054: aa0303e0 mov x0, x3 0x005d0058: f94007fe ldr lr, [sp, #8] 0x005d005c: 910043ff add sp, sp, #0x10 (16) 0x005d0060: d65f03c0 ret 0x005d0064: f942767e ldr lr, [tr, #1256] ; pThrowDivZero 0x005d0068: d63f03c0 blr lr StackMap [native_pc=0x5d006c] (dex_pc=0x0, native_pc_offset=0x48, dex_register_map_offset=0x0, inline_info_offset=0xffffffff, register_mask=0x0, stack_mask=0b000000) v2: in register (1) [entry 0] v3: in register (2) [entry 1]
与上面的 ArrayIndexOutOfBoundsException 类似:7.3 StringIndexOutOfBoundsException
/* * Called by managed code to create and deliver a StringIndexOutOfBoundsException * as if thrown from a call to String.charAt(). Arg1 holds index, arg2 holds limit. */ TWO_ARG_RUNTIME_EXCEPTION_SAVE_EVERYTHING art_quick_throw_string_bounds, artThrowStringBoundsFromCode
9. Throw & Catch的实现
throw-catch-finally: ART异常处理机制(4) - throw & catch & finally实现