Basic4Android1003无标题,Android ANR Trace 详解

5.Threadlist DumpForSigQuit

看一下ThreadList 的Dump过程:

void ThreadList::DumpForSigQuit(std::ostream& os) {

{

ScopedObjectAccess soa(Thread::Current());

// Only print if we have samples.

if (suspend_all_historam_.SampleSize() > 0) { // 这个数据记录一次SuspendAll所花费的时间,如果记录里有数据就进行dump

Histogram::CumulativeData data;

suspend_all_historam_.CreateHistogram(&data);

suspend_all_historam_.PrintConfidenceIntervals(os, 0.99, data);  // Dump time to suspend.

}

}

Dump(os); // Dump thread list

DumpUnattachedThreads(os); // 对于当前进程中,没有Attach 的线程进行Dump

}

void ThreadList::Dump(std::ostream& os) {

{

MutexLock mu(Thread::Current(), *Locks::thread_list_lock_);

os

}

local_os < backtrace_map_;

};

即,Dump Thread list 是通过每个thread执行DumpCheckpoint来Dump 各个thread的状态和backtrace的;

看下每个Thread是如何执行DumpCheckPoint的:

size_t ThreadList::RunCheckpoint(Closure* checkpoint_function) {

Thread* self = Thread::Current();

Locks::mutator_lock_->AssertNotExclusiveHeld(self);

Locks::thread_list_lock_->AssertNotHeld(self);

Locks::thread_suspend_count_lock_->AssertNotHeld(self);

if (kDebugLocking && gAborting == 0) {

CHECK_NE(self->GetState(), kRunnable);

}

std::vectorsuspended_count_modified_threads;

size_t count = 0;

{

// 第一步:Runnable线程和Suspended线程区分对待

// Call a checkpoint function for each thread, threads which are suspend get their checkpoint

// manually called.这里已经说明,让每个thread执行 CheckPoint函数,对于Suspend的线程,我们手动帮它们调用 CheckPoint函数;

MutexLock mu(self, *Locks::thread_list_lock_);

MutexLock mu2(self, *Locks::thread_suspend_count_lock_);

count = list_.size();

for (const auto& thread : list_) {

if (thread != self) {

while (true) {

// 对于Runnable的线程,把checkpoint_function设置到当前线程的 CheckPoint function列表中,当线程执行到CheckPoint时,会执行该CheckPoint function

if (thread->RequestCheckpoint(checkpoint_function)) {

// This thread will run its checkpoint some time in the near future.

break;

} else {

// We are probably suspended, try to make sure that we stay suspended.

// The thread switched back to runnable.

if (thread->GetState() == kRunnable) {

// Spurious fail, try again.

continue;

}

// 对于suspended线程,放到一个集合里,稍后单独处理,为了防止处理过成中线程状态改变,影响处理,在这里把线程的suspend count +1,

// 这样即便线程原有的suspended Request结束时,suspend count仍然不为0,无法进入Runnable状态

thread->ModifySuspendCount(self, +1, false);

suspended_count_modified_threads.push_back(thread);

break;

}

}

}

}

}

// Run the checkpoint on ourself while we wait for threads to suspend.

checkpoint_function->Run(self); // 对于Signal Catcher线程,在这里进行 CheckPoint function的Run函数调用,进行Thread dump

// Run the checkpoint on the suspended threads.

for (const auto& thread : suspended_count_modified_threads) {

if (!thread->IsSuspended()) {

if (ATRACE_ENABLED()) {

std::ostringstream oss;

thread->ShortDump(oss);

ATRACE_BEGIN((std::string("Waiting for suspension of thread ") + oss.str()).c_str());

}

// Busy wait until the thread is suspended.

const uint64_t start_time = NanoTime();

do {

ThreadSuspendSleep(kThreadSuspendInitialSleepUs);

} while (!thread->IsSuspended());

const uint64_t total_delay = NanoTime() - start_time;

// Shouldn't need to wait for longer than 1000 microseconds.

constexpr uint64_t kLongWaitThreshold = MsToNs(1);

ATRACE_END();

if (UNLIKELY(total_delay > kLongWaitThreshold)) {

LOG(WARNING)

{

MutexLock mu2(self, *Locks::thread_suspend_count_lock_);

thread->ModifySuspendCount(self, -1, false); // 当前thread dump 完成后,将其suspend count -1,不在需要保持suspend状态了;

}

}

{

// Imitate ResumeAll, threads may be waiting on Thread::resume_cond_ since we raised their

// suspend count. Now the suspend_count_ is lowered so we must do the broadcast.

MutexLock mu2(self, *Locks::thread_suspend_count_lock_);

Thread::resume_cond_->Broadcast(self); // 通知那些suspended线程,可以Resume了;

}

return count;

}

在这里有两个点需要解释下:

1.线程的kRunnable状态和Suspended状态:

enum ThreadState {

//                                   Thread.State   JDWP state

kTerminated = 66,                 // TERMINATED     TS_ZOMBIE    Thread.run has returned, but Thread* still around

kRunnable,                        // RUNNABLE       TS_RUNNING   runnable

kTimedWaiting,                    // TIMED_WAITING  TS_WAIT      in Object.wait() with a timeout

kSleeping,                        // TIMED_WAITING  TS_SLEEPING  in Thread.sleep()

kBlocked,                         // BLOCKED        TS_MONITOR   blocked on a monitor

kWaiting,                         // WAITING        TS_WAIT      in Object.wait()

kWaitingForGcToComplete,          // WAITING        TS_WAIT      blocked waiting for GC

kWaitingForCheckPointsToRun,      // WAITING        TS_WAIT      GC waiting for checkpoints to run

kWaitingPerformingGc,             // WAITING        TS_WAIT      performing GC

kWaitingForDebuggerSend,          // WAITING        TS_WAIT      blocked waiting for events to be sent

kWaitingForDebuggerToAttach,      // WAITING        TS_WAIT      blocked waiting for debugger to attach

kWaitingInMainDebuggerLoop,       // WAITING        TS_WAIT      blocking/reading/processing debugger events

kWaitingForDebuggerSuspension,    // WAITING        TS_WAIT      waiting for debugger suspend all

kWaitingForJniOnLoad,             // WAITING        TS_WAIT      waiting for execution of dlopen and JNI on load code

kWaitingForSignalCatcherOutput,   // WAITING        TS_WAIT      waiting for signal catcher IO to complete

kWaitingInMainSignalCatcherLoop,  // WAITING        TS_WAIT      blocking/reading/processing signals

kWaitingForDeoptimization,        // WAITING        TS_WAIT      waiting for deoptimization suspend all

kWaitingForMethodTracingStart,    // WAITING        TS_WAIT      waiting for method tracing to start

kWaitingForVisitObjects,          // WAITING        TS_WAIT      waiting for visiting objects

kWaitingForGetObjectsAllocated,   // WAITING        TS_WAIT      waiting for getting the number of allocated objects

kStarting,                        // NEW            TS_WAIT      native thread started, not yet ready to run managed code

kNative,                          // RUNNABLE       TS_RUNNING   running in a JNI native method

kSuspended,                       // RUNNABLE       TS_RUNNING   suspended by GC or debugger

};

其中,thread在运行的3中状态:

kRunnable, // 正在运行,可能会存在heap上的内存分配和 java函数跳转

kNative,  // 是指在执行 Jni Native method,不会影响Java堆 heap的分配和GC,不存在java函数跳转

kSuspended, //线程其实是在Runnable中 Wait,wait resume condition

kRunnable是指当前线程正在运行,

kSuspended是指当前线程从其他状态要切换到kRunnable状态时,检查当前线程是否有kSuspendRequest,

如果有suspend Request,则进行wait,代码不在继续执行,线程变成kSuspended状态,直到 Suspend count发生变化,变为0后才会切换到Runnable状态;

这也是为什么GC的时候需要 SuspendAll线程,因为Suspend后,此时的heap是被锁定的,不存在对java heap的操作,以便来进行GC线程操作heap;

2.CheckPoint

提到CheckPoint必须要提到safe point;

safepoint:对于ART编译的代码,可以定期轮询当前Runtime来确认是否需要执行某些特定代码;可以认为这些轮询时的点,就是safepoint;

safepoint可以用来实现暂定一个java线程,也可以用来实现Checkpoint机制;

比如:当正在执行java代码的线程A执行到safepoint时,会执行CheckSuspend函数,在发现当前线程有 checkpoint request时,

会在这个点执行线程的CheckPoint函数;如果发现当前线程有suspend request时,会进行SuspendCheck,使得线程进入Suspend状态(暂停);

所以说,ART CheckPoint应该是safepoint的一个功能实现;

下面引用网上一段话:

作者:RednaxelaFX

链接: https://www.zhihu.com/question/48996839/answer/113801448

来源:知乎

著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。

从编译器和解释器的角度看,ART的safepoint有两种:

主动safepoint:编译生成的代码里或者解释代码里有主动检查safepoint的动作,并在发现需要进入safepoint时跳转到相应的处理程序里。

ART的解释器安插主动safepoint的位置在循环的回跳处(backedge,具体来说是在跳转前的源头处)以及方法返回处(return / throw exception)。

ART Optimizing Compiler安插主动safepoint的位置在循环回跳处(backedge,具体来说是在跳转前的源头处)以及方法入口处(entry)。

被动safepoint:所有未内联的方法调用点(call site)都是被动safepoint。这里并没有任何需要主动执行的代码,而就是个普通的方法调用。

之所以要作为safepoint,是因为执行到方法调用点之后,控制就交给了被调用的方法,而被调用的方法可能会进入safepoint,safepoint中可能需要遍历栈帧,因此caller也必须处于safepoint。

安插safepoint的位置的思路是:程序要能够在runtime发出需要safepoint的请求后,及时地执行到最近的safepoint然后把控制权交给runtime。

怎样算“及时”?只要执行时间是有上限(bounded)就可以了,实时性要求并不是很高。

于是进一步假设,向前执行(直线型、带条件分支都算)的代码都会在有限时间内执行完,所以可以不用管;而可能导致长时间执行的代码,要么是循环,要么是方法调用,所以只要在这两种地方插入safepoint就可以保证及时性了。

至于具体在方法入口还是出口、循环回边的源头还是目标处插入safepoint,这是个具体实现的细节,只要选择一边插入就可以了。

所以,对于前面的一行代码:

// 对于Runnable的线程,把checkpoint_function设置到当前线程的 CheckPoint function列表中,当线程执行到CheckPoint时,会执行该CheckPoint function

if (thread->RequestCheckpoint(checkpoint_function)) {

处于Runnable的线程,我们设置了checkpoint_function和 CheckPoint Request,那么这个线程终归要执行到CheckPoint,从而执行check_point function.

前面提到safepoint的实时性要求不高,可以给个时间概念,一个函数的运行时间之内肯定会执行到CheckPoint;

但也会受到其他因素的影响,比如线程调度,假如一个线程A在Runnable状态,将要执行到safepoint,但此时,该线程不在得到调度,就会一直执行不到safepoint;

正对本例中,正常情况下的流程是:Runnable的线程在执行到safepoint时,发现有CheckPoint请求,从而执行CheckPoint函数,

此处CheckPoint函数已经被设置了 DumpCheckPoint的Run()函数,从而进行thread dump;

至此,suspended 状态和 Runnable状态的线程的Dump调用点都说清楚了。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值