android 工作原理,Android WatchDog 工作原理

一、概述

Android系统中,有硬件 WatchDog 用于定时检测关键硬件是否正常工作,类似地,在framework层有一个软件WatchDog用于定期检测关键系统服务是否发生死锁事件。WatchDog功能主要是分析系统核心服务和重要线程是否处于Blocked状态。

监视reboot广播;

监视mMonitors关键系统服务是否死锁。

二、WatchDog初始化

2.1 startOtherServices

[-> SystemServer.java]

private void startOtherServices(){

...

//创建watchdog【见小节2.2】

final Watchdog watchdog = Watchdog.getInstance();

//注册reboot广播【见小节2.3】

watchdog.init(context, mActivityManagerService);

...

mSystemServiceManager.startBootPhase(SystemService.PHASE_LOCK_SETTINGS_READY); //480

...

mActivityManagerService.systemReady(new Runnable() {

public void run(){

mSystemServiceManager.startBootPhase(

SystemService.PHASE_ACTIVITY_MANAGER_READY);

...

// watchdog启动【见小节3.1】

Watchdog.getInstance().start();

mSystemServiceManager.startBootPhase(

SystemService.PHASE_THIRD_PARTY_APPS_CAN_START);

}

}

}

system_server进程启动的过程中初始化WatchDog,主要有:

创建watchdog对象,该对象本身继承于Thread;

注册reboot广播;

调用start()开始工作。

2.2 getInstance

[-> Watchdog.java]

public static Watchdog getInstance(){

if (sWatchdog == null) {

//单例模式,创建实例对象【见小节2.3 】

sWatchdog = new Watchdog();

}

return sWatchdog;

}

2.3 创建Watchdog

[-> Watchdog.java]

public class Watchdog extends Thread{

//所有的HandlerChecker对象组成的列表,HandlerChecker对象类型【见小节2.3.1】

final ArrayList mHandlerCheckers = new ArrayList<>();

...

private Watchdog(){

super("watchdog");

//将前台线程加入队列

mMonitorChecker = new HandlerChecker(FgThread.getHandler(),

"foreground thread", DEFAULT_TIMEOUT);

mHandlerCheckers.add(mMonitorChecker);

//将主线程加入队列

mHandlerCheckers.add(new HandlerChecker(new Handler(Looper.getMainLooper()),

"main thread", DEFAULT_TIMEOUT));

//将ui线程加入队列

mHandlerCheckers.add(new HandlerChecker(UiThread.getHandler(),

"ui thread", DEFAULT_TIMEOUT));

//将i/o线程加入队列

mHandlerCheckers.add(new HandlerChecker(IoThread.getHandler(),

"i/o thread", DEFAULT_TIMEOUT));

//将display线程加入队列

mHandlerCheckers.add(new HandlerChecker(DisplayThread.getHandler(),

"display thread", DEFAULT_TIMEOUT));

//【见小节2.3.2】

addMonitor(new BinderThreadMonitor());

}

}

Watchdog继承于Thread,创建的线程名为"watchdog"。mHandlerCheckers队列包括、

主线程,fg, ui, io, display线程的HandlerChecker对象。

2.3.1 HandlerChecker

[-> Watchdog.java]

public final class HandlerChecker implements Runnable{

private final Handler mHandler; //Handler对象

private final String mName; //线程描述名

private final long mWaitMax; //最长等待时间

//记录着监控的服务

private final ArrayList mMonitors = new ArrayList();

private boolean mCompleted; //开始检查时先设置成false

private Monitor mCurrentMonitor;

private long mStartTime; //开始准备检查的时间点

HandlerChecker(Handler handler, String name, long waitMaxMillis) {

mHandler = handler;

mName = name;

mWaitMax = waitMaxMillis;

mCompleted = true;

}

}

2.3.2 addMonitor

public class Watchdog extends Thread{

public void addMonitor(Monitor monitor){

synchronized (this) {

...

//此处mMonitorChecker数据类型为HandlerChecker

mMonitorChecker.addMonitor(monitor);

}

}

public final class HandlerChecker implements Runnable{

private final ArrayList mMonitors = new ArrayList();

public void addMonitor(Monitor monitor){

//将上面的BinderThreadMonitor添加到mMonitors队列

mMonitors.add(monitor);

}

...

}

}

监控Binder线程, 将monitor添加到HandlerChecker的成员变量mMonitors列表中。

在这里是将BinderThreadMonitor对象加入该线程。

private static final class BinderThreadMonitor implements Watchdog.Monitor{

public void monitor(){

Binder.blockUntilThreadAvailable();

}

}

blockUntilThreadAvailable最终调用的是IPCThreadState,等待有空闲的binder线程

void IPCThreadState::blockUntilThreadAvailable()

{

pthread_mutex_lock(&mProcess->mThreadCountLock);

while (mProcess->mExecutingThreadsCount >= mProcess->mMaxThreads) {

//等待正在执行的binder线程小于进程最大binder线程上限(16个)

pthread_cond_wait(&mProcess->mThreadCountDecrement, &mProcess->mThreadCountLock);

}

pthread_mutex_unlock(&mProcess->mThreadCountLock);

}

可见addMonitor(new BinderThreadMonitor())是将Binder线程添加到android.fg线程的handler(mMonitorChecker)来检查是否工作正常。

2.3 init

[-> Watchdog.java]

public void init(Context context, ActivityManagerService activity){

mResolver = context.getContentResolver();

mActivity = activity;

//注册reboot广播接收者【见小节2.3.1】

context.registerReceiver(new RebootRequestReceiver(),

new IntentFilter(Intent.ACTION_REBOOT),

android.Manifest.permission.REBOOT, null);

}

2.3.1 RebootRequestReceiver

[-> Watchdog.java]

final class RebootRequestReceiver extends BroadcastReceiver{

@Override

public void onReceive(Context c, Intent intent){

if (intent.getIntExtra("nowait", 0) != 0) {

//【见小节2.3.2】

rebootSystem("Received ACTION_REBOOT broadcast");

return;

}

Slog.w(TAG, "Unsupported ACTION_REBOOT broadcast: " + intent);

}

}

2.3.2 rebootSystem

[-> Watchdog.java]

void rebootSystem(String reason){

Slog.i(TAG, "Rebooting system because: " + reason);

IPowerManager pms = (IPowerManager)ServiceManager.getService(Context.POWER_SERVICE);

try {

//通过PowerManager执行reboot操作

pms.reboot(false, reason, false);

} catch (RemoteException ex) {

}

}

最终是通过PowerManagerService来完成重启操作,具体的重启流程后续会单独讲述。

三、Watchdog检测机制

当调用Watchdog.getInstance().start()时,则进入线程“watchdog”的run()方法, 该方法分成两部分:

前半部 [小节3.1] 用于监测是否触发超时;

后半部 [小节4.1], 当触发超时则输出各种信息。

3.1 run

[-> Watchdog.java]

public void run(){

boolean waitedHalf = false;

while (true) {

final ArrayList blockedCheckers;

final String subject;

final boolean allowRestart;

int debuggerWasConnected = 0;

synchronized (this) {

long timeout = CHECK_INTERVAL; //CHECK_INTERVAL=30s

for (int i=0; i

HandlerChecker hc = mHandlerCheckers.get(i);

//执行所有的Checker的监控方法, 每个Checker记录当前的mStartTime[见小节3.2]

hc.scheduleCheckLocked();

}

if (debuggerWasConnected > 0) {

debuggerWasConnected--;

}

long start = SystemClock.uptimeMillis();

//通过循环,保证执行30s才会继续往下执行

while (timeout > 0) {

if (Debug.isDebuggerConnected()) {

debuggerWasConnected = 2;

}

try {

wait(timeout); //触发中断,直接捕获异常,继续等待.

} catch (InterruptedException e) {

Log.wtf(TAG, e);

}

if (Debug.isDebuggerConnected()) {

debuggerWasConnected = 2;

}

timeout = CHECK_INTERVAL - (SystemClock.uptimeMillis() - start);

}

//评估Checker状态【见小节3.3】

final int waitState = evaluateCheckerCompletionLocked();

if (waitState == COMPLETED) {

waitedHalf = false;

continue;

} else if (waitState == WAITING) {

continue;

} else if (waitState == WAITED_HALF) {

if (!waitedHalf) {

//首次进入等待时间过半的状态

ArrayList pids = new ArrayList();

pids.add(Process.myPid());

//输出system_server和3个native进程的traces【见小节4.2】

ActivityManagerService.dumpStackTraces(true, pids, null, null,

NATIVE_STACKS_OF_INTEREST);

waitedHalf = true;

}

continue;

}

... //进入这里,意味着Watchdog已超时【见小节4.1】

}

...

}

}

public static final String[] NATIVE_STACKS_OF_INTEREST = new String[] {

"/system/bin/mediaserver",

"/system/bin/sdcard",

"/system/bin/surfaceflinger"

};

该方法主要功能:

执行所有的Checker的监控方法scheduleCheckLocked()当mMonitor个数为0(除了android.fg线程之外都为0)且处于poll状态,则设置mCompleted = true;

当上次check还没有完成, 则直接返回.

等待30s后, 再调用evaluateCheckerCompletionLocked来评估Checker状态;

根据waitState状态来执行不同的操作:当COMPLETED或WAITING,则相安无事;

当WAITED_HALF(超过30s)且为首次, 则输出system_server和3个Native进程的traces;

当OVERDUE, 则输出更多信息.

由此,可见当触发一次Watchdog, 则必然会调用两次AMS.dumpStackTraces, 也就是说system_server和3个Native进程的traces

的traces信息会输出两遍,且时间间隔超过30s.

3.2 scheduleCheckLocked

public final class HandlerChecker implements Runnable{

...

public void scheduleCheckLocked(){

if (mMonitors.size() == 0 && mHandler.getLooper().getQueue().isPolling()) {

mCompleted = true; //当目标looper正在轮询状态则返回。

return;

}

if (!mCompleted) {

return; //有一个check正在处理中,则无需重复发送

}

mCompleted = false;

mCurrentMonitor = null;

// 记录当下的时间

mStartTime = SystemClock.uptimeMillis();

//发送消息,插入消息队列最开头, 见下方的run()方法

mHandler.postAtFrontOfQueue(this);

}

public void run(){

final int size = mMonitors.size();

for (int i = 0 ; i < size ; i++) {

synchronized (Watchdog.this) {

mCurrentMonitor = mMonitors.get(i);

}

//回调具体服务的monitor方法

mCurrentMonitor.monitor();

}

synchronized (Watchdog.this) {

mCompleted = true;

mCurrentMonitor = null;

}

}

}

该方法主要功能: 向Watchdog的监控线程的Looper池的最头部执行该HandlerChecker.run()方法,在该方法中调用monitor(),执行完成后会设置mCompleted = true. 那么当handler消息池当前的消息,导致迟迟没有机会执行monitor()方法, 则会触发watchdog.

其中postAtFrontOfQueue(this),该方法输入参数为Runnable对象,根据消息机制,最终会回调HandlerChecker中的run方法,该方法会循环遍历所有的Monitor接口,具体的服务实现该接口的monitor()方法。

可能的问题,如果有其他消息不断地调用postAtFrontOfQueue()也可能导致watchdog没有机会执行;或者是每个monitor消耗一些时间,雷加起来超过1分钟造成的watchdog. 这些都是非常规的Watchdog.

3.3 evaluateCheckerCompletionLocked

private int evaluateCheckerCompletionLocked(){

int state = COMPLETED;

for (int i=0; i

HandlerChecker hc = mHandlerCheckers.get(i);

//【见小节3.4】

state = Math.max(state, hc.getCompletionStateLocked());

}

return state;

}

获取mHandlerCheckers列表中等待状态值最大的state.

3.4 getCompletionStateLocked

```java

public int getCompletionStateLocked() {

if (mCompleted) {

return COMPLETED;

} else {

long latency = SystemClock.uptimeMillis() - mStartTime;

// mWaitMax默认是60s

if (latency < mWaitMax/2) {

return WAITING;

} else if (latency < mWaitMax) {

return WAITED_HALF;

}

}

return OVERDUE;

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值