Adnroid Watchdog实现详解

本文基于Android4.4,

 

最近查了下watchdog打印错误log的问题。头都大。。。也查看了下android framework 下watchdog的实现代码,做个记录以备后边温习,以及新入行后辈们能够快速上手

 

以PowerManagerservice为例做简单流程分析

 

Watchdog功能:

1.      监视reboot广播

2.      监视加到check list 的service是否死锁

 

功能介绍:

功能1非常简单,就是注册一个broadcastreceiver,收到关机的Action就去走关机或reboot流程

主要说下功能2

PowerManagerService.java:

先说一下构造函数:

private Watchdog() {

       super("watchdog");

       // Initialize handler checkers for each common thread we want tocheck.  Note

       // that we are not currently checking the background thread, since itcan

       // potentially hold longer running operations with no guarantees aboutthe timeliness

       // of operations there.

 

       // The shared foreground thread is the main checker.  It is where we

       // will also dispatch monitor checks and do other work.

       mMonitorChecker = newHandlerChecker(FgThread.getHandler(),

                "foreground thread",DEFAULT_TIMEOUT);

       mHandlerCheckers.add(mMonitorChecker);

       // Add checker for main thread. We only do a quick check since there

       // can be UI running on the thread.

       mHandlerCheckers.add(new HandlerChecker(newHandler(Looper.getMainLooper()),

                "main thread",DEFAULT_TIMEOUT));

       // Add checker for shared UI thread.

       mHandlerCheckers.add(newHandlerChecker(UiThread.getHandler(),

                "ui thread", DEFAULT_TIMEOUT));

       // And also check IO thread.

       mHandlerCheckers.add(newHandlerChecker(IoThread.getHandler(),

                "i/o thread",DEFAULT_TIMEOUT));

}

红色部分很重要,添加UIThreadFgThreadIoThread,还有当前new Watchdog时候的主线程,其实就是System_server主线程

 

接下来说下init:

   public void init(Context context, LightsService ls,

           ActivityManagerService am, BatteryService bs, IBatteryStats bss,

           IAppOpsService appOps, DisplayManagerService dm) {

                  。。。。。。

       mHandlerThread = new HandlerThread(TAG);

       mHandlerThread.start();

       mHandler = new PowerManagerHandler(mHandlerThread.getLooper());

 

       Watchdog.getInstance().addMonitor(this);  //添加本对象到monitor列表里

       Watchdog.getInstance().addThread(mHandler, mHandlerThread.getName());

     。。。。。。

}

   public void addMonitor(Monitor monitor) {

        synchronized (this) {

            if (isAlive()) {

                throw newRuntimeException("Monitors can't be added once the Watchdog isrunning");

            }

            mMonitorChecker.addMonitor(monitor);//

        }

    }

 

 

Watchdog构造函数里

mMonitorChecker = newHandlerChecker(FgThread.getHandler(),

                "foreground thread",DEFAULT_TIMEOUT);

 

        public void addMonitor(Monitor monitor){

//mMonitors是一个数组list  ArrayList<Monitor> mMonitors = newArrayList<Monitor>();

//添加powermanagerservice对象到此list

            mMonitors.add(monitor);

        }

 

接着说下Watchdog.getInstance().addThread(mHandler,mHandlerThread.getName());

 

    public void addThread(Handler thread,String name) {

        addThread(thread, name, DEFAULT_TIMEOUT);

    }

 

    public void addThread(Handler thread,String name, long timeoutMillis) {

        synchronized (this) {

            if (isAlive()) {

                throw newRuntimeException("Threads can't be added once the Watchdog isrunning");

            }

 

//PowerManagerHandler对象添加到mHandlerCheckers列表

            mHandlerCheckers.add(newHandlerChecker(thread, name, timeoutMillis));

        }

    }

 

准备工作到此已经做完,接下来就是watchdog不停监视每个service是否死锁

代码主要在Watchdog.java

 

System_server会启动watchdog会跑到run函数:

WatchDog运行在一个单独的线程中,它的线程执行方法run()的代码如下:

  public void run() {

      booleanwaitedHalf= false;

      while(true){

         finalArrayListblockedCheckers;

        finalStringsubject;

         finalbooleanallowRestart;

        synchronized(this){

            longtimeout= CHECK_INTERVAL;

            //给监控的线程发送消息

             for (inti=0; i<mHandlerCheckers.size(); i++) {

                    HandlerChecker hc =mHandlerCheckers.get(i);

                    hc.scheduleCheckLocked();

                }         

    //睡眠一段时间

             longstart= SystemClock.uptimeMillis();

             while(timeout> 0) {

                 try{

                     wait(timeout);

                 }catch(InterruptedException e) {

                     Log.wtf(TAG,e);

                }

                timeout=CHECK_INTERVAL - (SystemClock.uptimeMillis() -start);

           }

             //检查是否有线程或服务出问题了

            finalintwaitState =evaluateCheckerCompletionLocked();

             if(waitState== COMPLETED) {

                waitedHalf=false;

                continue;

            }elseif (waitState == WAITING) {

                continue;

            }elseif (waitState == WAITED_HALF) {

                 if(!waitedHalf){

                     ArrayListpids= new ArrayList();

                    pids.add(Process.myPid());

                     ActivityManagerService.dumpStackTraces(true,pids,null, null,

                            NATIVE_STACKS_OF_INTEREST);

                     waitedHalf=true;

                 }

                continue;

             }

       ......

        {

            //杀死SystemServer

            Process.killProcess(Process.myPid());

            System.exit(10);

        }

        waitedHalf=false;

     }

 }

run()方法中有一个无限循环,每次循环中主要做三件事:

1.       调用scheduleCheckLocked()方法给所有受监控的线程发送消息。scheduleCheckLocked()方法的代码如下

publicvoidscheduleCheckLocked() {

   if(mMonitors.size() == 0 &&mHandler.getLooper().isIdling()) {

       mCompleted= true;

       return;

   }

   if(!mCompleted) {

      return;

   }

   mCompleted= false;

   mCurrentMonitor= null;

   mStartTime= SystemClock.uptimeMillis();

   mHandler.postAtFrontOfQueue(this);//给监视的线程发送消息

}

 

HandlerChecker对象即要监控服务,也要监控某个线程。所以上面的代码先判断mMonitors的size是否为0。如果为0,说明这个HandlerChecker没有监控服务,这时如果被监控线程的消息队列处于空闲状态(调用isIdling()检查),则说明线程运行良好,把mCompleted设为true后就可以返回了。否则先把mCompleted设为false,然后记录消息开始发送的时间到变量mStartTime中,最后调用postAtFrontOfQueue()方法给被监控的线程发送一个消息。此时在Handler.java的

public voiddispatchMessage(Message msg) {

        if (msg.callback != null) {

            handleCallback(msg);

        } else {

            if (mCallback != null) {

                if(mCallback.handleMessage(msg)) {

                   return;

                }

            }

            handleMessage(msg);

        }

}

 private static void handleCallback(Message message) {

        message.callback.run();

    }

这个消息的处理方法是HandlerChecker类的方法run(),代码如下:

publicvoidrun() {

   finalint size = mMonitors.size();

   for(int i = 0 ; i < size ; i++) {

       synchronized(Watchdog.this) {

           mCurrentMonitor= mMonitors.get(i);

       }

       mCurrentMonitor.monitor();

   }

   synchronized(Watchdog.this) {

       mCompleted= true;

       mCurrentMonitor= null;

   }

}

如果消息处理方法run()能够被执行,说明受监控的线程本身没有问题。但是还需要检查被监控服务的状态。检查是通过调用服务中实现的monitor()方法来完成的。通常monitor()方法的实现是获取服务中的锁,如果不能得到,线程就会被挂起,这样mCompleted的值就不能被置成true了。

mCompleted的值为true,表明HandlerChecker对象监控的线程或服务正常。否则就可能有问题。是否真有问题还要通过等待的时间是否超过规定时间来判断。

moninor()方法的实现通常如下:

publicvoidmonitor() {

   synchronized(mLock) {

   }

}

2.       给受监控的线程发送完消息后,调用wait()方法让WatchDog线程睡眠一段时间。

3.       逐个检查是否有线程或服务出问题了,一旦发现问题,马上杀死进程。

前面调用了方法evaluateCheckerCompletionLocked()来检查线程或服务是否有问题。evaluateCheckerCompletionLocked()方法的代码如下:

privateintevaluateCheckerCompletionLocked() {

   intstate = COMPLETED;

   for(int i=0; i

       HandlerCheckerhc =mHandlerCheckers.get(i);

       state= Math.max(state,hc.getCompletionStateLocked());

   }

   returnstate;

}

   !waitedHalf, pids, null, null,NATIVE_STACKS_OF_INTEREST);

getCompletionStateLocked()函数根据等待时间来确认返回HandlerChecker对象的状态,代码如下:

publicintgetCompletionStateLocked() {

   if(mCompleted) {

       returnCOMPLETED;

   }else {

       longlatency = SystemClock.uptimeMillis() -mStartTime;

       if(latency < mWaitMax/2) {

           returnWAITING;

       }else if (latency < mWaitMax) {

           returnWAITED_HALF;

       }

   }

   returnOVERDUE;

}

到此就已经分析完毕,如果对发送消息不明白可以看我博文里handler,looper的那篇文章。中间网上找了些博文,借鉴。如果有什么不足之处,请指正。。

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值