Android7.0 Watchdog机制

最新推荐文章于 2024-06-20 16:20:13 发布

宇落无痕

最新推荐文章于 2024-06-20 16:20:13 发布

阅读量9.7k

点赞数 2

本文链接：https://blog.csdn.net/fu_kevin0606/article/details/64479489

版权

对手机系统而言，因为肩负着接听电话和接收短信的“重任”，所以被寄予7x24小时正常工作的希望。但是作为一个在嵌入式设备上运行的操作系统，Android运行中必须面对各种软硬件干扰，从最简单的代码出现死锁或者被阻塞，到内存越界导致的内存破坏，或者由于硬件问题导致的内存反转，甚至是极端工作环境下出现的CPU电子迁移和存储器消磁。这一切问题都可能导致系统服务发生难以预料的崩溃和死机。
想解决这一问题，可以从正反两个方向出发，其一是提高软硬件在极端状态下的可靠性，如进行程序终止性验证，或选用抗辐射加固器件。但是基于成本考虑，普通的手机系统很难做到完全不出故障；另一个方法是及时发现系统崩溃并重启系统。手机系统的大部分的故障都会在重启后消失，不会影响继续使用。所以简单的办法是，如果检测到系统不正常了，将设备重新启动，这样用户就能继续使用了。那么如何才能判断系统是否正常呢。在早期的手机平台上通常的做法是在设备中增加一个硬件看门狗，软件系统必须定时的向看门狗硬件中写值来表示自己没出故障（俗称“喂狗”），否则超过了规定的时间看门狗就会重新启动设备。
硬件看门狗的问题是它的功能比较单一，只能监控整个系统。早期的手机操作系统大多是单任务的，硬件看门狗勉强能胜任。Android的SystemServer是一个非常复杂的进程，里面运行的服务超过五十种，是最可能出问题的进程，因此有必要对SystemServer中运行的各种线程实施监控。但是如果使用硬件看门狗的工作方式，每个线程隔一段时间去喂狗，不但非常浪费CPU，而且会导致程序设计更加复杂。因此Android开发了WatchDog类作为软件看门狗来监控SystemServer中的线程。一旦发现问题，WatchDog会杀死SystemServer进程。
SystemServer的父进程Zygote接收到SystemServer的死亡信号后，会杀死自己。Zygote进程死亡的信号传递到Init进程后，Init进程会杀死Zygote进程所有的子进程并重启Zygote。这样整个手机相当于重启一遍。通常SystemServer出现问题和kernel并没有关系，所以这种“软重启”大部分时候都能够解决问题。而且这种“软重启”的速度更快，对用户的影响也更小。

WatchDog是在SystemServer进程中被初始化和启动的。在SystemServer 的run方法中，各种Android服务被注册和启动，其中也包括了WatchDog的初始化和启动。代码如下：

            final Watchdog watchdog = Watchdog.getInstance();
            watchdog.init(context, mActivityManagerService);

在SystemServer中startOtherServices的后半段，将通过SystemReady接口通知系统已经就绪。在ActivityManagerService的SystemReady接口的CallBack函数中实现WatchDog的启动

                Watchdog.getInstance().start();

以上代码位于frameworks/base/services/java/com/android/server/SystemServer.java中。
前面说到WatchDog是在SystemServer.java中通过getInstance方法创建的，其具体实现方式如下：

    public static Watchdog getInstance() {
        if (sWatchdog == null) {
            sWatchdog = new Watchdog();    //单例模式创建实例
        }

        return sWatchdog;
    }

    private Watchdog() {
        super("watchdog");
        // Initialize handler checkers for each common thread we want to check.  Note
        // that we are not currently checking the background thread, since it can
        // potentially hold longer running operations with no guarantees about the timeliness
        // of operations there.

        // The shared foreground thread is the main checker.  It is where we
        // will also dispatch monitor checks and do other work.
        mMonitorChecker = new HandlerChecker(FgThread.getHandler(),
                "foreground thread", DEFAULT_TIMEOUT);
        mHandlerCheckers.add(mMonitorChecker);
        // Add checker for main thread.  We only do a quick check since there
        // can be UI running on the thread.
        mHandlerCheckers.add(new HandlerChecker(new Handler(Looper.getMainLooper()),
                "main thread", DEFAULT_TIMEOUT));
        // Add checker for shared UI thread.
        mHandlerCheckers.add(new HandlerChecker(UiThread.getHandler(),
                "ui thread", DEFAULT_TIMEOUT));
        // And also check IO thread.
        mHandlerCheckers.add(new HandlerChecker(IoThread.getHandler(),
                "i/o thread", DEFAULT_TIMEOUT));
        // And the display thread.
        mHandlerCheckers.add(new HandlerChecker(DisplayThread.getHandler(),
                "display thread", DEFAULT_TIMEOUT));

        // Initialize monitor for Binder threads.
        addMonitor(new BinderThreadMonitor());
    }

在Watchdog构造函数中将main thread，UIthread，Iothread，DisplayThread加入mHandlerCheckers列表中。最后初始化monitor放入mMonitorCheckers列表中。

最低0.47元/天解锁文章

宇落无痕

关注

2
点赞
踩
19

收藏

觉得还不错? 一键收藏
3
评论
Android7.0 Watchdog机制

对手机系统而言，因为肩负着接听电话和接收短信的“重任”，所以被寄予7x24小时正常工作的希望。但是作为一个在嵌入式设备上运行的操作系统，Android运行中必须面对各种软硬件干扰，从最简单的代码出现死锁或者被阻塞，到内存越界导致的内存破坏，或者由于硬件问题导致的内存反转，甚至是极端工作环境下出现的CPU电子迁移和存储器消磁。这一切问题都可能导致系统服务发生难以预料的崩溃和死机。想解决
复制链接

扫一扫