Android Watchdog框架解析、应用与改造(下)

接着上一篇WTD的介绍 ,看下实际死锁情况下,WTD的功能与改造。

最近遇见Android开机一直停留在动画界面,查看trace文件发现死锁了,简要信息如下:

 

"main" prio=5 tid=1 MONITOR
  | group="main" sCount=1 dsCount=0 obj=0x4c20f360 self=0x71e1ade0
  | sysTid=519 nice=-2 sched=0/0 cgrp=apps handle=1878216768
  | state=S schedstat=( 736667963 56924727 1529 ) utm=62 stm=11 core=0
  at com.android.server.am.ActivityManagerService.registerReceiver(ActivityManagerService.java:~13326)
  - waiting to lock <0x4c6b2630> (a com.android.server.am.ActivityManagerService) held by tid=27 (InputDispatcher)
  at android.app.ContextImpl.registerReceiverInternal(ContextImpl.java:1473)
  at android.app.ContextImpl.registerReceiver(ContextImpl.java:1441)
  at com.android.server.power.PowerManagerService.systemReady(PowerManagerService.java:494)
  at com.android.server.ServerThread.initAndLoop(SystemServer.java:1050)
  at com.android.server.SystemServer.main(SystemServer.java:1371)
  at java.lang.reflect.Method.invokeNative(Native Method)
  at java.lang.reflect.Method.invoke(Method.java:515)
  at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:794)
  at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:610)
  at dalvik.system.NativeStart.main(Native Method)

"InputDispatcher" prio=10 tid=27 MONITOR
  | group="main" sCount=1 dsCount=0 obj=0x4c9c7d60 self=0x72010e50
  | sysTid=554 nice=-8 sched=0/0 cgrp=apps handle=1912287104
  | state=S schedstat=( 1007065539 96683590 71214 ) utm=22 stm=78 core=0
  at com.android.server.power.PowerManagerService.setScreenBrightnessOverrideFromWindowManagerInternal(PowerManagerService.java:~2206)
  - waiting to lock <0x4c6a8af0> (a java.lang.Object) held by tid=1 (main)
  at com.android.server.power.PowerManagerService.setScreenBrightnessOverrideFromWindowManager(PowerManagerService.java:2199)
  at com.android.server.wm.WindowManagerService.performLayoutAndPlaceSurfacesLockedInner(WindowManagerService.java:9818)
  at com.android.server.wm.WindowManagerService.performLayoutAndPlaceSurfacesLockedLoop(WindowManagerService.java:8566)
  at com.android.server.wm.WindowManagerService.performLayoutAndPlaceSurfacesLocked(WindowManagerService.java:8508)
  at com.android.server.wm.WindowManagerService.setNewConfiguration(WindowManagerService.java:3847)
  at com.android.server.am.ActivityManagerService.updateConfigurationLocked(ActivityManagerService.java:14490)
  at com.android.server.am.ActivityManagerService.updateConfiguration(ActivityManagerService.java:14375)
  at com.android.server.wm.WindowManagerService.sendNewConfiguration(WindowManagerService.java:6725)
  at com.android.server.wm.InputMonitor.notifyConfigurationChanged(InputMonitor.java:325)
  at com.android.server.input.InputManagerService.notifyConfigurationChanged(InputManagerService.java:1275)
  at dalvik.system.NativeStart.run(Native Method)


       trace很清楚的说明了main、InputDispatcher线程发生互相的死锁。从栈信息函数调用上可以看出两个线程都都用了AMS、PMS服务,从上一篇分析来看,AMS、PMS都是已经添加到WTD中进行检测的,为何服务发生死锁了,WTD没有检测到?

 

       回到上一篇看一下有关AMS、PMS的启动流程,还有WTD的启动时间点,如下:

 

    public void initAndLoop() {
        try {
            // Wait for installd to finished starting up so that it has a chance to
            // create critical directories such as /data/user with the appropriate
            // permissions.  We need this to complete before we initialize other services.
            Slog.i(TAG, "Waiting for installd to be ready.");
            installer = new Installer();
            installer.ping();

            Slog.i(TAG, "Power Manager");
            power = new PowerManagerService();
            ServiceManager.addService(Context.POWER_SERVICE, power);

            Slog.i(TAG, "Activity Manager");
            context = ActivityManagerService.main(factoryTest);
        } catch (RuntimeException e) {
            Slog.e("System", "******************************************");
            Slog.e("System", "************ Failure starting bootstrap service", e);
        }

            // only initialize the power service after we have started the
            // lights service, content providers and the battery service.
            power.init(context, lights, ActivityManagerService.self(), battery,
                    BatteryStatsService.getService(),
                    ActivityManagerService.self().getAppOpsService(), display);

            Slog.i(TAG, "Init Watchdog");
            Watchdog.getInstance().init(context, battery, power, alarm,
                    ActivityManagerService.self());
            Watchdog.getInstance().addThread(wmHandler, "WindowManager thread");

        try {
            <span style="color:#ff0000;">power.systemReady(twilight, dreamy);</span>
        } catch (Throwable e) {
            reportWtf("making Power Manager Service ready", e);
        }

        ActivityManagerService.self().systemReady(new Runnable() {
            public void run() {
                <span style="color:#cc0000;">Watchdog.getInstance().start();</span>

从systemserver.java文件上可以看到WTD线程的启动是在很多service注册之后才启动的,那么如果service注册过程死锁,WTD就没法启动检测了。所以上面trace死锁问题的原因就找到了,接下来想办法如何解决这个问题。我大致觉得办法有三,如下:

 

一. 提前WTD的运行,即在实例化后马上运行,这样当出现上诉死锁时,WTD将能够检测到并杀死死锁线程

二. 在AMS、PMS中设置ReentrantLock互斥锁,按照trace死锁的位置,设定函数访问互斥锁,当PMS systemready函数持有锁时,setScreenBrightnessOverrideFromWindowManager不去申请锁,访问死锁

三. 服务注册过程中禁止InputManagerService.notifyConfigurationChanged,这种做法我觉得没有办法二恰当,出现这个死锁是因为系统挂着USB输入设备,USB是热插拔设备,注册时间上是不可控的,也就导致了上诉的死锁。

 

       重点说明方法一方法,加速WTD的运行。以下patch就是提前WTD运行的思路。结合WTD源码分析,加速WTD的运行首先要考虑这样做系统的稳定性。尤其是提前的WTD的运行,是否影响后续服务的WTD使用,以及WTD在此过程中,资源的访问是否存在问题。

 

--- a/frameworks/base/services/java/com/android/server/SystemServer.java
+++ b/frameworks/base/services/java/com/android/server/SystemServer.java
@@ -351,7 +351,9 @@ class ServerThread {
             Watchdog.getInstance().init(context, battery, power, alarm,
                     ActivityManagerService.self());
             Watchdog.getInstance().addThread(wmHandler, "WindowManager thread");
-
+               Watchdog.getInstance().start();
+               
             Slog.i(TAG, "Input Manager");

@@ -1165,8 +1167,8 @@ class ServerThread {
                 } catch (Throwable e) {
                     reportWtf("making Recognition Service ready", e);
                 }
-                Watchdog.getInstance().start();
-
+                //Watchdog.getInstance().start();
                 // It is now okay to let the various system services start their
                 // third party code...


       针对以上问题综合分析,我认为这个过程存在的问题是可以避免的,只是在上诉patch的基础上,需要对watchdog.java文件进行一些额外处理。在此制作简单描述,实现起来比较简单。

 

1. 取消addMonitor、addThread函数接口中对线程状态的判断,否则WTD启动后不能添加监视器到WTD中

2. WTD启动后,run函数和addMonitor、addThread存在锁竞争,而run函数的执行周期很长,在系统启动过程中需要调节run函数的执行周期

       按照上诉注意事项对WTD进行启动时序改造后,系统可以正常运行,WTD运行正常,我进行reboot测试一千次,暂无影响得意

 

 

 

 

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值