kernel空指针引起的system_server watchdog问题分析

最新推荐文章于 2024-07-13 15:49:46 发布

aa787282301

最新推荐文章于 2024-07-13 15:49:46 发布

阅读量4.5k

点赞数 2

分类专栏： kernel 安卓系统文章标签： linux android

本文链接：https://blog.csdn.net/aa787282301/article/details/79431196

版权

本文分析了手机kernel启动时因空指针导致的system_server watchdog问题。问题源于surfaceflinger主线程等待中断处理函数，但该函数因空指针未执行完毕，进而阻塞了system_server。通过分析system_server和surfaceflinger的trace，发现kernel中synaptics_rmi4_add_and_update_tp_data方法和kthread_data方法出现了空指针错误。深入研究发现，中断处理函数引发的空指针异常使进程陷入D状态，修复这个问题的关键在于解决最初的kernel空指针异常。

摘要由CSDN通过智能技术生成

一. 问题描述

1.1 现象

　　手机kernel启动时狂点屏幕会导致手机卡死重启．

1.3 结论

　　因为surfaceflinger主线程在等中断处理函数执行执行结束，而中断处理函数一直没有执行完，所以导致surfaceflinger的主线程卡住．这时候system_server通过binder跨进程去调用 surfaceflinger中的方法时，由于surfaceflinger卡住了，所以system_server也会被卡住，最终导致了system_server watchdog．

1.4修复链接：

　　XXX

二. 初步分析

2.1 查看system_server trace

　　一般对于watchdog问题，我们都会先看看system_server的那一个线程出了问题，出现问题的具体调用栈如下：

从这里我们能看到system_server的binder线程在向surfaceflinger发起跨进程调用时被block了，具体的调用方法是createSurface，那么接下来我们就去看看surfaceflinger的trace，看看它在干什么事情．

2.1 查看surfaceflinger trace

　　我下了好几份log，都没有发现watchdog文件中有surfaceflinger的trace，没有能继续往下分析，只能先去找没有打印出surfaceflinger trace的原因.
2.2 surfaceflinger打不出trace的怀疑点
①debuggerd出问题了？
不会出问题，要是它出了问题，其他native进程也不会打印出对应的trace．
②watchdog类里把中surfaceflinger这个要抓trace的进程给去掉了？然后查看了下c8对应的代码：

surfaceflinger这个进程还在默认的列表里面，这就比较奇怪了，然后再继续下载了几十份Log，然而没有一份log有surfaceflinger的trace.
继续看下相应的Log，debuggerd确实打印了相应的log

12-29 20:42:17.814 695 831 W SurfaceFlinger: setTransactionState timed out waiting for previous animation frame
12-29 20:42:19.695 2912 3259 W PushService: 2017-12-29 20:42:19,695 - [WARN::PushService] - [Thread:99] [Slim] Wrote {cmd=SECMSG;chid=5;len=341}
12-29 20:42:19.695 2912 3259 W PushService: 2017-12-29 20:42:19,695 - [WARN::PushService] - [Thread:99] [Slim] 08:42:19 ä¸å Sent Blob [SECMSG,5,5QbAV-13]
12-29 20:42:19.695 933 5418 I ExtendedUtils: printFileName fd(15) -> /system/media/audio/notifications/CrystalRing.ogg
12-29 20:42:19.695 2912 3259 W PushService: 2017-12-29 20:42:19,695 - [WARN::PushService] - [Thread:99] try send mi push message. packagename:com.miui.powerkeeper action:Notification
12-29 20:42:19.696 2912 3259 W PushService: 2017-12-29 20:42:19,695 - [WARN::PushService] - [Thread:99] [Slim] Wrote {cmd=SECMSG;chid=5;len=349}
12-29 20:42:22.955 695 831 W SurfaceFlinger: setTransactionState timed out waiting for previous animation frame
12-29 20:43:16.835 591 591 W : debuggerd: handling request: pid=695 uid=1000 gid=1003 tid=695
12-29 20:43:26.849 591 591 W : debuggerd: resuming target 695
12-29 20:43:26.854 591 591 W : debuggerd: handling request: pid=923 uid=1041 gid=1005 tid=923
12-29 20:43:26.857 590 590 W : debuggerd: handling request: pid=923 uid=1041 gid=1005 tid=923
12-29 20:43:26.999 591 591 W : debuggerd: handling request: pid=924 uid=1047 gid=1005 tid=924
12-29 20:43:27.000 590 590 W : debuggerd: handling request: pid=924 uid=1047 gid=1005 tid=924
12-29 20:43:27.106 591 591 W : debuggerd: handling request: pid=925 uid=1019 gid=1019 tid=925
12-29 20:43:27.107 590 590 W : debuggerd: handling request: pid=925 uid=1019 gid=1019 tid=925
12-29 20:43:27.120 591 591 W : debuggerd: handling request: pid=926 uid=0 gid=0 tid=926
12-29 20:43:57.857 5557 5557 W : debuggerd: handling request: pid=695 uid=1000 gid=1003 tid=695
12-29 20:44:07.870 5557 5557 W : debuggerd: resuming target 695
12-29 20:44:07.876 5557 5557 W : debuggerd: handling request: pid=923 uid=1041 gid=1005 tid=923
12-29 20:44:07.878 590 590 W : debuggerd: handling request: pid=923 uid=1041 gid=1005 tid=923
12-29 20:44:07.931 5557 5557 W : debuggerd: handling request: pid=924 uid=1047 gid=1005 tid=924
12-29 20:44:07.932 590 590 W : debuggerd: handling request: pid=924 uid=1047 gid=1005 tid=924
12-29 20:44:08.044 5557 5557 W : debuggerd: handling request: pid=925 uid=1019 gid=1019 tid=925
12-29 20:44:08.045 590 590 W : debuggerd: handling request: pid=925 uid=1019 gid=1019 tid=925
12-29 20:44:08.056 5557 5557 W : debuggerd: handling request: pid=926 uid=0 gid=0 tid=926