今天遇到一个非常奇怪又非常严重的问题,而且是偶现问题,使用其他手机复现不了,所以这台手机只能保留原始状态(不重新刷机或者不重新关开机)。就是GOTA升级完成,手机卡死某锁屏界面,点击没有任何响应。但是,双击Power键位会打开Camera界面,但是,Camera界面的相机预览是黑色。
针对此问题分析:
[1]首先考虑触摸报点是否正常,通过命令adb shell getevent 查看报点是否正常。
执行adb shell getevent:
响应:
/dev/input/event2: 0000 0000 00000000
/dev/input/event2: 0003 0039 00000000
/dev/input/event2: 0003 0030 00000009
/dev/input/event2: 0003 0035 000000d5
/dev/input/event2: 0003 0036 0000020f
/dev/input/event2: 0000 0002 00000000
/dev/input/event2: 0000 0000 00000000
/dev/input/event2: 0003 0039 00000000
/dev/input/event2: 0003 0030 00000009
/dev/input/event2: 0003 0035 000000d8
/dev/input/event2: 0003 0036 00000218
/dev/input/event2: 0000 0002 00000000
/dev/input/event2: 0000 0000 00000000
/dev/input/event2: 0003 0039 00000000
/dev/input/event2: 0003 0030 00000009
/dev/input/event2: 0003 0035 000000db
/dev/input/event2: 0003 0036 0000021e
通过上面的数据确认,driver层是正常的。
[2]打开开发者模式中"Pointer location",屏幕是卡主不能进入Settings界面,怎么办呢?可以使用命令赋值并启动.
命令: adb shell settings put System "pointer_location" 1
发现屏幕上顶部x,y 根本就没有任何位置信息,而屏幕也没有任何影响。
从此可以推断出Input 框架上报或者说分发有问题。
[3]查看Input框架上报信息,从log中发现有:
05:49:12.491080 1853 1945 I InputDispatcher: Dropped event because input dispatch is disabled.
通过此log信息,定位原始代码:
frameworks/native/services/inputflinger/InputDispatcher.cpp
void InputDispatcher::dropInboundEventLocked(EventEntry* entry, DropReason dropReason) {
const char* reason;
switch (dropReason) {
case DROP_REASON_POLICY:
#if DEBUG_INBOUND_EVENT_DETAILS
ALOGD("Dropped event because policy consumed it.");
#endif
reason = "inbound event was dropped because the policy consumed it";
break;
case DROP_REASON_DISABLED:
if (mLastDropReason != DROP_REASON_DISABLED) {
ALOGI("Dropped event because input dispatch is disabled.");
}
...
}
说明input分发被禁止了,找到对应的控制分发的变量mDispatchEnabled。
log是此表示input 分发被disabled了,那现在状态的是什么样子呢?
查看源代码,发现有dump接口,并且实现了。
frameworks/native/services/inputflinger/InputDispatcher.cpp
void InputDispatcher::dump(std::string& dump) {
AutoMutex _l(mLock);
dump += "Input Dispatcher State:\n";
dumpDispatchStateLocked(dump);
if (!mLastANRState.empty()) {
dump += "\nInput Dispatcher State at time of last ANR:\n";
dump += mLastANRState;
}
/// M: Switch log by command @{
switchInputLog();
/// @}
}
void InputDispatcher::dumpDispatchStateLocked(std::string& dump) {
dump += StringPrintf(INDENT "DispatchEnabled: %d\n", mDispatchEnabled);
dump += StringPrintf(INDENT "DispatchFrozen: %d\n", mDispatchFrozen);
...
}
发现可以dump 出变量mDispatchEnabled当前状态.
使用命令:adb shell dumpsys input
找到对应的DispatchEnabled 变量:DispatchEnabled: 0
可以确认Input 确实分发被禁掉了。那是谁在调用它呢?
[4]寻找修改mDispatchEnabled变量的方法.
frameworks/native/services/inputflinger/InputDispatcher.cpp
void InputDispatcher::setInputDispatchMode(bool enabled, bool frozen) {
...
mDispatchEnabled = enabled;
...
}
只有这一处,进行了设置。
继续往上跟踪,谁会调用setInputDispatchMode方法。
frameworks/base/services/core/jni/com_android_server_input_InputManagerService.cpp
void NativeInputManager::setInputDispatchMode(bool enabled, bool frozen) {
mInputManager->getDispatcher()->setInputDispatchMode(enabled, frozen);
}
static void nativeSetInputDispatchMode(JNIEnv* /* env */,
jclass /* clazz */, jlong ptr, jboolean enabled, jboolean frozen) {
NativeInputManager* im = reinterpret_cast<NativeInputManager*>(ptr);
im->setInputDispatchMode(enabled, frozen);
}
定位到nativeSetInputDispatchMode方法,肯定对应java层的方法了。
frameworks/base/services/core/java/com/android/server/input/InputManagerService.java
public void setInputDispatchMode(boolean enabled, boolean frozen) {
nativeSetInputDispatchMode(mPtr, enabled, frozen);
}
frameworks/base/services/core/java/com/android/server/wm/InputMonitor.java
private void updateInputDispatchModeLw() {
mService.mInputManager.setInputDispatchMode(mInputDispatchEnabled, mInputDispatchFrozen);
}
public void setEventDispatchingLw(boolean enabled) {
if (mInputDispatchEnabled != enabled) {
/// M: Add more log at WMS
if (DEBUG_INPUT || DEBUG_BOOT) {
Slog.v(TAG_WM, "Setting event dispatching to " + enabled);
}
mInputDispatchEnabled = enabled;
updateInputDispatchModeLw();
}
}
frameworks/base/services/core/java/com/android/server/wm/WindowManagerService.java
public void setEventDispatching(boolean enabled) {
if (!checkCallingPermission(MANAGE_APP_TOKENS, "setEventDispatching()")) {
throw new SecurityException("Requires MANAGE_APP_TOKENS permission");
}
synchronized (mWindowMap) {
mEventDispatchingEnabled = enabled;
if (mDisplayEnabled) {
mInputMonitor.setEventDispatchingLw(enabled);
}
}
}
frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java
private void updateEventDispatchingLocked() {
mWindowManager.setEventDispatching(mBooted && !mShuttingDown);
}
跟到AMS中,发现在调用updateEventDispatchingLocked方法会控制input 是否会被禁掉。input是否被禁掉和mBooted 和 mShuttingDown变量有关。
想知道此时mBooted和mShuttingDown的值,需要查看dump是否可以打印.
发现在ActivityManagerService.java中,确实可以打印此值。
void dumpProcessesLocked(FileDescriptor fd, PrintWriter pw, String[] args,
int opti, boolean dumpAll, String dumpPackage, int dumpAppId) {
...
if (dumpAll) {
pw.println(" Total persistent processes: " + numPers);
pw.println(" mProcessesReady=" + mProcessesReady
+ " mSystemReady=" + mSystemReady
+ " mBooted=" + mBooted
...
pw.println(" mShuttingDown=" + mShuttingDown + " mTestPssMode=" + mTestPssMode);
...
}
有了此值,则就可以dump,这两个值的当前状态.
使用命令:adb shell dumpsys
mShuttingDown=true
mBooted=true
可以看出是由于mShuttingDown为true导致的。
再看在那种情况下,可以使mShuttingDown为true,发现在ActivityManagerService.java只有一种情况:
frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java
@Override
public boolean shutdown(int timeout) {
...
synchronized(this) {
mShuttingDown = true;
mStackSupervisor.prepareForShutdownLocked();
updateEventDispatchingLocked();
timedout = mStackSupervisor.shutdownLocked(timeout);
}
...
}
也就是说在调用shutdown关机的情况,才会将mShuttingDown设置为true.
那就是不能正常关机,一直处于关机中。
[5]猜测是否一直在关机中,针对此问题,很简单,就看关机log的信息PID是否一直存在。
07-20 05:04:48.445651 1853 1869 D ShutdownThread: Notifying thread to start shutdown longPressBehavior=1
...
07-20 22:48:34.353 1853 2822 D MtkShutdownThread: waitForPlayAnimation still wait service.bootanim.completed
已经十几个小时了,PID(1853)还一直存在,足以说明系统一直处于关机中,但是,没有关机,说明有可能死循环卡主。
[6]分析死循环卡住.
07-20 06:08:03.191201 1853 2822 D MtkShutdownThread: waitForPlayAnimation still wait service.bootanim.completed
07-20 06:08:04.191517 1853 2822 D MtkShutdownThread: waitForPlayAnimation still wait service.bootanim.completed
定时的打印出来。
寻找对应代码:
vendor/mediatek/proprietary/frameworks/base/services/core/java/com/mediatek/server/MtkShutdownThread.java
private static void waitForPlayAnimation() {
if(JourneyCustomFeature.SHUTDOWN_ANIMATION_WAIT) {
boolean bootanim_completed = false;
Log.d(TAG, "waitForPlayAnimation wait service.bootanim.completed change to 1");
while(!bootanim_completed) {
try {
// bootanimate will set this to true after all completed.
bootanim_completed = SystemProperties.getBoolean("service.bootanim.completed", false);
if(!bootanim_completed) {
Thread.currentThread().sleep(1000);
Log.d(TAG, "waitForPlayAnimation still wait service.bootanim.completed");
} else {
Log.d(TAG, "waitForPlayAnimation wait bootanim_completed");
return;
}
} catch (InterruptedException e) {
Log.e(TAG, "Shutdown stop bootanimation Thread.currentThread().sleep exception!");
}
}
}
}
有此确定出,由于while死循环和不能关机有或多或少的关系。
[7]根据代码分析while死循环和不能关机的关系:
vendor/mediatek/proprietary/frameworks/base/services/core/java/com/mediatek/server/MtkShutdownThread.java
private void shutdownAnimationService() {
...
waitForPlayAnimation();
...
}
@Override
protected void mShutdownSeqFinish(Context context) {
...
shutdownAnimationService();
...
}
mShutdownSeqFinish()方法的调用只有一处:
frameworks/base/services/core/java/com/android/server/power/ShutdownThread.java
public void run() {
...
///M: added for Shutdown Enhancement@{
mShutdownSeqFinish(mContext);
/// @}
shutdownTimingLog.traceEnd(); // SystemServerShutdown
metricEnded(METRIC_SYSTEM_SERVER);
saveMetrics(mReboot, mReason);
// Remaining work will be done by init, including vold shutdown
rebootOrShutdown(mContext, mReboot, mReason);
...
}
代码在这里很非常清晰了,由于mShutdownSeqFinish()方法是死循环,导致rebootOrShutdown()不能调用一直处于等待状态。所以造成一系列的问题。
[8]代码模拟并验证:
vendor/mediatek/proprietary/frameworks/base/services/core/java/com/mediatek/server/MtkShutdownThread.java 中waitForPlayAnimation()写成死循环,模拟场景。
使用正常手机模拟,确实是和此目前的行为是一致的。
[9]解决此死循环,解决此bug。
总结:遇到冻屏问题,不要慌张,利用dump 命令可以获取当前状态逐步分析。