在Android ANR 问题第一弹中,我们介绍Android ANR问题分为三类:Input,Receiver,Service。我们现在就先来详细的介绍Input事件超时是如何导致ANR问题发生的,我们只有从原理上去了解Input超时的本质,才能更好的分析解决实际开发中遇到的问题。本文会以Android8.1的代码为准,来简要分析Input事件超时。
在分析Input超时之前,我们先来简单的介绍一下Android的Input系统。Android Input体系中,大致有两种类型的事件,实体按键key事件,屏幕点击触摸事件,当然如果根据事件类型的不同我们还能细分为基础实体按键的key(power,volume up/down,recents,back,home),实体键盘按键,屏幕点击(多点,单点),屏幕滑动等等的事件。在Android整个Input体系中有三个格外重要的成员:Eventhub,InputReader,InputDispatcher。它们分别担负着各自不同的职责,Eventhub负责监听/dev/input产生Input事件,InputReader负责从Eventhub读取事件,并将读取的事件发给InputDispatcher,InputDispatcher则根据实际的需要具体分发给当前手机获得焦点实际的Window。当然它们三者之间有工作远比我介绍的要复杂的很多。
好了当我们知道什么是Input的时候,我们现在就开始分析Input是如何超时导致ANR的。我们常说Input超时,都是指的是Input事件分发超时,因此整个超时计算以及触发都在InputDispatcher这个类中。其代码路径如下:/frameworks/native/services/inputflinger/InputDispatcher.cpp,Input分发事件的时候就是不断执行InputDispatcher的threadLoop来读取Input事件,并调用dispatchOnce进行分发事件的。当然如果没有Input事件的时候,他会执行mLooper->pollOnce,进入等待状态。这个就和Android应用UI主线程的Looper一样,MessageQueue里面没有消息的时候,等待于nativePollOnce方法,其实最终还是调用Looper->pollOnce进入等待状态。
bool InputDispatcherThread::threadLoop() {
mDispatcher->dispatchOnce();
return true;
}
我们知道InputDispatcher会通过dispatchOnce不断的读取并分发Input事件,因此我直接来看InputDispatcher::dispatchOnceInnerLocked该方法,其中代码并非整个代码,我这边只列取关键代码进行分析,而且整个分析过程主要以按键事件为主要分析对象。
void InputDispatcher::dispatchOnceInnerLocked(nsecs_t* nextWakeupTime) {
nsecs_t currentTime = now();//记录事件分发的第一时间点,很重要,此参数会不断在接下来的方法中作为参数进行传递。
// Ready to start a new event.
// If we don't already have a pending event, go grab one.
if (! mPendingEvent) { //只有但前mPendingEvent(正在分发的事件)为空的时候才进入
//从注释中可以看出这里就是获取一个Input事件,并且重置ANR时间计算的相关参数
// Get ready to dispatch the event.
resetANRTimeoutsLocked();
}
switch (mPendingEvent->type) {
case EventEntry::TYPE_KEY: {
KeyEntry* typedEntry = static_cast<KeyEntry*>(mPendingEvent);
//找到Input事件,让我们发起来
done = dispatchKeyLocked(currentTime, typedEntry, &dropReason, nextWakeupTime);
break;
}
}
if (done) {
if (dropReason != DROP_REASON_NOT_DROPPED) {
dropInboundEventLocked(mPendingEvent, dropReason);//这里稍微提一下,一般打出的一些drop***event的log都是从这里输出的
}
}
}
来看下resetANRTimeoutsLocked方法
void InputDispatcher::resetANRTimeoutsLocked() {
// Reset input target wait timeout.
mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_NONE;
mInputTargetWaitApplicationHandle.clear();
}
继续事件分发dispatchKeyLocked,gogogo
bool InputDispatcher::dispatchKeyLocked(nsecs_t currentTime, KeyEntry* entry,
DropReason* dropReason, nsecs_t* nextWakeupTime) {
// Identify targets.
Vector<InputTarget> inputTargets;
int32_t injectionResult = findFocusedWindowTargetsLocked(currentTime,
entry, inputTargets, nextWakeupTime); file descriptors from//这边会找到当前有焦点的窗口window,并根据条件触发ANR
addMonitoringTargetsLocked(inputTargets);
// Dispatch the key.
dispatchEventLocked(currentTime, entry, inputTargets);//继续执行事件分发流程
return true;
}
重点分析findFocusedWindowTargetsLocked该方法
int32_t InputDispatcher::findFocusedWindowTargetsLocked(nsecs_t currentTime,
const EventEntry* entry, Vector<InputTarget>& inputTargets, nsecs_t* nextWakeupTime) {
int32_t injectionResult;
String8 reason;
// If there is no currently focused window and no focused application
// then drop the event.
if (mFocusedWindowHandle == NULL) {
if (mFocusedApplicationHandle != NULL) {
injectionResult = handleTargetsNotReadyLocked(currentTime, entry,
mFocusedApplicationHandle, NULL, nextWakeupTime,
"Waiting because no window has focus but there is a "
"focused application that may eventually add a window "
"when it finishes starting up.");
goto Unresponsive;
}//看到这里,有没有一丝的惊喜,是不是发现monkey test的时候经常遇到类似log的ANR?典型的无窗口,有应用的ANR问题,这里我们就需要了解Android应用的启动流程了(后续准备写一篇Android应用启动流程详细分析的文章),一般此类问题都是Android应用首次启动时会发生此类问题,此时我们应用本身需要检查一下我们的Android应用重写的Application onCreate方法,Android应用的启动界面是否在onCreate onStart方法中是否存在耗时操作。当然不排除系统原因造成的启动慢,直接导致ANR问题发生的情况
ALOGI("Dropping event because there is no focused window or focused application.");
injectionResult = INPUT_EVENT_INJECTION_FAILED;
goto Failed;
}
// Check whether the window is ready for more input.//这里将会进入更为详细更多种类的ANR触发过程
reason = checkWindowReadyForMoreInputLocked(currentTime,
mFocusedWindowHandle, entry, "focused");
if (!reason.isEmpty()) {//一旦checkWindowReadyForMoreInputLocked返回不为空,怎说明存在应用ANR
injectionResult = handleTargetsNotReadyLocked(currentTime, entry,
mFocusedApplicationHandle, mFocusedWindowHandle, nextWakeupTime, reason.string());
goto Unresponsive;
}
// Success! Output targets.
injectionResult = INPUT_EVENT_INJECTION_SUCCEEDED;
addWindowTargetLocked(mFocusedWindowHandle,
InputTarget::FLAG_FOREGROUND | InputTarget::FLAG_DISPATCH_AS_IS, BitSet32(0),
inputTargets);
// Done.
Failed:
Unresponsive:
nsecs_t timeSpentWaitingForApplication = getTimeSpentWaitingForApplicationLocked(currentTime);
updateDispatchStatisticsLocked(currentTime, entry,
injectionResult, timeSpentWaitingForApplication);
#if DEBUG_FOCUS
ALOGD("findFocusedWindow finished: injectionResult=%d, "
"timeSpentWaitingForApplication=%0.1fms",
injectionResult, timeSpentWaitingForApplication / 1000000.0);
#endif
return injectionResult;
}
各种ANR种类判断checkWindowReadyForMoreInputLocked,这里就不去进行详细的分析了,毕竟源码的注释很了然了。
String8 InputDispatcher::checkWindowReadyForMoreInputLocked(nsecs_t currentTime,
const sp<InputWindowHandle>& windowHandle, const EventEntry* eventEntry,
const char* targetType) {
// If the window is paused then keep waiting.
if (windowHandle->getInfo()->paused) {
return String8::format("Waiting because the %s window is paused.", targetType);
}
// If the window's connection is not registered then keep waiting.
ssize_t connectionIndex = getConnectionIndexLocked(windowHandle->getInputChannel());
if (connectionIndex < 0) {
return String8::format("Waiting because the %s window's input channel is not "
"registered with the input dispatcher. The window may be in the process "
"of being removed.", targetType);
}
// If the connection is dead then keep waiting.
sp<Connection> connection = mConnectionsByFd.valueAt(connectionIndex);
if (connection->status != Connection::STATUS_NORMAL) {
return String8::format("Waiting because the %s window's input connection is %s."
"The window may be in the process of being removed.", targetType,
connection->getStatusLabel());
}
// If the connection is backed up then keep waiting.
if (connection->inputPublisherBlocked) {
return String8::format("Waiting because the %s window's input channel is full. "
"Outbound queue length: %d. Wait queue length: %d.",
targetType, connection->outboundQueue.count(), connection->waitQueue.count());
}
// Ensure that the dispatch queues aren't too far backed up for this event.
if (eventEntry->type == EventEntry::TYPE_KEY) {
// If the event is a key event, then we must wait for all previous events to
// complete before delivering it because previous events may have the
// side-effect of transferring focus to a different window and we want to
// ensure that the following keys are sent to the new window.
//
// Suppose the user touches a button in a window then immediately presses "A".
// If the button causes a pop-up window to appear then we want to ensure that
// the "A" key is delivered to the new pop-up window. This is because users
// often anticipate pending UI changes when typing on a keyboard.
// To obtain this behavior, we must serialize key events with respect to all
// prior input events.
if (!connection->outboundQueue.isEmpty() || !connection->waitQueue.isEmpty()) {
return String8::format("Waiting to send key event because the %s window has not "
"finished processing all of the input events that were previously "
"delivered to it. Outbound queue length: %d. Wait queue length: %d.",
targetType, connection->outboundQueue.count(), connection->waitQueue.count());
}
} else {
// Touch events can always be sent to a window immediately because the user intended
// to touch whatever was visible at the time. Even if focus changes or a new
// window appears moments later, the touch event was meant to be delivered to
// whatever window happened to be on screen at the time.
//
// Generic motion events, such as trackball or joystick events are a little trickier.
// Like key events, generic motion events are delivered to the focused window.
// Unlike key events, generic motion events don't tend to transfer focus to other
// windows and it is not important for them to be serialized. So we prefer to deliver
// generic motion events as soon as possible to improve efficiency and reduce lag
// through batching.
//
// The one case where we pause input event delivery is when the wait queue is piling
// up with lots of events because the application is not responding.
// This condition ensures that ANRs are detected reliably.
if (!connection->waitQueue.isEmpty()
&& currentTime >= connection->waitQueue.head->deliveryTime
+ STREAM_AHEAD_EVENT_TIMEOUT) {
return String8::format("Waiting to send non-key event because the %s window has not "
"finished processing certain input events that were delivered to it over "
"%0.1fms ago. Wait queue length: %d. Wait queue head age: %0.1fms.",
targetType, STREAM_AHEAD_EVENT_TIMEOUT * 0.000001f,
connection->waitQueue.count(),
(currentTime - connection->waitQueue.head->deliveryTime) * 0.000001f);
}
}
return String8::empty();
}
根据各种reason,判断是否已经超时,触发ANR
int32_t InputDispatcher::handleTargetsNotReadyLocked(nsecs_t currentTime,
const EventEntry* entry,
const sp<InputApplicationHandle>& applicationHandle,
const sp<InputWindowHandle>& windowHandle,
nsecs_t* nextWakeupTime, const char* reason) {
if (applicationHandle == NULL && windowHandle == NULL) {//无应用,无窗口,进入一次,继续等待应用,不触发ANR
if (mInputTargetWaitCause != INPUT_TARGET_WAIT_CAUSE_SYSTEM_NOT_READY) {
#if DEBUG_FOCUS
ALOGD("Waiting for system to become ready for input. Reason: %s", reason);
#endif
mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_SYSTEM_NOT_READY;
mInputTargetWaitStartTime = currentTime;
mInputTargetWaitTimeoutTime = LONG_LONG_MAX;
mInputTargetWaitTimeoutExpired = false;
mInputTargetWaitApplicationHandle.clear();
}
} else {
if (mInputTargetWaitCause != INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY) {
#if DEBUG_FOCUS//这里一般是有应用(application已经创建),无窗口,或者有应用,有窗口ANR的情形,一般同一个窗口至进入一次该方法
ALOGD("Waiting for application to become ready for input: %s. Reason: %s",
getApplicationWindowLabelLocked(applicationHandle, windowHandle).string(),
reason);
#endif
nsecs_t timeout;
if (windowHandle != NULL) {
timeout = windowHandle->getDispatchingTimeout(DEFAULT_INPUT_DISPATCHING_TIMEOUT);//5s超时
} else if (applicationHandle != NULL) {
timeout = applicationHandle->getDispatchingTimeout(
DEFAULT_INPUT_DISPATCHING_TIMEOUT);//5s超时
} else {
timeout = DEFAULT_INPUT_DISPATCHING_TIMEOUT;//5s超时
}
mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY;//超时等待原因
mInputTargetWaitStartTime = currentTime;//记录当前分发事件为第一次分发时间
mInputTargetWaitTimeoutTime = currentTime + timeout;//设置超时
mInputTargetWaitTimeoutExpired = false;//超时是否过期
mInputTargetWaitApplicationHandle.clear();//清除记录当前等待的应用
if (windowHandle != NULL) {
mInputTargetWaitApplicationHandle = windowHandle->inputApplicationHandle;//记录当前等待的应用
}
if (mInputTargetWaitApplicationHandle == NULL && applicationHandle != NULL) {
mInputTargetWaitApplicationHandle = applicationHandle;
}//记录当前等待的应用,针对无窗口,有应用
}
}
if (mInputTargetWaitTimeoutExpired) {
return INPUT_EVENT_INJECTION_TIMED_OUT;
}
if (currentTime >= mInputTargetWaitTimeoutTime) {//当前时间已经大于超时时间,说明应用有时间分发超时了,需要触发ANR
onANRLocked(currentTime, applicationHandle, windowHandle,
entry->eventTime, mInputTargetWaitStartTime, reason);
// Force poll loop to wake up immediately on next iteration once we get the
// ANR response back from the policy.
*nextWakeupTime = LONG_LONG_MIN;
return INPUT_EVENT_INJECTION_PENDING;
} else {
// Force poll loop to wake up when timeout is due.
if (mInputTargetWaitTimeoutTime < *nextWakeupTime) {
*nextWakeupTime = mInputTargetWaitTimeoutTime;
}
return INPUT_EVENT_INJECTION_PENDING;
}
}
我们来理一理该方法:
- 当有事件第一次分发的时候,我们需要注意mFocusedWindowHandle和mFocusedApplicationHandle,暂不考虑无应用,无窗口的情况,这两个参数都是通过WMS在应用启动addWindow或者有Window切换的时候,通过JNI设置到InputDispatcher中的,所以我们在分发事件的时候,只会记录Input事件第一次分发时的时间点,并设置该事件超时的相关参数。
- 当InputDispatcher再次执行dispatchOnceInnerLocked的时候,发现当前的mPendingEvent不为空,所以不会重置ANR相关的timeout参数,因此只会不停的判断当前的时间是否大于mInputTargetWaitTimeoutTime,如果大于则触发ANR。
- 什么时候会重置ANR相关的timeout参数呢?分发到新的Input事件时(重置),也就是mpendingevent处理完(重置),又有新的Input事件产生的时候,焦点应用更新的时候,InputDispatcher自身重置的时候。
- 当Input事件分发超时导致ANR时,真正的ANR发生的第一时间所以应该是InputDispatcherLog打出的时间点,当调用onANRLocked层层调用最终触发appNotResponding打印event log ,ActivityManager anr log,记录trace,因此我们说event log ,ActivityManager anr log,trace具有参考性,并不绝对,并无道理。
到此我们也应该对Input事件导致的ANR问题有一个基本的了解了,我们也能更快更准的定位ANR问题的发生的原因。当然,并不是大家看完本篇文章,就能立马很好的分析ANR问题了,前提是我们自身还是要有充足的知识储备,我们都在学习的路上,还是一句话,不为繁华异匠心,与君共勉之。