Android中对于各种ANR(Application Not Respond)都有不同的检测机制,今天来介绍一下Input事件的ANR触发机制
10-21 10:03:26.935 1322 3788 I ActivityManager: Done dumping
10-21 10:03:27.023 1322 3788 E ActivityManager: ANR in com.example.stability (com.example.stability/.anr.ANRActivity)
10-21 10:03:27.023 1322 3788 E ActivityManager: PID: 2786
10-21 10:03:27.023 1322 3788 E ActivityManager: Reason: Input dispatching timed out (com.example.stability/com.example.stability.anr.ANRActivity, e07be82 com.example.stability/com.example.stability.anr.ANRActivity (server) is not responding. Waited 8009ms for (eventTime=117830966476000, deviceId=4, source=0x00001002, displayId=0, action=DOWN, actionButton=0x00000000, flags=0x00000000, metaState=0x00000000, buttonState=0x00000000, classification=NONE, edgeFlags=0x00000000, xPrecision=1.0, yPrecision=1.0, xCursorPosition=nan, yCursorPosition=nan, pointers=[0: (470.0, 712.0)]), policyFlags=0x62000000)
10-21 10:03:27.023 1322 3788 E ActivityManager: Parent: com.example.stability/.anr.ANRActivity
10-21 10:03:27.023 1322 3788 E ActivityManager: Load: 24.16 / 24.81 / 24.94
10-21 10:03:27.023 1322 3788 E ActivityManager: ----- Output from /proc/pressure/memory -----
10-21 10:03:27.023 1322 3788 E ActivityManager: some avg10=0.00 avg60=0.00 avg300=0.00 total=8238359
10-21 10:03:27.023 1322 3788 E ActivityManager: full avg10=0.00 avg60=0.00 avg300=0.00 total=4188694
10-21 10:03:27.023 1322 3788 E ActivityManager: ----- End output from /proc/pressure/memory -----
10-21 10:03:27.023 1322 3788 E ActivityManager:
首先我们知道InputDispatcher分发事件的最后一个方法是startDispatchCycleLocked,我们先来看这个方法
不去关心无关代码,只看事件发送之后的处理
事件分发后,将事件从connection->outboundQueue移除,添加到connection->waitQueue中,调用mAnrTracker.insert把对应timeout和记录app信息的token添加进去,这个timeout一般为5000ms,也就是输入事件无响应的超时
void InputDispatcher::startDispatchCycleLocked(nsecs_t currentTime,
const sp<Connection>& connection) {
const nsecs_t timeout =
getDispatchingTimeoutLocked(connection->inputChannel->getConnectionToken());
dispatchEntry->timeoutTime = currentTime + timeout;
...
connection->outboundQueue.erase(std::remove(connection->outboundQueue.begin(),
connection->outboundQueue.end(),
dispatchEntry));
connection->waitQueue.push_back(dispatchEntry);
if (connection->responsive) {
mAnrTracker.insert(dispatchEntry->timeoutTime,
connection->inputChannel->getConnectionToken());
}
}
}
我们回到分发开始的函数dispatchOnce,通过注释我们知道processAnrsLocked方法是进行ANR相关处理的,我们接着往下看
void InputDispatcher::dispatchOnce() {
nsecs_t nextWakeupTime = LONG_LONG_MAX;
{
if (!haveCommandsLocked()) {
dispatchOnceInnerLocked(&nextWakeupTime);
}
if (runCommandsLockedInterruptible()) {
nextWakeupTime = LONG_LONG_MIN;
}
// If we are still waiting for ack on some events,
// we might have to wake up earlier to check if an app is anr'ing.
const nsecs_t nextAnrCheck = processAnrsLocked();
nextWakeupTime = std::min(nextWakeupTime, nextAnrCheck);
}
// Wait for callback or timeout or wake. (make sure we round up, not down)
nsecs_t currentTime = now();
int timeoutMillis = toMillisecondTimeoutDelay(currentTime, nextWakeupTime);
mLooper->pollOnce(timeoutMillis);
}
nsecs_t InputDispatcher::processAnrsLocked() {
const nsecs_t currentTime = now();
nsecs_t nextAnrCheck = LONG_LONG_MAX;
//计算是否触发超时
nextAnrCheck = std::min(nextAnrCheck, mAnrTracker.firstTimeout());
if (currentTime < nextAnrCheck) { // most likely scenario
return nextAnrCheck; // everything is normal. Let's check again at nextAnrCheck
}
// If we reached here, we have an unresponsive connection.
//通过token找到对应的connection
sp<Connection> connection = getConnectionLocked(mAnrTracker.firstToken());
if (connection == nullptr) {
ALOGE("Could not find connection for entry %" PRId64, mAnrTracker.firstTimeout());
return nextAnrCheck;
}
connection->responsive = false;
// Stop waking up for this unresponsive connection
mAnrTracker.eraseToken(connection->inputChannel->getConnectionToken());
//触发ANR
onAnrLocked(*connection);
return LONG_LONG_MIN;
}
还记得上文的mAnrTracker吗,代码实现是AnrTracker.cpp,
这里通过最近的firstToken得到对应的connection,然后调用onAnrLocked(connection);进一步处理
void InputDispatcher::onAnrLocked(const Connection& connection) {
// Since we are allowing the policy to extend the timeout, maybe the waitQueue
// is already healthy again. Don't raise ANR in this situation
if (connection.waitQueue.empty()) {
ALOGI("Not raising ANR because the connection %s has recovered",
connection.inputChannel->getName().c_str());
return;
}
DispatchEntry* oldestEntry = *connection.waitQueue.begin();
const nsecs_t currentWait = now() - oldestEntry->deliveryTime;
//reason log 打印
std::string reason =
android::base::StringPrintf("%s is not responding. Waited %" PRId64 "ms for %s",
connection.inputChannel->getName().c_str(),
ns2ms(currentWait),
oldestEntry->eventEntry->getDescription().c_str());
updateLastAnrStateLocked(getWindowHandleLocked(connection.inputChannel->getConnectionToken()),
reason);
//新建CommandEntry执行ANR的触发工作
std::unique_ptr<CommandEntry> commandEntry =
std::make_unique<CommandEntry>(&InputDispatcher::doNotifyAnrLockedInterruptible);
commandEntry->inputApplicationHandle = nullptr;
commandEntry->inputChannel = connection.inputChannel;
commandEntry->reason = std::move(reason);
postCommandLocked(std::move(commandEntry));
}
我们看到这里其实还是用的CommandEntry那套逻辑,对应执行的是doNotifyAnrLockedInterruptible这个方法
他会在下一次dispatchOnce触发,然后在runCommandsLockedInterruptible里调起doNotifyAnrLockedInterruptible触发上层ANR
void InputDispatcher::doNotifyAnrLockedInterruptible(CommandEntry* commandEntry) {
sp<IBinder> token =
commandEntry->inputChannel ? commandEntry->inputChannel->getConnectionToken() : nullptr;
mLock.unlock();
const nsecs_t timeoutExtension =
mPolicy->notifyAnr(commandEntry->inputApplicationHandle, token, commandEntry->reason);
}
我们又看到mPolicy了,之前就分析过,这个最终会调用到上层,进行ANR信息的收集dunp等工作
当然如果应用端是没有问题的,正常情况下是不会走到这里的,因为Socket通信是双向的,所以当APP处理完事件就触发机制会给InputDispatcher发送一个返回,这个在InputDispatcher的handleReceiveCallback函数中进行处理,已经介绍了相应逻辑,这个最终会调用到doDispatchCycleFinishedLockedInterruptible方法,从mAnrTracker中将token擦除,从而下一次的processAnrsLocked在检查时就不会触发后续ANR的逻辑了
void InputDispatcher::doDispatchCycleFinishedLockedInterruptible(CommandEntry* commandEntry) {
sp<Connection> connection = commandEntry->connection;
connection->waitQueue.erase(dispatchEntryIt);
mAnrTracker.erase(dispatchEntry->timeoutTime,
connection->inputChannel->getConnectionToken());
}
总结:
1.分发事件时,将发出的事件的timeout和token保存到mAnrTracker中
2.若APP端有及时反馈,通过CommandEntry执行 mAnrTracker擦除相应的timeout和token
3.若APP端没及时反馈,在processAnrsLocked中检查mAnrTracker超时,获取token通过CommandEntry执行 触发ANR的逻辑