Android ANR原理以及机制

 

一 ANR原理

我们平时遇到的ANR问题大部分是input ANR类型,本文以input ANR为例进行梳理,这块机制并不复杂,受限于篇幅,本文只介绍埋下计时和check超时的代码部分。

正常输入事件的分发流程如下

InputDispatcher::dispatchOnce()
->InputDispatcher::dispatchOnceInnerLocked
->InputDispatcher::dispatchKeyLocked
->InputDispatcher::findFocusedWindowTargetsLocked
......

findFocusedWindowTargetsLocked这个函数从字面不难猜出其意图: 查找有焦点的window。

该函数较长,我们将其拆分开来进行梳理

未找到focused的window,也未找到focused的application

// If there is no currently focused window and no focused application
// then drop the event.
if (focusedWindowHandle == nullptr && focusedApplicationHandle == nullptr) {
    ALOGI("Dropping %s event because there is no focused window or focused application in "
           "display %" PRId32 ".",
    NamedEnum::string(entry.type).c_str(), displayId);
    return InputEventInjectionResult::FAILED;
}

这种情况下,则drop该事件

未找到focused的window,有focused的application

if (focusedWindowHandle == nullptr && focusedApplicationHandle != nullptr) {
        if (!mNoFocusedWindowTimeoutTime.has_value()) {
            // We just discovered that there's no focused window. Start the ANR timer
            std::chrono::nanoseconds timeout = focusedApplicationHandle->getDispatchingTimeout(
                    DEFAULT_INPUT_DISPATCHING_TIMEOUT);
            //更新超时时间,该focused事件开始进入计时
            mNoFocusedWindowTimeoutTime = currentTime + timeout.count();
            mAwaitedFocusedApplication = focusedApplicationHandle;
            mAwaitedApplicationDisplayId = displayId;
            ALOGW("Waiting because no window has focus but %s may eventually add a "
                  "window when it finishes starting up. Will wait for %" PRId64 "ms",
                  mAwaitedFocusedApplication->getName().c_str(), millis(timeout));
            *nextWakeupTime = *mNoFocusedWindowTimeoutTime;
            return InputEventInjectionResult::PENDING;
        } else if (currentTime > *mNoFocusedWindowTimeoutTime) {
            // Already raised ANR. Drop the event
            ALOGE("Dropping %s event because there is no focused window",
                  NamedEnum::string(entry.type).c_str());
            return InputEventInjectionResult::FAILED;
        } else {
            //说明之前已经埋过计时,此时还未到超时时间则继续等待
            // Still waiting for the focused window
            return InputEventInjectionResult::PENDING;
        }
}

重置超时时间

/ we have a valid, non-null focused window
resetNoFocusedWindowTimeoutLocked();

执行到这步的话,则说明本次findFocusedWindowTargetsLocked找到了非空的window,对于这种情况会resetNoFocusedWindowTimeoutLocked。

除此之外,系统还有多个场景下也会触发该重置接口,比如

  1. setFocusedApplicationLocked 当前focused应用发生变化
  2. setInputDispatchMode 调用了分发模式
  3. resetAndDropEverythingLocked这个接口存在多处会调用的场景,如stopFreezingDisplayLocked、performEnableScreen等场景。

其它窗口异常情况

如果当前window存在异常情况,也会做pending处理,同样可能会成为造成ANR的原因。比如窗口处于paused状态

if (focusedWindowHandle->getInfo()->paused) {
    ALOGI("Waiting because %s is paused", focusedWindowHandle->getName().c_str());
    return InputEventInjectionResult::PENDING;
}

还有其他情况也会导致pending,如窗口未连接、窗口连接已满、窗口连接死亡等,不一一列出。

这里提到了造成消息pending的情况,我们自然会想到那什么场景下消息会drop掉呢?

rameworks/native/services/inputflinger/dispatcher/InputDispatcher.cpp
void InputDispatcher::dropInboundEventLocked(const EventEntry& entry, DropReason dropReason) {
    const char* reason;
    switch (dropReason) {
        case DropReason::POLICY:
#if DEBUG_INBOUND_EVENT_DETAILS
            ALOGD("Dropped event because policy consumed it.");
#endif
            reason = "inbound event was dropped because the policy consumed it";
            break;
        case DropReason::DISABLED:
            if (mLastDropReason != DropReason::DISABLED) {
                ALOGI("Dropped event because input dispatch is disabled.");
            }
            reason = "inbound event was dropped because input dispatch is disabled";
            break;
        case DropReason::APP_SWITCH:
            ALOGI("Dropped event because of pending overdue app switch.");
            reason = "inbound event was dropped because of pending overdue app switch";
            break;
        case DropReason::BLOCKED:
            ALOGI("Dropped event because the current application is not responding and the user "
                  "has started interacting with a different application.");
            reason = "inbound event was dropped because the current application is not responding "
                     "and the user has started interacting with a different application";
            break;
        case DropReason::STALE:
            ALOGI("Dropped event because it is stale.");
            reason = "inbound event was dropped because it is stale";
            break;
        case DropReason::NOT_DROPPED: {
            LOG_ALWAYS_FATAL("Should not be dropping a NOT_DROPPED event");
            return;
        }
    }

有如上几种场景会造成消息drop,dropInboundEventLocked的触发时机是在InputDispatcher::dispatchOnceInnerLocked中。

到这里我们已经清楚了埋下超时时间的流程,那么什么时候会检查超时时间有没有到呢?

InputDispatcher.cpp@dispatchOnce-> InputDispatcher.cpp@processAnrsLocked
**
 * Check if any of the connections' wait queues have events that are too old.
 * If we waited for events to be ack'ed for more than the window timeout, raise an ANR.
 * Return the time at which we should wake up next.
 */
nsecs_t InputDispatcher::processAnrsLocked() {
    const nsecs_t currentTime = now();
    nsecs_t nextAnrCheck = LONG_LONG_MAX;
    // Check if we are waiting for a focused window to appear. Raise ANR if waited too long
    if (mNoFocusedWindowTimeoutTime.has_value() && mAwaitedFocusedApplication != nullptr) {
        if (currentTime >= *mNoFocusedWindowTimeoutTime) {
            processNoFocusedWindowAnrLocked();
            mAwaitedFocusedApplication.reset();
            mNoFocusedWindowTimeoutTime = std::nullopt;
            return LONG_LONG_MIN;
        } else {
            //mNoFocusedWindowTimeoutTime代表的是这个window超时的时间点
            // Keep waiting. We will drop the event when mNoFocusedWindowTimeoutTime comes.
            nextAnrCheck = *mNoFocusedWindowTimeoutTime;
        }
    }

    // Check if any connection ANRs are due
    nextAnrCheck = std::min(nextAnrCheck, mAnrTracker.firstTimeout());
    if (currentTime < nextAnrCheck) { // most likely scenario
        return nextAnrCheck;          // everything is normal. Let's check again at nextAnrCheck
    }

    // If we reached here, we have an unresponsive connection.
    sp<Connection> connection = getConnectionLocked(mAnrTracker.firstToken());
    if (connection == nullptr) {
        ALOGE("Could not find connection for entry %" PRId64, mAnrTracker.firstTimeout());
        return nextAnrCheck;
    }
    connection->responsive = false;
    // Stop waking up for this unresponsive connection
    mAnrTracker.eraseToken(connection->inputChannel->getConnectionToken());
    onAnrLocked(connection);
    return LONG_LONG_MIN;
}

如果当前时间已经满足超时时间,则触发onAnrLocked。

void InputDispatcher::onAnrLocked(std::shared_ptr<InputApplicationHandle> application) {
    std::string reason =
            StringPrintf("%s does not have a focused window", application->getName().c_str());
    updateLastAnrStateLocked(*application, reason);

    std::unique_ptr<CommandEntry> commandEntry = std::make_unique<CommandEntry>(
            &InputDispatcher::doNotifyNoFocusedWindowAnrLockedInterruptible);
    commandEntry->inputApplicationHandle = std::move(application);
    postCommandLocked(std::move(commandEntry));
}
/**
 * Check if any of the connections' wait queues have events that are too old.
 * If we waited for events to be ack'ed for more than the window timeout, raise an ANR.
 * Return the time at which we should wake up next.
 */
nsecs_t InputDispatcher::processAnrsLocked() {
    const nsecs_t currentTime = now();
    nsecs_t nextAnrCheck = LONG_LONG_MAX;
    // Check if we are waiting for a focused window to appear. Raise ANR if waited too long
    if (mNoFocusedWindowTimeoutTime.has_value() && mAwaitedFocusedApplication != nullptr) {
        if (currentTime >= *mNoFocusedWindowTimeoutTime) {
            processNoFocusedWindowAnrLocked();
            mAwaitedFocusedApplication.reset();
            mNoFocusedWindowTimeoutTime = std::nullopt;
            return LONG_LONG_MIN;
        } else {
            //mNoFocusedWindowTimeoutTime代表的是这个window超时的时间点
            // Keep waiting. We will drop the event when mNoFocusedWindowTimeoutTime comes.
            nextAnrCheck = *mNoFocusedWindowTimeoutTime;
        }
    }

    // Check if any connection ANRs are due
    nextAnrCheck = std::min(nextAnrCheck, mAnrTracker.firstTimeout());
    if (currentTime < nextAnrCheck) { // most likely scenario
        return nextAnrCheck;          // everything is normal. Let's check again at nextAnrCheck
    }

    // If we reached here, we have an unresponsive connection.
    sp<Connection> connection = getConnectionLocked(mAnrTracker.firstToken());
    if (connection == nullptr) {
        ALOGE("Could not find connection for entry %" PRId64, mAnrTracker.firstTimeout());
        return nextAnrCheck;
    }
    connection->responsive = false;
    // Stop waking up for this unresponsive connection
    mAnrTracker.eraseToken(connection->inputChannel->getConnectionToken());
    onAnrLocked(connection);
    return LONG_LONG_MIN;
}

onAnrLocked这个函数所起到的主要作用是将doNotifyNoFocusedWindowAnrLockedInterruptible通过postCommandLocked塞进队列中。

在下一次触发InputDispatcher.dispatchOnce函数会执行runCommandsLockedInterruptible
 

oid InputDispatcher::dispatchOnce() {
    nsecs_t nextWakeupTime = LONG_LONG_MAX;
    { // acquire lock
        std::scoped_lock _l(mLock);
        mDispatcherIsAlive.notify_all();

        // Run a dispatch loop if there are no pending commands.
        // The dispatch loop might enqueue commands to run afterwards.
        if (!haveCommandsLocked()) {
            dispatchOnceInnerLocked(&nextWakeupTime);
        }

        // Run all pending commands if there are any.
        // If any commands were run then force the next poll to wake up immediately.
        if (runCommandsLockedInterruptible()) {
            nextWakeupTime = LONG_LONG_MIN;
        }
        //....
}

runCommandsLockedInterruptible函数作用其实比较简单,就是取出所有的Command执行一遍

bool InputDispatcher::runCommandsLockedInterruptible() {
    if (mCommandQueue.empty()) {
        return false;
    }

    do {
        std::unique_ptr<CommandEntry> commandEntry = std::move(mCommandQueue.front());
        mCommandQueue.pop_front();
        Command command = commandEntry->command;
        command(*this, commandEntry.get()); // commands are implicitly 'LockedInterruptible'

        commandEntry->connection.clear();
    } while (!mCommandQueue.empty());
    return true;
}

这里顺便提一下,我们平时分析日志时经常会遇到类似这样的片段

上面的日志片段其实是在processAnrsLocked中打印的。

二、ANR产生机制

2.1 输入事件超时(5s)

InputEvent Timeout

a.InputDispatcher发送key事件给 对应的进程的 Focused Window ,对应的window不存在、处于暂停态、或通道(input channel)占满、通道未注册、通道异常、或5s内没有处理完一个事件,就会发生ANR
​
b.InputDispatcher发送MotionEvent事件有个例外之处:当对应Touched Window的 input waitQueue中有超过0.5s的事件,inputDispatcher会暂停该事件,并等待5s,如果仍旧没有收到window的‘finish’事件,则触发ANR
​
c.下一个事件到达,发现有一个超时事件才会触发ANR

2.2 广播类型超时(前台15s,后台60s)

BroadcastReceiver Timeout

a.静态注册的广播和有序广播会ANR,动态注册的非有序广播并不会ANR
​
b.广播发送时,会判断该进程是否存在,不存在则创建,创建进程的耗时也算在超时时间里
​
c.只有当进程存在前台显示的Activity才会弹出ANR对话框,否则会直接杀掉当前进程
​
d.当onReceive执行超过阈值(前台15s,后台60s),将产生ANR
​
e.如何发送前台广播:Intent.addFlags(Intent.FLAG_RECEIVER_FOREGROUND)

2.3 服务超时(前台20s,后台200s)

Service Timeout

a.Service的以下方法都会触发ANR:onCreate(),onStartCommand(), onStart(), onBind(), onRebind(), onTaskRemoved(), onUnbind(),
onDestroy().
​
b.前台Service超时时间为20s,后台Service超时时间为200s
​
c.如何区分前台、后台执行————当前APP处于用户态,此时执行的Service则为前台执行。
​
d.用户态:有前台activity、有前台广播在执行、有foreground service执行
复制代码

2.4 ContentProvider 类型

a.ContentProvider创建发布超时并不会ANR
​
b.使用ContentProviderclient来访问ContentProverder可以自主选择触发ANR,超时时间自己定
client.setDetectNotResponding(PROVIDER_ANR_TIMEOUT);

  • 1
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值