Android8.0 系统异常处理流程

Android8.0 系统异常处理流程

异常处理流程

Java处理未捕获异常有个Thread.UncaughtExceptionHandler,在Android系统中当然也是通过实现其来进行未捕获异常处理。

Android 默认系统异常处理是在启动SystemServer进程时设置的。

Zygote进程启动SystemServer时会调用ZygoteInit的forkSystemServer()方法,该方法中又通过handleSystemServerProcess()方法来对SystemServer进程做一些处理,最后会调用到RuntimeInit.commonInit()方法

frameworks/base/core/java/com/android/internal/os/RuntimeInit.java

protected static final void commonInit() {
    Thread.setUncaughtExceptionPreHandler(new LoggingHandler());
    // 该出就设置了默认未捕获异常的处理Handler-KillApplicationHandler
    Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler());
   ...
}

KillApplicationHandler代码如下

frameworks/base/core/java/com/android/internal/os/RuntimeInit.java

private static class KillApplicationHandler implements Thread.UncaughtExceptionHandler {
    public void uncaughtException(Thread t, Throwable e) {
        try {
            ...
            // 1. mApplicationObject标识当前应用
            ActivityManager.getService().handleApplicationCrash(
                    mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));
        } ...
        finally {
            // 无论如何都要保证出现crash的进程不存活
            Process.killProcess(Process.myPid());
            System.exit(10);
        }
    }
}

注释1处的ActivityManager.getService()得到的就是ActivityManagerService的服务端代理对象,实现是通过Binder机制。看看AMS在handleApplicationCrash方法中是如何处理的

frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java

public void handleApplicationCrash(IBinder app,
        ApplicationErrorReport.ParcelableCrashInfo crashInfo) {
    ProcessRecord r = findAppProcess(app, "Crash");
    final String processName = app == null ? "system_server"
            : (r == null ? "unknown" : r.processName);

    handleApplicationCrashInner("crash", r, processName, crashInfo);
}

void handleApplicationCrashInner(String eventType, ProcessRecord r, String processName,
        ApplicationErrorReport.CrashInfo crashInfo) {
    // 1. 将crash信息写入event log中
    EventLog.writeEvent(EventLogTags.AM_CRASH, Binder.getCallingPid(),
            UserHandle.getUserId(Binder.getCallingUid()), processName,
            r == null ? -1 : r.info.flags,
            crashInfo.exceptionClassName,
            crashInfo.exceptionMessage,
            crashInfo.throwFileName,
            crashInfo.throwLineNumber);

    addErrorToDropBox(eventType, r, processName, null, null, null, null, null, crashInfo);
    // 2. 
    mAppErrors.crashApplication(r, crashInfo);
}

注释1处将log记录在event log中。注释2处调用AppError的crashApplication方法

frameworks/base/services/core/java/com/android/server/am/AppErrors.java

void crashApplication(ProcessRecord r, ApplicationErrorReport.CrashInfo crashInfo) {
    final int callingPid = Binder.getCallingPid();
    final int callingUid = Binder.getCallingUid();

    final long origId = Binder.clearCallingIdentity();
    try {
        // 调用内部的crashApplicationInner
        crashApplicationInner(r, crashInfo, callingPid, callingUid);
    } finally {
        Binder.restoreCallingIdentity(origId);
    }
}

继续看crashApplicationInner方法

frameworks/base/services/core/java/com/android/server/am/AppErrors.java

void crashApplicationInner(ProcessRecord r, ApplicationErrorReport.CrashInfo crashInfo,
        int callingPid, int callingUid) {
    ...
    synchronized (mService) {
        // 1. 处理有IActivityController的情况,如果Controller已经处理错误,则不会显示错误框
        if (handleAppCrashInActivityController(r, crashInfo, shortMsg, longMsg, stackTrace,
                timeMillis, callingPid, callingUid)) {
            return;
        }
        ...
        AppErrorDialog.Data data = new AppErrorDialog.Data();
        data.result = result;
        data.proc = r;
        ...
        // 2. 发送SHOW_ERROR_UI_MSG给AMS的mUiHandler,将弹出一个错误对话框,提示用户某进程crash
        final Message msg = Message.obtain();
        msg.what = ActivityManagerService.SHOW_ERROR_UI_MSG;

        task = data.task;
        msg.obj = data;
        mService.mUiHandler.sendMessage(msg);
    }
    // 3. 调用AppErrorResult的get方法,该方法内部调用了wait方法,故为阻塞状态,当用户处理了对话框后会调用AppErrorResult的set方法,该方法内部调用了notifyAll()方法来唤醒线程。
    // 注意此处涉及了两个线程的工作,crashApplicationInner函数工作在Binder调用所在的线程;对话框工作于AMS的Ui线程
    
    int res = result.get();

    Intent appErrorIntent = null;
    MetricsLogger.action(mContext, MetricsProto.MetricsEvent.ACTION_APP_CRASH, res);
    // 4. 判断用户操作结果,然后根据结果做不同处理
    if (res == AppErrorDialog.TIMEOUT || res == AppErrorDialog.CANCEL) {
        res = AppErrorDialog.FORCE_QUIT;
    }
    synchronized (mService) {
        // 不在提示错误
        if (res == AppErrorDialog.MUTE) {
            stopReportingCrashesLocked(r);
        }
        // 尝试重启进程
        if (res == AppErrorDialog.RESTART) {
            mService.removeProcessLocked(r, false, true, "crash");
            if (task != null) {
                try {
                    mService.startActivityFromRecents(task.taskId,
                            ActivityOptions.makeBasic().toBundle());
                } ...
            }
        }
        // 强行结束进程
        if (res == AppErrorDialog.FORCE_QUIT) {
            long orig = Binder.clearCallingIdentity();
            try {
                // Kill it with fire!
                mService.mStackSupervisor.handleAppCrashLocked(r);
                if (!r.persistent) {
                    mService.removeProcessLocked(r, false, false, "crash");
                    mService.mStackSupervisor.resumeFocusedStackTopActivityLocked();
                }
            } finally {
                Binder.restoreCallingIdentity(orig);
            }
        }
        // 停止进程并报告错误
        if (res == AppErrorDialog.FORCE_QUIT_AND_REPORT) {
            appErrorIntent = createAppErrorIntentLocked(r, timeMillis, crashInfo);
        }
        ...
    }

    if (appErrorIntent != null) {
        try {
            // 启动报告错误界面
            mContext.startActivityAsUser(appErrorIntent, new UserHandle(r.userId));
        } catch (ActivityNotFoundException e) {
            Slog.w(TAG, "bug report receiver dissappeared", e);
        }
    }
}

注释1会优先让crash观察者进行crash处理,crash观察者通过AMS的setActivityController()方法进行设置,如果已经处理则不会再弹出错误对话框。注释2会发送SHOW_ERROR_UI_MSG消息给AMS的mUIHandler处理来请求弹出错误对话框。注释3通过调用AppErrorResult中的get()方法来使线程阻塞。需要注意的是此处涉及到两个线程,crashApplicationInner工作在Binder调用所在的线程,对话框显示则处于AMS的UI线程。具体AppErrorResult的工作后面会说到。待用户操作对话框后或者超时时间到时get()方法就会被唤醒,并且返回处理结果。注释4则根据用户操作结果进行不同的处理,例如强制停止进程,重启进程等。

这里看下注释2处是如何显示错误对话框的,AMS的UiHandler接收到了消息就会进行显示操作

crash对话框的显示和用户行为

frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java

final class UiHandler extends Handler {
    @Override
    public void handleMessage(Message msg) {
        switch (msg.what) {
        // 显示错误对话框
        case SHOW_ERROR_UI_MSG: {
            mAppErrors.handleShowAppErrorUi(msg);
            ensureBootCompleted();
        } break;
        // 显示ANR对话框
        case SHOW_NOT_RESPONDING_UI_MSG: {
            mAppErrors.handleShowAnrUi(msg);
            ensureBootCompleted();
        } break;
        ...
}

可以看到UiHandler对错误和ANR对话框显示的处理,这里看错误对话框的显示,其还是通过AppErrors类进行处理。

frameworks/base/services/core/java/com/android/server/am/AppErrors.java

void handleShowAppErrorUi(Message msg) {
    ...
    synchronized (mService) {
        ProcessRecord proc = data.proc;
        AppErrorResult res = data.result;
        // 1. crash 对话框已显示,故无需再显示
        if (proc != null && proc.crashDialog != null) {
            if (res != null) {
                res.set(AppErrorDialog.ALREADY_SHOWING);
            }
            return;
        }
        
       ...
        final boolean crashSilenced = mAppsNotReportingCrashes != null &&
                mAppsNotReportingCrashes.contains(proc.info.packageName);
        if ((mService.canShowErrorDialogs() || showBackground) && !crashSilenced) {
            // 2. 创建crash对话框
            proc.crashDialog = new AppErrorDialog(mContext, mService, data);
        } else {
            // 3. 如果AMS禁止显示错误对话框,或者当前设备处于睡眠模式则不会让显示对话框
            if (res != null) {
                res.set(AppErrorDialog.CANT_SHOW);
            }
        }
    }
    // 4. 调用Dialog show方法显示crash对话框
    if(data.proc.crashDialog != null) {
        data.proc.crashDialog.show();
    }
}

注释1先对crash进程是否已经显示对话框做了判断,如果已经显示则无需显示。注释2处,手机没有息屏,AMS也允许显示crash对话框,则创建对话框,否则走注释3处,直接说明不显示。如果走到注释4则需要显示crash对话框,故直接调用Dialog的show()方法。这里对注释1和注释3处的res.set()方法做以解释,这res就是AppErrorResult,也就是在crashApplicationInner方法中创建的,该方法在请求AMS显示对话框时调用了result.get()使其阻塞,调用set方法后则会唤醒Binder调用线程,接着走下面代码,进而对结果进行判断。

看下AppErrorResult get()和set()的实现

frameworks/base/services/core/java/com/android/server/am/AppErrorResult.java

final class AppErrorResult {
    public void set(int res) {
        synchronized (this) {
            mHasResult = true;
            // 1. set方法设置mResult的值
            mResult = res;
            // 2.  调用notifyAll唤醒持有当前对象锁且处于阻塞状态的所有线程
            notifyAll();
        }
    }

    public int get() {
        synchronized (this) {
            while (!mHasResult) {
                try {
                    //3. 实质通过wait()使当前线程阻塞
                    wait();
                } catch (InterruptedException e) {
                }
            }
        }
        // 4. 返回mResult
        return mResult;
    }

    boolean mHasResult = false;
    int mResult;
}

通过get()方法线程阻塞,通过set方法更新mResult的值并唤醒处于等待队列的线程,此时接着get()方法wait后面的代码执行,将set()方法中更新的mResult值作为返回值。

当错误对话框弹出后,用户操作或者超时时间到时又是怎样的?我们一起看下AppErrorDialog

frameworks/base/services/core/java/com/android/server/am/AppErrorDialog.java

@Override
public void onClick(View v) {
    // 1. 判断点击控件,来决定操作
    switch (v.getId()) {
        // 请求重启进程
        case com.android.internal.R.id.aerr_restart:
            mHandler.obtainMessage(RESTART).sendToTarget();
            break;
        // 请求反馈报错问题
        case com.android.internal.R.id.aerr_report:
            mHandler.obtainMessage(FORCE_QUIT_AND_REPORT).sendToTarget();
            break;
        // 请求关闭crash Dialog并杀死进程
        case com.android.internal.R.id.aerr_close:
            mHandler.obtainMessage(FORCE_QUIT).sendToTarget();
            break;
        // 请求不再提示对话框
        case com.android.internal.R.id.aerr_mute:
            mHandler.obtainMessage(MUTE).sendToTarget();
            break;
        default:
            break;
    }
}
    
// 2. 受到请求信息后调用setResult()方法并关闭对话框
private final Handler mHandler = new Handler() {
    public void handleMessage(Message msg) {
        setResult(msg.what);
        dismiss();
    }
};
private void setResult(int result) {
    synchronized (mService) {
        if (mProc != null && mProc.crashDialog == AppErrorDialog.this) {
            mProc.crashDialog = null;
        }
    }
    // 3. 调用AppErrorResult的set方法使阻塞线程运行,并将用户点击结果告知
    mResult.set(result);

    mHandler.removeMessages(TIMEOUT);
}

注释的步骤写的已经很清楚了,最终通过mResult.set()方法唤线程,是线程代码接着执行

frameworks/base/services/core/java/com/android/server/am/AppErrors.java

void crashApplicationInner(ProcessRecord r, ApplicationErrorReport.CrashInfo crashInfo,
        int callingPid, int callingUid) {
    ...
    // 3. 阻塞线程直至超时或者用户操作对话框
    int res = result.get();
    // 4. 判断用户操作结果,然后根据结果做不同处理
    ...
}

后续清理工作

根据前面的流程,我们知道当进程crash后,最终将被kill掉,此时AMS还需要完成后续的清理工作。

我们先来回忆一下进程启动后,注册到AMS的部分流程

frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java

// 进程启动后,对应的ActivityThread会attach到AMS上
private final boolean attachApplicationLocked(IApplicationThread thread,
            int pid) {
    ...
    final String processName = app.processName;
    try {
        // 1.  创建“讣告”接收者
        AppDeathRecipient adr = new AppDeathRecipient(
                app, pid, thread);
        thread.asBinder().linkToDeath(adr, 0);
        app.deathRecipient = adr;
    } 
    ...
}

当进程注册到AMS时,AMS注册了一个“讣告”接收者注册到进程中。
因此,当crash进程被kill后,AppDeathRecipient中的binderDied方法将被回调。看源码知道bindDied()方法中又会调用到appDiedLocked()方法

frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java

final void appDiedLocked(ProcessRecord app, int pid, IApplicationThread thread,
        boolean fromBinderDied) {
    ...
    // 1. 该进程没有杀死,则杀死进程
    if (!app.killed) {
        if (!fromBinderDied) {
            killProcessQuiet(pid);
        }
        killProcessGroup(app.uid, pid);
        app.killed = true;
    }

    if (app.pid == pid && app.thread != null &&
            app.thread.asBinder() == thread.asBinder()) {
        ...
        // 2. 
        handleAppDiedLocked(app, false, true);
        ...
    } ...
}

注释1会将进程杀死,注释2处为app死亡的关键处理

frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java

private final void handleAppDiedLocked(ProcessRecord app,
        boolean restarting, boolean allowRestart) {
    int pid = app.pid;
    // 1. 进行进程中service、ContentProvider、BroadcastReceiver等的收尾工作
    boolean kept = cleanUpApplicationRecordLocked(app, restarting, allowRestart, -1,
            false /*replacingPid*/);
    if (!kept && !restarting) {
        removeLruProcessLocked(app);
        if (pid > 0) {
            ProcessList.remove(pid);
        }
    }

    ...
    // 2. 判断是否还存在可见的Activity
    boolean hasVisibleActivities = mStackSupervisor.handleAppDiedLocked(app);
    // 清除activity列表
    app.activities.clear();
    ...
    try {
        if (!restarting && hasVisibleActivities
                && !mStackSupervisor.resumeFocusedStackTopActivityLocked()) {
            // 3. 若当前crash进程中存在可视Activity,那么AMS还是会确保所有可见Activity正常运行,故会重启该进程
            mStackSupervisor.ensureActivitiesVisibleLocked(null, 0, !PRESERVE_WINDOWS);
        }
    } finally {
        mWindowManager.continueSurfaceLayout();
    }
}

注释1比较重要的是对于crash进程中的Bounded Service而言,会清理掉service与客户端之间的联系,此外若service的客户端重要性过低,还会被直接kill掉。注释2处判断是否应用还存在可见的Activity,注释3处对于可见的Activity系统要保证其正常运行,还会重新启动进程。

总结

app停止原来如此啊,当然app停止不可完全避免,但是一旦出现实在太难看了,而且没法收集到log,下篇就看看作为开发者自己如何处理这种未捕获异常。

  • 3
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值