Android 深入理解 ANR 触发原理:Service

一、概述

       ANR(Application Not responding),是指应用程序未响应,Android系统对于一些事件需要在一定的时间范围内完成,如果超过预定时间能未能得到有效响应或者响应时间过长,都会造成ANR。一般地,这时往往会弹出一个提示框,告知用户当前xxx未响应,用户可选择继续等待或者Force Close,并不是所有的ANR都会有提示框,文字后面会给出答案

那么哪些场景会造成ANR呢?

  • Service Timeout:比如前台服务在20s内未执行完成,后台服务60s未完成;
  • BroadcastQueue Timeout:比如前台广播在10s内未执行完成,后台广播200s未完成
  • ContentProvider Timeout:内容提供者,在publish过超时10s;
  • InputDispatching Timeout: 输入事件分发超时5s,包括按键和触摸事件。

二、Service Timeout的情况:

  2.1 

Service Timeout是位于”ActivityManager”线程中的AMS.MainHandler收到SERVICE_TIMEOUT_MSG消息时触发。

对于Service有两类:

  • 对于前台服务,则超时为SERVICE_TIMEOUT = 20s;

  • 对于后台服务,则超时为SERVICE_BACKGROUND_TIMEOUT = 200s

由变量ProcessRecord.execServicesFg来决定是否前台启动

2.2 startService

   其中在Service进程attach到system_server进程的过程中会调用realStartServiceLocked()方法

 private final void realStartServiceLocked(ServiceRecord r,
            ProcessRecord app, boolean execInFg) throws RemoteException {
        if (app.thread == null) {
            throw new RemoteException();
        }
        if (DEBUG_MU)
            Slog.v(TAG_MU, "realStartServiceLocked, ServiceRecord.uid = " + r.appInfo.uid
                    + ", ProcessRecord.uid = " + app.uid);
        r.setProcess(app);
        r.restartTime = r.lastActivity = SystemClock.uptimeMillis();

        final boolean newService = app.services.add(r);
		//
        bumpServiceExecutingLocked(r, execInFg, "create");
        mAm.updateLruProcessLocked(app, false, null);
        updateServiceForegroundLocked(r.app, /* oomAdj= */ false);
        mAm.updateOomAdjLocked(OomAdjuster.OOM_ADJ_REASON_START_SERVICE);

        boolean created = false;
        try {
            if (LOG_SERVICE_START_STOP) {
                String nameTerm;
                int lastPeriod = r.shortInstanceName.lastIndexOf('.');
                nameTerm = lastPeriod >= 0 ? r.shortInstanceName.substring(lastPeriod)
                        : r.shortInstanceName;
                EventLogTags.writeAmCreateService(
                        r.userId, System.identityHashCode(r), nameTerm, r.app.uid, r.app.pid);
            }
            StatsLog.write(StatsLog.SERVICE_LAUNCH_REPORTED, r.appInfo.uid, r.name.getPackageName(),
                    r.name.getClassName());
            synchronized (r.stats.getBatteryStats()) {
                r.stats.startLaunchedLocked();
            }
            mAm.notifyPackageUse(r.serviceInfo.packageName,
                                 PackageManager.NOTIFY_PACKAGE_USE_SERVICE);
            app.forceProcessStateUpTo(ActivityManager.PROCESS_STATE_SERVICE);
			//进入ActivityThread,启动service ,ActivityThread是什么呢?所有通过zygote 孵化出来的APP进程启动入口,APP进程孵化出来会执行ActivityThread.main.
            app.thread.scheduleCreateService(r, r.serviceInfo,
                    mAm.compatibilityInfoForPackage(r.serviceInfo.applicationInfo),
                    app.getReportedProcState());
            r.postNotification();
            created = true;
			.......
		}

  上述代码有两个方法比较重要:bumpServiceExecutingLocked和scheduleCreateService,bumpServiceExecutingLocked方法最终会调用 mAm.mHandler.sendMessageDelayed(msg,
                proc.execServicesFg ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT); 发送一个延时message。所以ANR的计时时间就是在bumpServiceExecutingLocked。scheduleCreateService方法是开始把代码挂载到进程中。

private final void bumpServiceExecutingLocked(ServiceRecord r, boolean fg, String why) {
        if (DEBUG_SERVICE) Slog.v(TAG_SERVICE, ">>> EXECUTING "
                + why + " of " + r + " in app " + r.app);
        else if (DEBUG_SERVICE_EXECUTING) Slog.v(TAG_SERVICE_EXECUTING, ">>> EXECUTING "
                + why + " of " + r.shortInstanceName);

        // For b/34123235: Services within the system server won't start until SystemServer
        // does Looper.loop(), so we shouldn't try to start/bind to them too early in the boot
        // process. However, since there's a little point of showing the ANR dialog in that case,
        // let's suppress the timeout until PHASE_THIRD_PARTY_APPS_CAN_START.
        //
        // (Note there are multiple services start at PHASE_THIRD_PARTY_APPS_CAN_START too,
        // which technically could also trigger this timeout if there's a system server
        // that takes a long time to handle PHASE_THIRD_PARTY_APPS_CAN_START, but that shouldn't
        // happen.)
        boolean timeoutNeeded = true;
        if ((mAm.mBootPhase < SystemService.PHASE_THIRD_PARTY_APPS_CAN_START)
                && (r.app != null) && (r.app.pid == android.os.Process.myPid())) {

            Slog.w(TAG, "Too early to start/bind service in system_server: Phase=" + mAm.mBootPhase
                    + " " + r.getComponentName());
            timeoutNeeded = false;
        }

        long now = SystemClock.uptimeMillis();
        if (r.executeNesting == 0) {
            r.executeFg = fg;
            ServiceState stracker = r.getTracker();
            if (stracker != null) {
                stracker.setExecuting(true, mAm.mProcessStats.getMemFactorLocked(), now);
            }
            if (r.app != null) {
                r.app.executingServices.add(r);
                r.app.execServicesFg |= fg;
                if (timeoutNeeded && r.app.executingServices.size() == 1) {
                    scheduleServiceTimeoutLocked(r.app);
                }
            }
        } else if (r.app != null && fg && !r.app.execServicesFg) {
            r.app.execServicesFg = true;
            if (timeoutNeeded) {
                scheduleServiceTimeoutLocked(r.app);
            }
        }
        r.executeFg |= fg;//executeFg 判断是否前后进程的依据
        r.executeNesting++;
        r.executingStart = now;//记录service的启动时间
    }
 void scheduleServiceTimeoutLocked(ProcessRecord proc) {
        if (proc.executingServices.size() == 0 || proc.thread == null) {
            return;
        }
        Message msg = mAm.mHandler.obtainMessage(
                ActivityManagerService.SERVICE_TIMEOUT_MSG);
        msg.obj = proc;
		//发生延时message  SERVICE_TIMEOUT_MSG
        mAm.mHandler.sendMessageDelayed(msg,
                proc.execServicesFg ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
        /// M: ANR Debug Mechanism
        mAm.mAnrManager.sendServiceMonitorMessage();
    }

2.3 remove  SERVICE_TIMEOUT_MSG

  那在什么地方会remove  SERVICE_TIMEOUT_MSG?  按我们理解应该是servcie  在执行oncreate的时候吧。结果也是

 private void handleCreateService(CreateServiceData data) {
        // If we are getting ready to gc after going to the background, well
        // we are back active so skip it.
        unscheduleGcIdler();

        LoadedApk packageInfo = getPackageInfoNoCheck(
                data.info.applicationInfo, data.compatInfo);
        Service service = null;
        try {
            java.lang.ClassLoader cl = packageInfo.getClassLoader();
            service = packageInfo.getAppFactory()
                    .instantiateService(cl, data.info.name, data.intent);
        } catch (Exception e) {
            if (!mInstrumentation.onException(service, e)) {
                throw new RuntimeException(
                    "Unable to instantiate service " + data.info.name
                    + ": " + e.toString(), e);
            }
        }

        try {
            if (localLOGV) Slog.v(TAG, "Creating service " + data.info.name);

            ContextImpl context = ContextImpl.createAppContext(this, packageInfo);
            context.setOuterContext(service);
//创建Application对象
            Application app = packageInfo.makeApplication(false, mInstrumentation);
            service.attach(context, this, data.info.name, data.token, app,
                    ActivityManager.getService());
            service.onCreate(); //调用服务onCreate()方法
            mServices.put(data.token, service);
            try {
			//移除  SERVICE_TIMEOUT_MSG
                ActivityManager.getService().serviceDoneExecuting(
                        data.token, SERVICE_DONE_EXECUTING_ANON, 0, 0);
            } catch (RemoteException e) {
                throw e.rethrowFromSystemServer();
            }
        } catch (Exception e) {
            if (!mInstrumentation.onException(service, e)) {
                throw new RuntimeException(
                    "Unable to create service " + data.info.name
                    + ": " + e.toString(), e);
            }
        }
    }
 public void serviceDoneExecuting(IBinder token, int type, int startId, int res) {
        synchronized(this) {
            if (!(token instanceof ServiceRecord)) {
                Slog.e(TAG, "serviceDoneExecuting: Invalid service token=" + token);
                throw new IllegalArgumentException("Invalid service token");
            }
            mServices.serviceDoneExecutingLocked((ServiceRecord)token, type, startId, res);
        }
    }
 private void serviceDoneExecutingLocked(ServiceRecord r, boolean inDestroying,
            boolean finishing) {
        if (DEBUG_SERVICE) Slog.v(TAG_SERVICE, "<<< DONE EXECUTING " + r
                + ": nesting=" + r.executeNesting
                + ", inDestroying=" + inDestroying + ", app=" + r.app);
        else if (DEBUG_SERVICE_EXECUTING) Slog.v(TAG_SERVICE_EXECUTING,
                "<<< DONE EXECUTING " + r.shortInstanceName);
        r.executeNesting--;
        if (r.executeNesting <= 0) {
            if (r.app != null) {
                if (DEBUG_SERVICE) Slog.v(TAG_SERVICE,
                        "Nesting at 0 of " + r.shortInstanceName);
                r.app.execServicesFg = false;
                r.app.executingServices.remove(r);//service执行完毕就移除。service的添加是在bumpServiceExecutingLocked方法中执行的。
                if (r.app.executingServices.size() == 0) {
                    if (DEBUG_SERVICE || DEBUG_SERVICE_EXECUTING) Slog.v(TAG_SERVICE_EXECUTING,
                            "No more executingServices of " + r.shortInstanceName);
					
//移除SERVICE_TIMEOUT_MSG		                    

mAm.mHandler.removeMessages(ActivityManagerService.SERVICE_TIMEOUT_MSG, r.app);
                    
/// M: ANR Debug Mechanism
                    mAm.mAnrManager.removeServiceMonitorMessage();
                } else if (r.executeFg) {
                    // Need to re-evaluate whether the app still needs to be in the foreground.
                    for (int i=r.app.executingServices.size()-1; i>=0; i--) {
                        if (r.app.executingServices.valueAt(i).executeFg) {
                            r.app.execServicesFg = true;
                            break;
                        }
                    }

service ANR的触发原理很简单,那么思考一个问题,什么样的情况下会触发servcieANR?

答:1.执行代码块的时候Timeout,意思就是,startservcie 到service.oncreate 这段时间超时,比较常见的可能性是oncreate 里面执行比较耗时的代码,或者死锁。2 还有一种情况是机器性能有关,就是当cpu使用率比较高的时候,拿不到或者分配的cpu 时间片段比较小。导致执行超时,这部分就得看trace的打印或者cpuinfo信息。

 

2.4 发生ANR的情况

    思考两个问题:1 如果发生了ANR ,系统会怎么反应,做哪些动作。2.如果发生了ANR系统会输出哪些log,我们从哪些log能判断此时系统是ANR的情况。

    

final class MainHandler extends Handler {
        public MainHandler(Looper looper) {
            super(looper, null, true);
        }

        @Override
        public void handleMessage(Message msg) {
            switch (msg.what) {
            case GC_BACKGROUND_PROCESSES_MSG: {
                synchronized (ActivityManagerService.this) {
                    performAppGcsIfAppropriateLocked();
                }
            } break;
            case SERVICE_TIMEOUT_MSG: {
                /// M: ANR Debug Mechanism @{
                if (mAnrManager.delayMessage(mHandler, msg, SERVICE_TIMEOUT_MSG,
                        ActiveServices.SERVICE_TIMEOUT))
                    return; /// @}
                mServices.serviceTimeout((ProcessRecord)msg.obj);
            } break;
			
void serviceTimeout(ProcessRecord proc) {
        String anrMessage = null;
        synchronized(mAm) {
            if (proc.isDebugging()) {
                // The app's being debugged, ignore timeout.
                return;
            }
            if (proc.executingServices.size() == 0 || proc.thread == null) {
                return;
            }
            final long now = SystemClock.uptimeMillis();
			//区分前台服务还是后台服务
            final long maxTime =  now -
                    (proc.execServicesFg ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
            ServiceRecord timeout = null;
            long nextTime = 0;
			// sr.executingStart是表示servcie 启动的时间,这个时间的赋值是 bumpServiceExecutingLocked 函数里面 r.executingStart = now;
			//proc.executingServices 启动的服务是什么时候被添加到proc.executingServices里面呢?bumpServiceExecutingLocked  r.app.executingServices.add(r);
			//serviceDoneExecutingLocked  r.app.executingServices.remove(r);
            for (int i=proc.executingServices.size()-1; i>=0; i--) {
                ServiceRecord sr = proc.executingServices.valueAt(i);
                if (sr.executingStart < maxTime) {
                    timeout = sr;找到timeout的servcie
                    break;
                }
                if (sr.executingStart > nextTime) {
                    nextTime = sr.executingStart;
                }
            }
			
			//如果有timeout的进程那么就生成anrMessage  最后appNotResponding
            if (timeout != null && mAm.mProcessList.mLruProcesses.contains(proc)) {
                Slog.w(TAG, "Timeout executing service: " + timeout);
                StringWriter sw = new StringWriter();
                PrintWriter pw = new FastPrintWriter(sw, false, 1024);
                pw.println(timeout);
                timeout.dump(pw, "    ");
                pw.close();
                mLastAnrDump = sw.toString();
                mAm.mHandler.removeCallbacks(mLastAnrDumpClearer);
                mAm.mHandler.postDelayed(mLastAnrDumpClearer, LAST_ANR_LIFETIME_DURATION_MSECS);
                anrMessage = "executing service " + timeout.shortInstanceName;
            } else {
                  //如果没有找到timeout的service,那么在nextTime+SERVICE_TIMEOUT时间后再发送SERVICE_TIMEOUT_MSG广播。
                Message msg = mAm.mHandler.obtainMessage(
                        ActivityManagerService.SERVICE_TIMEOUT_MSG);
                msg.obj = proc;
                mAm.mHandler.sendMessageAtTime(msg, proc.execServicesFg
                        ? (nextTime+SERVICE_TIMEOUT) : (nextTime + SERVICE_BACKGROUND_TIMEOUT));
            }
        }

        if (anrMessage != null) {//当找到timeout的service时前面创建好的anrMessage ,把这个anrMessage 交给 proc.appNotResponding处理
            proc.appNotResponding(null, null, null, null, false, anrMessage);
        }
    }
 void appNotResponding(String activityShortComponentName, ApplicationInfo aInfo,
            String parentShortComponentName, WindowProcessController parentProcess,
            boolean aboveSystem, String annotation) {
        ArrayList<Integer> firstPids = new ArrayList<>(5);
        SparseArray<Boolean> lastPids = new SparseArray<>(20);
       
	   //如果有调用ActivityManager.getService().setActivityController方法,也就是自定义发生ANR时,用户处理,那么系统就不做处理,并且杀死进程。
        mWindowProcessController.appEarlyNotResponding(annotation, () -> kill("anr", true));

        long anrTime = SystemClock.uptimeMillis();
        if (isMonitorCpuUsage()) {
            mService.updateCpuStatsNow();//更新CPU的状态
        }

        synchronized (mService) {
            // PowerManager.reboot() can block for a long time, so ignore ANRs while shutting down.
            if (mService.mAtmInternal.isShuttingDown()) {//正在执行关机流程
                Slog.i(TAG, "During shutdown skipping ANR: " + this + " " + annotation);
                return;
            } else if (isNotResponding()) {//相同的进程已经处理过ANR的问题了。
                Slog.i(TAG, "Skipping duplicate ANR: " + this + " " + annotation);
                return;
            } else if (isCrashing()) {//APP crash ,已经被杀死。
                Slog.i(TAG, "Crashing app skipping ANR: " + this + " " + annotation);
                return;
            } else if (killedByAm) {//被AMS杀死,那么什么情况会被AMS杀死呢?内存不足的时候,或者发生了OOM的情况下。AMS 会根据Adj的值来杀死进程,特别是后台进程。
                Slog.i(TAG, "App already killed by AM skipping ANR: " + this + " " + annotation);
                return;
            } else if (killed) {//进程已经死掉,比如之前的  mWindowProcessController.appEarlyNotResponding(annotation, () -> kill("anr", true));
                Slog.i(TAG, "Skipping died app ANR: " + this + " " + annotation);
                return;
            }
			

            // In case we come through here for the same app before completing
            // this one, mark as anring now so we will bail out.
            setNotResponding(true);

            // Log the ANR to the event log.
			//输出Event log   关键字是am_anr 
            EventLog.writeEvent(EventLogTags.AM_ANR, userId, pid, processName, info.flags,
                    annotation);

            // Dump thread traces as quickly as we can, starting with "interesting" processes.
            firstPids.add(pid);//收集进程pid,为后续的dump打印做准备。

            // Don't dump other PIDs if it's a background ANR
			//如果是后台进程就不做处理。
            if (!isSilentAnr()) {
                int parentPid = pid;
                if (parentProcess != null && parentProcess.getPid() > 0) {
                    parentPid = parentProcess.getPid();
                }
                if (parentPid != pid) firstPids.add(parentPid);

                if (MY_PID != pid && MY_PID != parentPid) firstPids.add(MY_PID);

                for (int i = getLruProcessList().size() - 1; i >= 0; i--) {
                    ProcessRecord r = getLruProcessList().get(i);
                    if (r != null && r.thread != null) {
                        int myPid = r.pid;
                        if (myPid > 0 && myPid != pid && myPid != parentPid && myPid != MY_PID) {
                            if (r.isPersistent()) {
                                firstPids.add(myPid);
                                if (DEBUG_ANR) Slog.i(TAG, "Adding persistent proc: " + r);
                            } else if (r.treatLikeActivity) {
                                firstPids.add(myPid);
                                if (DEBUG_ANR) Slog.i(TAG, "Adding likely IME: " + r);
                            } else {
                                lastPids.put(myPid, Boolean.TRUE);
                                if (DEBUG_ANR) Slog.i(TAG, "Adding ANR proc: " + r);
                            }
                        }
                    }
                }
            }
        }

        final ProcessRecord parentPr = parentProcess != null
                ? (ProcessRecord) parentProcess.mOwner : null;

        /// M: ANR Debug Mechanism
		//如果ANRManager处理了,后续就不处理。谷歌的AnrManage.startAnrDump 是没有任何实现方式,所以这部分由各个芯片厂商来自己定义。
		//mtk会开启persist.vendor.anr.enhancement这个属性值来是否自己处理,一般情况是不开启。所以正常情况走谷歌ANR处理流程
        if (mService.mAnrManager.startAnrDump(mService, this, activityShortComponentName, aInfo,
                parentShortComponentName, parentPr, aboveSystem, annotation, getShowBackground(),
                anrTime))
            return;

        // Log the ANR to the main log.
		//ANR log信息的保存。
        StringBuilder info = new StringBuilder();
        info.setLength(0);
        info.append("ANR in ").append(processName);
        if (activityShortComponentName != null) {
            info.append(" (").append(activityShortComponentName).append(")");
        }
        info.append("\n");
        info.append("PID: ").append(pid).append("\n");
        if (annotation != null) {
            info.append("Reason: ").append(annotation).append("\n");
        }
        if (parentShortComponentName != null
                && parentShortComponentName.equals(activityShortComponentName)) {
            info.append("Parent: ").append(parentShortComponentName).append("\n");
        }

        ProcessCpuTracker processCpuTracker = new ProcessCpuTracker(true);

        // don't dump native PIDs for background ANRs unless it is the process of interest
        String[] nativeProcs = null;
		//如果是后台进程,就不dump natvie pid 信息。
        if (isSilentAnr()) {
            for (int i = 0; i < NATIVE_STACKS_OF_INTEREST.length; i++) {
                if (NATIVE_STACKS_OF_INTEREST[i].equals(processName)) {
                    nativeProcs = new String[] { processName };
                    break;
                }
            }
        } else {
            nativeProcs = NATIVE_STACKS_OF_INTEREST;
        }

        int[] pids = nativeProcs == null ? null : Process.getPidsForCommands(nativeProcs);
        ArrayList<Integer> nativePids = null;

        if (pids != null) {
            nativePids = new ArrayList<>(pids.length);
            for (int i : pids) {
                nativePids.add(i);
            }
        }

        // For background ANRs, don't pass the ProcessCpuTracker to
        // avoid spending 1/2 second collecting stats to rank lastPids.
		//开始输出ANR信息到data/anr/trace.txt 文件。,trace.txt 这个文件名称的格式是:anr_yyyy-MM-dd-HH-mm-ss-SSS  ,具体的实现方式可以看AMS  createAnrDumpFile方法
        File tracesFile = ActivityManagerService.dumpStackTraces(firstPids,
                (isSilentAnr()) ? null : processCpuTracker, (isSilentAnr()) ? null : lastPids,
                nativePids);

        String cpuInfo = null;
        if (isMonitorCpuUsage()) {
            mService.updateCpuStatsNow();//更新cpu信息
            synchronized (mService.mProcessCpuTracker) {
                cpuInfo = mService.mProcessCpuTracker.printCurrentState(anrTime);
            }
            info.append(processCpuTracker.printCurrentLoad());
            info.append(cpuInfo);
        }

        info.append(processCpuTracker.printCurrentState(anrTime));

        Slog.e(TAG, info.toString());//打印cpu trace 信息。
        if (tracesFile == null) {//如果创建data/anr/trace.txt的文件失败的话,就杀死进程
            // There is no trace file, so dump (only) the alleged culprit's threads to the log
            Process.sendSignal(pid, Process.SIGNAL_QUIT);
        }

        StatsLog.write(StatsLog.ANR_OCCURRED, uid, processName,
                activityShortComponentName == null ? "unknown": activityShortComponentName,
                annotation,
                (this.info != null) ? (this.info.isInstantApp()
                        ? StatsLog.ANROCCURRED__IS_INSTANT_APP__TRUE
                        : StatsLog.ANROCCURRED__IS_INSTANT_APP__FALSE)
                        : StatsLog.ANROCCURRED__IS_INSTANT_APP__UNAVAILABLE,
                isInterestingToUserLocked()
                        ? StatsLog.ANROCCURRED__FOREGROUND_STATE__FOREGROUND
                        : StatsLog.ANROCCURRED__FOREGROUND_STATE__BACKGROUND,
                getProcessClassEnum(),
                (this.info != null) ? this.info.packageName : "");
				
       //把信息添加到DropBoxManager ,这个时候在main log中能看到 DropBoxManager输出的ANR信息
        mService.addErrorToDropBox("anr", this, processName, activityShortComponentName,
                parentShortComponentName, parentPr, annotation, cpuInfo, tracesFile, null);

        if (mWindowProcessController.appNotResponding(info.toString(), () -> kill("anr", true),
                () -> {
                    synchronized (mService) {
                        mService.mServices.scheduleServiceTimeoutLocked(this);
                    }
                })) {
            return;
        }

        synchronized (mService) {
            // mBatteryStatsService can be null if the AMS is constructed with injector only. This
            // will only happen in tests.
            if (mService.mBatteryStatsService != null) {
                mService.mBatteryStatsService.noteProcessAnr(processName, uid);//通知BatteryStatsService
            }

            if (isSilentAnr() && !isDebugging()) {//如果是后台进程就直接杀死,不会有弹框的情况
                kill("bg anr", true);
                return;
            }

            // Set the app's notResponding state, and look up the errorReportReceiver
            makeAppNotRespondingLocked(activityShortComponentName,
                    annotation != null ? "ANR " + annotation : "ANR", info.toString());

            // mUiHandler can be null if the AMS is constructed with injector only. This will only
            // happen in tests.
			//发送handle 弹出一个AppNotResponding 的对话框。
            if (mService.mUiHandler != null) {
                // Bring up the infamous App Not Responding dialog
                Message msg = Message.obtain();
                msg.what = ActivityManagerService.SHOW_NOT_RESPONDING_UI_MSG;
                msg.obj = new AppNotRespondingDialog.Data(this, aInfo, aboveSystem);

                mService.mUiHandler.sendMessage(msg);
            }
        }
    }
 public boolean appNotResponding(String info, Runnable killAppCallback,
            Runnable serviceTimeoutCallback) {
        Runnable targetRunnable = null;
        synchronized (mAtm.mGlobalLock) {
            if (mAtm.mController == null) {
                return false;
            }

            try {
                // 0 == show dialog, 1 = keep waiting, -1 = kill process immediately
                int res = mAtm.mController.appNotResponding(mName, mPid, info);
                if (res != 0) {
                    if (res < 0 && mPid != MY_PID) {
                        targetRunnable = killAppCallback;
                    } else {
                        targetRunnable = serviceTimeoutCallback;
                    }
                }
            } catch (RemoteException e) {
                mAtm.mController = null;
                Watchdog.getInstance().setActivityController(null);
                return false;
            }
        }
        if (targetRunnable != null) {
            targetRunnable.run();
            return true;
        }
        return false;
    }
public static File dumpStackTraces(ArrayList<Integer> firstPids,
            ProcessCpuTracker processCpuTracker, SparseArray<Boolean> lastPids,
            ArrayList<Integer> nativePids) {
        ArrayList<Integer> extraPids = null;

        Slog.i(TAG, "dumpStackTraces pids=" + lastPids + " nativepids=" + nativePids);//输出sys log .

        // Measure CPU usage as soon as we're called in order to get a realistic sampling
        // of the top users at the time of the request.
		//如果是后台进程processCpuTracker=null,也就是后台进程的情况不输出CPU信息
        if (processCpuTracker != null) {
            processCpuTracker.init();
            try {
                Thread.sleep(200);
            } catch (InterruptedException ignored) {
            }

            processCpuTracker.update();

            // We'll take the stack crawls of just the top apps using CPU.
            final int N = processCpuTracker.countWorkingStats();
            extraPids = new ArrayList<>();
            for (int i = 0; i < N && extraPids.size() < 5; i++) {
                ProcessCpuTracker.Stats stats = processCpuTracker.getWorkingStats(i);
                if (lastPids.indexOfKey(stats.pid) >= 0) {
                    if (DEBUG_ANR) Slog.d(TAG, "Collecting stacks for extra pid " + stats.pid);

                    extraPids.add(stats.pid);
                } else {
                    Slog.i(TAG, "Skipping next CPU consuming process, not a java proc: "
                            + stats.pid);
                }
            }
        }

        final File tracesDir = new File(ANR_TRACE_DIR);//创建data/anr 文件。
        // Each set of ANR traces is written to a separate file and dumpstate will process
        // all such files and add them to a captured bug report if they're recent enough.
        maybePruneOldTraces(tracesDir);

        // NOTE: We should consider creating the file in native code atomically once we've
        // gotten rid of the old scheme of dumping and lot of the code that deals with paths
        // can be removed.
        File tracesFile = createAnrDumpFile(tracesDir);//创建anr_yyyy-MM-dd-HH-mm-ss-SSS 格式文件
        if (tracesFile == null) {
            return null;
        }

        dumpStackTraces(tracesFile.getAbsolutePath(), firstPids, nativePids, extraPids);
        return tracesFile;
    }
	
	
	 public static void dumpStackTraces(String tracesFile, ArrayList<Integer> firstPids,
            ArrayList<Integer> nativePids, ArrayList<Integer> extraPids) {

        Slog.i(TAG, "Dumping to " + tracesFile);//输出sys log 

        // We don't need any sort of inotify based monitoring when we're dumping traces via
        // tombstoned. Data is piped to an "intercept" FD installed in tombstoned so we're in full
        // control of all writes to the file in question.

        // We must complete all stack dumps within 20 seconds.
        long remainingTime = 20 * 1000;

        // First collect all of the stacks of the most important pids.
		//输出java 进程的trace 信息到tracesFile
        if (firstPids != null) {
            int num = firstPids.size();
            for (int i = 0; i < num; i++) {
                Slog.i(TAG, "Collecting stacks for pid " + firstPids.get(i));
                final long timeTaken = dumpJavaTracesTombstoned(firstPids.get(i), tracesFile,
                                                                remainingTime);

                remainingTime -= timeTaken;
                if (remainingTime <= 0) {
                    Slog.e(TAG, "Aborting stack trace dump (current firstPid=" + firstPids.get(i) +
                           "); deadline exceeded.");
                    return;
                }

                if (DEBUG_ANR) {
                    Slog.d(TAG, "Done with pid " + firstPids.get(i) + " in " + timeTaken + "ms");
                }
            }
        }

        // Next collect the stacks of the native pids
		//输出native 进程的trace 信息到tracesFile
        if (nativePids != null) {
            for (int pid : nativePids) {
                Slog.i(TAG, "Collecting stacks for native pid " + pid);
                final long nativeDumpTimeoutMs = Math.min(NATIVE_DUMP_TIMEOUT_MS, remainingTime);

                final long start = SystemClock.elapsedRealtime();
                Debug.dumpNativeBacktraceToFileTimeout(
                        pid, tracesFile, (int) (nativeDumpTimeoutMs / 1000));
                final long timeTaken = SystemClock.elapsedRealtime() - start;

                remainingTime -= timeTaken;
                if (remainingTime <= 0) {
                    Slog.e(TAG, "Aborting stack trace dump (current native pid=" + pid +
                        "); deadline exceeded.");
                    return;
                }

                if (DEBUG_ANR) {
                    Slog.d(TAG, "Done with native pid " + pid + " in " + timeTaken + "ms");
                }
            }
        }

        // Lastly, dump stacks for all extra PIDs from the CPU tracker.
        if (extraPids != null) {
            for (int pid : extraPids) {
                Slog.i(TAG, "Collecting stacks for extra pid " + pid);

                final long timeTaken = dumpJavaTracesTombstoned(pid, tracesFile, remainingTime);

                remainingTime -= timeTaken;
                if (remainingTime <= 0) {
                    Slog.e(TAG, "Aborting stack trace dump (current extra pid=" + pid +
                            "); deadline exceeded.");
                    return;
                }

                if (DEBUG_ANR) {
                    Slog.d(TAG, "Done with extra pid " + pid + " in " + timeTaken + "ms");
                }
            }
        }
        Slog.i(TAG, "Done dumping");//输出完毕。
    }
  private static synchronized File createAnrDumpFile(File tracesDir) {
        if (sAnrFileDateFormat == null) {
            sAnrFileDateFormat = new SimpleDateFormat("yyyy-MM-dd-HH-mm-ss-SSS");
        }

        final String formattedDate = sAnrFileDateFormat.format(new Date());
        final File anrFile = new File(tracesDir, "anr_" + formattedDate);

        try {
            if (anrFile.createNewFile()) {
                FileUtils.setPermissions(anrFile.getAbsolutePath(), 0600, -1, -1); // -rw-------
                return anrFile;
            } else {
                Slog.w(TAG, "Unable to create ANR dump file: createNewFile failed");
            }
        } catch (IOException ioe) {
            Slog.w(TAG, "Exception creating ANR dump file:", ioe);
        }

        return null;
    }

 

看完上面的代码对刚才提的两个问题已经找到答案,再总结一下

一、 如果发生了ANR ,系统会怎么反应:1.判断是前台进程还是后台进程,如果是前台进程判断超时时间是否大于SERVICE_TIMEOUT(20s),如果是后台进程判断超时时间是否大于SERVICE_BACKGROUND_TIMEOUT(200s)。如果时间没到的话就, mAm.mHandler.sendMessageAtTime(msg, proc.execServicesFg? (nextTime+SERVICE_TIMEOUT)  (nextTime + SERVICE_BACKGROUND_TIMEOUT))。

                                                              2.如果超时,不管是前台进程还是后台进程。创建anrMessage,输出 Slog.w(TAG, "Timeout executing service: " + timeout);把anrMessage给proc.appNotResponding处理。

                                                            3.如果用户有调用ActivityManager.getService().setActivityController,也就是用户自定义处理ANR信息的情况,杀死进程kill("anr", true)。

                                                            4.更新CPU的状态,一些特殊情况下不会输出data/anr/trace 文件,具体情况看前面的代码说明。

                                                            5.输出Event log   关键字是am_anr 

                                                            6 对后台进程会做一些特殊处理,不输出cpu信息,以及pid的trace信息。

                                                            7.如果芯片厂商有自定义了ANRManager 的情况:如果ANRManager处理了,后续就不处理,包括data/anr/trace和弹框。谷歌的AnrManage.startAnrDump 是没有任何实现方式,所以这部分由各个芯片厂商来自己定义。mtk会开启persist.vendor.anr.enhancement这个属性值来是否自己处理,一般情况是不开启。所以正常情况走谷歌ANR处理流程

                                                           8 开始输出ANR信息到data/anr/trace.txt 文件。,trace.txt 这个文件名称的格式是:anr_yyyy-MM-dd-HH-mm-ss-SSS  ,具体的实现方式可以看AMS  createAnrDumpFile方法   

                                                          9 如果是后台进程就直接杀死,不会有弹框的情况。如果是前台进程,发送handle ,弹出一个AppNotResponding 的对话框。

所以在上面需要注意几点信息:1.不是所有的ANR都会有data/anr/trace生成,当用户自己处理ANR的时候,还有trace文件创建失败的情况。2.后台进程是不会有弹框的,会被直接杀死。而且后台进程没有相应的PID trace信息。

 

 

二、.如果发生了ANR系统会输出哪些log,我们从哪些log能判断此时系统是ANR的情况:

                                        1 .   Slog.w(TAG, "Timeout executing service: " + timeout);TAG:ActivityManager

                                        2.   EventLog.writeEvent(EventLogTags.AM_ANR, userId, pid, processName, info.flags,
                    annotation); 这个log的输出在某些情况是不会输出的,用户定义了ANR的处理。还有就是另外几种情况,isShuttingDown ,isNotResponding ,isCrashing ,killedByAm,killed

                                       3. 生成 data/anr/文件,里面保存了一些pid的trace 信息以及cpu信息。

                                      4.DropBoxManager也会输出一些anr的trace信息。

                                     5 如果是前台进程还会有一个AppNotRespondingDialog。

   

 

 

 

 

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值