一、ANR概述
ANR(Application Not responding)是指应用程序未响应,Android系统对于一些事件需要在一定的时间范围内完成,如果超过预定时间能未能得到有效响应或者响应时间过长,都会造成ANR。
1.1、Anr的触发时机
anr的触发时机有很多种,基本分为以下几点:
- 输入事件:5s内无响应,如屏幕触碰事件;
- 广播:执行前台广播(BroadcastReceiver)的onReceive()方法时10s没有处理完成,后台广播的超时时间为60s;
- Service:前台Service20s内没有执行完毕;后台Service200s内没有执行完毕;
- ContentProvider:10s内ContentProvider的publish未执行完;
从上面4点来看,每一种触发的时机都是在规定的时间内,看你消不消费得完自身的事件,而本文所探讨的问题就是,这些时间是由谁来制定,并且Anr的核心点在哪里。
二、Service的ANR
前台Service20s内没有执行完毕,后台Service200s内没有执行完毕,就会触发ANR。
开启service有两种方式,一个是startService一个是bindService,本文从startService开始分析,我们可以先看看service的启动流程。
简单概括,AMS开启了service服务,传递到ActiveServices,ActiveServices是协助AMS完成启动、绑定等逻辑操作,在ActiveService的realStartServiceLocked方法中创建了service对象,开启了一个service的onCreate()。
service的ANR在于时间的控制,那开始计时的逻辑怎么看?
2.1、计时开始
从启动service的流程中发现,service启动计时的逻辑是从ActiveServices.realStartServiceLocked() 开始。
private void realStartServiceLocked(ServiceRecord r, ProcessRecord app,
IApplicationThread thread, int pid, UidRecord uidRecord, boolean execInFg,
boolean enqueueOomAdj) throws RemoteException {
bumpServiceExecutingLocked(r, execInFg, "create", null /* oomAdjReason */);
mAm.updateLruProcessLocked(app, false, null);
updateServiceForegroundLocked(psr, /* oomAdj= */ false);
thread.scheduleCreateService(r, r.serviceInfo,
mAm.compatibilityInfoForPackage(r.serviceInfo.applicationInfo),
app.mState.getReportedProcState());
requestServiceBindingsLocked(r, execInFg);
updateServiceClientActivitiesLocked(psr, null, true);
sendServiceArgsLocked(r, execInFg, true);
}
复制代码
开始计时的操作bumpServiceExecutingLocked方法中
private boolean bumpServiceExecutingLocked(ServiceRecord r, boolean fg, String why,
@Nullable String oomAdjReason) {
...
if (timeoutNeeded) {
scheduleServiceTimeoutLocked(r.app);
}
}
复制代码
抛出handler开始计时
void scheduleServiceTimeoutLocked(ProcessRecord proc) {
if (proc.mServices.numberOfExecutingServices() == 0 || proc.getThread() == null) {
return;
}
Message msg = mAm.mHandler.obtainMessage(
ActivityManagerService.SERVICE_TIMEOUT_MSG);
msg.obj = proc;
mAm.mHandler.sendMessageDelayed(msg, proc.mServices.shouldExecServicesFg()
? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
}
复制代码
从scheduleServiceTimeoutLocked方法中可以看到,内部发出了一个延时的消息,延时的时间由proc.mServices.shouldExecServicesFg()来判断,而proc.mServices.shouldExecServicesFg()指代的是是否在前台执行服务,也就是我们经常说的前台服务与后台服务的判断,
当为前台服务时,delay的时间为SERVICE_TIMEOUT = 20s
static final int SERVICE_TIMEOUT = 20 * 1000 * Build.HW_TIMEOUT_MULTIPLIER;
复制代码
当为后台服务时,SERVICE_BACKGROUND_TIMEOUT = 20 * 10 = 200s
static final int SERVICE_BACKGROUND_TIMEOUT = SERVICE_TIMEOUT * 10;
复制代码
我们再来看看Handler中msg=SERVICE_TIMEOUT_MSG时发生了什么?
final class MainHandler extends Handler {
@Override
public void handleMessage(Message msg) {
switch (msg.what) {
case SERVICE_TIMEOUT_MSG: {
mServices.serviceTimeout((ProcessRecord) msg.obj);
} break;
}
复制代码
void serviceTimeout(ProcessRecord proc) {
String anrMessage = null;
synchronized(mAm) {
final ProcessServiceRecord psr = proc.mServices;
final long now = SystemClock.uptimeMillis();
//计算当前时间减去服务超时时间
final long maxTime = now -
(psr.shouldExecServicesFg() ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
ServiceRecord timeout = null;
long nextTime = 0;
//遍历执行service列表,找出超时的service
for (int i = psr.numberOfExecutingServices() - 1; i >= 0; i--) {
ServiceRecord sr = psr.getExecutingServiceAt(i);
if (sr.executingStart < maxTime) {
timeout = sr;
break;
}
//重新计时
if (sr.executingStart > nextTime) {
nextTime = sr.executingStart;
}
}
if (timeout != null && mAm.mProcessList.isInLruListLOSP(proc)) {
//该服务已超时,记录anr的一些信息
Slog.w(TAG, "Timeout executing service: " + timeout);
StringWriter sw = new StringWriter();
PrintWriter pw = new FastPrintWriter(sw, false, 1024);
pw.println(timeout);
timeout.dump(pw, " ");
pw.close();
mLastAnrDump = sw.toString();
mAm.mHandler.removeCallbacks(mLastAnrDumpClearer);
mAm.mHandler.postDelayed(mLastAnrDumpClearer, LAST_ANR_LIFETIME_DURATION_MSECS);
anrMessage = "executing service " + timeout.shortInstanceName;
} else {
//服务未超时,则重新监控下一个service是否超时
Message msg = mAm.mHandler.obtainMessage(
ActivityManagerService.SERVICE_TIMEOUT_MSG);
msg.obj = proc;
mAm.mHandler.sendMessageAtTime(msg, psr.shouldExecServicesFg()
? (nextTime+SERVICE_TIMEOUT) : (nextTime + SERVICE_BACKGROUND_TIMEOUT));
}
}
//将根据需要创建一个独立的线程,开始向用户报告ANR的相关信息
if (anrMessage != null) {
mAm.mAnrHelper.appNotResponding(proc, anrMessage);
}
}
复制代码
最后回进入到 ActiveServices.serviceTimeout()方法中,基本逻辑如下:
- 计算当前时间减去服务超时时间;
- 遍历服务列表,找出超时的服务;
- 如果该服务超时,记录anr信息;如果没有超时,则重新健康下一个service是否超时;
- 超时的service,会更加需要创建一个独立进程,开始向用户报告ANR的相关信息;
有开始ANRmsg,必然也有结束计时的逻辑,我们且往下屡屡。
2.2、计时结束
在调用bumpServiceExecutingLocked方法开启计时,接着就开始调用 scheduleCreateService开始服务的创建;
#ActivityThread
public final void scheduleCreateService(IBinder token,
ServiceInfo info, CompatibilityInfo compatInfo, int processState) {
...
sendMessage(H.CREATE_SERVICE, s);
}
case CREATE_SERVICE:
handleCreateService((CreateServiceData)msg.obj);
break;
复制代码
private void handleCreateService(CreateServiceData data) {
Service service = null;
try {
...
service = packageInfo.getAppFactory()
.instantiateService(cl, data.info.name, data.intent);
...
service.onCreate();
...
ActivityManager.getService().serviceDoneExecuting(
data.token, SERVICE_DONE_EXECUTING_ANON, 0, 0);
}
}
复制代码
在handleCreateService中service开始回调onCreate(),也就是正式开启了一个service,而在开启service之后,AMS就开始结束service的计时。
private void serviceDoneExecutingLocked(ServiceRecord r, boolean inDestroying,
boolean finishing, boolean enqueueOomAdj) {
...
mAm.mHandler.removeMessages(ActivityManagerService.SERVICE_TIMEOUT_MSG, r.app);
}
复制代码
利用Handler的removeMessage移除SERVICE_TIMEOUT_MSG消息,结束ANR的计时。
三、Broadcast的ANR
从广播的发送开始,利用Context来sendBroadcast,传递到AMS,同servcice的原理一样,有一个BroadcastQueue来协助AMS处理前台和后台广播。具体流程可以看下面的UML图。
3.1、计时开始
关于ANR计时开始的逻辑从AMS的broadcastIntentLocked()方法就开始了,
final int broadcastIntentLocked() {
//处理并行广播
//
int NR = registeredReceivers != null ? registeredReceivers.size() : 0;
if (!ordered && NR > 0) {
//根据标志位FLAG_RECEIVER_FOREGROUND决定返回前台广播队列还是后台广播队列
final BroadcastQueue queue = broadcastQueueForIntent(intent);
BroadcastRecord r = new BroadcastRecord(queue, intent, callerApp, callerPackage,
callerFeatureId, callingPid, callingUid, callerInstantApp, resolvedType,
requiredPermissions, excludedPermissions, appOp, brOptions, registeredReceivers,
resultTo, resultCode, resultData, resultExtras, ordered, sticky, false, userId,
allowBackgroundActivityStarts, backgroundActivityStartsToken,
timeoutExempt);
if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST, "Enqueueing parallel broadcast " + r);
final boolean replaced = replacePending
&& (queue.replaceParallelBroadcastLocked(r) != null);
// Note: We assume resultTo is null for non-ordered broadcasts.
if (!replaced) {
queue.enqueueParallelBroadcastLocked(r);
queue.scheduleBroadcastsLocked();
}
registeredReceivers = null;
NR = 0;
}
//处理串行广播
if ((receivers != null && receivers.size() > 0)
|| resultTo != null) {
BroadcastQueue queue = broadcastQueueForIntent(intent);
BroadcastRecord r = new BroadcastRecord(queue, intent, callerApp, callerPackage,
callerFeatureId, callingPid, callingUid, callerInstantApp, resolvedType,
requiredPermissions, excludedPermissions, appOp, brOptions,
receivers, resultTo, resultCode, resultData, resultExtras,
ordered, sticky, false, userId, allowBackgroundActivityStarts,
backgroundActivityStartsToken, timeoutExempt);
if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST, "Enqueueing ordered broadcast " + r);
final BroadcastRecord oldRecord =
replacePending ? queue.replaceOrderedBroadcastLocked(r) : null;
if (oldRecord != null) {
// Replaced, fire the result-to receiver.
if (oldRecord.resultTo != null) {
final BroadcastQueue oldQueue = broadcastQueueForIntent(oldRecord.intent);
try {
oldQueue.performReceiveLocked(oldRecord.callerApp, oldRecord.resultTo,
oldRecord.intent,
Activity.RESULT_CANCELED, null, null,
false, false, oldRecord.userId);
}
}
} else {
queue.enqueueOrderedBroadcastLocked(r);
queue.scheduleBroadcastsLocked();
}
} else {
// There was nobody interested in the broadcast, but we still want to record
// that it happened.
if (intent.getComponent() == null && intent.getPackage() == null
&& (intent.getFlags()&Intent.FLAG_RECEIVER_REGISTERED_ONLY) == 0) {
// This was an implicit broadcast... let's record it for posterity.
addBroadcastStatLocked(intent.getAction(), callerPackage, 0, 0, 0);
}
}
return ActivityManager.BROADCAST_SUCCESS;
}
复制代码
根据标识位来区分是否时前台广播还是后台广播
BroadcastQueue broadcastQueueForIntent(Intent intent) {
final boolean isFg = (intent.getFlags() & Intent.FLAG_RECEIVER_FOREGROUND) != 0;
return (isFg) ? mFgBroadcastQueue : mBgBroadcastQueue;
}
复制代码
拿到前台或者后台的广播队列后,调用scheduleBroadcastsLocked方法开始Handler消息的分发,也就是ANR计时开始,
//前台广播超时时间 = 10s;后台广播 = 60s
static final int BROADCAST_FG_TIMEOUT = 10 * 1000 * Build.HW_TIMEOUT_MULTIPLIER;
static final int BROADCAST_BG_TIMEOUT = 60 * 1000 * Build.HW_TIMEOUT_MULTIPLIER;
复制代码
//设置超时时间
long timeoutTime = r.receiverTime + mConstants.TIMEOUT;
setBroadcastTimeoutLocked(timeoutTime);
复制代码
最后在handler发送超时时间的消息后,还是会在AnrHelper中发送ANR的相关信息。
3.2、计时结束
有handler发送消息的地方,也就有取消message的地方,在上面计时开始的流程中,processNextBroadcastLocked方法中在处理串行广播时,如果判断没有超时,就取消message。
do {
cancelBroadcastTimeoutLocked();
// ... and on to the next...
addBroadcastToHistoryLocked(r);
} while (r == null);
复制代码
final void cancelBroadcastTimeoutLocked() {
if (mPendingBroadcastTimeoutMessage) {
mHandler.removeMessages(BROADCAST_TIMEOUT_MSG, this);
mPendingBroadcastTimeoutMessage = false;
}
}
复制代码
四、ContentProvider的ANR
ContentProvider的创建和启动一般是在进程启动的时候就开始,我们在创建一个进程的时候都会调用AMS的attachApplication()方法,在这个方法中,会拿到ProvideInfo信息即ContentProvider,接着会调用新建进程的bindApplication()将查询到的ProviderInfo进行创建并启动ContentProvider。
4.1、计时开始
在ContentProvider的创建启动过程中, 有一个地方需要需要注意,那就是AMS的attachApplication()方法,
private boolean attachApplicationLocked(@NonNull IApplicationThread thread,
int pid, int callingUid, long startSeq) {
...
List<ProviderInfo> providers = normalMode
? mCpHelper.generateApplicationProvidersLocked(app)
: null;
if (providers != null && mCpHelper.checkAppInLaunchingProvidersLocked(app)) {
Message msg = mHandler.obtainMessage(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG);
msg.obj = app;
mHandler.sendMessageDelayed(msg,
ContentResolver.CONTENT_PROVIDER_PUBLISH_TIMEOUT_MILLIS);
}
...
thread.bindApplication(processName, appInfo, providerList,
instr2.mClass,
profilerInfo, instr2.mArguments,
instr2.mWatcher,
instr2.mUiAutomationConnection, testMode,
mBinderTransactionTrackingEnabled, enableTrackAllocation,
isRestrictedBackupMode || !normalMode, app.isPersistent(),
new Configuration(app.getWindowProcessController().getConfiguration()),
app.getCompat(), getCommonServicesLocked(app.isolated),
mCoreSettingsObserver.getCoreSettingsLocked(),
buildSerial, autofillOptions, contentCaptureOptions,
app.getDisabledCompatChanges(), serializedSystemFontMap);
...
}
复制代码
在这个方法中,会先通过ContentProviderHelper拿到ProviderInfo列表,然后通过Handler延时发送CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG消息,而延时的时间为10s,
public static final int CONTENT_PROVIDER_PUBLISH_TIMEOUT_MILLIS =
10 * 1000 * Build.HW_TIMEOUT_MULTIPLIER;
复制代码
跟到最后其实和service、广播的流程一样,最后会触发appNotResponding,向用户展示ANR相关信息。
public void appNotRespondingViaProvider(IBinder connection) {
final ContentProviderConnection conn = (ContentProviderConnection) connection;
...
final ProcessRecord host = conn.provider.proc;
...
mHandler.post(new Runnable() {
@Override
public void run() {
mAppErrors.appNotResponding(host, null, null, false,
"ContentProvider not responding");
}
});
}
复制代码
4.2、计时结束
目前有两个地方会结束计时,
- 进程消失
当进程消失时,会清除进程的主要代码,在这里会remove掉ANR的计时,因为在新建进程时,会再次开启一个新的计时。
final boolean cleanUpApplicationRecordLocked(ProcessRecord app, int pid,
boolean restarting, boolean allowRestart, int index, boolean replacingPid,
boolean fromBinderDied) {
...
mHandler.removeMessages(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG, app);
}
复制代码
- ContentProvider发布完成同样会移除ANR的超时消息。
public final void publishContentProviders(IApplicationThread caller,
List<ContentProviderHolder> providers) {
if (wasInLaunchingProviders) {
mHandler.removeMessages(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG, r);
}
}
}
}
作者:付十一
链接:https://juejin.cn/post/7121217696594657293
来源:稀土掘金
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。