上一篇文章将基于Android 12的AMS进程管理中LRU算法进行了分析,得到的结论是根据进程状态(是否存在activity和service)去调整mLruProcessServiceStart和mLruProcessActivityStart两个指针的位置,维护活动的列表中3个区域里各自进程排序。AMS进程管理--LRU篇-CSDN博客
LRU的作用更多的像是给进程按条件进行排序,当内存不足时,会优先从队头位置选择进程杀掉。那么有些核心服务没有actvitity是不是就更容易死掉?答案当然不是,这就是今天我们介绍的ADJ算法。它作用是结合排序里的顺序,为每个进程再计算出一个优先级,当系统内存不足时,kill掉低优先级的进程。
同样的,android12后,在AMS中的算法源码被委托给了OomAdjuster类,它的主要职责就是计算进程的oom_adj值,也就是优先级。符合单一职责的设计原则。
private void updateOomAdjLockedInner(String oomAdjReason, final ProcessRecord topApp,
ArrayList<ProcessRecord> processes, ActiveUids uids, boolean potentialCycles,
boolean startProfiling) {
if (startProfiling) {
mService.mOomAdjProfiler.oomAdjStarted();
}
final long now = SystemClock.uptimeMillis();
final long nowElapsed = SystemClock.elapsedRealtime();
//允许一个空进程最长存活周期为30分钟,oldTime也就是30分钟前
final long oldTime = now - ProcessList.MAX_EMPTY_TIME;
final boolean fullUpdate = processes == null;
ActiveUids activeUids = uids;
//获取LRU列表中所有进程
ArrayList<ProcessRecord> activeProcesses = fullUpdate ? mProcessList.mLruProcesses
: processes;
final int numProc = activeProcesses.size();
//初始化uid
if (activeUids == null) {
final int numUids = mActiveUids.size();
activeUids = mTmpUidRecords;
activeUids.clear();
for (int i = 0; i < numUids; i++) {
UidRecord r = mActiveUids.valueAt(i);
activeUids.put(r.uid, r);
}
}
// Reset state in all uid records.
for (int i = activeUids.size() - 1; i >= 0; i--) {
final UidRecord uidRec = activeUids.valueAt(i);
if (DEBUG_UID_OBSERVERS) {
Slog.i(TAG_UID_OBSERVERS, "Starting update of " + uidRec);
}
uidRec.reset();
}
if (mService.mAtmInternal != null) {
mService.mAtmInternal.rankTaskLayersIfNeeded();
}
//分配唯一seqid
mAdjSeq++;
if (fullUpdate) {
mNewNumServiceProcs = 0;
mNewNumAServiceProcs = 0;
}
boolean retryCycles = false;
boolean computeClients = fullUpdate || potentialCycles;
// need to reset cycle state before calling computeOomAdjLocked because of service conns
for (int i = numProc - 1; i >= 0; i--) {
ProcessRecord app = activeProcesses.get(i);
app.mReachable = false;
// No need to compute again it has been evaluated in previous iteration
//上一次迭代中已经计算过的无需重新计算
if (app.adjSeq != mAdjSeq) {
app.containsCycle = false;
app.setCurRawProcState(PROCESS_STATE_CACHED_EMPTY);
app.setCurRawAdj(ProcessList.UNKNOWN_ADJ);
app.setCapability = PROCESS_CAPABILITY_NONE;
app.resetCachedInfo();
}
}
for (int i = numProc - 1; i >= 0; i--) {
ProcessRecord app = activeProcesses.get(i);
if (!app.killedByAm && app.thread != null) {
app.procStateChanged = false;
//核心方法
computeOomAdjLocked(app, ProcessList.UNKNOWN_ADJ, topApp, fullUpdate, now, false,computeClients); // It won't enter cycle if not computing clients.
// if any app encountered a cycle, we need to perform an additional loop later
retryCycles |= app.containsCycle;
// Keep the completedAdjSeq to up to date.
app.completedAdjSeq = mAdjSeq;
}
}
。。。
//这里调用了applyOomAdjLocked方法,该方法在Android12中,加入了adjString的判断后再去调用
boolean allChanged = updateAndTrimProcessLocked(now, nowElapsed, oldTime, activeUids);
。。。
}
和Android12版本的思路一样,核心方法是computeOomAdjLocked和applyOomAdjLocked,其中第二个方法加入了是否计算完成adj的判断。下面分别来看这两个方法做了什么
computeOomAdjLocked
- 通过mAdjSeq字段判断此轮更新是否已经计算过adj,是的话直接返回当前app.curRawAdj
- 判断进程的客户端线程是否存在,不存在,则:将adj设置为CACHED_APP_MAX_ADJ。
- 判断是否是前台进程,如果不是:则根据TOP_APP,app.hasTopUi,activitiesSize,systemNoUi等参数计算adj。
- 前台进程继续往下,初始化一些前台进程相关的默认值,后续再根据具体情况细化。
- 根据是否为TOP_APP,是否有正在接受的动画,是否有正在执行的服务,是否有正在运行的Activity以及Activity的状态等对adj等参数赋值。
- 对可见进程或者拥有可感知的前台服务或者后台服务等参数设置adj
- 对后台进程设置优先级
- 遍历在进程上运行的Service,根据Service的状态进一步更新adj等值。
- 同Service。遍历进程上的ContentProvider,根据ContentProvider的状态进一步更新adj等值。
- 根据cache进程运行状态,细分出cache进程还有empty进程。
- 将计算好的adj等值赋值给对应的进程属性
补充概念
ADJ等级划分定义在ProcessList中,有以下分类,其定义的int型从高到低,优先级递增。ADJ值小于0的进程都是系统进程,几乎不能被kill,即使kill也会立刻重启。
// Adjustment used in certain places where we don't know it yet.
// (Generally this is something that is going to be cached, but we
// don't know the exact value in the cached range to assign yet.)
//未知的,常用作缓存进程,最低优先级
static final int UNKNOWN_ADJ = 1001;
// This is a process only hosting activities that are not visible,
// so it can be killed without any disruption.
//分别是不可见进程的adj最大值和最小值,在内存不足的情况下就会优先被kill。
static final int CACHED_APP_MAX_ADJ = 999;
static final int CACHED_APP_MIN_ADJ = 900;
// This is the oom_adj level that we allow to die first. This cannot be equal to
// CACHED_APP_MAX_ADJ unless processes are actively being assigned an oom_score_adj of
// CACHED_APP_MAX_ADJ.
//lowmem 查杀的最小等级
static final int CACHED_APP_LMK_FIRST_ADJ = 950;
// Number of levels we have available for different service connection group importance
// levels.
//用作计算不同连接service组中的等级
static final int CACHED_APP_IMPORTANCE_LEVELS = 5;
// The B list of SERVICE_ADJ -- these are the old and decrepit
// services that aren't as shiny and interesting as the ones in the A list.
//非活跃进程,B List中的Service(运行时间较长、使用可能性更小)
static final int SERVICE_B_ADJ = 800;
// This is the process of the previous application that the user was in.
// This process is kept above other things, because it is very common to
// switch back to the previous app. This is important both for recent
// task switch (toggling between the two top recent apps) as well as normal
// UI flow such as clicking on a URI in the e-mail app to view in the browser,
// and then pressing back to return to e-mail.
//上一个App的进程(上一个stopActivity的进程,举了个例子,点击电子邮件应用程序中的URI以在浏览器中查看,然后按返回返回电子邮件。那么browser就是上一个进程)
static final int PREVIOUS_APP_ADJ = 700;
// This is a process holding the home application -- we want to try
// avoiding killing it, even if it would normally be in the background,
// because the user interacts with it so much.
//持有home进程的应用
static final int HOME_APP_ADJ = 600;
// This is a process holding an application service -- killing it will not
// have much of an impact as far as the user is concerned.
//服务进程,杀掉后对用户影响无感知
static final int SERVICE_ADJ = 500;
// This is a process with a heavy-weight application. It is in the
// background, but we want to try to avoid killing it. Value set in
// system/rootdir/init.rc on startup.
//重量级进程,避免kill掉这类进程
static final int HEAVY_WEIGHT_APP_ADJ = 400;
// This is a process currently hosting a backup operation. Killing it
// is not entirely fatal but is generally a bad idea.
//备份进程,bindBackupAgent
static final int BACKUP_APP_ADJ = 300;
// This is a process bound by the system (or other app) that's more important than services but
// not so perceptible that it affects the user immediately if killed.
//由系统(或其他应用程序)绑定的进程,它比服务更重要
static final int PERCEPTIBLE_LOW_APP_ADJ = 250;
// This is a process only hosting components that are perceptible to the
// user, and we really want to avoid killing them, but they are not
// immediately visible. An example is background music playback.
//可感知进程,如后台播放的音乐进程
static final int PERCEPTIBLE_APP_ADJ = 200;
// This is a process only hosting activities that are visible to the
// user, so we'd prefer they don't disappear.
//可见进程
static final int VISIBLE_APP_ADJ = 100;
//layer可见进程最大值99,约束进程的优先级在VISIBLE_APP_ADJ和PERCEPTIBLE_APP_ADJ间
static final int VISIBLE_APP_LAYER_MAX = PERCEPTIBLE_APP_ADJ - VISIBLE_APP_ADJ - 1;
// This is a process that was recently TOP and moved to FGS. Continue to treat it almost
// like a foreground app for a while.
// @see TOP_TO_FGS_GRACE_PERIOD
//应用有前台服务,从前台切换到前台service,且在15s内到过前台
static final int PERCEPTIBLE_RECENT_FOREGROUND_APP_ADJ = 50;
// This is the process running the current foreground app. We'd really
// rather not kill it!
//前台进程
static final int FOREGROUND_APP_ADJ = 0;
// This is a process that the system or a persistent process has bound to,
// and indicated it is important.
//关联着系统或persistent进程(由startIsolatedProcess()方式启动的进程,或者是由system_server或者persistent进程所绑定的服务进程)
static final int PERSISTENT_SERVICE_ADJ = -700;
// This is a system persistent process, such as telephony. Definitely
// don't want to kill it, but doing so is not completely fatal.
//系统persistent进程,比如telephony(一般不会被杀,即使被杀或crash,立即重启)
static final int PERSISTENT_PROC_ADJ = -800;
// The system process runs at the default adjustment.
//系统进程(system_server进程)
static final int SYSTEM_ADJ = -900;
// Special code for native processes that are not being managed by the system (so
// don't have an oom adj assigned by the system).
//native进程(由init进程fork出的进程,并不受system管控)
static final int NATIVE_ADJ = -1000;
进程组schedGroup
表示当前进程所在的进程调度组序列,也定义在ProcessList中
// Activity manager's version of Process.THREAD_GROUP_BACKGROUND
//后台进程组
static final int SCHED_GROUP_BACKGROUND = 0;
// Activity manager's version of Process.THREAD_GROUP_RESTRICTED
//受限制的进程组
static final int SCHED_GROUP_RESTRICTED = 1;
// Activity manager's version of Process.THREAD_GROUP_DEFAULT
//前台进程组
static final int SCHED_GROUP_DEFAULT = 2;
// Activity manager's version of Process.THREAD_GROUP_TOP_APP
//top进程组
public static final int SCHED_GROUP_TOP_APP = 3;
// Activity manager's version of Process.THREAD_GROUP_TOP_APP
// Disambiguate between actual top app and processes bound to the top app
static final int SCHED_GROUP_TOP_APP_BOUND = 4;
看完了computeOomAdjLocked方法,我们知道它主要是根据进程状态,赋值ADJ value。接下来看applyOomAdjLocked方法,该方法主要有以下3个作用
- 设置进程优先级:将前面计算好的curAdj传递给LMKD服务
- 设置进程的调度策略:将schedGroup设置为对应的进程调度组
- 设置进程状态:将curProcState线程状态回传给应用进程ApplicationThread
一,设置进程优先级
if (app.curAdj != app.setAdj) {
//方法调用到ProcessList中,这是实际处理逻辑的方法
ProcessList.setOomAdj(app.pid, app.uid, app.curAdj);
if (DEBUG_SWITCH || DEBUG_OOM_ADJ || mService.mCurOomAdjUid == app.info.uid) {
String msg = "Set " + app.pid + " " + app.processName + " adj "
+ app.curAdj + ": " + app.adjType;
reportOomAdjMessageLocked(TAG_OOM_ADJ, msg);
}
app.setAdj = app.curAdj;
app.verifiedAdj = ProcessList.INVALID_ADJ;
}
接下看实际处理逻辑的方法
public static void setOomAdj(int pid, int uid, int amt) {
// This indicates that the process is not started yet and so no need to proceed further.
if (pid <= 0) {
return;
}
if (amt == UNKNOWN_ADJ)
return;
long start = SystemClock.elapsedRealtime();
ByteBuffer buf = ByteBuffer.allocate(4 * 4);
buf.putInt(LMK_PROCPRIO);
buf.putInt(pid);
buf.putInt(uid);
buf.putInt(amt);
//
writeLmkd(buf, null);
long now = SystemClock.elapsedRealtime();
if ((now-start) > 250) {
Slog.w("ActivityManager", "SLOW OOM ADJ: " + (now-start) + "ms for pid " + pid
+ " = " + amt);
}
}
private static boolean writeLmkd(ByteBuffer buf, ByteBuffer repl) {
if (!sLmkdConnection.isConnected()) {
// try to connect immediately and then keep retrying
sKillHandler.sendMessage(
sKillHandler.obtainMessage(KillHandler.LMKD_RECONNECT_MSG));
// wait for connection retrying 3 times (up to 3 seconds)
if (!sLmkdConnection.waitForConnection(3 * LMKD_RECONNECT_DELAY_MS)) {
return false;
}
}
return sLmkdConnection.exchange(buf, repl);
}
public boolean exchange(ByteBuffer req, ByteBuffer repl) {
if (repl == null) {
return write(req);
}
boolean result = false;
// set reply buffer to user-defined one to fill it
synchronized (mReplyBufLock) {
mReplyBuf = repl;
if (write(req)) {
try {
// wait for the reply
mReplyBufLock.wait();
result = (mReplyBuf != null);
} catch (InterruptedException ie) {
result = false;
}
}
// reset reply buffer
mReplyBuf = null;
}
return result;
}
adj,pid,uid写入名为lmkd的Socket通道中。之后的进程adj更新就是由lmkd来负责了。
二,进程调度策略
if (app.setSchedGroup != curSchedGroup) {
int oldSchedGroup = app.setSchedGroup;
app.setSchedGroup = curSchedGroup;
if (DEBUG_SWITCH || DEBUG_OOM_ADJ || mService.mCurOomAdjUid == app.uid) {
String msg = "Setting sched group of " + app.processName
+ " to " + curSchedGroup + ": " + app.adjType;
reportOomAdjMessageLocked(TAG_OOM_ADJ, msg);
}
if (app.waitingToKill != null && app.curReceivers.isEmpty()
&& app.setSchedGroup == ProcessList.SCHED_GROUP_BACKGROUND) {
app.kill(app.waitingToKill, ApplicationExitInfo.REASON_USER_REQUESTED,
ApplicationExitInfo.SUBREASON_UNKNOWN, true);
success = false;
} else {
int processGroup;
switch (curSchedGroup) {
case ProcessList.SCHED_GROUP_BACKGROUND:
processGroup = THREAD_GROUP_BACKGROUND;
break;
case ProcessList.SCHED_GROUP_TOP_APP:
case ProcessList.SCHED_GROUP_TOP_APP_BOUND:
processGroup = THREAD_GROUP_TOP_APP;
break;
case ProcessList.SCHED_GROUP_RESTRICTED:
processGroup = THREAD_GROUP_RESTRICTED;
break;
default:
processGroup = THREAD_GROUP_DEFAULT;
break;
}
//android12以前是直接调用的Process.setProcessGroup方法。但如果这个进程有太多线程,这个方法就是比较耗时的,所以改为了handler发消息,而这个handler在构造方法就初始化。整个方法由同步改异步,避免出现阻塞。
mProcessGroupHandler.sendMessage(mProcessGroupHandler.obtainMessage(
0 /* unused */, app.pid, processGroup, app.processName));
try {
if (curSchedGroup == ProcessList.SCHED_GROUP_TOP_APP) {
// do nothing if we already switched to RT
if (oldSchedGroup != ProcessList.SCHED_GROUP_TOP_APP) {
app.getWindowProcessController().onTopProcChanged();
if (mService.mUseFifoUiScheduling) {
// Switch UI pipeline for app to SCHED_FIFO
app.savedPriority = Process.getThreadPriority(app.pid);
mService.scheduleAsFifoPriority(app.pid, /* suppressLogs */true);
if (app.renderThreadTid != 0) {
mService.scheduleAsFifoPriority(app.renderThreadTid,
/* suppressLogs */true);
if (DEBUG_OOM_ADJ) {
Slog.d("UI_FIFO", "Set RenderThread (TID " +
app.renderThreadTid + ") to FIFO");
}
} else {
if (DEBUG_OOM_ADJ) {
Slog.d("UI_FIFO", "Not setting RenderThread TID");
}
}
} else {
// Boost priority for top app UI and render threads
setThreadPriority(app.pid, TOP_APP_PRIORITY_BOOST);
if (app.renderThreadTid != 0) {
try {
setThreadPriority(app.renderThreadTid,
TOP_APP_PRIORITY_BOOST);
} catch (IllegalArgumentException e) {
// thread died, ignore
}
}
}
}
} else if (oldSchedGroup == ProcessList.SCHED_GROUP_TOP_APP &&
curSchedGroup != ProcessList.SCHED_GROUP_TOP_APP) {
app.getWindowProcessController().onTopProcChanged();
if (mService.mUseFifoUiScheduling) {
try {
// Reset UI pipeline to SCHED_OTHER
setThreadScheduler(app.pid, SCHED_OTHER, 0);
setThreadPriority(app.pid, app.savedPriority);
if (app.renderThreadTid != 0) {
setThreadScheduler(app.renderThreadTid,
SCHED_OTHER, 0);
}
} catch (IllegalArgumentException e) {
Slog.w(TAG,
"Failed to set scheduling policy, thread does not exist:\n"
+ e);
} catch (SecurityException e) {
Slog.w(TAG, "Failed to set scheduling policy, not allowed:\n" + e);
}
} else {
// Reset priority for top app UI and render threads
setThreadPriority(app.pid, 0);
}
if (app.renderThreadTid != 0) {
setThreadPriority(app.renderThreadTid, THREAD_PRIORITY_DISPLAY);
}
}
} catch (Exception e) {
if (DEBUG_ALL) {
Slog.w(TAG, "Failed setting thread priority of " + app.pid, e);
}
}
}
}
OomAdjuster(ActivityManagerService service, ProcessList processList, ActiveUids activeUids,
ServiceThread adjusterThread) {
。。。
mProcessGroupHandler = new Handler(adjusterThread.getLooper(), msg -> {
final int pid = msg.arg1;
final int group = msg.arg2;
try {
setProcessGroup(pid, group);
} catch (Exception e) {
if (DEBUG_ALL) {
Slog.w(TAG, "Failed setting process group of " + pid + " to " + group, e);
}
} finally {
Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER);
}
return true;
});
。。。
}
1,通过异步方式调用Process.setProcessGroup(int pid, int group)去设置进程调度策略,目的是
利用linux的cgroup机制,根据进程状态将进程放入预先设定的cgroup分组中,分组中包含了对cpu使用率、cpuset、cpu调频等子资源的配置,以满足特定状态进程对系统资源的需求。
2,对schedGroup在某前台和后台之间切换时,调用setThreadPriority方法,切换主线程以及绘制线程的优先级,以提高用户的响应速度。
三,设置进程状态
if (app.getReportedProcState() != app.getCurProcState()) {
app.setReportedProcState(app.getCurProcState());
if (app.thread != null) {
try {
if (false) {
//RuntimeException h = new RuntimeException("here");
Slog.i(TAG, "Sending new process state " + app.getReportedProcState()
+ " to " + app /*, h*/);
}
app.thread.setProcessState(app.getReportedProcState());
} catch (RemoteException e) {
}
}
}
这里调用了应用进程的ApplicationThread的setProcessState方法,具体的代码这里不展开,我们只要知道这里目的是告诉ART运行时当前进程的可感知能力,
用来切换虚拟机之间的GC算法,即到底是前台进程GC还是后台进程GC,前台GC算法效率高,但是会产生碎片,后台GC效率低,但是不会产生碎片。
总结
ADJ算法通过感知进程的状态去设置优先级,内部定义了20多个等级划分。当内存不足时,从lru队列中寻找ADJ值高的(也就是优先级低的)进程进行回收。