1. 前言
看过很多关于 Android 系统启动流程的博客,对此都是半知半解。坚信只有自己将流程走一遍,才能有更进一步的认知,以及更深的理解。此篇文章算是自己在梳理 Android 系统启动流程所做的记录。由于自己对 C 以及 C++ 了解的不足,这里只从 Java 层开始追。此流程是基于 Android 9.0 的源码做出的记录,具体源码链接:http://androidxref.com/9.0.0_r3/
2. 电源键到 init 进程
2.1 当按下电源键开始,加载引导程序 Bootloader 到 RAM(Random Access Memory,随机存取存储器)。
2.2 引导程序 Bootloader 是在Android 操作系统开始运行前的一个小程序,它的主要作用是把系统OS拉起来并运行。Linux内核启动,主要加载驱动和挂载根文件系统,当完成系统的文件的挂载,开始启动 init 进程。
2.3 init 进程主要作用:a.创建一些文件夹并挂载设备;b.初始化和启动属性服务;c.解析init.rc配置文件并启动zygote进程;
2.4 zygote 进程通过调用 ZygoteInit 中 main 方法,实现了从 C++ 到 Java 框架层。
以上流程参考:
https://www.jianshu.com/p/9f978d57c683
http://liuwangshu.cn/framework/booting/1-init.html
http://liuwangshu.cn/framework/booting/2-zygote.html
接下来将从 ZygoteInit 中的 main 方法开始分析。
3. ZygoteInit
public static void main(String argv[]) {
ZygoteServer zygoteServer = new ZygoteServer();
//1 定义了zygoteServer
// Mark zygote start. This ensures that thread creation will throw
// an error.
ZygoteHooks.startZygoteNoThreadCreation();
// Zygote goes into its own process group.
try {
Os.setpgid(0, 0);
//2 指定当前进程的id,setpgid(int pid, int pgid),第一个参数是进程id,第二个是进程组id
} catch (ErrnoException ex) {
throw new RuntimeException("Failed to setpgid(0,0)", ex);
}
final Runnable caller;
try {
// Report Zygote start time to tron unless it is a runtime restart
//3 报告 Zygote 的启动时间,除非是重启
if (!"1".equals(SystemProperties.get("sys.boot_completed"))) {
MetricsLogger.histogram(null, "boot_zygote_init",
(int) SystemClock.elapsedRealtime());
}
String bootTimeTag = Process.is64Bit() ? "Zygote64Timing" : "Zygote32Timing";
TimingsTraceLog bootTimingsTraceLog = new TimingsTraceLog(bootTimeTag,
Trace.TRACE_TAG_DALVIK);
bootTimingsTraceLog.traceBegin("ZygoteInit");
RuntimeInit.enableDdms();
boolean startSystemServer = false;
String socketName = "zygote";
//4 默认socket的名字为"zygote"
String abiList = null;
boolean enableLazyPreload = false;
for (int i = 1; i < argv.length; i++) { //5 argv为main方法传递的类型为String的数组,此时遍历数组,以决定需要做哪些
if ("start-system-server".equals(argv[i])) {
startSystemServer = true; //6 是否启动SystemServer进程,一般都要启动
} else if ("--enable-lazy-preload".equals(argv[i])) {
enableLazyPreload = true;//7 是否启动懒式预加载
} else if (argv[i].startsWith(ABI_LIST_ARG)) {
abiList = argv[i].substring(ABI_LIST_ARG.length());
} else if (argv[i].startsWith(SOCKET_NAME_ARG)) {
socketName = argv[i].substring(SOCKET_NAME_ARG.length());
//8 如果有传递socket的名字,则进行修改,没有的话为默认的"zygote";
} else {
throw new RuntimeException("Unknown command line argument: " + argv[i]);
}
}
if (abiList == null) {
throw new RuntimeException("No ABI list supplied.");
}
zygoteServer.registerServerSocketFromEnv(socketName);//9 创建一个Server端的Socket
// In some configurations, we avoid preloading resources and classes eagerly.
// In such cases, we will preload things prior to our first fork.
if (!enableLazyPreload) { //10 第一次启动的话,肯定不会是懒加载,肯定会调用preload()方法进行初始化
bootTimingsTraceLog.traceBegin("ZygotePreload");
EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_START,
SystemClock.uptimeMillis());
preload(bootTimingsTraceLog);//11 预加载类和资源
EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_END,
SystemClock.uptimeMillis());
bootTimingsTraceLog.traceEnd(); // ZygotePreload
} else {
Zygote.resetNicePriority();//12 启动赖加载的情况
}
// Do an initial gc to clean up after startup
//13 从注释可以看出,在启动后触发一次GC,来清理
bootTimingsTraceLog.traceBegin("PostZygoteInitGC");
gcAndFinalize();//14 强制GC一次
bootTimingsTraceLog.traceEnd(); // PostZygoteInitGC
bootTimingsTraceLog.traceEnd(); // ZygoteInit
// Disable tracing so that forked processes do not inherit stale tracing tags from
// Zygote.
Trace.setTracingEnabled(false, 0);
Zygote.nativeSecurityInit();
// Zygote process unmounts root storage spaces.
Zygote.nativeUnmountStorageOnInit();//15 卸载root的存储空间
ZygoteHooks.stopZygoteNoThreadCreation();
if (startSystemServer) { //16 判断是否需要启动SystemServer
Runnable r = forkSystemServer(abiList, socketName, zygoteServer);
// {@code r == null} in the parent (zygote) process, and {@code r != null} in the
// child (system_server) process.
if (r != null) {
r.run();
return;
}
}
Log.i(TAG, "Accepting command socket connections");
// The select loop returns early in the child process after a fork and
// loops forever in the zygote.
caller = zygoteServer.runSelectLoop(abiList);//17 创建循环,等待socket消息
} catch (Throwable ex) {
Log.e(TAG, "System zygote died with exception", ex);
throw ex;
} finally {
zygoteServer.closeServerSocket();
}
// We're in the child process and have exited the select loop. Proceed to execute the
// command.
if (caller != null) {
caller.run();
}
}
注释1:创建了 ZygoteServer 对象,对于 ZygoteServer 具体做了什么,等后面有用到再具体分析。
注释2:指定当前进程的 id 以及进程组 id,这里调用的是 Os 的 setpgid 方法:
http://androidxref.com/9.0.0_r3/xref/libcore/luni/src/main/java/android/system/Os.java#501
/**
* See <a href="http://man7.org/linux/man-pages/man2/setpgid.2.html">setpgid(2)</a>.
*/
/** @hide */ public static void setpgid(int pid, int pgid) throws ErrnoException { Libcore.os.setpgid(pid, pgid); }
注释:4:默认socket 的名字为"zygote",如果在 main 方法传入的参数中带有 socket 的名称,则会被改动。真正在创建 socket 的时候,会在这个 socketName 前加上统一的前缀。
注释5,6,7,8:main 方法的的参数传入的是一个类型为 String 的数组 argv,通过遍历这个数组,决定是否进行预加载以及 socket 名字等。
注释9:调用 zygoteServer 中的方法 registerServerSocketFromEnv() 注册一个 Server 端的 socket,此处传入 socketName:
void registerServerSocketFromEnv(String socketName) {
if (mServerSocket == null) {
int fileDesc;
final String fullSocketName = ANDROID_SOCKET_PREFIX + socketName;
try {
String env = System.getenv(fullSocketName);
fileDesc = Integer.parseInt(env);
} catch (RuntimeException ex) {
throw new RuntimeException(fullSocketName + " unset or invalid", ex);
}
try {
FileDescriptor fd = new FileDescriptor();
fd.setInt$(fileDesc);
mServerSocket = new LocalServerSocket(fd);
mCloseSocketFd = true;
} catch (IOException ex) {
throw new RuntimeException(
"Error binding to local socket '" + fileDesc + "'", ex);
}
}
}
首先拼接完整的 socket 名字,"ANDROID_SOCKET_" + socketName,默认情况下为:ANDROID_SOCKET_zygote;接着从环境变量 env 中获取 socket 的 fd,最终创建一个服务端的 Socket :LocalServerSocket。
注释10,11,12:从 main 传入的参数决定是否进行预加载还是懒加载,开机启动中,肯定是预加载,而预加载的流程中,主要调用的是 preload() 方法:
static void preload(TimingsTraceLog bootTimingsTraceLog) {
Log.d(TAG, "begin preload");
bootTimingsTraceLog.traceBegin("BeginIcuCachePinning");
beginIcuCachePinning();
bootTimingsTraceLog.traceEnd(); // BeginIcuCachePinning
bootTimingsTraceLog.traceBegin("PreloadClasses");
preloadClasses();
bootTimingsTraceLog.traceEnd(); // PreloadClasses
bootTimingsTraceLog.traceBegin("PreloadResources");
preloadResources();
bootTimingsTraceLog.traceEnd(); // PreloadResources
Trace.traceBegin(Trace.TRACE_TAG_DALVIK, "PreloadAppProcessHALs");
nativePreloadAppProcessHALs();
Trace.traceEnd(Trace.TRACE_TAG_DALVIK);
Trace.traceBegin(Trace.TRACE_TAG_DALVIK, "PreloadOpenGL");
preloadOpenGL();
Trace.traceEnd(Trace.TRACE_TAG_DALVIK);
preloadSharedLibraries();
preloadTextResources();
// Ask the WebViewFactory to do any initialization that must run in the zygote process,
// for memory sharing purposes.
WebViewFactory.prepareWebViewInZygote();
endIcuCachePinning();
warmUpJcaProviders();
Log.d(TAG, "end preload");
sPreloadComplete = true;
}
在预加载的流程中,包含了类的预加载、系统资源的预加载、OpenGL的预加载,以及文字资源的预加载和 WebViewFactory 中一些必须在 zygote 进程中进行的初始化工作。
注释13,14:当加载完预加载的类以及系统资源等,通过调用 gcAndFinalize() 对系统进行强制的回收。
/**
* Runs several special GCs to try to clean up a few generations of
* softly- and final-reachable objects, along with any other garbage.
* This is only useful just before a fork().
*/
/*package*/ static void gcAndFinalize() {
final VMRuntime runtime = VMRuntime.getRuntime();
/* runFinalizationSync() lets finalizers be called in Zygote,
* which doesn't have a HeapWorker thread.
*/
System.gc();
runtime.runFinalizationSync();
System.gc();
}
注释16:对于开机启动,这里会执行以下的代码:
if (startSystemServer) {
Runnable r = forkSystemServer(abiList, socketName, zygoteServer);
// {@code r == null} in the parent (zygote) process, and {@code r != null} in the
// child (system_server) process.
if (r != null) {
r.run();
return;
}
}
通过调用方法 forkSystemServer() 来创建一个 Runnable,并运行此线程,forkSystemServer 方法的核心代码如下:
private static Runnable forkSystemServer(String abiList, String socketName,
ZygoteServer zygoteServer) {
...
...
/* Request to fork the system server process */
pid = Zygote.forkSystemServer(
parsedArgs.uid, parsedArgs.gid,
parsedArgs.gids,
parsedArgs.runtimeFlags,
null,
parsedArgs.permittedCapabilities,
parsedArgs.effectiveCapabilities);
} catch (IllegalArgumentException ex) {
throw new RuntimeException(ex);
}
/* For child process */
if (pid == 0) {
if (hasSecondZygote(abiList)) {
waitForSecondaryZygote(socketName);
}
zygoteServer.closeServerSocket();
return handleSystemServerProcess(parsedArgs);
}
return null;
}
通过调用 Zygote.forkSystemServer 方法来创建一个新的进程来启动SystemServer,并且返回这个进程的pid,代码如下:
public static int forkSystemServer(int uid, int gid, int[] gids, int runtimeFlags,
int[][] rlimits, long permittedCapabilities, long effectiveCapabilities) {
VM_HOOKS.preFork();
// Resets nice priority for zygote process.
resetNicePriority();
int pid = nativeForkSystemServer(
uid, gid, gids, runtimeFlags, rlimits, permittedCapabilities, effectiveCapabilities);
// Enable tracing as soon as we enter the system_server.
if (pid == 0) {
Trace.setTracingEnabled(true, runtimeFlags);
}
VM_HOOKS.postForkCommon();
return pid;
}
此通过调用 native 方法 nativeForkSystemServer 方法来创建进程以及返回对应的进程id,即 pid。继续回到上面的代码:
/* For child process */
if (pid == 0) {
if (hasSecondZygote(abiList)) {
waitForSecondaryZygote(socketName);
}
zygoteServer.closeServerSocket();
return handleSystemServerProcess(parsedArgs);
}
fork函数会返回两次,pid==0意味着子进程创建成功,如果机器支持32位应用,需要等待32位的Zygote连接成功。关闭从Zygote进程继承来的Socket,这是因为当 Zygote 复制出新的进程时,由于复制出的新进程与 Zygote 进程共享内存空间,而在Zygote进程中创建的服务端 Socket 是新进程不需要的,所以新创建的进程需调用 zygoteServer.closeServerSocket()方法关闭该Socket服务端。system_server进程是 zygote 进程 fork 的第一个进程。
handleSystemServerProcess 方法:
private static Runnable handleSystemServerProcess(ZygoteConnection.Arguments parsedArgs) {
...
...
if (parsedArgs.invokeWith != null) {
...
...
} else {
ClassLoader cl = null;
if (systemServerClasspath != null) {
cl = createPathClassLoader(systemServerClasspath, parsedArgs.targetSdkVersion);
Thread.currentThread().setContextClassLoader(cl);
}
/*
* Pass the remaining arguments to SystemServer.
*/
return ZygoteInit.zygoteInit(parsedArgs.targetSdkVersion, parsedArgs.remainingArgs, cl);
}
/* should never reach here */
}
这个方法的核心是调用了 ZygoteInit.zygoteInit() 方法:
public static final Runnable zygoteInit(int targetSdkVersion, String[] argv, ClassLoader classLoader) {
if (RuntimeInit.DEBUG) {
Slog.d(RuntimeInit.TAG, "RuntimeInit: Starting application from zygote");
}
Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "ZygoteInit");
RuntimeInit.redirectLogStreams();
// 调用了 commonInit() 设置 System 进程的时区和键盘布局等信息
RuntimeInit.commonInit();
// 初始化了 System 进程中的 Binder 线程池
ZygoteInit.nativeZygoteInit();
// 回调 main 方法
return RuntimeInit.applicationInit(targetSdkVersion, argv, classLoader);
}
commonInit 方法如下:
protected static final void commonInit() {
...
...
LoggingHandler loggingHandler = new LoggingHandler();
//设置进程的uncaught exception的处理方法
Thread.setUncaughtExceptionPreHandler(loggingHandler);
//进入异常崩溃的处理流程
Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler(loggingHandler));
/*
* Install a TimezoneGetter subclass for ZoneInfo.db
*/
TimezoneGetter.setInstance(new TimezoneGetter() {
@Override
public String getId() {
return SystemProperties.get("persist.sys.timezone");
}
});
TimeZone.setDefault(null);
...
...
}
LoggingHandler 如下:
这是 RuntimeInit 中的一个静态内部类:
private static class LoggingHandler implements Thread.UncaughtExceptionHandler {
public volatile boolean mTriggered = false;
@Override
public void uncaughtException(Thread t, Throwable e) {
mTriggered = true;
// Don't re-enter if KillApplicationHandler has already run
if (mCrashing) return;
// mApplicationObject is null for non-zygote java programs (e.g. "am")
// There are also apps running with the system UID. We don't want the
// first clause in either of these two cases, only for system_server.
if (mApplicationObject == null && (Process.SYSTEM_UID == Process.myUid())) {
Clog_e(TAG, "*** FATAL EXCEPTION IN SYSTEM PROCESS: " + t.getName(), e);
} else {
StringBuilder message = new StringBuilder();
// The "FATAL EXCEPTION" string is still used on Android even though
// apps can set a custom UncaughtExceptionHandler that renders uncaught
// exceptions non-fatal.
message.append("FATAL EXCEPTION: ").append(t.getName()).append("\n");
final String processName = ActivityThread.currentProcessName();
if (processName != null) {
message.append("Process: ").append(processName).append(", ");
}
message.append("PID: ").append(Process.myPid());
Clog_e(TAG, message.toString(), e);
}
}
}
应用的 JAVA 的 crash 问题就是 FATAL EXCEPTION 开头,而系统的 JAVA 的 crash 问题是 FATAL EXCEPTION IN SYSTEM PROCESS开头的。
KillApplicationHandler 类:
private static class KillApplicationHandler implements Thread.UncaughtExceptionHandler {
private final LoggingHandler mLoggingHandler;
public KillApplicationHandler(LoggingHandler loggingHandler) {
this.mLoggingHandler = Objects.requireNonNull(loggingHandler);
}
@Override
public void uncaughtException(Thread t, Throwable e) {
try {
ensureLogging(t, e);
if (mCrashing) return;
mCrashing = true;
ActivityManager.getService().handleApplicationCrash(
mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));
} catch (Throwable t2) {
if (t2 instanceof DeadObjectException) {
// System process is dead; ignore
} else {
try {
Clog_e(TAG, "Error reporting crash", t2);
} catch (Throwable t3) {
// Even Clog_e() fails! Oh well.
}
}
} finally {
// Try everything to make sure this process goes away.
Process.killProcess(Process.myPid());
System.exit(10);
}
}
private void ensureLogging(Thread t, Throwable e) {
if (!mLoggingHandler.mTriggered) {
try {
mLoggingHandler.uncaughtException(t, e);
} catch (Throwable loggingThrowable) {
// Ignored.
}
}
}
}
此类是 RuntimeInit 中的一个静态内部类,这里的逻辑就是发生crash 后杀死进程并且通知AMS弹窗:
// Bring up crash dialog, wait for it to be dismissed
ActivityManager.getService().handleApplicationCrash(
mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));
继续回到 ZygoteInit.zygoteInit() 中调用的 ZygoteInit.nativeZygoteInit():
nativeZygoteInit方法是个JNI方法,在 AndroidRuntime.cpp 中注册,最终创建 Binder 线程池。
继续看 RuntimeInit.applicationInit() 方法:
protected static Runnable applicationInit(int targetSdkVersion, String[] argv,
ClassLoader classLoader) {
VMRuntime.getRuntime().setTargetHeapUtilization(0.75f);
VMRuntime.getRuntime().setTargetSdkVersion(targetSdkVersion);
final Arguments args = new Arguments(argv);
// The end of of the RuntimeInit event (see #zygoteInit).
Trace.traceEnd(Trace.TRACE_TAG_ACTIVITY_MANAGER);
// Remaining arguments are passed to the start class's static main
return findStaticMain(args.startClass, args.startArgs, classLoader);
}
这里的核心是调用 findStaticMain() 方法:
protected static Runnable findStaticMain(String className, String[] argv,
ClassLoader classLoader) {
Class<?> cl;
try {
cl = Class.forName(className, true, classLoader);
} catch (ClassNotFoundException ex) {
throw new RuntimeException(
"Missing class when invoking static main " + className,
ex);
}
Method m;
try {
m = cl.getMethod("main", new Class[] { String[].class });
} catch (NoSuchMethodException ex) {
throw new RuntimeException(
"Missing static main on " + className, ex);
} catch (SecurityException ex) {
throw new RuntimeException(
"Problem getting static main on " + className, ex);
}
int modifiers = m.getModifiers();
if (! (Modifier.isStatic(modifiers) && Modifier.isPublic(modifiers))) {
throw new RuntimeException(
"Main method is not public and static on " + className);
}
/*
* This throw gets caught in ZygoteInit.main(), which responds
* by invoking the exception's run() method. This arrangement
* clears up all the stack frames that were required in setting
* up the process.
*/
return new MethodAndArgsCaller(m, argv);
}
这里通过传入的类名,这里类名为 "com.android.server.SystemServer" ,通过加载类的字节码,反射此类的main方法,得到Method 对象。获取到 Method 对象后创建 MethodAndArgsCaller 对象:
static class MethodAndArgsCaller implements Runnable {
/** method to call */
private final Method mMethod;
/** argument array */
private final String[] mArgs;
public MethodAndArgsCaller(Method method, String[] args) {
mMethod = method;
mArgs = args;
}
public void run() {
try {
mMethod.invoke(null, new Object[] { mArgs });
} catch (IllegalAccessException ex) {
throw new RuntimeException(ex);
} catch (InvocationTargetException ex) {
Throwable cause = ex.getCause();
if (cause instanceof RuntimeException) {
throw (RuntimeException) cause;
} else if (cause instanceof Error) {
throw (Error) cause;
}
throw new RuntimeException(ex);
}
}
}
这里的核心代码是 mMethod.invoke(null, new Object[] { mArgs }),用于执行SystemServer的main函数, 从而进入到SystemServer 的main方法,到此 SystemServer 就正式被启动了,后续再继续分析。
饶了一大圈,这时候在回到 ZygoteInit 中的 main 方法,
注释17:
caller = zygoteServer.runSelectLoop(abiList);//17 创建循环,等待socket消息
用于创建循环,等待 Socket 的消息,runSelectLoop 方法如下:
Runnable runSelectLoop(String abiList) {
...
...
while (true) {
...
...
for (int i = pollFds.length - 1; i >= 0; --i) {
if ((pollFds[i].revents & POLLIN) == 0) {
continue;
}
if (i == 0) {
ZygoteConnection newPeer = acceptCommandPeer(abiList);
peers.add(newPeer);
fds.add(newPeer.getFileDesciptor());
} else {
try {
ZygoteConnection connection = peers.get(i);
final Runnable command = connection.processOneCommand(this);
if (mIsForkChild) {
if (command == null) {
throw new IllegalStateException("command == null");
}
return command;
} else {
if (command != null) {
throw new IllegalStateException("command != null");
}
if (connection.isClosedByPeer()) {
connection.closeSocket();
peers.remove(i);
fds.remove(i);
}
}
} catch (Exception e) {
if (!mIsForkChild) {
Slog.e(TAG, "Exception executing zygote command: ", e);
ZygoteConnection conn = peers.remove(i);
conn.closeSocket();
fds.remove(i);
} else {
Log.e(TAG, "Caught post-fork exception in child process.", e);
throw e;
}
} finally {
mIsForkChild = false;
}
}
}
}
}
这里通过一个 while 循环,并且接受连接请求。i=0,说明请求连接的事件过来了,调用acceptCommandPeer()和客户端建立socket连接,然后加入监听数组,等待这个socket上命令的到来;i>0,则说明 ActivityManagerService 向 Zygote 进程发送了一个创建应用进程的请求,则调用 ZygoteConnection 的runOnce函数来创建一个新的应用程序进程。并在成功创建后将这个连接从Socket连接列表peers和fd列表fds中清除。
4. 总结
在 Zygote 进程中,主要做的事情为:
a. 注册一个 Server 端的 Socket;
b. 预加载系统资源;
c. 启动 SystemServer 进程;
d. 开启循环,进行接收 Socket 的消息;