源码详解Android 9.0(P) 系统启动流程目录:
源码详解Android 9.0(P)系统启动流程之init进程(第一阶段)
源码详解Android 9.0(P)系统启动流程之init进程(第二阶段)
源码详解Android 9.0(P)系统启动流程之init.rc语法规则
源码详解Android 9.0(P)系统启动流程之init进程(第三阶段)
源码详解Android 9.0(P)系统启动流程之核心服务和关键进程启动
源码详解Android 9.0(P)系统启动流程之Zygote进程
源码详解Android 9.0(P)系统启动流程之SystemServer
Android系统启动流程 Zygote进程
0. 概述
在前面的篇章源码详解Android 9.0§ 系统启动流程之核心服务和关键进程启动中我们简要的概括了Android P核心服务和关键进程的启动,这其中就提及了我们这个篇章需要重点讲解的zygote的启动的部分内容。如果对Android有一些了解的话就会知道,zygote进程属于Native service进程,它是由我们先前的篇章init进程在解析init.xxx.rc文件中得到的service服务,这些服务即Android的核心Native服务,并且通常这些服务被称为守护进程(dameon)运行于后台,为Android系统的运行守驾护航而操劳着。
zygote进程的重要性不言而喻,而zygote进程启动做的事情也非常多,我们一一来依据源代码进行梳理,本篇主要包含:
- Zygote进程启动流程整体概括
- Zygote 进程从何而来
- zygote创建参数解析
- 创建虚拟机
- 注册JNI函数
- 通过JNI反射启动ZygoteInit的main方法
- 登陆Java世界
都说zygote开创了Android的世界,创造了五彩缤纷的Android生态,孵化了各式的上层App。那么本篇将会带领大伙看看,我们的zygote是怎么做到的!
注意:本文演示的代码是Android 9.0 amlogic平台源码。涉及的代码如下:
frameworks/base/cmds/app_process/app_main.cpp
frameworks/base/include/android_runtime/AndroidRuntime.h
frameworks/base/core/jni/AndroidRuntime.cpp
frameworks/base//core/java/com/android/internal/os/ZygoteInit.java
frameworks//base/core/jni/com_android_internal_os_Zygote.cpp
system/core/rootdir/init.rc
system/core/rootdir/init.zygoteXX.rc
frameworks/base/core/java/com/android/internal/os/ZygoteServer.java
1. Zygote进程启动流程整体概括
Zygote进程详细分解下来东西比较多,为了我们有一个清晰的知识脉络,首先奉上Zygote进程的启动整体流程图,这样对于读者可以首先站在上帝视角审视然后详细品读。
从上面的流程图可以将Zygote进程主要拆分为如下几步,这简单的几个模块撑起了Android Java世界的一片天:
- startVm创建vm虚拟机
- startReg注册JNI函数
- registerServerSocketFromEnv注册zygote通信通道
- preload预加载类和资源
- forkSystemServer创建system_server
在接下的章节我们以上述为目录,进行Zygote进程启动的讲解。
2. Zygote 进程从何而来
Zygote进程出身名门,当然不是从天上掉下来的,更加不是从石头缝里蹦出来的,而是有国际通用出生证的。让我们追查一番出生证明,看看zygote进程的前世今日如何。
2.1 加载init.zygoteXX.rc
我们之前提到过,在Android 7以后对于Android Native service会有一个单独的对应的rc配置文件,当然对于zygote进程也不例外,可是我们按照国际惯例在其源码目录下frameworks/base/cmds/app_process下面没有找到:
Zygote进程是个特例,它对应的rc文件在system/core/rootdir/下面,这个路径下面是有不少zygote的rc文件,如下:
那么zygote的rc文件是怎么加载进入init进程然后解析成service的呢,这个我们在init.rc有如下的import加载流程,Android正是通过ro.zygote来判断是加载那个zygote的rc文件的。
import /init.${ro.zygote}.rc
当前我用来分析的终端的ro.zygote属性的值的配置如下,即我Android终端加载的是init.zygote32.rc配置。
console:/ # getprop | grep ro.zygote
[ro.zygote]: [zygote32]
不同的终端可能会有不同的情况,让我们来了解一下其中的原因。
2.1.1 init.zygoteXX.rc文件解读
通过前面的了解,我们发现在Android源码下存在四种init.zygoteXX.rc文件,那么这四个rc文件的功能是什么,以及什么时候使用呢?这个我想是大伙关心的。这几个rc文件有如下不同:
- init.zygote32.rc:zygote 进程对应的执行程序是 app_process (纯 32bit 模式),这个在Android 5版本以前基本是这个模式
- init.zygote64.rc:zygote 进程对应的执行程序是 app_process64 (纯 64bit 模式),这个现阶段比较少,很少有Android终端完全运行64位的,但是看最近谷歌的策略好像有强制对Android高版本必须64位运行的要求
- init.zygote32_64.rc:启动两个 zygote 进程 (名为 zygote 和 zygote_secondary),对应的执行程序分别是 app_process32 (主模式)、app_process64
- init.zygote64_32.rc:启动两个 zygote 进程 (名为 zygote 和 zygote_secondary),对应的执行程序分别是 app_process64 (主模式)、app_process32,这种情况比较常见
2.1.2 init.zygoteXX.rc存在多种配置的原因
这时候我们也许会思考,为什么需要四个文件呢,一个文件不可以吗?当然不行,原因主要有:
- 因为随着硬件和科技的发展,以及Android版本的迭代,同时谷歌也为了追上苹果的用户体验而推出了64的Android版本,但是不是所有的Android终端都是高配版本,也不是所有的App都已经做好了适配64的准备,这就导致了Android必须兼容各种模式
- 同时Android设备厂商,也有旗舰机型和屌丝机型,这种针对不同机型导致了Android可能运行的位数也不同,必须正确搭配好。
其实不同的zygote.rc内容大致相同,主要区别体现在启动的是32位,还是64位的进程。init.zygote32_64.rc和init.zygote64_32.rc会启动两个进程,且存在主次之分。这里我们以init.zygote64_32.rc为例来说明:
#service服务的正常操作,在讲解init进程中有详细讲解过service的参数配置,至少要有两个,一个是服务名,一个路径,另外的以这里举例就是启动参数了为-Xzygote /system/bin --zygote --start-system-server --socket-name=zygote
#这里有一点需要注意,虽然这里的服务名叫zygote,但不是是运行终端中ps查看的zygote
service zygote /system/bin/app_process64 -Xzygote /system/bin --zygote --start-system-server --socket-name=zygote
class main
priority -20
user root
group root readproc reserved_disk
socket zygote stream 660 root system #创建一个socket,名字叫zygote,以tcp形式
onrestart write /sys/android_power/request_state wake #onrestart 指当进程重启时执行后面的命令
onrestart write /sys/power/state on
onrestart restart audioserver
onrestart restart cameraserver
onrestart restart media
onrestart restart netd
onrestart restart wificond
writepid /dev/cpuset/foreground/tasks #创建子进程时,向/dev/cpuset/foreground/tasks 写入pid
#创建一个service,次zygote进程名字叫做zygote_secondary ,你会发现在ps中查看不到,因为它在运行中换脸了
service zygote_secondary /system/bin/app_process32 -Xzygote /system/bin --zygote --socket-name=zygote_secondary --enable-lazy-preload
class main
priority -20
user root
group root readproc reserved_disk
socket zygote_secondary stream 660 root system
onrestart restart zygote
writepid /dev/cpuset/foreground/tasks
2.1.3 zygote进程在Android终端中的实际运行情况
上面rc配置中的zygote 和zygote_secondary 实际运行终端中的zygote是不匹配的,且在实际运行的终端中你通过ps是根本是看不到zygote_secondary的。
使用ps | grep zygote
可以查看终端运行情况。
使用ps | grep PID
可以查看zygote孵化了哪些进程。
3. Zygote 进程的启动
在前面的章节里面,我们解决了zygote进程的出身问题,那么在这个章节我们将要解决zygote进程何时启动的问题。
3.1 Zygote进程启动触发流程
在前面的文章源码详解Android 9.0§ 系统启动流程之init进程(第三阶段)中我们了解了Trigger触发顺序的触发顺序,我们知道在init进程启动的第三阶段会在最后调用如下的逻辑代码添加触发逻辑如下:
// Don't mount filesystems or start core system services in charger mode.
std::string bootmode = GetProperty("ro.bootmode", "");
if (bootmode == "charger") {
am.QueueEventTrigger("charger");
} else {
am.QueueEventTrigger("late-init");
}
我这里演示的Android终端模式是没有开启加密模式的,所以ro.crypto.state的值为unencrypted,所以综上所述会在on late-init 中触发的。代码定义于system\core\rootdir\init.rc
......
import /init.${ro.zygote}.rc
......
# 挂载文件系统并启动核心系统服务
# Mount filesystems and start core system services.
on late-init
......
# 调用zygote-start
# Now we can start zygote for devices with file based encryption
trigger zygote-start
......
trigger early-boot
trigger boot
......
# It is recommended to put unnecessary data/ initialization from post-fs-data
# to start-zygote in device's init.rc to unblock zygote start.
# 在init.rc中调用zygote-start解除zygote的启动阻塞
on zygote-start && property:ro.crypto.state=unencrypted
# A/B update verifier that marks a successful boot.
exec_start update_verifier_nonencrypted
start netd
start zygote
start zygote_secondary
on zygote-start && property:ro.crypto.state=unsupported
# A/B update verifier that marks a successful boot.
exec_start update_verifier_nonencrypted
start netd
start zygote
start zygote_secondary
on zygote-start && property:ro.crypto.state=encrypted && property:ro.crypto.type=file
# A/B update verifier that marks a successful boot.
exec_start update_verifier_nonencrypted
start netd
start zygote
start zygote_secondary
......
3.2 Zygote正式启动
经过如上的步骤,最终调用命令start启动了zygote的两个进程形式,在前面源码详解Android 9.0§ 系统启动流程之init进程(第三阶段)我们知道了start对应的命令如下所示,其代码路径在system/core/init/builtins.cpp中。
static const Map builtin_functions = {
...
{"start", {1, 1, do_start}},
...
};
start 命令有一个对应的执行函数 do_start
static Result<Success> do_start(const BuiltinArguments& args) {
Service* svc = ServiceList::GetInstance().FindService(args[1]);//找出对应service
if (!svc) return Error() << "service " << args[1] << " not found";
if (auto result = svc->Start(); !result) {//svc->Start启动服务
return Error() << "Could not start service: " << result.error();
}
return Success();
}
这个代码也很简单就是ServiceList查找前面在init进程中前面已经解析好的service section列表,然后将之启动。
zygote和zygote_secondary的源码路径在frameworks/base/cmds/app_process中,通过Android.mk我们发现这两个进程的源文件是同一个,只是通过Android.mk生成了不同的执行文件而已,这种做法在Android中非常常见,比如adb啊,同一份代码可以编译出windows版本和linux版本的出来。
......
app_process_src_files := \
app_main.cpp \
LOCAL_SRC_FILES:= $(app_process_src_files)
......
LOCAL_MODULE:= app_process
LOCAL_MULTILIB := both
LOCAL_MODULE_STEM_32 := app_process32
LOCAL_MODULE_STEM_64 := app_process64
......
4. Zygote进程main函数分析
在前面的章节中中我们了解的zygote的出身何处,以及如何发家的。在这个章节中我们将要分析zygote进程的main函数,其源码路径为frameworks/base/cmds/app_process/app_main.cpp。
app_main.cpp中的main函数做的工作不是很多,主要就是解析传递进来的参数,然后根据解析得到的参数启动不同的模式,这里的启动模式分为如下两种:
-
zygote模式,这个就是我们今天要讲解的模式了,即初始化zygote进程模式,其中传递的参数为-Xzygote /system/bin --zygote --start-system-server --socket-name=zygote,–start-system-server表示启动的是SystemServier,–socket-name=zygote表示指定socket名称。
-
application模式,这个就是我们通常所说的Zygote孵化应用程序模式,传递的参数有class名字以及class带的参数
最好通过解析出来的参数来决定,是调用AppRuntime的start函数启动ZygoteInit还是RuntimeInit,我们这里分析的是zygote进程启动流程,所以根据传递进来的参数走的是ZygoteInit这个分支
4.1 解析zygote参数
//代码位于frameworks/base/cmds/app_process/app_main.cpp
int main(int argc, char* const argv[])
{
setpriority(0, getpid(), -20);
syscall(SYS_ioprio_set, IOPRIO_WHO_PROCESS, getpid(), 1 | (IOPRIO_CLASS_RT << IOPRIO_CLASS_SHIFT));
//将参数argv放到argv_String字符串中,然后打印出来
//之前start zygote传入的参数是 -Xzygote /system/bin --zygote --start-system-server --socket-name=zygote
if (true || !LOG_NDEBUG) {
String8 argv_String;
for (int i = 0; i < argc; ++i) {
argv_String.append("\"");
argv_String.append(argv[i]);
argv_String.append("\" ");
}
ALOGD("app_process main with argv: %s", argv_String.string());
//zygote启动打印出来的日志如下
//app_process main with argv: "/system/bin/app_process64" "-Xzygote" "/system/bin" "--zygote" "--start-system-server" "--socket-name=zygote"
}
ALOGD("get prio: %d", getpriority(PRIO_PROCESS, 0));
//以传入的参数构建构建AppRuntime对象
AppRuntime runtime(argv[0], computeArgBlockSize(argc, argv));
// Process command line arguments
// ignore argv[0]
argc--;
argv++;
// Everything up to '--' or first non '-' arg goes to the vm.
//
// The first argument after the VM args is the "parent dir", which
// is currently unused.
//
// After the parent dir, we expect one or more the following internal
// arguments :
//
// --zygote : Start in zygote mode
// --start-system-server : Start the system server.
// --application : Start in application (stand alone, non zygote) mode.
// --nice-name : The nice name for this process.
//
// For non zygote starts, these arguments will be followed by
// the main class name. All remaining arguments are passed to
// the main method of this class.
//
// For zygote starts, all remaining arguments are passed to the zygote.
// main function.
//
// Note that we must copy argument string values since we will rewrite the
// entire argument block when we apply the nice name to argv0.
//
// As an exception to the above rule, anything in "spaced commands"
// goes to the vm even though it has a space in it.
/*
*下面让我用我半吊子水平的英文来大概描述下上面语句的意思:
*所有在 "--" 后面的非 "-"开头的参数都将传入vm, 但是有个例外是spaced commands数组中的参数,后面可以看到以-开头的会传入runtime了
*/
const char* spaced_commands[] = { "-cp", "-classpath" };//这两个参数是Java程序需要依赖的Jar包,相当于import
// Allow "spaced commands" to be succeeded by exactly 1 argument (regardless of -s).
bool known_command = false;
int i;
for (i = 0; i < argc; i++) {
if (known_command == true) {
runtime.addOption(strdup(argv[i]));//将spaced_commands中的参数额外加入AppRuntime
// The static analyzer gets upset that we don't ever free the above
// string. Since the allocation is from main, leaking it doesn't seem
// problematic. NOLINTNEXTLINE
ALOGV("app_process main add known option '%s'", argv[i]);
known_command = false;
continue;
}
for (int j = 0;
j < static_cast<int>(sizeof(spaced_commands) / sizeof(spaced_commands[0]));
++j) {
if (strcmp(argv[i], spaced_commands[j]) == 0) {//比较参数是否是spaced_commands中的参数
known_command = true;
ALOGV("app_process main found known command '%s'", argv[i]);
}
}
if (argv[i][0] != '-') {//如果参数的第一个字母不是''则跳出循环,这里的参数是-Xzygote所以满足条件,
break;
}
if (argv[i][1] == '-' && argv[i][2] == 0) {//满足该条件
++i; // Skip --.
break;
}
runtime.addOption(strdup(argv[i]));
// The static analyzer gets upset that we don't ever free the above
// string. Since the allocation is from main, leaking it doesn't seem
// problematic. NOLINTNEXTLINE
ALOGV("app_process main add option '%s'", argv[i]);
//打印出来的日志如下
//app_process main add option '-Xzygote'
}
// Parse runtime arguments. Stop at first unrecognized option.
bool zygote = false;
bool startSystemServer = false;
bool application = false;
String8 niceName;
String8 className;
++i; // Skip unused "parent dir" argument.
//前面已经消耗掉了一个参数-Xzygote,所以接着解析其余参数,这里++i,跳过了第二参数/system/bin,也就是所谓的 "parent dir"
while (i < argc) {
const char* arg = argv[i++];
if (strcmp(arg, "--zygote") == 0) {//第三个参数为--zygote符合zygote启动模式
zygote = true;
niceName = ZYGOTE_NICE_NAME;//这个即我在前面讲解的,zygote进程在运行过程中换脸切换名称的别名,这个值根据平台可能是zygote64或zygote
} else if (strcmp(arg, "--start-system-server") == 0) {//需要启动SystemServer
startSystemServer = true;
} else if (strcmp(arg, "--application") == 0) {//表示是application启动模式,也就是普通应用程序
application = true;
} else if (strncmp(arg, "--nice-name=", 12) == 0) {//进程别名
niceName.setTo(arg + 12);
} else if (strncmp(arg, "--", 2) != 0) {//application启动的class
className.setTo(arg);
break;
} else {
--i;
break;
}
}
wait_for_ready("/data/dalvik-cache/", 5000);
ALOGD("waited /data/dalvik-cache...");
Vector<String8> args;
if (!className.isEmpty()) {//判断className是否为空
// We're not in zygote mode, the only argument we need to pass
// to RuntimeInit is the application argument.
//
// The Remainder of args get passed to startup class main(). Make
// copies of them before we overwrite them with the process name.
args.add(application ? String8("application") : String8("tool"));
runtime.setClassNameAndArgs(className, argc - i, argv + i);//将className和参数设置给runtime
if (!LOG_NDEBUG) {//打印class带的参数
String8 restOfArgs;
char* const* argv_new = argv + i;
int argc_new = argc - i;
for (int k = 0; k < argc_new; ++k) {
restOfArgs.append("\"");
restOfArgs.append(argv_new[k]);
restOfArgs.append("\" ");
}
ALOGV("Class name = %s, args = %s", className.string(), restOfArgs.string());
}
} else {//zygote启动模式,即我们这个篇章分析的
// We're in zygote mode.
maybeCreateDalvikCache();//新建Dalvik的缓存目录
if (startSystemServer) {
args.add(String8("start-system-server"));//加入start-system-server参数
}
char prop[PROP_VALUE_MAX];
if (property_get(ABI_LIST_PROPERTY, prop, NULL) == 0) {
LOG_ALWAYS_FATAL("app_process: Unable to determine ABI list from property %s.",
ABI_LIST_PROPERTY);
return 11;
}
String8 abiFlag("--abi-list=");//加入--abi-list=参数
abiFlag.append(prop);
args.add(abiFlag);
// In zygote mode, pass all remaining arguments to the zygote
// main() method.
for (; i < argc; ++i) {//将剩下的参数加入args
args.add(String8(argv[i]));
}
}
if (!niceName.isEmpty()) {//设置进程别名,这个就是在前面所说的app_process后面为啥叫做zygote的由来
runtime.setArgv0(niceName.string(), true /* setProcName */);
}
if (zygote) {//如果是zygote启动模式,就调用ZygoteInit
runtime.start("com.android.internal.os.ZygoteInit", args, zygote);
} else if (className) {//如果是application启动模式,则加载RuntimeInit
runtime.start("com.android.internal.os.RuntimeInit", args, zygote);
} else {
fprintf(stderr, "Error: no class name or --zygote supplied.\n");
app_usage();
LOG_ALWAYS_FATAL("app_process: no class name or --zygote supplied.");
}
}
对app_main.cpp的main分析到这里就告一段落了,我们看到,在最后调用的是runtime.start函数,因此,接下来我们将要继续分析的就是AppRuntime 的start的启动流程。
5. AndroidRuntime::start
刚刚我们说要对AppRuntime的start流程进行分析,现在标题却是AndroidRuntime,很多人会不解,这是因为AppRuntime继承自AndroidRuntime,且没有start的实现体,因此zygote的流程进入到了AndroidRuntime.cpp。
/*
* Start the Android runtime. This involves starting the virtual machine
* and calling the "static void main(String[] args)" method in the class
* named by "className".
*
* Passes the main function two arguments, the class name and the specified
* options string.
*/
void AndroidRuntime::start(const char* className, const Vector<String8>& options, bool zygote)
{
......
//打印一些日志,获取ANDROID_ROOT环境变量
/* start the virtual machine */
JniInvocation jni_invocation;
jni_invocation.Init(NULL);//初始化JNI,加载libart.so
JNIEnv* env;
if (startVm(&mJavaVM, &env, zygote) != 0) {//创建VM虚拟机
return;
}
//回调AppRuntime的onVmCreated函数
//对于zygote进程的启动流程而言,无实际操作
onVmCreated(env);
/*
* Register android functions.
*/
if (startReg(env) < 0) {//注册系统JNI
ALOGE("Unable to register all android natives\n");
return;
}
/*
* We want to call main() with a String array with arguments in it.
* At present we have two arguments, the class name and an option string.
* Create an array to hold them.
*/
jclass stringClass;
jobjectArray strArray;
jstring classNameStr;
stringClass = env->FindClass("java/lang/String");
assert(stringClass != NULL);
strArray = env->NewObjectArray(options.size() + 1, stringClass, NULL);
assert(strArray != NULL);
classNameStr = env->NewStringUTF(className);
assert(classNameStr != NULL);
env->SetObjectArrayElement(strArray, 0, classNameStr);
for (size_t i = 0; i < options.size(); ++i) {
jstring optionsStr = env->NewStringUTF(options.itemAt(i).string());
assert(optionsStr != NULL);
env->SetObjectArrayElement(strArray, i + 1, optionsStr);
}
/*
* Start VM. This thread becomes the main thread of the VM, and will
* not return until the VM exits.
*/
char* slashClassName = toSlashClassName(className != NULL ? className : "");
jclass startClass = env->FindClass(slashClassName);
if (startClass == NULL) {
ALOGE("JavaVM unable to locate class '%s'\n", slashClassName);
/* keep going */
} else {
jmethodID startMeth = env->GetStaticMethodID(startClass, "main",
"([Ljava/lang/String;)V");//通过反射调用ZygoteInit的main方法
if (startMeth == NULL) {
ALOGE("JavaVM unable to find main() in '%s'\n", className);
/* keep going */
} else {
env->CallStaticVoidMethod(startClass, startMeth, strArray);
#if 0
if (env->ExceptionCheck())
threadExitUncaughtException(env);
#endif
}
}
......
}
通过对start代码的主干简要分析我们可知start函数主要干了如下几个方面的工作:
- 调用JniInvocation.Init初始化JNI
- 创建虚拟机,虽然只有简简单的几个字但是其中涉及的内容非常多,这个后面专门章节细述
- 注册系统JNI函数
- 通过JNI调用ZygoteInit类的main函数
5.1 JniInvocation.Init
该代码定义在libnativehelper/JniInvocation.cpp中
Init函数主要作用是初始化JNI,具体工作是首先通过dlopen加载libart.so获得其句柄,然后调用dlsym从libart.so中找到JNI_GetDefaultJavaVMInitArgs、JNI_CreateJavaVM、JNI_GetCreatedJavaVMs三个函数地址,赋值给对应成员属性,这三个函数会在后续虚拟机创建中调用。
bool JniInvocation::Init(const char* library) {
#ifdef __ANDROID__
char buffer[PROP_VALUE_MAX];
#else
char* buffer = NULL;
#endif
library = GetLibrary(library, buffer);//默认返回 libart.so
// Load with RTLD_NODELETE in order to ensure that libart.so is not unmapped when it is closed.
// This is due to the fact that it is possible that some threads might have yet to finish
// exiting even after JNI_DeleteJavaVM returns, which can lead to segfaults if the library is
// unloaded.
const int kDlopenFlags = RTLD_NOW | RTLD_NODELETE;
/*
* 1.dlopen功能是以指定模式打开指定的动态链接库文件,并返回一个句柄
* 2.RTLD_NOW表示需要在dlopen返回前,解析出所有未定义符号,如果解析不出来,在dlopen会返回NULL
* 3.RTLD_NODELETE表示在dlclose()期间不卸载库,并且在以后使用dlopen()重新加载库时不初始化库中的静态变量
*/
handle_ = dlopen(library, kDlopenFlags);// 获取libart.so的句柄
if (handle_ == NULL) {//获取失败打印错误日志并尝试再次打开libart.so
if (strcmp(library, kLibraryFallback) == 0) {
// Nothing else to try.
ALOGE("Failed to dlopen %s: %s", library, dlerror());
return false;
}
// Note that this is enough to get something like the zygote
// running, we can't property_set here to fix this for the future
// because we are root and not the system user. See
// RuntimeInit.commonInit for where we fix up the property to
// avoid future fallbacks. http://b/11463182
ALOGW("Falling back from %s to %s after dlopen error: %s",
library, kLibraryFallback, dlerror());
library = kLibraryFallback;
handle_ = dlopen(library, kDlopenFlags);
if (handle_ == NULL) {
ALOGE("Failed to dlopen %s: %s", library, dlerror());
return false;
}
}
/*
* 1.FindSymbol函数内部实际调用的是dlsym
* 2.dlsym作用是根据 动态链接库 操作句柄(handle)与符号(symbol),返回符号对应的地址
* 3.这里实际就是从libart.so中将JNI_GetDefaultJavaVMInitArgs等对应的地址存入&JNI_GetDefaultJavaVMInitArgs_中
*/
if (!FindSymbol(reinterpret_cast<void**>(&JNI_GetDefaultJavaVMInitArgs_),
"JNI_GetDefaultJavaVMInitArgs")) {
return false;
}
if (!FindSymbol(reinterpret_cast<void**>(&JNI_CreateJavaVM_),
"JNI_CreateJavaVM")) {
return false;
}
if (!FindSymbol(reinterpret_cast<void**>(&JNI_GetCreatedJavaVMs_),
"JNI_GetCreatedJavaVMs")) {
return false;
}
return true;
}
5.2 AndroidRuntime::startVm创建虚拟机
该代码定义在frameworks/base/core/jni/AndroidRuntime.cpp中,startVM该函数中主要是初始化 VM 的参数,然后接着调用JNI_CreateJavaVM创建 VM。下面让我们详细分析之。
5.2.1 创建VM的涉及的参数
对于这个函数的代码,我们简单浏览不难发现,这里面存在着大量的参数需要解析,其实我们如果不是专门做系统调优或者非常底层的可以不需要太过于关注,其实就是从各种系统属性中读取一些参数,然后通过addOption设置到AndroidRuntime的mOptions数组中存起来,另外就是调用之前从libart.so中找到JNI_CreateJavaVM函数,并将这些参数传入,这些参数的实际意义我也不了解,对于虚拟机有兴趣的可以深入研究研究。
实际上这些参数在代码中都是通过调用addOption函数添加到mOptions数组中的,下图只是其中一个例子,实际代码中有非常多的参数。
5.2.2 startVm代码简要分析
在前面的章节我们主要讲了startVm主要是解析vm参数,然后将相关参数收集保存最最好通过JNI_CreateJavaVM创建虚拟机。下面分析一下:
void AndroidRuntime::addOption(const char* optionString, void* extraInfo)
{
JavaVMOption opt;
opt.optionString = optionString;
opt.extraInfo = extraInfo;
mOptions.add(opt);
}
/*
* Start the Dalvik Virtual Machine.
*
* Various arguments, most determined by system properties, are passed in.
* The "mOptions" vector is updated.
*
* CAUTION: when adding options in here, be careful not to put the
* char buffer inside a nested scope. Adding the buffer to the
* options using mOptions.add() does not copy the buffer, so if the
* buffer goes out of scope the option may be overwritten. It's best
* to put the buffer at the top of the function so that it is more
* unlikely that someone will surround it in a scope at a later time
* and thus introduce a bug.
*
* Returns 0 on success.
*/
int AndroidRuntime::startVm(JavaVM** pJavaVM, JNIEnv** pEnv, bool zygote)
{
JavaVMInitArgs initArgs;
......
/* route exit() to our handler */
addOption("exit", (void*) runtime_exit);//将各种参数加入mOptions列表中
.......
initArgs.version = JNI_VERSION_1_4;
initArgs.options = mOptions.editArray();//将mOptions赋值给initArgs
initArgs.nOptions = mOptions.size();
initArgs.ignoreUnrecognized = JNI_FALSE;
/*
* Initialize the VM.
*
* The JavaVM* is essentially per-process, and the JNIEnv* is per-thread.
* If this call succeeds, the VM is ready, and we can start issuing
* JNI calls.
*/
if (JNI_CreateJavaVM(pJavaVM, pEnv, &initArgs) < 0) {//调用前面5.2.2章节里面获取的libart.so的JNI_CreateJavaVM函数
ALOGE("JNI_CreateJavaVM failed\n");
return -1;
}
return 0;
}
//代码位于libnativehelper\JniInvocation.cpp
jint JniInvocation::JNI_CreateJavaVM(JavaVM** p_vm, JNIEnv** p_env, void* vm_args) {
return JNI_CreateJavaVM_(p_vm, p_env, vm_args);//调用之前初始化的JNI_CreateJavaVM_
}
extern "C" jint JNI_CreateJavaVM(JavaVM** p_vm, JNIEnv** p_env, void* vm_args) {
return JniInvocation::GetJniInvocation().JNI_CreateJavaVM(p_vm, p_env, vm_args);//通过前面获取的相关vm参数创建vm
}
JniInvocation& JniInvocation::GetJniInvocation() {
LOG_ALWAYS_FATAL_IF(jni_invocation_ == NULL,
"Failed to create JniInvocation instance before using JNI invocation API");
return *jni_invocation_;
}
5.2.3 startVm参数配置注意点
随着Android版本的升级和现在硬件水平的提高,现在主流的Android终端配置基本都是6+128的相关配置了。但是随之而来的就是现在的App越来越吃内存了,要重点关注其中 dalvik heapsize 的初始化,如果没有配置正确的参数或者使用默认的参数很有可能导致手机无法进入 Launcher。因为目前APP 占用的 heapsize 都比较大,使用默认参数很容易出现 OOM 导致应用不断重启。
/*
* The default starting and maximum size of the heap. Larger
* values should be specified in a product property override.
*/
parseRuntimeOption("dalvik.vm.heapstartsize", heapstartsizeOptsBuf, "-Xms", "4m");
parseRuntimeOption("dalvik.vm.heapsize", heapsizeOptsBuf, "-Xmx", "16m");
parseRuntimeOption("dalvik.vm.heapgrowthlimit", heapgrowthlimitOptsBuf, "-XX:HeapGrowthLimit=");
parseRuntimeOption("dalvik.vm.heapminfree", heapminfreeOptsBuf, "-XX:HeapMinFree=");
parseRuntimeOption("dalvik.vm.heapmaxfree", heapmaxfreeOptsBuf, "-XX:HeapMaxFree=");
parseRuntimeOption("dalvik.vm.heaptargetutilization",
heaptargetutilizationOptsBuf,
"-XX:HeapTargetUtilization=");
这个值要根据具体的Android终端硬件配置决定,使用命令getprop | grep dalvik.vm.heap
可以查看终端值,我们的终端相关的值如下:
5.3 AndroidRuntime::startReg
startReg该函数主要有如下几个方面:
- 调用androidSetCreateThreadFunc设置Android创建线程的处理函数
- 然后创建了一个200容量的局部引用作用域,确保不会出现局部引用不会溢出,通常PushLocalFrame 和 PopLocalFrame 是配套使用,它们可以为局部引用创建一个指定数量内嵌的空间,在这个函数对之间的局部引用都会在这个空间内,直到释放后,所有的局部引用都会被释放掉,不用再担心每一个局部引用的释放问题了。
- register_jni_procs作用是注册 JNI 函数,遍历 gRegJNI 数组中 JNI register 方法注册 JNI method。注册JNI 方法后,会通过 JNI 调用 java class(zygoteInit)的 main 函数进入 java 世界。
/*
* Register android native functions with the VM.
*/
/*static*/ int AndroidRuntime::startReg(JNIEnv* env)
{
ATRACE_NAME("RegisterAndroidNatives");
/*
* This hook causes all future threads created in this process to be
* attached to the JavaVM. (This needs to go away in favor of JNI
* Attach calls.)
*/
//设置Android创建线程的函数javaCreateThreadEtc,这个函数内部是通过Linux的clone来创建线程的
androidSetCreateThreadFunc((android_create_thread_fn) javaCreateThreadEtc);
ALOGV("--- registering native functions ---\n");
/*
* Every "register" function calls one or more things that return
* a local reference (e.g. FindClass). Because we haven't really
* started the VM yet, they're all getting stored in the base frame
* and never released. Use Push/Pop to manage the storage.
*/
env->PushLocalFrame(200);//创建一个200容量的局部引用作用域,这个局部引用其实就是局部变量
if (register_jni_procs(gRegJNI, NELEM(gRegJNI), env) < 0) {//注册JNI函数
env->PopLocalFrame(NULL);
return -1;
}
env->PopLocalFrame(NULL);//和PushLocalFrame配套使用,释放局部引用作用域
//createJavaThread("fubar", quickTest, (void*) "hello");
return 0;
}
5.3.1 androidSetCreateThreadFunc
该函数定义在system/core//libutils/Threads.cpp中,设置设置线程创建函数指针gCreateThreadFn指向javaCreateThreadEtc
void androidSetCreateThreadFunc(android_create_thread_fn func)
{
gCreateThreadFn = func;
}
接着我们继续分析javaCreateThreadEtc,该函数定义在frameworks/base/core/jni/AndroidRuntime.cpp之中,这个流程比较多这个不是本篇重点关注的
/*
* This is invoked from androidCreateThreadEtc() via the callback
* set with androidSetCreateThreadFunc().
*
* We need to create the new thread in such a way that it gets hooked
* into the VM before it really starts executing.
*/
/*static*/ int AndroidRuntime::javaCreateThreadEtc(
android_thread_func_t entryFunction,
void* userData,
const char* threadName,
int32_t threadPriority,
size_t threadStackSize,
android_thread_id_t* threadId)
{
void** args = (void**) malloc(3 * sizeof(void*)); // javaThreadShell must free
int result;
LOG_ALWAYS_FATAL_IF(threadName == nullptr, "threadName not provided to javaCreateThreadEtc");
args[0] = (void*) entryFunction;
args[1] = userData;
args[2] = (void*) strdup(threadName); // javaThreadShell must free
result = androidCreateRawThreadEtc(AndroidRuntime::javaThreadShell, args,
threadName, threadPriority, threadStackSize, threadId);
return result;
}
5.3.2 register_jni_procs
/*static*/ int AndroidRuntime::startReg(JNIEnv* env)
{
......
if (register_jni_procs(gRegJNI, NELEM(gRegJNI), env) < 0) {
env->PopLocalFrame(NULL);
return -1;
}
......
return 0;
}
register_jni_procs函数定义在frameworks/base/core/jni/AndroidRuntime.cpp
它的处理是交给RegJNIRec的mProc,RegJNIRec是个很简单的结构体,mProc是个函数指针
static int register_jni_procs(const RegJNIRec array[], size_t count, JNIEnv* env)
{
for (size_t i = 0; i < count; i++) {
if (array[i].mProc(env) < 0) {//调用mProc
#ifndef NDEBUG
ALOGD("----------!!! %s failed to load\n", array[i].mName);
#endif
return -1;
}
}
return 0;
}
我们看看register_jni_procs传入的RegJNIRec数组gRegJNI,里面就是一堆的函数指针
static const RegJNIRec gRegJNI[] = {
REG_JNI(register_com_android_internal_os_RuntimeInit),
REG_JNI(register_com_android_internal_os_ZygoteInit_nativeZygoteInit),
REG_JNI(register_android_os_SystemClock),
REG_JNI(register_android_util_EventLog),
REG_JNI(register_android_util_Log),
REG_JNI(register_android_util_MemoryIntArray),
REG_JNI(register_android_util_PathParser),
REG_JNI(register_android_util_StatsLog),
REG_JNI(register_android_app_admin_SecurityLog),
REG_JNI(register_android_content_AssetManager),
REG_JNI(register_android_content_StringBlock),
REG_JNI(register_android_content_XmlBlock),
REG_JNI(register_android_content_res_ApkAssets),
REG_JNI(register_android_text_AndroidCharacter),
REG_JNI(register_android_text_Hyphenator),
REG_JNI(register_android_text_MeasuredParagraph),
REG_JNI(register_android_text_StaticLayout),
REG_JNI(register_android_view_InputDevice),
REG_JNI(register_android_view_KeyCharacterMap),
REG_JNI(register_android_os_Process),
REG_JNI(register_android_os_SystemProperties),
REG_JNI(register_android_os_Binder),
REG_JNI(register_android_os_Parcel),
REG_JNI(register_android_os_HidlSupport),
REG_JNI(register_android_os_HwBinder),
REG_JNI(register_android_os_HwBlob),
REG_JNI(register_android_os_HwParcel),
REG_JNI(register_android_os_HwRemoteBinder),
......
}
我们随便看一个register_com_android_internal_os_RuntimeInit,这实际上是自定义JNI函数并进行动态注册的标准写法,内部是调用JNI的RegisterNatives,这样注册后,Java类RuntimeInit的native方法nativeFinishInit就会调用com_android_internal_os_RuntimeInit_nativeFinishInit函数,nativeSetExitWithoutCleanup就会调用com_android_internal_os_RuntimeInit_nativeSetExitWithoutCleanup函数
int register_com_android_internal_os_RuntimeInit(JNIEnv* env)
{
const JNINativeMethod methods[] = {
{ "nativeFinishInit", "()V",
(void*) com_android_internal_os_RuntimeInit_nativeFinishInit },
{ "nativeSetExitWithoutCleanup", "(Z)V",
(void*) com_android_internal_os_RuntimeInit_nativeSetExitWithoutCleanup },
};
return jniRegisterNativeMethods(env, "com/android/internal/os/RuntimeInit",
methods, NELEM(methods));
}
6. 通过JNI反射启动ZygoteInit的main方法
在前面zygote启动阶段已经创建好了虚拟机,注册好了系统JNI函数,一切就绪只欠东风了!在接下来通过JNI的反射调用Java的类了。
还是熟悉的味道,还是原来的配方,经过一顿猛如虎的操作最终调用CallStaticVoidMethod反射从而调用ZygoteInit的main函数。
void AndroidRuntime::start(const char* className, const Vector<String8>& options, bool zygote)
{
......
/*
* We want to call main() with a String array with arguments in it.
* At present we have two arguments, the class name and an option string.
* Create an array to hold them.
*/
//接下来的这些语法大家应该比较熟悉了,都是JNI里的语法,主要作用就是调用ZygoteInit类的main函数
jclass stringClass;//申明一些jni变量,没有什么好说的
jobjectArray strArray;
jstring classNameStr;
stringClass = env->FindClass("java/lang/String");
assert(stringClass != NULL);
strArray = env->NewObjectArray(options.size() + 1, stringClass, NULL);
assert(strArray != NULL);
classNameStr = env->NewStringUTF(className);
assert(classNameStr != NULL);
env->SetObjectArrayElement(strArray, 0, classNameStr);
for (size_t i = 0; i < options.size(); ++i) {
jstring optionsStr = env->NewStringUTF(options.itemAt(i).string());
assert(optionsStr != NULL);
env->SetObjectArrayElement(strArray, i + 1, optionsStr);
}
/*
* Start VM. This thread becomes the main thread of the VM, and will
* not return until the VM exits.
*/
/*
*将字符中的.转换为/,我们前面传递过来的字符串是com.android.internal.os.ZygoteInit,
*通过这里转换为com/android/internal/os/ZygoteInit,这个是JNI的规范
*/
char* slashClassName = toSlashClassName(className != NULL ? className : "");//将字符中的.转换为/
jclass startClass = env->FindClass(slashClassName);//找到class
if (startClass == NULL) {
ALOGE("JavaVM unable to locate class '%s'\n", slashClassName);
/* keep going */
} else {
jmethodID startMeth = env->GetStaticMethodID(startClass, "main",
"([Ljava/lang/String;)V");//通过反射找到ZygoteInit的main函数
if (startMeth == NULL) {
ALOGE("JavaVM unable to find main() in '%s'\n", className);
/* keep going */
} else {
env->CallStaticVoidMethod(startClass, startMeth, strArray);//调用main方法
#if 0
if (env->ExceptionCheck())
threadExitUncaughtException(env);
#endif
}
}
free(slashClassName);
ALOGD("Shutting down VM\n");
if (mJavaVM->DetachCurrentThread() != JNI_OK)//退出当前线程
ALOGW("Warning: unable to detach main thread\n");
if (mJavaVM->DestroyJavaVM() != 0)//等待所有线程结束关闭虚拟机
ALOGW("Warning: VM did not shut down cleanly\n");
}
7. 登陆Java世界
兜兜转转,终于调用CallStaticVoidMethod终于登陆Java的世界了,Java世界真香,让我们畅游一番。
7.1 ZygoteInit.main
该代码定义在frameworks/base/core/java/com/android/internal/os/ZygoteInit.java中,这里不过多详细解释,在main方法中主要做了如下几件事情:
- 解析参数
- 调用zygoteServer.registerServerSocketFromEnv创建zygote通信的服务端
- 调用forkSystemServer启动system_server
- zygoteServer.runSelectLoop进入循环模式
private static final String SOCKET_NAME_ARG = "--socket-name=";
public static void main(String argv[]) {
......
String socketName = "zygote";
String abiList = null;
boolean enableLazyPreload = false;
for (int i = 1; i < argv.length; i++) {//解析参数
if ("start-system-server".equals(argv[i])) {
startSystemServer = true;
} else if ("--enable-lazy-preload".equals(argv[i])) {
enableLazyPreload = true;
} else if (argv[i].startsWith(ABI_LIST_ARG)) {
abiList = argv[i].substring(ABI_LIST_ARG.length());
} else if (argv[i].startsWith(SOCKET_NAME_ARG)) {
socketName = argv[i].substring(SOCKET_NAME_ARG.length());
} else {
throw new RuntimeException("Unknown command line argument: " + argv[i]);
}
}
if (abiList == null) {
throw new RuntimeException("No ABI list supplied.");
}
zygoteServer.registerServerSocketFromEnv(socketName);//注册zygote进程和AMS交互的通道
// In some configurations, we avoid preloading resources and classes eagerly.
// In such cases, we will preload things prior to our first fork.
if (!enableLazyPreload) {
bootTimingsTraceLog.traceBegin("ZygotePreload");
EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_START,
SystemClock.uptimeMillis());
//firstPreload(bootTimingsTraceLog);
preloadTextResources();
EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_END,
SystemClock.uptimeMillis());
bootTimingsTraceLog.traceEnd(); // ZygotePreload,预加载类和资源
} else {
//如注释,延迟预加载
//变更Zygote进程优先级为NORMAL级别
//第一次fork时才会preload
Zygote.resetNicePriority();
}
// Do an initial gc to clean up after startup
bootTimingsTraceLog.traceBegin("PostZygoteInitGC");
gcAndFinalize();//GC操作,如果预加载了,很有必要GC一波
bootTimingsTraceLog.traceEnd(); // PostZygoteInitGC
bootTimingsTraceLog.traceEnd(); // ZygoteInit
// Disable tracing so that forked processes do not inherit stale tracing tags from
// Zygote.
Trace.setTracingEnabled(false, 0);
Zygote.nativeSecurityInit();
//以下均是安全相关的内容
// Zygote process unmounts root storage spaces.
Zygote.nativeUnmountStorageOnInit();
ZygoteHooks.stopZygoteNoThreadCreation();
if (startSystemServer) {
Runnable r = forkSystemServer(abiList, socketName, zygoteServer);//启动system_server
// {@code r == null} in the parent (zygote) process, and {@code r != null} in the
// child (system_server) process.
if (r != null) {
r.run();
return;
}
}
//preloadTextResources();
Log.i(TAG, "Accepting command socket connections");
// The select loop returns early in the child process after a fork and
// loops forever in the zygote.
caller = zygoteServer.runSelectLoop(abiList);//进入循环模式
} catch (Throwable ex) {
Log.e(TAG, "System zygote died with exception", ex);
throw ex;
} finally {
zygoteServer.closeServerSocket();
}
// We're in the child process and have exited the select loop. Proceed to execute the
// command.
if (caller != null) {
caller.run();
//通过反射调用新进程函数的地方
//后续介绍新进程启动时,再介绍
}
}
7.2 zygoteServer.registerServerSocketFromEnv
该代码的路径是frameworks/base/core/java/com/android/internal/os/ZygoteServer.java,这里主要是通过解析zygote启动传入的变量值得到 zygote 文件描述符,根据该文件描述符创建 socket,用来和 ActivityManagerService 通信。AMS 通过 Process.start 来创建新的进程,而Process.start 会先通过 socket 连接到 zygote 进程,并最终由 zygote 完成进程创建。这里传入的-为-socket-name=zygote。
/**
* Registers a server socket for zygote command connections. This locates the server socket
* file descriptor through an ANDROID_SOCKET_ environment variable.
*
* @throws RuntimeException when open fails
*/
void registerServerSocketFromEnv(String socketName) {
if (mServerSocket == null) {
int fileDesc;
final String fullSocketName = ANDROID_SOCKET_PREFIX + socketName;
try {
//记得么?在init.zygote.rc被加载时,指定了名为zygote的socket
//在进程被创建时,就会创建对应的文件描述符,并加入到环境变量中
//因此,此时可以取出对应的环境变量
String env = System.getenv(fullSocketName);
fileDesc = Integer.parseInt(env);
} catch (RuntimeException ex) {
throw new RuntimeException(fullSocketName + " unset or invalid", ex);
}
try {
FileDescriptor fd = new FileDescriptor();
fd.setInt$(fileDesc);//设置文件描述符
mServerSocket = new LocalServerSocket(fd);//创建socket本地服务端
mCloseSocketFd = true;
} catch (IOException ex) {
throw new RuntimeException(
"Error binding to local socket '" + fileDesc + "'", ex);
}
}
}
LocalSocket是Google为我们带来的,比Java本身的socket效率要高,没有经过协议栈,是Android自己实现的类似共享内存一样的东东,在传输大量数据的时候就需要用到,如果有兴趣大家可以自行去了解一下。我们简单看下LocalSocket的代码
public LocalServerSocket(String name) throws IOException
{
impl = new LocalSocketImpl();
//创建SOCKET_STREAM类型的AF_UNIX socket
impl.create(LocalSocket.SOCKET_STREAM);
localAddress = new LocalSocketAddress(name);
//绑定到指定地址
impl.bind(localAddress);
//开始监听
impl.listen(LISTEN_BACKLOG);
}
我们知道ActivityManagerService 通过 Process.start 来创建新的进程,而 Process.start 会先通过socket 连接到 zygote 进程,并最终由 zygote 完成进程创建。如下是App进程创建请求Zygote创建新的进程。
//定义在frameworks/base/core/java/android/os/Process.java
public static final ProcessStartResult start(final String processClass,
final String niceName,
int uid, int gid, int[] gids,
int runtimeFlags, int mountExternal,
int targetSdkVersion,
String seInfo,
String abi,
String instructionSet,
String appDataDir,
String invokeWith,
String[] zygoteArgs) {
return zygoteProcess.start(processClass, niceName, uid, gid, gids,
runtimeFlags, mountExternal, targetSdkVersion, seInfo,
abi, instructionSet, appDataDir, invokeWith, zygoteArgs);
}
//定义在frameworks/base/core/java/android/os/ZygoteProcess.java
public final Process.ProcessStartResult start(final String processClass,
final String niceName,
int uid, int gid, int[] gids,
int runtimeFlags, int mountExternal,
int targetSdkVersion,
String seInfo,
String abi,
String instructionSet,
String appDataDir,
String invokeWith,
String[] zygoteArgs) {
try {
return startViaZygote(processClass, niceName, uid, gid, gids,
runtimeFlags, mountExternal, targetSdkVersion, seInfo,
abi, instructionSet, appDataDir, invokeWith, false /* startChildZygote */,
zygoteArgs);
} catch (ZygoteStartFailedEx ex) {
Log.e(LOG_TAG,
"Starting VM process through Zygote failed");
throw new RuntimeException(
"Starting VM process through Zygote failed", ex);
}
}
private Process.ProcessStartResult startViaZygote(final String processClass,
final String niceName,
final int uid, final int gid,
final int[] gids,
int runtimeFlags, int mountExternal,
int targetSdkVersion,
String seInfo,
String abi,
String instructionSet,
String appDataDir,
String invokeWith,
boolean startChildZygote,
String[] extraArgs)
throws ZygoteStartFailedEx {
......
synchronized(mLock) {
return zygoteSendArgsAndGetResult(openZygoteSocketIfNeeded(abi), argsForZygote);
}
......
}
/**
* Tries to open socket to Zygote process if not already open. If
* already open, does nothing. May block and retry. Requires that mLock be held.
*/
@GuardedBy("mLock")
private ZygoteState openZygoteSocketIfNeeded(String abi) throws ZygoteStartFailedEx {
Preconditions.checkState(Thread.holdsLock(mLock), "ZygoteProcess lock not held");
if (primaryZygoteState == null || primaryZygoteState.isClosed()) {
try {
primaryZygoteState = ZygoteState.connect(mSocket);
} catch (IOException ioe) {
throw new ZygoteStartFailedEx("Error connecting to primary zygote", ioe);
}
maybeSetApiBlacklistExemptions(primaryZygoteState, false);
maybeSetHiddenApiAccessLogSampleRate(primaryZygoteState);
}
if (primaryZygoteState.matches(abi)) {
return primaryZygoteState;
}
// The primary zygote didn't match. Try the secondary.
if (secondaryZygoteState == null || secondaryZygoteState.isClosed()) {
try {
secondaryZygoteState = ZygoteState.connect(mSecondarySocket);
} catch (IOException ioe) {
throw new ZygoteStartFailedEx("Error connecting to secondary zygote", ioe);
}
maybeSetApiBlacklistExemptions(secondaryZygoteState, false);
maybeSetHiddenApiAccessLogSampleRate(secondaryZygoteState);
}
if (secondaryZygoteState.matches(abi)) {
return secondaryZygoteState;
}
throw new ZygoteStartFailedEx("Unsupported zygote ABI: " + abi);
}
我不知道大伙读到这里是否有一个疑问,为啥AMS和Zygote通信使用的是socket而不是Android的独门绝技Binder,大伙可以一起思考思考为啥Android的设计者是这么设计的呢,目前认为主要有如下几个方面的考虑:
- zygote比service manager先启动,从这个意义触发,zygote没有service manager可以注册binder,所以没有办法binder
- zygote进程和service manager进程都是由init进程启动的,那怕先启动service manager,也不能保证zygote起来的时候service manager启动好了,这样就需要额外的同步
- 同时这个socket的所有者是root,用户组是system,只有系统权限的用户才能读写,这又多了一个安全保障(这里的socket是unix域的socket,而不是internet域的socket)
- 最最主要的是zygote是通过fork生成进程的,而多线程是不允许fork的,可能造成死锁,同时Binder又是多线程,为了避免这些麻烦所以干脆使用socket
7.3 preload预加载
预加载类和资源,android Java 进程都是由 zygote 进程 fork,zygote 通过预加载类和资源可以加快子进程的执行速度和优化内存。因为预加载的类和资源较多,在开机优化过程中也需要重点关注 preload 的耗时。
static void preload(BootTimingsTraceLog bootTimingsTraceLog) {
.........
//Pin ICU Data, 获取字符集转换资源等
beginIcuCachePinning();
.........
//读取文件system/etc/preloaded-classes,然后通过反射加载对应的类
//一般由厂商来定义,有时需要加载数千个类,启动慢的原因之一
preloadClasses();
..........
//负责加载一些常用的系统资源
preloadResources();
........
//图形相关的
preloadOpenGL();
.......
//一些必要库
preloadSharedLibraries();
//语言相关的字符信息
preloadTextResources();
// Ask the WebViewFactory to do any initialization that must run in the zygote process,
//for memory sharing purposes.
WebViewFactory.prepareWebViewInZygote();
endIcuCachePinning();
//安全相关的
warmUpJcaProviders();
Log.d(TAG, "end preload");
sPreloadComplete = true;
}
- preloadClasses()预加载的文件路径为/system/etc/preloaded-classes,预加载比较多,这里对于类的加载主要通过Class.forName()方法来进行的。
- preloadResources()主要预加载的是com.android.internal.R.array.preloaded_drawables和com.android.internal.R.array.preloaded_color_state_lists的,那这些资源的载体是什么,主要是加载framework-res.apk中的资源,各位可以反编译看看。在应用程序中以com.android.internal.R.xxx开头的资源,便是此时由Zygote加载到内存的。
当经过上述的步骤后,zygote进程内加载了preload()方法中的所有资源,当需要fork新进程时,采用COW(copy-on-write)技术,这里盗用已不在Android界的大佬gityuan的一张图来说明:
copy-on-write即写时拷贝技术,zygote在这里使用了copy-on-write技术可以提高应用运行速度,因为该种方式对运行在内存中的进程实现了最大程度的复用,并通过库共享有效降低了内存的使用量。也就是说当新的App通过fork()创建的的时候不进行内存的复制,这是因为复制内存的开销是很大的,此时子进程只需要共享父进程的内存空间即可,因为这个时候他们没有差异。而当子进程需要需要修改共享内存信息时,此时才开始将内存信息复制到自己的内存空间中,并进行修改。感觉很高大上啊,这也就是为啥我们的App里面也能使用预加载的资源,so库等。这下大伙明白为啥我们能import com.android.internal.R.xxx的资源了吗。
7.4 forkSystemServer
该代码定义在frameworks/base/core/java/com/android/internal/os/ZygoteInit.java,其逻辑主要就是准备参数,并启动 system_server 进程,后续启动的Android Java 系统服务都将驻留在该进程中,它是是Android framework核心。这里主要设置了system_server 进程uid和gid,process name,class name。并且从zygote进程fork新进程后,需要关闭zygote原有的socket,另外,对于有两个zygote进程情况,需等待第2个zygote创建完成。
/**
* Prepare the arguments and forks for the system server process.
*
* Returns an {@code Runnable} that provides an entrypoint into system_server code in the
* child process, and {@code null} in the parent.
*/
private static Runnable forkSystemServer(String abiList, String socketName,
ZygoteServer zygoteServer) {
long capabilities = posixCapabilitiesAsBits(
OsConstants.CAP_IPC_LOCK,
OsConstants.CAP_KILL,
OsConstants.CAP_NET_ADMIN,
OsConstants.CAP_NET_BIND_SERVICE,
OsConstants.CAP_NET_BROADCAST,
OsConstants.CAP_NET_RAW,
OsConstants.CAP_SYS_MODULE,
OsConstants.CAP_SYS_NICE,
OsConstants.CAP_SYS_PTRACE,
OsConstants.CAP_SYS_TIME,
OsConstants.CAP_SYS_TTY_CONFIG,
OsConstants.CAP_WAKE_ALARM,
OsConstants.CAP_BLOCK_SUSPEND
);
/* Containers run without some capabilities, so drop any caps that are not available. */
StructCapUserHeader header = new StructCapUserHeader(
OsConstants._LINUX_CAPABILITY_VERSION_3, 0);
StructCapUserData[] data;
try {
data = Os.capget(header);
} catch (ErrnoException ex) {
throw new RuntimeException("Failed to capget()", ex);
}
capabilities &= ((long) data[0].effective) | (((long) data[1].effective) << 32);
/* Hardcoded command line to start the system server */
String args[] = {//参数准备
"--setuid=1000",
"--setgid=1000",
"--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1021,1023,1024,1032,1065,3001,3002,3003,3006,3007,3009,3010",
"--capabilities=" + capabilities + "," + capabilities,
"--nice-name=system_server",
"--runtime-args",
"--target-sdk-version=" + VMRuntime.SDK_VERSION_CUR_DEVELOPMENT,
"com.android.server.SystemServer",
};
ZygoteConnection.Arguments parsedArgs = null;
int pid;
try {
//用于解析参数,生成目标格式
parsedArgs = new ZygoteConnection.Arguments(args);
ZygoteConnection.applyDebuggerSystemProperty(parsedArgs);
ZygoteConnection.applyInvokeWithSystemProperty(parsedArgs);
boolean profileSystemServer = SystemProperties.getBoolean(
"dalvik.vm.profilesystemserver", false);
if (profileSystemServer) {
parsedArgs.runtimeFlags |= Zygote.PROFILE_SYSTEM_SERVER;
}
/* Request to fork the system server process */
//fork子进程,用于运行system_server
pid = Zygote.forkSystemServer(
parsedArgs.uid, parsedArgs.gid,
parsedArgs.gids,
parsedArgs.runtimeFlags,
null,
parsedArgs.permittedCapabilities,
parsedArgs.effectiveCapabilities);
} catch (IllegalArgumentException ex) {
throw new RuntimeException(ex);
}
/* For child process */
if (pid == 0) {//进入子进程system_server
if (hasSecondZygote(abiList)) {
waitForSecondaryZygote(socketName);
}
zygoteServer.closeServerSocket();
// 完成system_server进程剩余的工作
return handleSystemServerProcess(parsedArgs);
}
return null;
}
7.5 ZygoteServer.runSelectLoop
该代码定义在frameworks/base/core/java/com/android/internal/os/ZygoteServer.java,在这个阶段zygote将进入循环状态等待AMS来和zygote进行通信,从而孵化新的App。
/**
* Runs the zygote process's select loop. Accepts new connections as
* they happen, and reads commands from connections one spawn-request's
* worth at a time.
*/
Runnable runSelectLoop(String abiList) {
ArrayList<FileDescriptor> fds = new ArrayList<FileDescriptor>();
ArrayList<ZygoteConnection> peers = new ArrayList<ZygoteConnection>();
//sServerSocket是socket通信中的服务端,即zygote进程。保存到fds[0]
fds.add(mServerSocket.getFileDescriptor());
peers.add(null);
while (true) {
//每次循环,都重新创建需要监听的pollFds
StructPollfd[] pollFds = new StructPollfd[fds.size()];
for (int i = 0; i < pollFds.length; ++i) {
pollFds[i] = new StructPollfd();
pollFds[i].fd = fds.get(i);
//关注事件的到来
pollFds[i].events = (short) POLLIN;
}
try {
//处理轮询状态,当pollFds有事件到来则往下执行,否则阻塞在这里
Os.poll(pollFds, -1);
} catch (ErrnoException ex) {
throw new RuntimeException("poll failed", ex);
}
/*注意这里是倒序处理的,网上有的博客说是优先处理已建立连接的信息,后处理新建连接的请求
* 我觉得这个表述不是很正确,我觉得采用倒序是为了先处理已经建立连接的请求,但是这个优先反而是后面建立连接的请求有数据到来是优先处理了
* 然后接着最后处理sServerSocket,此时即有新的客户端要求建立连接
*/
for (int i = pollFds.length - 1; i >= 0; --i) {
//采用I/O多路复用机制,当接收到客户端发出连接请求 或者数据处理请求到来,则往下执行;
//否则进入continue,跳出本次循环。
if ((pollFds[i].revents & POLLIN) == 0) {
continue;
}
if (i == 0) {
//即fds[0],代表的是sServerSocket因为它最先加入,则意味着有客户端连接请求;
//则创建ZygoteConnection对象,并添加到fds。
ZygoteConnection newPeer = acceptCommandPeer(abiList);
//加入到peers和fds,下一次也开始监听
peers.add(newPeer);
fds.add(newPeer.getFileDesciptor());
} else {
try {
//i>0,则代表通过socket接收来自对端的数据,并执行相应操作
ZygoteConnection connection = peers.get(i);
final Runnable command = connection.processOneCommand(this);
if (mIsForkChild) {
// We're in the child. We should always have a command to run at this
// stage if processOneCommand hasn't called "exec".
if (command == null) {
throw new IllegalStateException("command == null");
}
return command;
} else {
// We're in the server - we should never have any commands to run.
if (command != null) {
throw new IllegalStateException("command != null");
}
// We don't know whether the remote side of the socket was closed or
// not until we attempt to read from it from processOneCommand. This shows up as
// a regular POLLIN event in our regular processing loop.
if (connection.isClosedByPeer()) {
connection.closeSocket();
peers.remove(i);
fds.remove(i);//处理完则从fds中移除该文件描述符
}
}
} catch (Exception e) {
if (!mIsForkChild) {
// We're in the server so any exception here is one that has taken place
// pre-fork while processing commands or reading / writing from the
// control socket. Make a loud noise about any such exceptions so that
// we know exactly what failed and why.
Slog.e(TAG, "Exception executing zygote command: ", e);
// Make sure the socket is closed so that the other end knows immediately
// that something has gone wrong and doesn't time out waiting for a
// response.
ZygoteConnection conn = peers.remove(i);
conn.closeSocket();
fds.remove(i);
} else {
// We're in the child so any exception caught here has happened post
// fork and before we execute ActivityThread.main (or any other main()
// method). Log the details of the exception and bring down the process.
Log.e(TAG, "Caught post-fork exception in child process.", e);
throw e;
}
} finally {
// Reset the child flag, in the event that the child process is a child-
// zygote. The flag will not be consulted this loop pass after the Runnable
// is returned.
mIsForkChild = false;
}
}
}
}
}
从上面的代码可以看出,Zygote采用高效的I/O多路复用机制,保证在没有客户端连接请求或数据处理时休眠,否则响应客户端的请求。从前面可以看到初始时fds中仅有server socket,因此当有数据到来时,将执行i等于0的分支。此时,显然是需要创建新的通信连接,因此acceptCommandPeer将被调用。让我们接着分析看看它究竟干了些啥!
7.6 ZygoteServer.acceptCommandPeer
让我们接着分析这段代码,如下:
/**
* Waits for and accepts a single command connection. Throws
* RuntimeException on failure.
*/
private ZygoteConnection acceptCommandPeer(String abiList) {
try {
// socket编程中,accept()调用主要用在基于连接的套接字类型,比如SOCK_STREAM和SOCK_SEQPACKET
// 它提取出所监听套接字的等待连接队列中第一个连接请求,创建一个新的套接字,并返回指向该套接字的文件描述符
// 新建立的套接字不在监听状态,原来所监听的套接字的状态也不受accept()调用的影响,这个就是套接字编程的基础了
return createNewConnection(mServerSocket.accept(), abiList);
} catch (IOException ex) {
throw new RuntimeException(
"IOException during accept()", ex);
}
}
protected ZygoteConnection createNewConnection(LocalSocket socket, String abiList)
throws IOException {
return new ZygoteConnection(socket, abiList);
}
通过上面的代码我们可以看到,acceptCommandPeer主要是基础的socket套接字编程,调用了server socket的accpet函数等待客户端的连接。当有新的连接建立时,zygote进程将会创建出一个新的socket与其通信,并将该socket加入到fds中。所以一旦和客户端进程的通信连接建立后,fds中将会有多个socket至少会有两个。
当poll监听到这一组sockets上有数据到来时,就会从阻塞中恢复。于是,我们需要判断到底是哪个socket收到了数据。
7.7 ZygoteConnection.processOneCommand
该代码定义在frameworks/base/core/java/com/android/internal/os/ZygoteConnection.java中,解析socket客户端即AMS传递过来的参数,然后调用forkAndSpecialize创建App进程。
/**
* Reads one start command from the command socket. If successful, a child is forked and a
* {@code Runnable} that calls the childs main method (or equivalent) is returned in the child
* process. {@code null} is always returned in the parent process (the zygote).
*
* If the client closes the socket, an {@code EOF} condition is set, which callers can test
* for by calling {@code ZygoteConnection.isClosedByPeer}.
*/
Runnable processOneCommand(ZygoteServer zygoteServer) {
String args[];
Arguments parsedArgs = null;
FileDescriptor[] descriptors;
try {
//读取socket客户端发送过来的参数列表
args = readArgumentList();
descriptors = mSocket.getAncillaryFileDescriptors();
} catch (IOException ex) {
throw new IllegalStateException("IOException on command socket", ex);
}
// readArgumentList returns null only when it has reached EOF with no available
// data to read. This will only happen when the remote socket has disconnected.
if (args == null) {
isEof = true;
return null;
}
int pid = -1;
FileDescriptor childPipeFd = null;
FileDescriptor serverPipeFd = null;
//将socket客户端传递过来的参数,解析成Arguments对象格式
parsedArgs = new Arguments(args);
......
pid = Zygote.forkAndSpecialize(parsedArgs.uid, parsedArgs.gid, parsedArgs.gids,
parsedArgs.runtimeFlags, rlimits, parsedArgs.mountExternal, parsedArgs.seInfo,
parsedArgs.niceName, fdsToClose, fdsToIgnore, parsedArgs.startChildZygote,
parsedArgs.instructionSet, parsedArgs.appDataDir);
try {
if (pid == 0) {
// in child
//子进程执行
zygoteServer.setForkChild();
zygoteServer.closeServerSocket();
IoUtils.closeQuietly(serverPipeFd);
serverPipeFd = null;
return handleChildProc(parsedArgs, descriptors, childPipeFd,
parsedArgs.startChildZygote);
} else {
// In the parent. A pid < 0 indicates a failure and will be handled in
// handleParentProc.
//父进程执行
IoUtils.closeQuietly(childPipeFd);
childPipeFd = null;
handleParentProc(pid, descriptors, serverPipeFd);
return null;
}
} finally {
IoUtils.closeQuietly(childPipeFd);
IoUtils.closeQuietly(serverPipeFd);
}
}
到此处runSelectLoop已经讲解完毕了,我们可以看出它采用的是倒序的方式进行轮询。且由于server socket第一个被加入到fds,所以它是最后被轮询到的。因此最后轮询到的socket才需要处理新建连接的操作;其它socket收到数据时,仅需要调用zygoteConnection的processOneCommand函数执行数据对应的操作。若一个连接处理完所有对应消息后,该连接对应的socket和连接等将被移除。
8. 小结
到这里zygote启动就基本告一段落了,zygote启动的调用流程图如下所示:
细数下来,zygote进程启动主要干了如下的相关大事:
- 解析init.zygotexxx.rc传递过来的参数,创建AppRuntime并调用AppRuntime.start()方法;
- 调用AndroidRuntime的startVM()方法创建虚拟机,再调用startReg()注册JNI函数
- 虚拟机和JNI环境构建后以后,通过JNI方式调用ZygoteInit.main(),正式进入的Java世界
- 调用registerServerSocketFromEnv()建立socket通道,zygote作为通信的服务端,用于响应客户端请求,这里大伙可以思考一个问题就是为啥用的是zygote通信而不是binder
- preload()预加载通用类、drawable和color资源、openGL以及共享库以及WebView,用于提高app启动效率
- zygote完毕大部分工作,接下来再通过forkSystemServer(),fork得力帮手system_server进程,也是上层framework的运行载体
- zygote功成身退,调用runSelectLoop(),随时待命,当接收到请求创建新进程请求时立即唤醒并执行相应工作。
上面我们提到Zygote进程是第一个java进程,但整篇分析下来,java进程其实也是运行在c++进程之上的,只不过是java虚拟机屏蔽了这一切。zygote进程的启动,是从c++世界一步一步过渡到java世界,每个世界做了自己的准备工作。
c++世界(app_main.cpp入口):
- 动态加载虚拟机动态库,启动java虚拟机
- 注册JNI本地函数,减轻虚拟机负担
- 装载ZygoteInit到java虚拟机,正式进入java世界
java世界(ZygoteInit.java入口):
- 绑定套接字,用来接收新Android应用程序运行请求
- 预加载Android资源,提高应用进程启动速度
- 启动并运行SystemServer(运行AMS、PMS等核心服务)
- 处理新Android应用程序运行请求
zygote进程启动其实没有特别难的难点,主要是繁琐,源码分析的过程是枯燥无味的,只有静下心来,才能有所收获。
本文忽略了很多细节,主要是介绍大致的流程,如果有错误的地方,还请大家批评指出,觉得写的不错也请点个赞~