现象:当system server进程crash时,发现zygote进程会被杀掉,此后Zyogote进程和system server被重新启动。
分析:在init解析init.rc时,Zygote进程作为一个服务被定义,且被声明为自动重启。因此一旦Zygote进程退出,则init会收到子进程退出信号从而重新启动zygote服务,进而Zygote启动System Server。同样,在System server被Zygote作为子进程启动后,Zygote通过信号监听该子进程状态,一旦退出Zygote将会杀死自身等待init再次运行。另外system server进程将监听service manager进程,如service manager退出则杀掉自身从而导致zygote被重启。
下面为相关代码:
Zygote启动system server入口:
libcore/dalvik/src/main/java/dalvik/system/Zygote.java
/**
* Special method to start the system server process.
* @deprecated use {@link Zygote#forkSystemServer(int, int, int[], int, int[][])}
*/
@Deprecated
public static int forkSystemServer(int uid, int gid, int[] gids,
boolean enableDebugger, int[][] rlimits) {
int debugFlags = enableDebugger ? DEBUG_ENABLE_DEBUGGER : 0;
return forkAndSpecialize(uid, gid, gids, debugFlags, rlimits);
}
forkAndSpecialize是一个JNI函数,其定义见Dalvik_dalvik_system_Zygote_fork(),在其中注册信号处理函数,在有子进程退出时将检查进程pid,仅当中止的子进程pid为system server时才杀掉本进程(zygote进程)。
dalvik_system_Zygote.c
/* native public static int fork(); */
static void Dalvik_dalvik_system_Zygote_fork(const u4* args, JValue* pResult)
{
pid_t pid;
if (!gDvm.zygote) {
dvmThrowException("Ljava/lang/IllegalStateException;",
"VM instance not started with -Xzygote");
RETURN_VOID();
}
if (!dvmGcPreZygoteFork()) {
LOGE("pre-fork heap failed\n");
dvmAbort();
}
setSignalHandler(); //这里注册信号处理,以监测子进程状态
dvmDumpLoaderStats("zygote");
pid = fork();
#ifdef HAVE_ANDROID_OS
if (pid == 0) {
/* child process */
extern int gMallocLeakZygoteChild;
gMallocLeakZygoteChild = 1;
}
#endif
RETURN_INT(pid);
}
/*
* configure sigchld handler for the zygote process
* This is configured very late, because earlier in the dalvik lifecycle
* we can fork() and exec() for the verifier/optimizer, and we
* want to waitpid() for those rather than have them be harvested immediately.
*
* This ends up being called repeatedly before each fork(), but there's
* no real harm in that.
*/
static void setSignalHandler()
{
int err;
struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sa.sa_handler = sigchldHandler; //信号处理函数地址
err = sigaction (SIGCHLD, &sa, NULL); //设置子进程中止时的信号处理函数
if (err < 0) {
LOGW("Error setting SIGCHLD handler: %s", strerror(errno));
}
}
/*
* This signal handler is for zygote mode, since the zygote
* must reap its children
*/
static void sigchldHandler(int s)
{
pid_t pid;
int status;
while ((pid = waitpid(-1, &status, WNOHANG)) > 0) { //得到中止的子进程pid
/* Log process-death status that we care about. In general it is not
safe to call LOG(...) from a signal handler because of possible
reentrancy. However, we know a priori that the current implementation
of LOG() is safe to call from a SIGCHLD handler in the zygote process.
If the LOG() implementation changes its locking s