之前写过一篇关于android5.0 init的介绍,这篇博客是介绍android6.0init,之前有的代码介绍不详细。而且分析 解析init.rc那块代码也没有结合init.rc介绍。
一、 main函数的一些准备工作
下面我们分析下源码:
int main(int argc, char** argv) {
if (!strcmp(basename(argv[0]), "ueventd")) {
return ueventd_main(argc, argv);
}
if (!strcmp(basename(argv[0]), "watchdogd")) {
return watchdogd_main(argc, argv);
}
由于ueventd watchdogd是公用代码,所以启动的时候根据文件名来判断是哪个进程,继续分析:
// Clear the umask.
umask(0);
add_environment("PATH", _PATH_DEFPATH);//添加环境变量
bool is_first_stage = (argc == 1) || (strcmp(argv[1], "--second-stage") != 0);
// Get the basic filesystem setup we need put together in the initramdisk
// on / and then we'll let the rc file figure out the rest.
if (is_first_stage) {
mount("tmpfs", "/dev", "tmpfs", MS_NOSUID, "mode=0755");
mkdir("/dev/pts", 0755);
mkdir("/dev/socket", 0755);
mount("devpts", "/dev/pts", "devpts", 0, NULL);
mount("proc", "/proc", "proc", 0, NULL);
mount("sysfs", "/sys", "sysfs", 0, NULL);
}
这块代码主要添加环境变量,以及挂载各种文件系统。
open_devnull_stdio();
klog_init();
klog_set_level(KLOG_NOTICE_LEVEL);//log的初始化
NOTICE("init%s started!\n", is_first_stage ? "" : " second stage");
if (!is_first_stage) {
// Indicate that booting is in progress to background fw loaders, etc.
close(open("/dev/.booting", O_WRONLY | O_CREAT | O_CLOEXEC, 0000));//启动的时候创建一个.booting文件
property_init();//属性初始化
// If arguments are passed both on the command line and in DT,
// properties set in DT always have priority over the command-line ones.
process_kernel_dt();
process_kernel_cmdline();
// Propogate the kernel variables to internal variables
// used by init as well as the current required properties.
export_kernel_boot_props();//设置一些属性
}
这里我们有没有注意到is_first_stage这个变量,我们再来往下看。如果是is_first_stage会再执行execv函数,重新启动init。这个时候参数是"--second-stage"
if (is_first_stage) {
if (restorecon("/init") == -1) {
ERROR("restorecon failed: %s\n", strerror(errno));
security_failure();
}
char* path = argv[0];
char* args[] = { path, const_cast<char*>("--second-stage"), nullptr };
if (execv(path, args) == -1) {
ERROR("execv(\"%s\") failed: %s\n", path, strerror(errno));
security_failure();
}
}
这个时候再启动,也就是if_first_stage为false。这个时候参数有"--second-stage"了
bool is_first_stage = (argc == 1) || (strcmp(argv[1], "--second-stage") != 0);
我们再看上面函数先是open_devnull_stdio函数,这个函数就是把标准输入,输出,错误输出重定义到空设备上。然后创建一个 .booting文件代表系统在启动,做了一些属性的初始化,以及一些boot相关的系统属性设置获取等。我们先看下open_devnull_stdio代码:
void open_devnull_stdio(void)
{
// Try to avoid the mknod() call if we can. Since SELinux makes
// a /dev/null replacement available for free, let's use it.
int fd = open("/sys/fs/selinux/null", O_RDWR);
if (fd == -1) {
// OOPS, /sys/fs/selinux/null isn't available, likely because
// /sys/fs/selinux isn't mounted. Fall back to mknod.
static const char *name = "/dev/__null__";
if (mknod(name, S_IFCHR | 0600, (1 << 8) | 3) == 0) {
fd = open(name, O_RDWR);
unlink(name);
}
if (fd == -1) {
exit(1);
}
}
dup2(fd, 0);
dup2(fd, 1);
dup2(fd, 2);
if (fd > 2) {
close(fd);
}
}
property_init()函数主要是属性的初始化,这个我们在之前分析属性系统的那篇博客分析过了。
我们再来看process_kernel_dt函数
static void process_kernel_dt(void)
{
static const char android_dir[] = "/proc/device-tree/firmware/android";
std::string file_name = android::base::StringPrintf("%s/compatible", android_dir);
std::string dt_file;
android::base::ReadFileToString(file_name, &dt_file);
if (!dt_file.compare("android,firmware")) {//compatible文件内容是否是android,firmware
ERROR("firmware/android is not compatible with 'android,firmware'\n");
return;
}
std::unique_ptr<DIR, int(*)(DIR*)>dir(opendir(android_dir), closedir);
if (!dir)
return;
struct dirent *dp;
while ((dp = readdir(dir.get())) != NULL) {//读取目录的每个文件
if (dp->d_type != DT_REG || !strcmp(dp->d_name, "compatible"))
continue;
file_name = android::base::StringPrintf("%s/%s", android_dir, dp->d_name);
android::base::ReadFileToString(file_name, &dt_file);
std::replace(dt_file.begin(), dt_file.end(), ',', '.');
std::string property_name = android::base::StringPrintf("ro.boot.%s", dp->d_name);//每个文件名作为属性名,里面的内容作为属性值
property_set(property_name.c_str(), dt_file.c_str());
}
}
上面这个函数主要是在/proc/device-tree/firmware/android 这个目录下,先看compatible文件内容是否是android,firmware。然后这个目录下每个文件名作为属性,文件里面的内容作为属性值。这里话就是ro.boot.hareware ro.boot.name这两个属性值。
root@lte26007:/proc/device-tree/firmware/android # ls
compatible
hardware
name
继续看process_kernel_cmdline函数
static void process_kernel_cmdline(void)
{
/* don't expose the raw commandline to nonpriv processes */
chmod("/proc/cmdline", 0440);
/* first pass does the common stuff, and finds if we are in qemu.
* second pass is only necessary for qemu to export all kernel params
* as props.
*/
import_kernel_cmdline(false, import_kernel_nv);
if (qemu[0])
import_kernel_cmdline(true, import_kernel_nv);
}
import_kernel_cmdline函数就是读取proc/cmdline中的内容,然后调用import_kernel_nv函数设置系统属性
void import_kernel_cmdline(bool in_qemu, std::function<void(char*,bool)> import_kernel_nv)
{
char cmdline[2048];
char *ptr;
int fd;
fd = open("/proc/cmdline", O_RDONLY | O_CLOEXEC);
if (fd >= 0) {
int n = read(fd, cmdline, sizeof(cmdline) - 1);
if (n < 0) n = 0;
/* get rid of trailing newline, it happens */
if (n > 0 && cmdline[n-1] == '\n') n--;
cmdline[n] = 0;
close(fd);
} else {
cmdline[0] = 0;
}
ptr = cmdline;
while (ptr && *ptr) {
char *x = strchr(ptr, ' ');
if (x != 0) *x++ = 0;
import_kernel_nv(ptr, in_qemu);
ptr = x;
}
}
在import_kernel_nv函数中设置系统属性,但是一定要有androidboot这样的关键字眼才会设置ro.boot这样的属性。这块在我们的设备cmdline中没有这样的字眼,也就不会设置这些属性。
static void import_kernel_nv(char *name, bool for_emulator)
{
char *value = strchr(name, '=');
int name_len = strlen(name);
if (value == 0) return;
*value++ = 0;
if (name_len == 0) return;
if (for_emulator) {
/* in the emulator, export any kernel option with the
* ro.kernel. prefix */
char buff[PROP_NAME_MAX];
int len = snprintf( buff, sizeof(buff), "ro.kernel.%s", name );
if (len < (int)sizeof(buff))
property_set( buff, value );
return;
}
if (!strcmp(name,"qemu")) {
strlcpy(qemu, value, sizeof(qemu));
} else if (!strncmp(name, "androidboot.", 12) && name_len > 12) {
const char *boot_prop_name = name + 12;
char prop[PROP_NAME_MAX];
int cnt;
cnt = snprintf(prop, sizeof(prop), "ro.boot.%s", boot_prop_name);
if (cnt < PROP_NAME_MAX)
property_set(prop, value);
}
}
再来看export_kernel_boot_props这个函数,它也就是设置一些属性,设置ro属性根据之前ro.boot这类的属性值,如果没有设置成unknown,像之前我们有ro.boot.hardware, 那我们就可以设置root.hardware这样的属性。
static void export_kernel_boot_props() {
struct {
const char *src_prop;
const char *dst_prop;
const char *default_value;
} prop_map[] = {
//{ "ro.boot.serialno", "ro.serialno", "", },
{ "ro.boot.mode", "ro.bootmode", "unknown", },
{ "ro.boot.baseband", "ro.baseband", "unknown", },
{ "ro.boot.bootloader", "ro.bootloader", "unknown", },
{ "ro.boot.hardware", "ro.hardware", "unknown", },
{ "ro.boot.revision", "ro.revision", "0", },
};
for (size_t i = 0; i < ARRAY_SIZE(prop_map); i++) {
char value[PROP_VALUE_MAX];
int rc = property_get(prop_map[i].src_prop, value);
property_set(prop_map[i].dst_prop, (rc > 0) ? value : prop_map[i].default_value);
}
}
下面这块都是selinux相关的,我们就不分析了。
// Set up SELinux, including loading the SELinux policy if we're in the kernel domain.
selinux_initialize(is_first_stage);
// If we're in the kernel domain, re-exec init to transition to the init domain now
// that the SELinux policy has been loaded.
if (is_first_stage) {
if (restorecon("/init") == -1) {
ERROR("restorecon failed: %s\n", strerror(errno));
security_failure();
}
char* path = argv[0];
char* args[] = { path, const_cast<char*>("--second-stage"), nullptr };
if (execv(path, args) == -1) {
ERROR("execv(\"%s\") failed: %s\n", path, strerror(errno));
security_failure();
}
}
// These directories were necessarily created before initial policy load
// and therefore need their security context restored to the proper value.
// This must happen before /dev is populated by ueventd.
INFO("Running restorecon...\n");
restorecon("/dev");
restorecon("/dev/socket");
restorecon("/dev/__properties__");
restorecon_recursive("/sys");
然后创建了一个epoll的fd
epoll_fd = epoll_create1(EPOLL_CLOEXEC);
if (epoll_fd == -1) {
ERROR("epoll_create1 failed: %s\n", strerror(errno));
exit(1);
}
继续分析,signal_handler_init函数主要是当子进程被kill之后,会在父进程接受一个信号。处理这个信号的时候往sockpair一端写数据,而另一端的fd是加入的epoll中。这块我们后面会专门其一节讲解。而property_load_boot_defaults就是解析根目录的default.prop中的属性,然后设置到属性中去。start_prperty_service就是把接受属性的socket的fd加入epoll中,也定义了处理函数,属性之前博客专门分析过了。
signal_handler_init();
property_load_boot_defaults();
start_property_service();
看看signal_handler_init函数就是处理子进程kill时的情况。
static void SIGCHLD_handler(int) {
if (TEMP_FAILURE_RETRY(write(signal_write_fd, "1", 1)) == -1) {
ERROR("write(signal_write_fd) failed: %s\n", strerror(errno));
}
}
void signal_handler_init() {
// Create a signalling mechanism for SIGCHLD.
int s[2];
if (socketpair(AF_UNIX, SOCK_STREAM | SOCK_NONBLOCK | SOCK_CLOEXEC, 0, s) == -1) {
ERROR("socketpair failed: %s\n", strerror(errno));
exit(1);
}
signal_write_fd = s[0];
signal_read_fd = s[1];
// Write to signal_write_fd if we catch SIGCHLD.
struct sigaction act;
memset(&act, 0, sizeof(act));
act.sa_handler = SIGCHLD_handler;
act.sa_flags = SA_NOCLDSTOP;
sigaction(SIGCHLD, &act, 0);
reap_any_outstanding_children();
register_epoll_handler(signal_read_fd, handle_signal);
}
二、解析init.rc
下面我们开始分析解析init.rc并且结合init.rc一起看
init.rc的语言我们可以看这篇博客,主要是init.rc主要有Actions和Service两种,具体看这篇博客http://blog.csdn.net/kc58236582/article/details/52042331。
我们通过init_parse_config_file函数来解析init.rc,先把文件数据读取到data中,然后调用parse_config来解析数据。
int init_parse_config_file(const char* path) {
INFO("Parsing %s...\n", path);
Timer t;
std::string data;
if (!read_file(path, &data)) {
return -1;
}
data.push_back('\n'); // TODO: fix parse_config.
parse_config(path, data);
dump_parser_state();
NOTICE("(Parsing %s took %.2fs.)\n", path, t.duration());
return 0;
}
我们先来看看dump_parser_state函数,当解析完之后我们可以在这个函数中打印所有的service和action。
void dump_parser_state() {
if (false) {
struct listnode* node;
list_for_each(node, &service_list) {
service* svc = node_to_item(node, struct service, slist);
INFO("service %s\n", svc->name);
INFO(" class '%s'\n", svc->classname);
INFO(" exec");
for (int n = 0; n < svc->nargs; n++) {
INFO(" '%s'", svc->args[n]);
}
INFO("\n");
for (socketinfo* si = svc->sockets; si; si = si->next) {
INFO(" socket %s %s 0%o\n", si->name, si->type, si->perm);
}
}
list_for_each(node, &action_list) {
action* act = node_to_item(node, struct action, alist);
INFO("on ");
char name_str[256] = "";
build_triggers_string(name_str, sizeof(name_str), act);
INFO("%s", name_str);
INFO("\n");
struct listnode* node2;
list_for_each(node2, &act->commands) {
command* cmd = node_to_item(node2, struct command, clist);
INFO(" %p", cmd->func);
for (int n = 0; n < cmd->nargs; n++) {
INFO(" %s", cmd->args[n]);
}
INFO("\n");
}
INFO("\n");
}
}
}
好回到正题看parse_config函数,来解析从init.rc文件中获取的数据。
static void parse_config(const char *fn, const std::string& data)
{
struct listnode import_list;
struct listnode *node;
char *args[INIT_PARSER_MAXARGS];
int nargs = 0;
parse_state state;
state.filename = fn;
state.line = 0;
state.ptr = strdup(data.c_str()); // TODO: fix this code!
state.nexttoken = 0;
state.parse_line = parse_line_no_op;//这里的函数是空实现
list_init(&import_list);
state.priv = &import_list;
for (;;) {
switch (next_token(&state)) {
case T_EOF:
state.parse_line(&state, 0, 0);
goto parser_done;
case T_NEWLINE:
state.line++;
if (nargs) {
int kw = lookup_keyword(args[0]);
if (kw_is(kw, SECTION)) {
state.parse_line(&state, 0, 0);
parse_new_section(&state, kw, nargs, args);
} else {
state.parse_line(&state, nargs, args);
}
nargs = 0;
}
break;
case T_TEXT:
if (nargs < INIT_PARSER_MAXARGS) {
args[nargs++] = state.text;
}
break;
}
}
parser_done:
list_for_each(node, &import_list) {
struct import *import = node_to_item(node, struct import, list);
int ret;
ret = init_parse_config_file(import->filename);
if (ret)
ERROR("could not import file '%s' from '%s'\n",
import->filename, fn);
}
}
我们先来看看next_token函数,我们来看下这个函数,
int next_token(struct parse_state *state)
{
char *x = state->ptr;
char *s;
if (state->nexttoken) {//刚进来为空
int t = state->nexttoken;
state->nexttoken = 0;
return t;
}
for (;;) {
switch (*x) {
case 0:
state->ptr = x;
return T_EOF;
case '\n':
x++;
state->ptr = x;
return T_NEWLINE;
case ' ':
case '\t':
case '\r':
x++;
continue;
case '#':
while (*x && (*x != '\n')) x++;
if (*x == '\n') {
state->ptr = x+1;
return T_NEWLINE;
} else {
state->ptr = x;
return T_EOF;
}
default://刚进来肯定直接是这个
goto text;
}
}
textdone:
state->ptr = x;
*s = 0;
return T_TEXT;
text:
state->text = s = x;//赋值state->text
textresume:
for (;;) {
switch (*x) {
case 0:
goto textdone;
case ' ':
case '\t':
case '\r'://碰到空什么的,直接返回T_TEXT
x++;
goto textdone;
case '\n':
state->nexttoken = T_NEWLINE;//碰到回车换行直接nexttoken是newline
x++;
goto textdone;
case '"':
x++;
for (;;) {
switch (*x) {
case 0:
/* unterminated quoted thing */
state->ptr = x;
return T_EOF;
case '"':
x++;
goto textresume;
default:
*s++ = *x++;
}
}
break;
case '\\':
x++;
switch (*x) {
case 0:
goto textdone;
case 'n':
*s++ = '\n';
break;
case 'r':
*s++ = '\r';
break;
case 't':
*s++ = '\t';
break;
case '\\':
*s++ = '\\';
break;
case '\r':
/* \ <cr> <lf> -> line continuation */
if (x[1] != '\n') {
x++;
continue;
}
case '\n':
/* \ <lf> -> line continuation */
state->line++;
x++;
/* eat any extra whitespace */
while((*x == ' ') || (*x == '\t')) x++;
continue;
default:
/* unknown escape -- just copy */
*s++ = *x++;
}
continue;
default:
*s++ = *x++;//一般的值继续往前走
}
}
return T_EOF;
}
看这个函数的代码,我们只需要知道。当我们普通的进来,没有碰到换行,只有碰到空格的话,返回T_TEXT,并且nextoken为null。
我们再来看T_TEXT的时候只是在数组里面保存了state.text的内容,然后继续下一次。当我们直到碰到/n,回车换行。这个时候返回T_TEXT,但是nexttoken是T_NEWLINE
这样下次,就直接返回T_NEWLINE了,当返回T_NEWLINE直接调用lookup_keyword函数。
for (;;) {
switch (next_token(&state)) {
case T_EOF:
state.parse_line(&state, 0, 0);
goto parser_done;
case T_NEWLINE:
state.line++;
if (nargs) {
int kw = lookup_keyword(args[0]);
if (kw_is(kw, SECTION)) {
state.parse_line(&state, 0, 0);
parse_new_section(&state, kw, nargs, args);
} else {
state.parse_line(&state, nargs, args);
}
nargs = 0;
}
break;
case T_TEXT:
if (nargs < INIT_PARSER_MAXARGS) {
args[nargs++] = state.text;
}
break;
}
}
lookup_keyword函数就是看第一个单词返回一个K_**的值而已。
static int lookup_keyword(const char *s)
{
switch (*s++) {
case 'b':
if (!strcmp(s, "ootchart_init")) return K_bootchart_init;
break;
case 'c':
if (!strcmp(s, "opy")) return K_copy;
if (!strcmp(s, "lass")) return K_class;
if (!strcmp(s, "lass_start")) return K_class_start;
if (!strcmp(s, "lass_stop")) return K_class_stop;
if (!strcmp(s, "lass_reset")) return K_class_reset;
if (!strcmp(s, "onsole")) return K_console;
if (!strcmp(s, "hown")) return K_chown;
if (!strcmp(s, "hmod")) return K_chmod;
if (!strcmp(s, "ritical")) return K_critical;
break;
case 'd':
if (!strcmp(s, "isabled")) return K_disabled;
if (!strcmp(s, "omainname")) return K_domainname;
break;
case 'e':
if (!strcmp(s, "nable")) return K_enable;
if (!strcmp(s, "xec")) return K_exec;
if (!strcmp(s, "xport")) return K_export;
break;
case 'g':
if (!strcmp(s, "roup")) return K_group;
break;
case 'h':
if (!strcmp(s, "ostname")) return K_hostname;
break;
case 'i':
if (!strcmp(s, "oprio")) return K_ioprio;
if (!strcmp(s, "fup")) return K_ifup;
if (!strcmp(s, "nsmod")) return K_insmod;
if (!strcmp(s, "mport")) return K_import;
if (!strcmp(s, "nstallkey")) return K_installkey;
break;
case 'k':
if (!strcmp(s, "eycodes")) return K_keycodes;
break;
case 'l':
if (!strcmp(s, "oglevel")) return K_loglevel;
if (!strcmp(s, "oad_persist_props")) return K_load_persist_props;
if (!strcmp(s, "oad_system_props")) return K_load_system_props;
break;
case 'm':
if (!strcmp(s, "kdir")) return K_mkdir;
if (!strcmp(s, "ount_all")) return K_mount_all;
if (!strcmp(s, "ount")) return K_mount;
break;
再来看这个宏
#define kw_is(kw, type) (keyword_info[kw].flags & (type))
来看看它的定义,首先先说下宏定义##代表后面是连接起来的,#代表就是后面这个变量
#define KEYWORD(symbol, flags, nargs, func) \
[ K_##symbol ] = { #symbol, func, nargs + 1, flags, },
static struct {
const char *name;
int (*func)(int nargs, char **args);
unsigned char nargs;
unsigned char flags;
} keyword_info[KEYWORD_COUNT] = {
[ K_UNKNOWN ] = { "unknown", 0, 0, 0 },
#include "keywords.h"
};
这样我们再来看下keywords.h这个头文件,这里就比较明白是它是解析各个关键词是属于SECTION,COMMAND,OPTION的
#ifndef KEYWORD//因为前面定义了KEYWORD
int do_bootchart_init(int nargs, char **args);
......
#endif
KEYWORD(bootchart_init, COMMAND, 0, do_bootchart_init)
KEYWORD(chmod, COMMAND, 2, do_chmod)
KEYWORD(chown, COMMAND, 2, do_chown)
KEYWORD(class, OPTION, 0, 0)
......
KEYWORD(import, SECTION, 1, 0)
.....
.....
KEYWORD(service, SECTION, 0, 0)
KEYWORD(writepid, OPTION, 0, 0)
#ifdef __MAKE_KEYWORD_ENUM__
KEYWORD_COUNT,
};
#undef __MAKE_KEYWORD_ENUM__
#undef KEYWORD
#endif
这样我们就可以通过kw_is(kw, SECTION)来判断是否属于SECTION
我们来看下函数,如果是SECTION,刚开始调用state.parse_line也是空实现
if (kw_is(kw, SECTION)) {
state.parse_line(&state, 0, 0);
parse_new_section(&state, kw, nargs, args);
} else {
state.parse_line(&state, nargs, args);
}
再来看看parse_new_section函数
static void parse_new_section(struct parse_state *state, int kw,
int nargs, char **args)
{
printf("[ %s %s ]\n", args[0],
nargs > 1 ? args[1] : "");
switch(kw) {
case K_service://如果是service
state->context = parse_service(state, nargs, args);
if (state->context) {
state->parse_line = parse_line_service;
return;
}
break;
case K_on://是on
state->context = parse_action(state, nargs, args);
if (state->context) {
state->parse_line = parse_line_action;
return;
}
break;
case K_import://是import
parse_import(state, nargs, args);
break;
}
state->parse_line = parse_line_no_op;
}
2.1 解析service
我们先来看下如果是service,先调用parse_service函数
static void *parse_service(struct parse_state *state, int nargs, char **args)
{
if (nargs < 3) {
parse_error(state, "services must have a name and a program\n");
return 0;
}
if (!valid_name(args[1])) {
parse_error(state, "invalid service name '%s'\n", args[1]);
return 0;
}
service* svc = (service*) service_find_by_name(args[1]);//找service
if (svc) {//如果找到该service,说明重复了
parse_error(state, "ignored duplicate definition of service '%s'\n", args[1]);
return 0;
}
nargs -= 2;
svc = (service*) calloc(1, sizeof(*svc) + sizeof(char*) * nargs);//new一个service
if (!svc) {
parse_error(state, "out of memory\n");
return 0;
}
svc->name = strdup(args[1]);//各种初始化
svc->classname = "default";
memcpy(svc->args, args + 2, sizeof(char*) * nargs);
trigger* cur_trigger = (trigger*) calloc(1, sizeof(*cur_trigger));
svc->args[nargs] = 0;
svc->nargs = nargs;
list_init(&svc->onrestart.triggers);
cur_trigger->name = "onrestart";
list_add_tail(&svc->onrestart.triggers, &cur_trigger->nlist);
list_init(&svc->onrestart.commands);
list_add_tail(&service_list, &svc->slist);//把service放进service_list
return svc;
}
state->parse_line赋值了parse_line_service函数了。然后我们再出这个函数看看,当你再来一行新的,这个时候不是SECTION,就要调用parse_line_service函数来解析了。
case T_NEWLINE:
state.line++;
if (nargs) {
int kw = lookup_keyword(args[0]);
if (kw_is(kw, SECTION)) {
state.parse_line(&state, 0, 0);
parse_new_section(&state, kw, nargs, args);
} else {
state.parse_line(&state, nargs, args);
}
nargs = 0;
}
break;
我们来看下parse_line_service函数:下面就是解析各种参数,然后填充service变量而已。
static void parse_line_service(struct parse_state *state, int nargs, char **args)
{
struct service *svc = (service*) state->context;
struct command *cmd;
int i, kw, kw_nargs;
if (nargs == 0) {
return;
}
svc->ioprio_class = IoSchedClass_NONE;
kw = lookup_keyword(args[0]);
switch (kw) {
case K_class:
if (nargs != 2) {
parse_error(state, "class option requires a classname\n");
} else {
svc->classname = args[1];
}
break;
case K_console:
svc->flags |= SVC_CONSOLE;
break;
case K_disabled:
2.2 解析on关键字
下面我们来看下解析on关键字的
case K_on:
state->context = parse_action(state, nargs, args);
if (state->context) {
state->parse_line = parse_line_action;
return;
}
break;
先看下parse_action函数
static void *parse_action(struct parse_state *state, int nargs, char **args)
{
struct trigger *cur_trigger;
int i;
if (nargs < 2) {
parse_error(state, "actions must have a trigger\n");
return 0;
}
action* act = (action*) calloc(1, sizeof(*act));//新建aciton
list_init(&act->triggers);
for (i = 1; i < nargs; i++) {
if (!(i % 2)) {
if (strcmp(args[i], "&&")) {//有的触发器有几个条件,比如可以两个属性同事满足
struct listnode *node;
struct listnode *node2;
parse_error(state, "& is the only symbol allowed to concatenate actions\n");
list_for_each_safe(node, node2, &act->triggers) {
struct trigger *trigger = node_to_item(node, struct trigger, nlist);
free(trigger);
}
free(act);
return 0;
} else
continue;
}
cur_trigger = (trigger*) calloc(1, sizeof(*cur_trigger));
cur_trigger->name = args[i];
list_add_tail(&act->triggers, &cur_trigger->nlist);
}
list_init(&act->commands);
list_init(&act->qlist);
list_add_tail(&action_list, &act->alist);//把aciton加入action_list中
/* XXX add to hash */
return act;
}
这里新建一个action,然后加入action_list中。主要触发器可以有几个条件。比如满足两个属性要求,然后保存在action的的triggers中。
同样我们再来看看parse_line_action函数,这个函数就是各种命令了。
static void parse_line_action(struct parse_state* state, int nargs, char **args)
{
struct action *act = (action*) state->context;
int kw, n;
if (nargs == 0) {
return;
}
kw = lookup_keyword(args[0]);
if (!kw_is(kw, COMMAND)) {
parse_error(state, "invalid command '%s'\n", args[0]);
return;
}
n = kw_nargs(kw);
if (nargs < n) {
parse_error(state, "%s requires %d %s\n", args[0], n - 1,
n > 2 ? "arguments" : "argument");
return;
}
command* cmd = (command*) malloc(sizeof(*cmd) + sizeof(char*) * nargs);
cmd->func = kw_func(kw);
cmd->line = state->line;
cmd->filename = state->filename;
cmd->nargs = nargs;
memcpy(cmd->args, args, sizeof(char*) * nargs);
list_add_tail(&act->commands, &cmd->clist);// 加入到act->commands
}
这里注意是kw_func宏,就是和之前那个宏一样,这里是选择每个命令的处理函数。
2.3 处理import
处理import我们来看下parse_import函数,这个函数很简单就把import的文件名保存在import_list中。
static void parse_import(struct parse_state *state, int nargs, char **args)
{
struct listnode *import_list = (listnode*) state->priv;
char conf_file[PATH_MAX];
int ret;
if (nargs != 2) {
ERROR("single argument needed for import\n");
return;
}
ret = expand_props(conf_file, args[1], sizeof(conf_file));
if (ret) {
ERROR("error while handling import on line '%d' in '%s'\n",
state->line, state->filename);
return;
}
struct import* import = (struct import*) calloc(1, sizeof(struct import));
import->filename = strdup(conf_file);
list_add_tail(import_list, &import->list);
INFO("Added '%s' to import list\n", import->filename);
}
最后我们来看下当所有init.rc中的关键字解析完之后,就会遍历import_list,然后调用init_parse_config_file函数再来解析该文件。
parser_done:
list_for_each(node, &import_list) {
struct import *import = node_to_item(node, struct import, list);
int ret;
ret = init_parse_config_file(import->filename);
if (ret)
ERROR("could not import file '%s' from '%s'\n",
import->filename, fn);
}
所以一般在init.rc中import的文件,放入action service列表中,会比直接在init.rc中的service和aciton靠后。
三、加入执行队列
在解析init.rc文件后,这节将介绍把Action加入执行队列中。
action_for_each_trigger("early-init", action_add_queue_tail);
// Queue an action that waits for coldboot done so we know ueventd has set up all of /dev...
queue_builtin_action(wait_for_coldboot_done_action, "wait_for_coldboot_done");
// ... so that we can start queuing up actions that require stuff from /dev.
queue_builtin_action(mix_hwrng_into_linux_rng_action, "mix_hwrng_into_linux_rng");
queue_builtin_action(keychord_init_action, "keychord_init");
queue_builtin_action(console_init_action, "console_init");
// Trigger all the boot actions to get us started.
action_for_each_trigger("init", action_add_queue_tail);
// Repeat mix_hwrng_into_linux_rng in case /dev/hw_random or /dev/random
// wasn't ready immediately after wait_for_coldboot_done
queue_builtin_action(mix_hwrng_into_linux_rng_action, "mix_hwrng_into_linux_rng");
// Don't mount filesystems or start core system services in charger mode.
char bootmode[PROP_VALUE_MAX];
if (property_get("ro.bootmode", bootmode) > 0 && strcmp(bootmode, "charger") == 0) {
action_for_each_trigger("charger", action_add_queue_tail);
} else {
action_for_each_trigger("late-init", action_add_queue_tail);
}
// Run all property triggers based on current state of the properties.
queue_builtin_action(queue_property_triggers_action, "queue_property_triggers");
我们先来看action_for_each_trigger函数
void action_for_each_trigger(const char *trigger,
void (*func)(struct action *act))
{
struct listnode *node, *node2;
struct action *act;
struct trigger *cur_trigger;
list_for_each(node, &action_list) {//遍历每个action
act = node_to_item(node, struct action, alist);
list_for_each(node2, &act->triggers) {//遍历每个action的triggers
cur_trigger = node_to_item(node2, struct trigger, nlist);
if (!strcmp(cur_trigger->name, trigger)) {//是否与传入的trigger名字匹配
func(act);//调用回调函数
}
}
}
}
我们再来看下传入的回调函数action_add_queue_tail,这个函数就是把aciton加入执行列表中。
void action_add_queue_tail(struct action *act)
{
if (list_empty(&act->qlist)) {
list_add_tail(&action_queue, &act->qlist);
}
}
1. 这样的话像第一句,就是在所有的aciton中是否有early-init这样的trigger,有的话加入执行列表。
action_for_each_trigger("early-init", action_add_queue_tail);
我们看下init.rc中early-init中的内容,设置了init进程的adj,开启ueventd进程等。
on early-init
# Set init and its forked children's oom_adj.
write /proc/1/oom_score_adj -1000
# Set the security context of /adb_keys if present.
restorecon /adb_keys
start ueventd
#add for amt
mkdir /amt 0775 root system
下面我们再来看下queue_builtin_action函数,这个函数的话就是直接创建一个action,然后新建command,关键是func会调函数设置好。最后把action加入执行队列中。
void queue_builtin_action(int (*func)(int nargs, char **args), const char *name)
{
action* act = (action*) calloc(1, sizeof(*act));
trigger* cur_trigger = (trigger*) calloc(1, sizeof(*cur_trigger));
cur_trigger->name = name;
list_init(&act->triggers);
list_add_tail(&act->triggers, &cur_trigger->nlist);
list_init(&act->commands);
list_init(&act->qlist);
command* cmd = (command*) calloc(1, sizeof(*cmd));
cmd->func = func;
cmd->args[0] = const_cast<char*>(name);
cmd->nargs = 1;
list_add_tail(&act->commands, &cmd->clist);
list_add_tail(&action_list, &act->alist);
action_add_queue_tail(act);
}
2. 因此这里我们看下wait_for_coldboot_done_action函数,这函数就是等待/dev/.coldboot_done文件
static int wait_for_coldboot_done_action(int nargs, char **args) {
Timer t;
NOTICE("Waiting for %s...\n", COLDBOOT_DONE);
// Any longer than 1s is an unreasonable length of time to delay booting.
// If you're hitting this timeout, check that you didn't make your
// sepolicy regular expressions too expensive (http://b/19899875).
if (wait_for_file(COLDBOOT_DONE, 1)) {
ERROR("Timed out waiting for %s\n", COLDBOOT_DONE);
}
NOTICE("Waiting for %s took %.2fs.\n", COLDBOOT_DONE, t.duration());
wait_for_file等待/dev/.coldboot_done文件,超时时间设置的是1秒。
int wait_for_file(const char *filename, int timeout)
{
struct stat info;
uint64_t timeout_time_ns = gettime_ns() + timeout * UINT64_C(1000000000);
int ret = -1;
while (gettime_ns() < timeout_time_ns && ((ret = stat(filename, &info)) < 0))
usleep(10000);
return ret;
}
3. mix_hwrng_into_linux_rng_action函数从硬件PNG的设备文件/dev/hw_random读取512字节并写到LinuxRNG设备文件dev/urandom中。
4. keychord_init_action初始化组合键监听模块,这个函数调用了keychord_init函数
static int keychord_init_action(int nargs, char **args)
{
keychord_init();
return 0;
}
void keychord_init() {
service_for_each(add_service_keycodes);
// Nothing to do if no services require keychords.
if (!keychords) {
return;
}
keychord_fd = TEMP_FAILURE_RETRY(open("/dev/keychord", O_RDWR | O_CLOEXEC));
if (keychord_fd == -1) {
ERROR("could not open /dev/keychord: %s\n", strerror(errno));
return;
}
int ret = write(keychord_fd, keychords, keychords_length);
if (ret != keychords_length) {
ERROR("could not configure /dev/keychord %d: %s\n", ret, strerror(errno));
close(keychord_fd);
}
free(keychords);
keychords = nullptr;
register_epoll_handler(keychord_fd, handle_keychord);
}
keychord_init函数先是遍历各个service,然后调用add_service_keycodes函数,在add_service_keycodes函数中,主要看service有没有keycodes这个变量,有的话将新建一个keychord,然后将service的keycodes保存在这个变量中。最后还有一个全局的keychords,所以的数据最后都是可以通过这个全局指针找到。
void add_service_keycodes(struct service *svc)
{
struct input_keychord *keychord;
int i, size;
if (svc->keycodes) {
/* add a new keychord to the list */
size = sizeof(*keychord) + svc->nkeycodes * sizeof(keychord->keycodes[0]);
keychords = (input_keychord*) realloc(keychords, keychords_length + size);
if (!keychords) {
ERROR("could not allocate keychords\n");
keychords_length = 0;
keychords_count = 0;
return;
}
keychord = (struct input_keychord *)((char *)keychords + keychords_length);
keychord->version = KEYCHORD_VERSION;
keychord->id = keychords_count + 1;
keychord->count = svc->nkeycodes;
svc->keychord_id = keychord->id;
for (i = 0; i < svc->nkeycodes; i++) {
keychord->keycodes[i] = svc->keycodes[i];
}
keychords_count++;
keychords_length += size;
}
}
然后我们把keychords这个全局变量数据写入/dev/keychord文件中,最后调用register_epoll_handler函数把这个fd注册到epoll中。
int ret = write(keychord_fd, keychords, keychords_length);
if (ret != keychords_length) {
ERROR("could not configure /dev/keychord %d: %s\n", ret, strerror(errno));
close(keychord_fd);
}
free(keychords);
keychords = nullptr;
register_epoll_handler(keychord_fd, handle_keychord);
最后在这个fd有数据来的时候,我们读取出来,通过service_find_by_keychord看与哪个service的的keychord匹配,匹配的话就把service启动。但是前提是and_enabled是running。
static void handle_keychord() {
struct service *svc;
char adb_enabled[PROP_VALUE_MAX];
int ret;
__u16 id;
// Only handle keychords if adb is enabled.
property_get("init.svc.adbd", adb_enabled);
ret = read(keychord_fd, &id, sizeof(id));
if (ret != sizeof(id)) {
ERROR("could not read keychord id\n");
return;
}
if (!strcmp(adb_enabled, "running")) {
svc = service_find_by_keychord(id);
if (svc) {
INFO("Starting service %s from keychord\n", svc->name);
service_start(svc, NULL);
} else {
ERROR("service for keychord %d not found\n", id);
}
}
}
5. console_init_action是显示A N D R O I D 字样的logo。
static int console_init_action(int nargs, char **args)
{
char console[PROP_VALUE_MAX];
if (property_get("ro.boot.console", console) > 0) {
snprintf(console_name, sizeof(console_name), "/dev/%s", console);
}
int fd = open(console_name, O_RDWR | O_CLOEXEC);
if (fd >= 0)
have_console = 1;//是否有控制台
close(fd);
fd = open("/dev/tty0", O_WRONLY | O_CLOEXEC);
if (fd >= 0) {
const char *msg;
msg = "\n"
"\n"
"\n"
"\n"
"\n"
"\n"
"\n" // console is 40 cols x 30 lines
"\n"
"\n"
"\n"
"\n"
"\n"
"\n"
"\n"
" A N D R O I D ";
write(fd, msg, strlen(msg));//显示android字样
close(fd);
}
return 0;
}
6. action_for_each_trigger("init", action_add_queue_tail); 触发init触发器, 主要是mount一些设备,还有创建一些目录。
on init
sysclktz 0
# Backward compatibility.
symlink /system/etc /etc
symlink /sys/kernel/debug /d
# Link /vendor to /system/vendor for devices without a vendor partition.
symlink /system/vendor /vendor
# Create cgroup mount point for cpu accounting
mkdir /acct
mount cgroup none /acct cpuacct
mkdir /acct/uid
# Create cgroup mount point for memory
mount tmpfs none /sys/fs/cgroup mode=0750,uid=0,gid=1000
......
7. mix_hwrng_into_linux_rng_action也是和RNG相关
8. charger和late-init,根据ro.bootmode来触发charger还是late-init触发器
char bootmode[PROP_VALUE_MAX];
if (property_get("ro.bootmode", bootmode) > 0 && strcmp(bootmode, "charger") == 0) {
action_for_each_trigger("charger", action_add_queue_tail);
} else {
action_for_each_trigger("late-init", action_add_queue_tail);
}
late-init内容如下:
on late-init
trigger early-fs
trigger fs
trigger post-fs
# Load properties from /system/ + /factory after fs mount. Place
# this in another action so that the load will be scheduled after the prior
# issued fs triggers have completed.
trigger load_system_props_action//加载系统属性
# Now we can mount /data. File encryption requires keymaster to decrypt
# /data, which in turn can only be loaded when system properties are present
trigger post-fs-data
trigger load_persist_props_action//加载persist属性
# Remove a file to wake up anything waiting for firmware.
trigger firmware_mounts_complete
trigger early-boot
trigger boot//这里面启动main core服务
而on charger就会启动一个charger进程
on charger
class_start charger
service charger /charger
seclabel u:r:healthd:s0
oneshot
9. queue_property_triggers_action就是看现在那些aciton满足条件,把它加入执行列中。
static int queue_property_triggers_action(int nargs, char **args)
{
queue_all_property_triggers();
/* enable property triggers */
property_triggers_enabled = 1;
return 0;
}
void queue_all_property_triggers()
{
queue_property_triggers(NULL, NULL);
}
最后调用queue_property_triggers,遍历所有的aciton是属性的那种,只要满足条件加入执行队列。
void queue_property_triggers(const char *name, const char *value)
{
struct listnode *node, *node2;
struct action *act;
struct trigger *cur_trigger;
bool match;
int name_length;
list_for_each(node, &action_list) {
act = node_to_item(node, struct action, alist);
match = !name;
list_for_each(node2, &act->triggers) {
cur_trigger = node_to_item(node2, struct trigger, nlist);
if (!strncmp(cur_trigger->name, "property:", strlen("property:"))) {
const char *test = cur_trigger->name + strlen("property:");
if (!match) {
name_length = strlen(name);
if (!strncmp(name, test, name_length) &&
test[name_length] == '=' &&
(!strcmp(test + name_length + 1, value) ||
!strcmp(test + name_length + 1, "*"))) {
match = true;
continue;
}
}
const char* equals = strchr(test, '=');
if (equals) {
char prop_name[PROP_NAME_MAX + 1];
char value[PROP_VALUE_MAX];
int length = equals - test;
if (length <= PROP_NAME_MAX) {
int ret;
memcpy(prop_name, test, length);
prop_name[length] = 0;
/* does the property exist, and match the trigger value? */
ret = property_get(prop_name, value);
if (ret > 0 && (!strcmp(equals + 1, value) ||
!strcmp(equals + 1, "*"))) {
continue;
}
}
}
}
match = false;
break;
}
if (match) {
action_add_queue_tail(act);
}
}
}
四、属性系统
属性会在start_property_service函数中,把属性的socket 的fd加入到了epoll中,init主要是检测属性发生改变时,有哪些action满足条件需要触发。以及一些persist属性保存。ctl属性开启 关闭service等。
具体的我们在之前的博客http://blog.csdn.net/kc58236582/article/details/51939322,已经分析的比较详细了,这里就不说了。
五、执行执行队列中的Action
执行命令主要是在main函数中的while循环中调用execute_one_command,因为执行队列会不断变化,所以需要在while循环中不断调用这个函数。
while (true) {
if (!waiting_for_exec) {
execute_one_command();
restart_processes();
}
int timeout = -1;
if (process_needs_restart) {
timeout = (process_needs_restart - gettime()) * 1000;
if (timeout < 0)
timeout = 0;
}
if (!action_queue_empty() || cur_action) {
timeout = 0;
}
bootchart_sample(&timeout);
epoll_event ev;
int nr = TEMP_FAILURE_RETRY(epoll_wait(epoll_fd, &ev, 1, timeout));
if (nr == -1) {
ERROR("epoll_wait failed: %s\n", strerror(errno));
} else if (nr == 1) {
((void (*)()) ev.data.ptr)();
}
}
我们来看下这个函数,比较简单先调用action_remove_queue_head函数,然后获取command,最后调用command的func回调函数。
void execute_one_command() {
Timer t;
char cmd_str[256] = "";
char name_str[256] = "";
if (!cur_action || !cur_command || is_last_command(cur_action, cur_command)) {
cur_action = action_remove_queue_head();
cur_command = NULL;
if (!cur_action) {
return;
}
build_triggers_string(name_str, sizeof(name_str), cur_action);
INFO("processing action %p (%s)\n", cur_action, name_str);
cur_command = get_first_command(cur_action);
} else {
cur_command = get_next_command(cur_action, cur_command);
}
if (!cur_command) {
return;
}
int result = cur_command->func(cur_command->nargs, cur_command->args);
if (klog_get_level() >= KLOG_INFO_LEVEL) {
for (int i = 0; i < cur_command->nargs; i++) {
strlcat(cmd_str, cur_command->args[i], sizeof(cmd_str));
if (i < cur_command->nargs - 1) {
strlcat(cmd_str, " ", sizeof(cmd_str));
}
}
char source[256];
if (cur_command->filename) {
snprintf(source, sizeof(source), " (%s:%d)", cur_command->filename, cur_command->line);
} else {
*source = '\0';
}
INFO("Command '%s' action=%s%s returned %d took %.2fs\n",
cmd_str, cur_action ? name_str : "", source, result, t.duration());
}
}
六、kill 进程处理以及再次开启service进程
之前我们在分析signal_handler_init函数的时候没有详细说,现在说下这个函数。
void signal_handler_init() {
// Create a signalling mechanism for SIGCHLD.
int s[2];
if (socketpair(AF_UNIX, SOCK_STREAM | SOCK_NONBLOCK | SOCK_CLOEXEC, 0, s) == -1) {
ERROR("socketpair failed: %s\n", strerror(errno));
exit(1);
}
signal_write_fd = s[0];
signal_read_fd = s[1];
// Write to signal_write_fd if we catch SIGCHLD.
struct sigaction act;
memset(&act, 0, sizeof(act));
act.sa_handler = SIGCHLD_handler;
act.sa_flags = SA_NOCLDSTOP;
sigaction(SIGCHLD, &act, 0);//子进程终结发给父进程的信号
reap_any_outstanding_children();
register_epoll_handler(signal_read_fd, handle_signal);
}
我们先来看下信号的处理函数,SIGCHLD_handler就是往socketpair的一端写入数据
static void SIGCHLD_handler(int) {
if (TEMP_FAILURE_RETRY(write(signal_write_fd, "1", 1)) == -1) {
ERROR("write(signal_write_fd) failed: %s\n", strerror(errno));
}
}
然后sockpair的另一端,注册到epoll中去,我们也来看下处理函数handle_signal,读取了sockpair中的内容后,调用了reap_any_outstanding_children函数,这个函数在signal_handler_init函数里面也调用了。
static void handle_signal() {
// Clear outstanding requests.
char buf[32];
read(signal_read_fd, buf, sizeof(buf));
reap_any_outstanding_children();
}
我们来看下reap_any_outstanding_children函数,直接while循环调用了wait_for_one_process函数
static void reap_any_outstanding_children() {
while (wait_for_one_process()) {
}
}
我们再来看wait_for_one_process函数,先调用了waitpid方法,pid为-1,代表监听所有的子进程。WNOHANG代表不阻塞。当pid值返回0和-1时return false,直接while循环退出了,否则一直处理一个接着一个进程挂掉的信号。
static bool wait_for_one_process() {
int status;
pid_t pid = TEMP_FAILURE_RETRY(waitpid(-1, &status, WNOHANG));//WNOHANG代表不阻塞
if (pid == 0) {
return false;
} else if (pid == -1) {
ERROR("waitpid failed: %s\n", strerror(errno));
return false;
}
service* svc = service_find_by_pid(pid);//找到service
std::string name;
if (svc) {
name = android::base::StringPrintf("Service '%s' (pid %d)", svc->name, pid);
} else {
name = android::base::StringPrintf("Untracked pid %d", pid);
}
NOTICE("%s %s\n", name.c_str(), DescribeStatus(status).c_str());
if (!svc) {
return true;//没找到service 直接结束处理下个进程信号
}
// TODO: all the code from here down should be a member function on service.
if (!(svc->flags & SVC_ONESHOT) || (svc->flags & SVC_RESTART)) {//如果不是oneshot 或者是restart的这种flag
NOTICE("Service '%s' (pid %d) killing any children in process group\n", svc->name, pid);
kill(-pid, SIGKILL);//kill该进程群组所有的进程
}
// Remove any sockets we may have created.去除socket
for (socketinfo* si = svc->sockets; si; si = si->next) {
char tmp[128];
snprintf(tmp, sizeof(tmp), ANDROID_SOCKET_DIR"/%s", si->name);
unlink(tmp);
}
if (svc->flags & SVC_EXEC) {
INFO("SVC_EXEC pid %d finished...\n", svc->pid);
waiting_for_exec = false;
list_remove(&svc->slist);
free(svc->name);
free(svc);
return true;
}
svc->pid = 0;
svc->flags &= (~SVC_RUNNING);//去除running的flag
// Oneshot processes go into the disabled state on exit,
// except when manually restarted.
if ((svc->flags & SVC_ONESHOT) && !(svc->flags & SVC_RESTART)) {
svc->flags |= SVC_DISABLED;//oneshot而且没有restart的flag,附上disabled的flag
}
// Disabled and reset processes do not get restarted automatically.
if (svc->flags & (SVC_DISABLED | SVC_RESET)) {//已经reset或者disabled直接结束
svc->NotifyStateChange("stopped");
return true;
}
time_t now = gettime();
if ((svc->flags & SVC_CRITICAL) && !(svc->flags & SVC_RESTART)) {
if (svc->time_crashed + CRITICAL_CRASH_WINDOW >= now) {
if (++svc->nr_crashed > CRITICAL_CRASH_THRESHOLD) {
ERROR("critical process '%s' exited %d times in %d minutes; "
"rebooting into recovery mode\n", svc->name,
CRITICAL_CRASH_THRESHOLD, CRITICAL_CRASH_WINDOW / 60);
android_reboot(ANDROID_RB_RESTART2, 0, "recovery");
return true;
}
} else {
svc->time_crashed = now;
svc->nr_crashed = 1;
}
}
svc->flags &= (~SVC_RESTART);
svc->flags |= SVC_RESTARTING;// restarting代表重启中
// Execute all onrestart commands for this service.
struct listnode* node;
list_for_each(node, &svc->onrestart.commands) {//有onrestart的command命令的,重启的时候要先调用命令
command* cmd = node_to_item(node, struct command, clist);
cmd->func(cmd->nargs, cmd->args);
}
svc->NotifyStateChange("restarting");
return true;
}
这个函数主要将service的flags赋值,一般的进程被kill 之后最后会被附上SVC_RESTARTING这个flag,而且又onrestart的,先执行其command。对于已经是disabled和reset的service直接结束,对于是oneshot而且没有restart flag的service,直接附上disabled这个flag。
我们先来看看下面servicemanager这个service,当servicemanager重启的时候,会restart healthd等。
service servicemanager /system/bin/servicemanager
class core
user system
group system
critical
onrestart restart healthd
onrestart restart zygote
onrestart restart media
onrestart restart surfaceflinger
onrestart restart drm
restart这个命令对应的是do_restart函数,最后调用service_restart函数重启service
int do_restart(int nargs, char **args)
{
struct service *svc;
svc = service_find_by_name(args[1]);
if (svc) {
service_restart(svc);
}
return 0;
}
我们再来看看service的NotifyStateChange函数,主要是设置init.svc.(service的name)这个属性为这个service最新的状态。
void service::NotifyStateChange(const char* new_state) {
if (!properties_initialized()) {
// If properties aren't available yet, we can't set them.
return;
}
if ((flags & SVC_EXEC) != 0) {
// 'exec' commands don't have properties tracking their state.
return;
}
char prop_name[PROP_NAME_MAX];
if (snprintf(prop_name, sizeof(prop_name), "init.svc.%s", name) >= PROP_NAME_MAX) {
// If the property name would be too long, we can't set it.
ERROR("Property name \"init.svc.%s\" too long; not setting to %s\n", name, new_state);
return;
}
property_set(prop_name, new_state);
}
下面我们需要再结合main函数中在while循环中调用的restart_processes函数
static void restart_processes()
{
process_needs_restart = 0;
service_for_each_flags(SVC_RESTARTING,
restart_service_if_needed);
}
结合service_for_each_flags,遍历所有的service,只要service的flags有SVC_RESTARTING的就调用restart_service_if_needed函数
void service_for_each_flags(unsigned matchflags,
void (*func)(struct service *svc))
{
struct listnode *node;
struct service *svc;
list_for_each(node, &service_list) {
svc = node_to_item(node, struct service, slist);
if (svc->flags & matchflags) {
func(svc);
}
}
}
在restart_service_if_needed函数中,会去除SVC_RESTARTING的flag,然后调用service_start启动进程。
static void restart_service_if_needed(struct service *svc)
{
time_t next_start_time = svc->time_started + 5;
if (next_start_time <= gettime()) {
svc->flags &= (~SVC_RESTARTING);
service_start(svc, NULL);
return;
}
if ((next_start_time < process_needs_restart) ||
(process_needs_restart == 0)) {
process_needs_restart = next_start_time;
}
}
所以普通的进程,使用kill的话,哪怕进程被kill了之后,还会被init进程启动的。我们再来看看service_start函数,先把一些flag清除
void service_start(struct service *svc, const char *dynamic_args)
{
// Starting a service removes it from the disabled or reset state and
// immediately takes it out of the restarting state if it was in there.
svc->flags &= (~(SVC_DISABLED|SVC_RESTARTING|SVC_RESET|SVC_RESTART|SVC_DISABLED_START));//清除flag
svc->time_started = 0;
// Running processes require no additional work --- if they're in the
// process of exiting, we've ensured that they will immediately restart
// on exit, unless they are ONESHOT.
if (svc->flags & SVC_RUNNING) {//已经启动的service直接退出
return;
}
bool needs_console = (svc->flags & SVC_CONSOLE);//需要控制台的service
if (needs_console && !have_console) {//是否有控制台,没有直接退出
ERROR("service '%s' requires console\n", svc->name);
svc->flags |= SVC_DISABLED;
return;
}
struct stat s;
if (stat(svc->args[0], &s) != 0) {
ERROR("cannot find '%s', disabling '%s'\n", svc->args[0], svc->name);
svc->flags |= SVC_DISABLED;
return;
}
if ((!(svc->flags & SVC_ONESHOT)) && dynamic_args) {
ERROR("service '%s' must be one-shot to use dynamic args, disabling\n",
svc->args[0]);
svc->flags |= SVC_DISABLED;
return;
}
后面是selinux的相关,后面就直接fork进程,处理一些子进程的环境等。
char* scon = NULL;
if (is_selinux_enabled() > 0) {
......
}
NOTICE("Starting service '%s'...\n", svc->name);
pid_t pid = fork();
if (pid == 0) {
......
_exit(127);
}
最后设置下service的时间,pid,以及flags状态改成running,最后通知(设置属性)改成running了。
if (pid < 0) {
ERROR("failed to start '%s'\n", svc->name);
svc->pid = 0;
return;
}
svc->time_started = gettime();
svc->pid = pid;
svc->flags |= SVC_RUNNING;
if ((svc->flags & SVC_EXEC) != 0) {
INFO("SVC_EXEC pid %d (uid %d gid %d+%zu context %s) started; waiting...\n",
svc->pid, svc->uid, svc->gid, svc->nr_supp_gids,
svc->seclabel ? : "default");
waiting_for_exec = true;
}
svc->NotifyStateChange("running");
}
像普通的进程我们通过kill进程会被init再次启动,那怎样才能kill 这个进程,又不会被init进程启动呢,可以使用stop命令,我们看下do_stop函数。
int do_stop(int nargs, char **args)
{
struct service *svc;
svc = service_find_by_name(args[1]);
if (svc) {
service_stop(svc);
}
return 0;
}
调用了service_stop_or_reset只是flag为disabled
void service_stop(struct service *svc)
{
service_stop_or_reset(svc, SVC_DISABLED);
}
service_stop_or_reset函数中,最后会kill整个进程组,而且因为把flag改成了disabled,在init中也不会启动进程了。
static void service_stop_or_reset(struct service *svc, int how)
{
/* The service is still SVC_RUNNING until its process exits, but if it has
* already exited it shoudn't attempt a restart yet. */
svc->flags &= ~(SVC_RESTARTING | SVC_DISABLED_START);//清相关flag
if ((how != SVC_DISABLED) && (how != SVC_RESET) && (how != SVC_RESTART)) {
/* Hrm, an illegal flag. Default to SVC_DISABLED */
how = SVC_DISABLED;
}
/* if the service has not yet started, prevent
* it from auto-starting with its class
*/
if (how == SVC_RESET) {
svc->flags |= (svc->flags & SVC_RC_DISABLED) ? SVC_DISABLED : SVC_RESET;// 看之前是否有disabled这个flag
} else {
svc->flags |= how;
}
if (svc->pid) {
NOTICE("Service '%s' is being killed...\n", svc->name);
kill(-svc->pid, SIGKILL);//kill 整个进程组
svc->NotifyStateChange("stopping");
} else {
svc->NotifyStateChange("stopped");
}
}
也可以自己调用start命令,最后通过do_start函数启动这个service。
int do_start(int nargs, char **args)
{
struct service *svc;
svc = service_find_by_name(args[1]);
if (svc) {
service_start(svc, NULL);
}
return 0;
}
再看下do_restart
int do_restart(int nargs, char **args)
{
struct service *svc;
svc = service_find_by_name(args[1]);
if (svc) {
service_restart(svc);
}
return 0;
}
在service_restart函数中,当是running状态,直接把它kill了,然后在init处理进程信号时,会把它的flag变成restarting,之后init进程会重启这个进程。其他的状态就直接启动service了。
void service_restart(struct service *svc)
{
if (svc->flags & SVC_RUNNING) {
/* Stop, wait, then start the service. */
service_stop_or_reset(svc, SVC_RESTART);
} else if (!(svc->flags & SVC_RESTARTING)) {
/* Just start the service since it's not running. */
service_start(svc, NULL);
} /* else: Service is restarting anyways. */
}
至于stop start命令只能在init.rc中使用,但是我们可以通过ctl.start ctl.stop ctl.restart来达到这个目的。处理的话,之前在属性系统中已经分析过了。
普通对一个service命令处理只有stop start restart没有reset,而在class_reset class_stop class_start中有,我们来看看这些命令处理。
int do_class_stop(int nargs, char **args)
{
service_for_each_class(args[1], service_stop);
return 0;
}
int do_class_reset(int nargs, char **args)
{
service_for_each_class(args[1], service_reset);
return 0;
}
再结合service_for_each_class函数,我们知道class_reset class_stop 只是遍历所有的service,看看其class是否满足,满足就调用service_stop 和 service_reset函数
void service_for_each_class(const char *classname,
void (*func)(struct service *svc))
{
struct listnode *node;
struct service *svc;
list_for_each(node, &service_list) {
svc = node_to_item(node, struct service, slist);
if (!strcmp(svc->classname, classname)) {
func(svc);
}
}
}
而我们再看看do_class_start有点不一样,遍历所有的service看看其class是否满足然后调用service_start_if_not_disabled函数
int do_class_start(int nargs, char **args)
{
/* Starting a class does not start services
* which are explicitly disabled. They must
* be started individually.
*/
service_for_each_class(args[1], service_start_if_not_disabled);
return 0;
}
看看service_start_if_not_disabled函数只有在flags不等于SVC_DISABLED的时候才会调用service_start函数。
static void service_start_if_not_disabled(struct service *svc)
{
if (!(svc->flags & SVC_DISABLED)) {
service_start(svc, NULL);
} else {
svc->flags |= SVC_DISABLED_START;
}
}
这样如果调用class_stop再调用class_start也不能再次启动这些class的service了,只有启动那些之前调用的是reset的service。
class_stop了之后,就只能一个一个service调用start命令了。