Android 内存管理之LowMemoryKiller实现原理分析

引言

现在的Android手机,硬件的配置越来越强大,内存已经从当时的1G,2G向着6G,8G去进行了过度。
但是,应用占用内存的情况,尤其是大型的游戏,多开的应用,对内存的占用情况也是在不停地上升。
当内存在某个时间被占用的比较满的情况下,合理的杀死一些应用,并释放内存,是Android进行性能提升的一个手段。
这个也就是一直说的LowMemoryKiller的机制。
我们平时使用华为,小米等手机,也会发现在一定的情况下,后台的进程被很快的杀死,这个就涉及到了LMK的一些定制。
本篇文章,我们会分析LMK的机制和实现的原理。

阙值

minfree
adb shell cat /sys/module/lowmemorykiller/parameters/minfree                                                                                                                
18432,23040,27648,32256,55296,80640

minfree里面是以”,”分割的一组数,每个数字代表一个内存级别。
也就是当到达某个级别时,去进行某些操作。

adj
cat /sys/module/lowmemorykiller/parameters/adj                                                                                                                    
0,100,200,250,900,950

adj里面也有是以","分割的一组数,里面的数字代表的是进程优先级级别

minfreeadj是对应的关系,当达到什么样的内存情况,控制的就去杀掉什么级别的应用。

我们可以从代码里面看到,在lmkd里面,也是写的操作在同步进行。

    if (has_inkernel_module) {
        char minfreestr[128];
        char killpriostr[128];

        minfreestr[0] = '\0';
        killpriostr[0] = '\0';

        for (i = 0; i < lowmem_targets_size; i++) {
            char val[40];

            if (i) {
                strlcat(minfreestr, ",", sizeof(minfreestr));
                strlcat(killpriostr, ",", sizeof(killpriostr));
            }

            snprintf(val, sizeof(val), "%d", use_inkernel_interface ? lowmem_minfree[i] : 0);
            strlcat(minfreestr, val, sizeof(minfreestr));
            snprintf(val, sizeof(val), "%d", use_inkernel_interface ? lowmem_adj[i] : 0);
            strlcat(killpriostr, val, sizeof(killpriostr));
        }

        writefilestring(INKERNEL_MINFREE_PATH, minfreestr, true);
        writefilestring(INKERNEL_ADJ_PATH, killpriostr, true);
    }

这里面的两个节点数据的PATH,即为我们cat查看的这两个节点:

/* gid containing AID_SYSTEM required */
#define INKERNEL_MINFREE_PATH "/sys/module/lowmemorykiller/parameters/minfree"
#define INKERNEL_ADJ_PATH "/sys/module/lowmemorykiller/parameters/adj"
oom_adj
cat /proc/1/oom_adj                                                                                                                                               
-17

每个进程都会在proc下面会有一个进程号,那么我们进入到这个进程号的目录之后,就会在里面找到对应的oom_adj
这个值代表的是代表当前进程在内核中的优先级。

oom_score_adj
cat /proc/1/oom_score_adj                                                                                                                                         
-1000

oom_score_adj代表的是上层优先级,跟ProcessList中的优先级对应。

oom_score_adjoom_adj的值分别代表了底层和上层的值,会通过综合计算后确认kill的范围。

实现原理

在Android系统的实现中,我们可以发现,lowkiller杀掉的,一般是从后台的应用开始杀起。
而且,kill的进程一般不包含系统的进程和系统app,因为当杀到系统进程时,手机肯定会出现功能的问题。
比如上面引用的proc为1的进程,我们可以看到定义的值都为负数。

所以一般进程管理以及杀进程都是针对与上层的APP来说的,而这些进程的优先级调整都在AMS里面。
AMS根据进程中的组件的状态去不断的计算每个进程的优先级,计算之后,会及时更新到对应进程的文件节点中。
这个对文件节点的更新并不是它完成的,而是lmkd,他们之间通过socket通信。
lmkd在手机中是一个常驻进程,用来处理上层ActivityManager在进行updateOomAdj之后,更新进程的优先级。
如果必要则杀掉进程释放内存。

lmkd的启动

PWD: system/memory/lmkd/lmkd.rc

service lmkd /system/bin/lmkd
    class core
    user lmkd
    group lmkd system readproc
    capabilities DAC_OVERRIDE KILL IPC_LOCK SYS_NICE SYS_RESOURCE
    critical
    socket lmkd seqpacket+passcred 0660 system system
    writepid /dev/cpuset/system-background/tasks

on property:lmkd.reinit=1
    exec_background /system/bin/lmkd --reinit

在lmkd的rc里面,我们可以看到定义了service为lmkd,使用的是/system/bin/lmkd的bin去执行。

    # Start lmkd before any other services run so that it can register them
    chown root system /sys/module/lowmemorykiller/parameters/adj
    chmod 0664 /sys/module/lowmemorykiller/parameters/adj
    chown root system /sys/module/lowmemorykiller/parameters/minfree
    chmod 0664 /sys/module/lowmemorykiller/parameters/minfree
    start lmkd

在init.rc的on init的函数中,对其进行了启动,并且启动和监视的时间较为靠前。

lmkd的编译

lmkd的代码路径为:system/memory/lmkd/
我们来读一下关键的Android.bp中的内容:

cc_binary {
    name: "lmkd",

    srcs: ["lmkd.cpp"],
    shared_libs: [
        "libcutils",
        "liblog",
        "libprocessgroup",
        "libpsi",
        "libstatssocket",
    ],
    static_libs: [
        "libstatslogc",
        "libstatslog_lmkd",
        "liblmkd_utils",
    ],
    local_include_dirs: ["include"],
    cflags: [
        "-Wall",
        "-Werror",
        "-Wextra",
        "-DLMKD_TRACE_KILLS"
    ],
    init_rc: ["lmkd.rc"],
    defaults: ["stats_defaults"],
    logtags: ["event.logtags"],
}

从这边我们可以看到,lmkd也就是编译成了一个bin,并且src的文件主要为lmkd.cpp。

lmkd的初始化

我们首先来看一下lmkd的main函数的实现:

int main(int argc, char **argv) {
    if ((argc > 1) && argv[1] && !strcmp(argv[1], "--reinit")) {
        if (property_set(LMKD_REINIT_PROP, "0")) {
            ALOGE("Failed to reset " LMKD_REINIT_PROP " property");
        }
        return issue_reinit();
    }

    update_props();

    ctx = create_android_logger(KILLINFO_LOG_TAG);

    if (!init()) {
        if (!use_inkernel_interface) {
            /*
             * MCL_ONFAULT pins pages as they fault instead of loading
             * everything immediately all at once. (Which would be bad,
             * because as of this writing, we have a lot of mapped pages we
             * never use.) Old kernels will see MCL_ONFAULT and fail with
             * EINVAL; we ignore this failure.
             *
             * N.B. read the man page for mlockall. MCL_CURRENT | MCL_ONFAULT
             * pins ⊆ MCL_CURRENT, converging to just MCL_CURRENT as we fault
             * in pages.
             */
            /* CAP_IPC_LOCK required */
            if (mlockall(MCL_CURRENT | MCL_FUTURE | MCL_ONFAULT) && (errno != EINVAL)) {
                ALOGW("mlockall failed %s", strerror(errno));
            }

            /* CAP_NICE required */
            struct sched_param param = {
                    .sched_priority = 1,
            };
            if (sched_setscheduler(0, SCHED_FIFO, &param)) {
                ALOGW("set SCHED_FIFO failed %s", strerror(errno));
            }
        }

        mainloop();
    }

    android_log_destroy(&ctx);

    ALOGI("exiting");
    return 0;
}
  • 在前面的rc中,我们看到当lmkd.reinit = 1时,会去进行reinit的操作,那么在main函数的开始,也会对齐进行判断。
    if ((argc > 1) && argv[1] && !strcmp(argv[1], "--reinit")) {
        if (property_set(LMKD_REINIT_PROP, "0")) {
            ALOGE("Failed to reset " LMKD_REINIT_PROP " property");
        }
        return issue_reinit();
    }
  • 会调用update_props来进行一些值的初始化。
static void update_props() {
    /* By default disable low level vmpressure events */
    level_oomadj[VMPRESS_LEVEL_LOW] =
        property_get_int32("ro.lmk.low", OOM_SCORE_ADJ_MAX + 1);
    level_oomadj[VMPRESS_LEVEL_MEDIUM] =
        property_get_int32("ro.lmk.medium", 800);
    level_oomadj[VMPRESS_LEVEL_CRITICAL] =
        property_get_int32("ro.lmk.critical", 0);
    debug_process_killing = property_get_bool("ro.lmk.debug", false);

    /* By default disable upgrade/downgrade logic */
    enable_pressure_upgrade =
        property_get_bool("ro.lmk.critical_upgrade", false);
    upgrade_pressure =
        (int64_t)property_get_int32("ro.lmk.upgrade_pressure", 100);
    downgrade_pressure =
        (int64_t)property_get_int32("ro.lmk.downgrade_pressure", 100);
    kill_heaviest_task =
        property_get_bool("ro.lmk.kill_heaviest_task", false);
    low_ram_device = property_get_bool("ro.config.low_ram", false);
    kill_timeout_ms =
        (unsigned long)property_get_int32("ro.lmk.kill_timeout_ms", 0);
    use_minfree_levels =
        property_get_bool("ro.lmk.use_minfree_levels", false);
    per_app_memcg =
        property_get_bool("ro.config.per_app_memcg", low_ram_device);
    swap_free_low_percentage = clamp(0, 100, property_get_int32("ro.lmk.swap_free_low_percentage",
        DEF_LOW_SWAP));
    psi_partial_stall_ms = property_get_int32("ro.lmk.psi_partial_stall_ms",
        low_ram_device ? DEF_PARTIAL_STALL_LOWRAM : DEF_PARTIAL_STALL);
    psi_complete_stall_ms = property_get_int32("ro.lmk.psi_complete_stall_ms",
        DEF_COMPLETE_STALL);
    thrashing_limit_pct = max(0, property_get_int32("ro.lmk.thrashing_limit",
        low_ram_device ? DEF_THRASHING_LOWRAM : DEF_THRASHING));
    thrashing_limit_decay_pct = clamp(0, 100, property_get_int32("ro.lmk.thrashing_limit_decay",
        low_ram_device ? DEF_THRASHING_DECAY_LOWRAM : DEF_THRASHING_DECAY));
}
  • 会调用init函数,来进行真正的初始化动作。
  • 调用main_loop进行死循环的等待,来等待FD事件的上报。

那么我们先来看看init函数的处理

lmkd init分析

我们先来看一下init函数的实现:

static int init(void) {
    static struct event_handler_info kernel_poll_hinfo = { 0, kernel_event_handler };
    struct reread_data file_data = {
        .filename = ZONEINFO_PATH,
        .fd = -1,
    };
    struct epoll_event epev;
    int pidfd;
    int i;
    int ret;

    page_k = sysconf(_SC_PAGESIZE);
    if (page_k == -1)
        page_k = PAGE_SIZE;
    page_k /= 1024;

    epollfd = epoll_create(MAX_EPOLL_EVENTS);
    if (epollfd == -1) {
        ALOGE("epoll_create failed (errno=%d)", errno);
        return -1;
    }

    // mark data connections as not connected
    for (int i = 0; i < MAX_DATA_CONN; i++) {
        data_sock[i].sock = -1;
    }

    ctrl_sock.sock = android_get_control_socket("lmkd");
    if (ctrl_sock.sock < 0) {
        ALOGE("get lmkd control socket failed");
        return -1;
    }

    ret = listen(ctrl_sock.sock, MAX_DATA_CONN);
    if (ret < 0) {
        ALOGE("lmkd control socket listen failed (errno=%d)", errno);
        return -1;
    }

    epev.events = EPOLLIN;
    ctrl_sock.handler_info.handler = ctrl_connect_handler;
    epev.data.ptr = (void *)&(ctrl_sock.handler_info);
    if (epoll_ctl(epollfd, EPOLL_CTL_ADD, ctrl_sock.sock, &epev) == -1) {
        ALOGE("epoll_ctl for lmkd control socket failed (errno=%d)", errno);
        return -1;
    }
    maxevents++;

    has_inkernel_module = !access(INKERNEL_MINFREE_PATH, W_OK);
    use_inkernel_interface = has_inkernel_module;

    if (use_inkernel_interface) {
        ALOGI("Using in-kernel low memory killer interface");
        if (init_poll_kernel()) {
            epev.events = EPOLLIN;
            epev.data.ptr = (void*)&kernel_poll_hinfo;
            if (epoll_ctl(epollfd, EPOLL_CTL_ADD, kpoll_fd, &epev) != 0) {
                ALOGE("epoll_ctl for lmk events failed (errno=%d)", errno);
                close(kpoll_fd);
                kpoll_fd = -1;
            } else {
                maxevents++;
                /* let the others know it does support reporting kills */
                property_set("sys.lmk.reportkills", "1");
            }
        }
    } else {
        if (!init_monitors()) {
            return -1;
        }
        /* let the others know it does support reporting kills */
        property_set("sys.lmk.reportkills", "1");
    }

    for (i = 0; i <= ADJTOSLOT(OOM_SCORE_ADJ_MAX); i++) {
        procadjslot_list[i].next = &procadjslot_list[i];
        procadjslot_list[i].prev = &procadjslot_list[i];
    }

    memset(killcnt_idx, KILLCNT_INVALID_IDX, sizeof(killcnt_idx));

    /*
     * Read zoneinfo as the biggest file we read to create and size the initial
     * read buffer and avoid memory re-allocations during memory pressure
     */
    if (reread_file(&file_data) == NULL) {
        ALOGE("Failed to read %s: %s", file_data.filename, strerror(errno));
    }

    /* check if kernel supports pidfd_open syscall */
    pidfd = TEMP_FAILURE_RETRY(sys_pidfd_open(getpid(), 0));
    if (pidfd < 0) {
        pidfd_supported = (errno != ENOSYS);
    } else {
        pidfd_supported = true;
        close(pidfd);
    }
    ALOGI("Process polling is %s", pidfd_supported ? "supported" : "not supported" );

    return 0;
}
  • 在init函数中,首先创建一个监听的文件句柄:epollfd = epoll_create(MAX_EPOLL_EVENTS);
    传递的参数为MAX_EPOLL_EVENTS,为最大可以监控的fd数目。
    函数的声明如下:
函数声明:int epoll_create(int size)

该函数生成一个epoll专用的文件描述符。
它其实是在内核申请一空间,用来存放你想关注的socket fd上是否发生以及发生了什么事件。
size就是你在这个epoll fd上能关注的最大socket fd数。
  • 会去取得lmkd的socket fd.
ctrl_sock.sock = android_get_control_socket("lmkd");
  • 会对socket进行监听:
ret = listen(ctrl_sock.sock, MAX_DATA_CONN)
  • 会对传进来的数据进行处理
ctrl_sock.handler_info.handler = ctrl_connect_handler;

ctrl_connect_handler的实现如下:

static void ctrl_connect_handler(int data __unused, uint32_t events __unused,
                                 struct polling_params *poll_params __unused) {
    struct epoll_event epev;
    int free_dscock_idx = get_free_dsock();

    if (free_dscock_idx < 0) {
        /*
         * Number of data connections exceeded max supported. This should not
         * happen but if it does we drop all existing connections and accept
         * the new one. This prevents inactive connections from monopolizing
         * data socket and if we drop ActivityManager connection it will
         * immediately reconnect.
         */
        for (int i = 0; i < MAX_DATA_CONN; i++) {
            ctrl_data_close(i);
        }
        free_dscock_idx = 0;
    }

    data_sock[free_dscock_idx].sock = accept(ctrl_sock.sock, NULL, NULL);
    if (data_sock[free_dscock_idx].sock < 0) {
        ALOGE("lmkd control socket accept failed; errno=%d", errno);
        return;
    }

    ALOGI("lmkd data connection established");
    /* use data to store data connection idx */
    data_sock[free_dscock_idx].handler_info.data = free_dscock_idx;
    data_sock[free_dscock_idx].handler_info.handler = ctrl_data_handler;
    data_sock[free_dscock_idx].async_event_mask = 0;
    epev.events = EPOLLIN;
    epev.data.ptr = (void *)&(data_sock[free_dscock_idx].handler_info);
    if (epoll_ctl(epollfd, EPOLL_CTL_ADD, data_sock[free_dscock_idx].sock, &epev) == -1) {
        ALOGE("epoll_ctl for data connection socket failed; errno=%d", errno);
        ctrl_data_close(free_dscock_idx);
        return;
    }
    maxevents++;
}

在这边我们可以看到,进行了accept的操作,并且会通过epoll_ctl把新的fd到注册到epollfd中;
另外,这边我们也可以看到注册上了handler函数即为ctrl_data_handler, 其实现如下:

static void ctrl_command_handler(int dsock_idx) {
    LMKD_CTRL_PACKET packet;
    struct ucred cred;
    int len;
    enum lmk_cmd cmd;
    int nargs;
    int targets;
    int kill_cnt;
    int result;

    len = ctrl_data_read(dsock_idx, (char *)packet, CTRL_PACKET_MAX_SIZE, &cred);
    if (len <= 0)
        return;

    if (len < (int)sizeof(int)) {
        ALOGE("Wrong control socket read length len=%d", len);
        return;
    }

    cmd = lmkd_pack_get_cmd(packet);
    nargs = len / sizeof(int) - 1;
    if (nargs < 0)
        goto wronglen;

    switch(cmd) {
    case LMK_TARGET:
        targets = nargs / 2;
        if (nargs & 0x1 || targets > (int)ARRAY_SIZE(lowmem_adj))
            goto wronglen;
        cmd_target(targets, packet);
        break;
    case LMK_PROCPRIO:
        /* process type field is optional for backward compatibility */
        if (nargs < 3 || nargs > 4)
            goto wronglen;
        cmd_procprio(packet, nargs, &cred);
        break;
    case LMK_PROCREMOVE:
        if (nargs != 1)
            goto wronglen;
        cmd_procremove(packet, &cred);
        break;
    case LMK_PROCPURGE:
        if (nargs != 0)
            goto wronglen;
        cmd_procpurge(&cred);
        break;
    case LMK_GETKILLCNT:
        if (nargs != 2)
            goto wronglen;
        kill_cnt = cmd_getkillcnt(packet);
        len = lmkd_pack_set_getkillcnt_repl(packet, kill_cnt);
        if (ctrl_data_write(dsock_idx, (char *)packet, len) != len)
            return;
        break;
    case LMK_SUBSCRIBE:
        if (nargs != 1)
            goto wronglen;
        cmd_subscribe(dsock_idx, packet);
        break;
    case LMK_PROCKILL:
        /* This command code is NOT expected at all */
        ALOGE("Received unexpected command code %d", cmd);
        break;
    case LMK_UPDATE_PROPS:
        if (nargs != 0)
            goto wronglen;
        update_props();
        if (!use_inkernel_interface) {
            /* Reinitialize monitors to apply new settings */
            destroy_monitors();
            result = init_monitors() ? 0 : -1;
        } else {
            result = 0;
        }
        len = lmkd_pack_set_update_props_repl(packet, result);
        if (ctrl_data_write(dsock_idx, (char *)packet, len) != len) {
            ALOGE("Failed to report operation results");
        }
        if (!result) {
            ALOGI("Properties reinitilized");
        } else {
            /* New settings can't be supported, crash to be restarted */
            ALOGE("New configuration is not supported. Exiting...");
            exit(1);
        }
        break;
    default:
        ALOGE("Received unknown command code %d", cmd);
        return;
    }

    return;

wronglen:
    ALOGE("Wrong control socket read length cmd=%d len=%d", cmd, len);
}

这边针对socket传递来的数据,进行了读取操作,并且根据不同的类型进行相应的处理。
我们可以看一下commands的定义如下:


/*
 * Supported LMKD commands
 */
enum lmk_cmd {
    LMK_TARGET = 0, /* Associate minfree with oom_adj_score */
    LMK_PROCPRIO,   /* Register a process and set its oom_adj_score */
    LMK_PROCREMOVE, /* Unregister a process */
    LMK_PROCPURGE,  /* Purge all registered processes */
    LMK_GETKILLCNT, /* Get number of kills */
    LMK_SUBSCRIBE,  /* Subscribe for asynchronous events */
    LMK_PROCKILL,   /* Unsolicited msg to subscribed clients on proc kills */
    LMK_UPDATE_PROPS, /* Reinit properties */
};

接着看一下我们的init函数,这边有一个use_inkernel_interface的判断:

    if (use_inkernel_interface) {
        ALOGI("Using in-kernel low memory killer interface");
        if (init_poll_kernel()) {
            epev.events = EPOLLIN;
            epev.data.ptr = (void*)&kernel_poll_hinfo;
            if (epoll_ctl(epollfd, EPOLL_CTL_ADD, kpoll_fd, &epev) != 0) {
                ALOGE("epoll_ctl for lmk events failed (errno=%d)", errno);
                close(kpoll_fd);
                kpoll_fd = -1;
            } else {
                maxevents++;
                /* let the others know it does support reporting kills */
                property_set("sys.lmk.reportkills", "1");
            }
        }
    } else {
        if (!init_monitors()) {
            return -1;
        }
        /* let the others know it does support reporting kills */
        property_set("sys.lmk.reportkills", "1");
    }

这个是根据是否有/sys/module/lowmemorykiller/parameters/minfree的写权限来判断的,没有的情况下就使用kernel空间的逻辑.

    has_inkernel_module = !access(INKERNEL_MINFREE_PATH, W_OK);
    use_inkernel_interface = has_inkernel_module;
mainloop的实现

在初始化完成以后,按照之前的分析,会进入到一个死循环里面,来进行处理;
那么我们接下来看看mainloop的实现:

static void mainloop(void) {
    struct event_handler_info* handler_info;
    struct polling_params poll_params;
    struct timespec curr_tm;
    struct epoll_event *evt;
    long delay = -1;

    poll_params.poll_handler = NULL;
    poll_params.paused_handler = NULL;

    while (1) {
        struct epoll_event events[MAX_EPOLL_EVENTS];
        int nevents;
        int i;

        if (poll_params.poll_handler) {
            bool poll_now;

            clock_gettime(CLOCK_MONOTONIC_COARSE, &curr_tm);
            if (poll_params.poll_handler == poll_params.paused_handler) {
                /*
                 * Just transitioned into POLLING_RESUME. Reset paused_handler
                 * and poll immediately
                 */
                poll_params.paused_handler = NULL;
                poll_now = true;
                nevents = 0;
            } else {
                /* Calculate next timeout */
                delay = get_time_diff_ms(&poll_params.last_poll_tm, &curr_tm);
                delay = (delay < poll_params.polling_interval_ms) ?
                    poll_params.polling_interval_ms - delay : poll_params.polling_interval_ms;

                /* Wait for events until the next polling timeout */
                nevents = epoll_wait(epollfd, events, maxevents, delay);

                /* Update current time after wait */
                clock_gettime(CLOCK_MONOTONIC_COARSE, &curr_tm);
                poll_now = (get_time_diff_ms(&poll_params.last_poll_tm, &curr_tm) >=
                    poll_params.polling_interval_ms);
            }
            if (poll_now) {
                call_handler(poll_params.poll_handler, &poll_params, 0);
            }
        } else {
            if (kill_timeout_ms && is_waiting_for_kill()) {
                clock_gettime(CLOCK_MONOTONIC_COARSE, &curr_tm);
                delay = kill_timeout_ms - get_time_diff_ms(&last_kill_tm, &curr_tm);
                /* Wait for pidfds notification or kill timeout to expire */
                nevents = (delay > 0) ? epoll_wait(epollfd, events, maxevents, delay) : 0;
                if (nevents == 0) {
                    /* Kill notification timed out */
                    stop_wait_for_proc_kill(false);
                    if (polling_paused(&poll_params)) {
                        clock_gettime(CLOCK_MONOTONIC_COARSE, &curr_tm);
                        resume_polling(&poll_params, curr_tm);
                    }
                }
            } else {
                /* Wait for events with no timeout */
                nevents = epoll_wait(epollfd, events, maxevents, -1);
            }
        }

        if (nevents == -1) {
            if (errno == EINTR)
                continue;
            ALOGE("epoll_wait failed (errno=%d)", errno);
            continue;
        }

        /*
         * First pass to see if any data socket connections were dropped.
         * Dropped connection should be handled before any other events
         * to deallocate data connection and correctly handle cases when
         * connection gets dropped and reestablished in the same epoll cycle.
         * In such cases it's essential to handle connection closures first.
         */
        for (i = 0, evt = &events[0]; i < nevents; ++i, evt++) {
            if ((evt->events & EPOLLHUP) && evt->data.ptr) {
                ALOGI("lmkd data connection dropped");
                handler_info = (struct event_handler_info*)evt->data.ptr;
                ctrl_data_close(handler_info->data);
            }
        }

        /* Second pass to handle all other events */
        for (i = 0, evt = &events[0]; i < nevents; ++i, evt++) {
            if (evt->events & EPOLLERR) {
                ALOGD("EPOLLERR on event #%d", i);
            }
            if (evt->events & EPOLLHUP) {
                /* This case was handled in the first pass */
                continue;
            }
            if (evt->data.ptr) {
                handler_info = (struct event_handler_info*)evt->data.ptr;
                call_handler(handler_info, &poll_params, evt->events);
            }
        }
    }
}

在这个mainloop中,会调用epoll_wait来进行事件的阻塞。
而epoll_wait方法返回的事件必然是通过 epoll_ctl添加到epoll中的。
也就是刚才的处理方法。

因为这边的方法处理太多,所以我们就只分析几个简单的实现。

LMK_TARGET的实现

在ctrl_command_handler中,我们可以看下LMK_TARGET的实现:

    case LMK_TARGET:
        targets = nargs / 2;
        if (nargs & 0x1 || targets > (int)ARRAY_SIZE(lowmem_adj))
            goto wronglen;
        cmd_target(targets, packet);
        break;

这边可以看到主要是通过cmd_target来进行实现:

static void cmd_target(int ntargets, LMKD_CTRL_PACKET packet) {
    int i;
    struct lmk_target target;
    char minfree_str[PROPERTY_VALUE_MAX];
    char *pstr = minfree_str;
    char *pend = minfree_str + sizeof(minfree_str);
    static struct timespec last_req_tm;
    struct timespec curr_tm;

    if (ntargets < 1 || ntargets > (int)ARRAY_SIZE(lowmem_adj))
        return;

    /*
     * Ratelimit minfree updates to once per TARGET_UPDATE_MIN_INTERVAL_MS
     * to prevent DoS attacks
     */
    if (clock_gettime(CLOCK_MONOTONIC_COARSE, &curr_tm) != 0) {
        ALOGE("Failed to get current time");
        return;
    }

    if (get_time_diff_ms(&last_req_tm, &curr_tm) <
        TARGET_UPDATE_MIN_INTERVAL_MS) {
        ALOGE("Ignoring frequent updated to lmkd limits");
        return;
    }

    last_req_tm = curr_tm;

    for (i = 0; i < ntargets; i++) {
        lmkd_pack_get_target(packet, i, &target);
        lowmem_minfree[i] = target.minfree;
        lowmem_adj[i] = target.oom_adj_score;

        pstr += snprintf(pstr, pend - pstr, "%d:%d,", target.minfree,
            target.oom_adj_score);
        if (pstr >= pend) {
            /* if no more space in the buffer then terminate the loop */
            pstr = pend;
            break;
        }
    }

    lowmem_targets_size = ntargets;

    /* Override the last extra comma */
    pstr[-1] = '\0';
    property_set("sys.lmk.minfree_levels", minfree_str);

    if (has_inkernel_module) {
        char minfreestr[128];
        char killpriostr[128];

        minfreestr[0] = '\0';
        killpriostr[0] = '\0';

        for (i = 0; i < lowmem_targets_size; i++) {
            char val[40];

            if (i) {
                strlcat(minfreestr, ",", sizeof(minfreestr));
                strlcat(killpriostr, ",", sizeof(killpriostr));
            }

            snprintf(val, sizeof(val), "%d", use_inkernel_interface ? lowmem_minfree[i] : 0);
            strlcat(minfreestr, val, sizeof(minfreestr));
            snprintf(val, sizeof(val), "%d", use_inkernel_interface ? lowmem_adj[i] : 0);
            strlcat(killpriostr, val, sizeof(killpriostr));
        }

        writefilestring(INKERNEL_MINFREE_PATH, minfreestr, true);
        writefilestring(INKERNEL_ADJ_PATH, killpriostr, true);
    }
}

上面的处理逻辑主要是:

  1. 按照顺序取出数据,装进lmkd的数组中。
  2. 分别将两个数组中的数取出,用”,”分隔
  3. lowmem_minfree中的数据拼成的string写到 “/sys/module/lowmemorykiller/parameters/minfree”
  4. lowmem_adj中的数据拼成的string写到 “/sys/module/lowmemorykiller/parameters/adj”

LMK_PROCPRIO

在ctrl_command_handler中,我们可以看下LMK_PROCPRIO的实现:

    case LMK_PROCPRIO:
        /* process type field is optional for backward compatibility */
        if (nargs < 3 || nargs > 4)
            goto wronglen;
        cmd_procprio(packet, nargs, &cred);
        break;

可以看到主要是调用cmd_procprio来进行的实现:

static void cmd_procprio(LMKD_CTRL_PACKET packet, int field_count, struct ucred *cred) {
    struct proc *procp;
    char path[LINE_MAX];
    char val[20];
    int soft_limit_mult;
    struct lmk_procprio params;
    bool is_system_server;
    struct passwd *pwdrec;
    int tgid;

    lmkd_pack_get_procprio(packet, field_count, &params);

    if (params.oomadj < OOM_SCORE_ADJ_MIN ||
        params.oomadj > OOM_SCORE_ADJ_MAX) {
        ALOGE("Invalid PROCPRIO oomadj argument %d", params.oomadj);
        return;
    }

    if (params.ptype < PROC_TYPE_FIRST || params.ptype >= PROC_TYPE_COUNT) {
        ALOGE("Invalid PROCPRIO process type argument %d", params.ptype);
        return;
    }

    /* Check if registered process is a thread group leader */
    tgid = proc_get_tgid(params.pid);
    if (tgid >= 0 && tgid != params.pid) {
        ALOGE("Attempt to register a task that is not a thread group leader (tid %d, tgid %d)",
            params.pid, tgid);
        return;
    }

    /* gid containing AID_READPROC required */
    /* CAP_SYS_RESOURCE required */
    /* CAP_DAC_OVERRIDE required */
    snprintf(path, sizeof(path), "/proc/%d/oom_score_adj", params.pid);
    snprintf(val, sizeof(val), "%d", params.oomadj);
    if (!writefilestring(path, val, false)) {
        ALOGW("Failed to open %s; errno=%d: process %d might have been killed",
              path, errno, params.pid);
        /* If this file does not exist the process is dead. */
        return;
    }

    if (use_inkernel_interface) {
        stats_store_taskname(params.pid, proc_get_name(params.pid, path, sizeof(path)));
        return;
    }

    /* lmkd should not change soft limits for services */
    if (params.ptype == PROC_TYPE_APP && per_app_memcg) {
        if (params.oomadj >= 900) {
            soft_limit_mult = 0;
        } else if (params.oomadj >= 800) {
            soft_limit_mult = 0;
        } else if (params.oomadj >= 700) {
            soft_limit_mult = 0;
        } else if (params.oomadj >= 600) {
            // Launcher should be perceptible, don't kill it.
            params.oomadj = 200;
            soft_limit_mult = 1;
        } else if (params.oomadj >= 500) {
            soft_limit_mult = 0;
        } else if (params.oomadj >= 400) {
            soft_limit_mult = 0;
        } else if (params.oomadj >= 300) {
            soft_limit_mult = 1;
        } else if (params.oomadj >= 200) {
            soft_limit_mult = 8;
        } else if (params.oomadj >= 100) {
            soft_limit_mult = 10;
        } else if (params.oomadj >=   0) {
            soft_limit_mult = 20;
        } else {
            // Persistent processes will have a large
            // soft limit 512MB.
            soft_limit_mult = 64;
        }

        snprintf(path, sizeof(path), MEMCG_SYSFS_PATH
                 "apps/uid_%d/pid_%d/memory.soft_limit_in_bytes",
                 params.uid, params.pid);
        snprintf(val, sizeof(val), "%d", soft_limit_mult * EIGHT_MEGA);

        /*
         * system_server process has no memcg under /dev/memcg/apps but should be
         * registered with lmkd. This is the best way so far to identify it.
         */
        is_system_server = (params.oomadj == SYSTEM_ADJ &&
                            (pwdrec = getpwnam("system")) != NULL &&
                            params.uid == pwdrec->pw_uid);
        writefilestring(path, val, !is_system_server);
    }

主要实现如下:

  1. LMK_PROCPRIO的主要作用就是更新进程的oomAdj
  2. 将上层传递过来的数据(pid以及优先级)写到该进程对应的文件节点/proc/pid/oom_score_adj

Kill进程的操作

/*
 * Find one process to kill at or above the given oom_adj level.
 * Returns size of the killed process.
 */
static int find_and_kill_process(int min_score_adj, int kill_reason, const char *kill_desc,
                                 union meminfo *mi, struct timespec *tm) {
    int i;
    int killed_size = 0;
    bool lmk_state_change_start = false;

    for (i = OOM_SCORE_ADJ_MAX; i >= min_score_adj; i--) {
        struct proc *procp;

        while (true) {
            procp = kill_heaviest_task ?
                proc_get_heaviest(i) : proc_adj_lru(i);

            if (!procp)
                break;

            killed_size = kill_one_process(procp, min_score_adj, kill_reason, kill_desc, mi, tm);
            if (killed_size >= 0) {
                if (!lmk_state_change_start) {
                    lmk_state_change_start = true;
                    stats_write_lmk_state_changed(
                            android::lmkd::stats::LMK_STATE_CHANGED__STATE__START);
                }
                break;
            }
        }
        if (killed_size) {
            break;
        }
    }

    if (lmk_state_change_start) {
        stats_write_lmk_state_changed(android::lmkd::stats::LMK_STATE_CHANGED__STATE__STOP);
    }

    return killed_size;
}

这里最主要的是for循环,去看当前系统建议的max和minfree中的odj的定义的进程的优先级,如果找到的话,就会去执行kill的操作。

/* Kill one process specified by procp.  Returns the size of the process killed */
static int kill_one_process(struct proc* procp, int min_oom_score, int kill_reason,
                            const char *kill_desc, union meminfo *mi, struct timespec *tm) {
    int pid = procp->pid;
    int pidfd = procp->pidfd;
    uid_t uid = procp->uid;
    int tgid;
    char *taskname;
    int tasksize;
    int r;
    int result = -1;
    struct memory_stat *mem_st;
    char buf[LINE_MAX];

    tgid = proc_get_tgid(pid);
    if (tgid >= 0 && tgid != pid) {
        ALOGE("Possible pid reuse detected (pid %d, tgid %d)!", pid, tgid);
        goto out;
    }

    taskname = proc_get_name(pid, buf, sizeof(buf));
    if (!taskname) {
        goto out;
    }

    tasksize = proc_get_size(pid);
    if (tasksize <= 0) {
        goto out;
    }

    mem_st = stats_read_memory_stat(per_app_memcg, pid, uid);

    TRACE_KILL_START(pid);

    /* CAP_KILL required */
    if (pidfd < 0) {
        start_wait_for_proc_kill(pid);
        r = kill(pid, SIGKILL);
    } else {
        start_wait_for_proc_kill(pidfd);
        r = sys_pidfd_send_signal(pidfd, SIGKILL, NULL, 0);
    }

    TRACE_KILL_END();

    if (r) {
        stop_wait_for_proc_kill(false);
        ALOGE("kill(%d): errno=%d", pid, errno);
        /* Delete process record even when we fail to kill so that we don't get stuck on it */
        goto out;
    }

    set_process_group_and_prio(pid, SP_FOREGROUND, ANDROID_PRIORITY_HIGHEST);

    last_kill_tm = *tm;

    inc_killcnt(procp->oomadj);

    killinfo_log(procp, min_oom_score, tasksize, kill_reason, mi);

    if (kill_desc) {
        ALOGI("Kill '%s' (%d), uid %d, oom_adj %d to free %ldkB; reason: %s", taskname, pid,
              uid, procp->oomadj, tasksize * page_k, kill_desc);
    } else {
        ALOGI("Kill '%s' (%d), uid %d, oom_adj %d to free %ldkB", taskname, pid,
              uid, procp->oomadj, tasksize * page_k);
    }

    stats_write_lmk_kill_occurred(uid, taskname, procp->oomadj, min_oom_score, tasksize, mem_st);

    ctrl_data_write_lmk_kill_occurred((pid_t)pid, uid);

    result = tasksize;

out:
    /*
     * WARNING: After pid_remove() procp is freed and can't be used!
     * Therefore placed at the end of the function.
     */
    pid_remove(pid);
    return result;
}

最终,LMKD是使用kill和pid_remove来进行的移除操作。

以上,就是LMKD的具体实现。

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值