LowMemoryKiller机制简析

  • LowMemoryKiller概述 

LowMemoryKiller是Android基于Linux的OOM killer定制的进程管理功能,通过对进程的管理来保证Android系统可以流畅运行。避免出现一些由于内存不足造成系统异常。

所有应用进程都是从zygote孵化出来的,记录在AMS中mLruProcesses列表中,由AMS进行统一管理,AMS中会根据进程的状态更新进程对应的oom_adj值,这个值会通过文件传递到kernel中去,kernel有个低内存回收机制,在内存达到一定阀值时会触发清理oom_adj值高的进程腾出更多的内存空间。

  • LowMemoryKill的查杀流程
  • Framework层

位于ProcessList.java中定义了3种命令类型,这些文件的定义必须跟lmkd.c定义完全一致,格式分别如下:

LMK_TARGET <minfree> <minkillprio> ... (up to 6 pairs)

LMK_PROCPRIO <pid> <prio>

LMK_PROCREMOVE <pid>

 

上述3个命令的使用都通过ProcessList.java中的如下方法:

功能

命令

对应方法

LMK_PROCPRIO

设置进程adj

PL.setOomAdj()

LMK_TARGET

更新oom_adj

PL.updateOomLevels()

LMK_PROCREMOVE

移除进程

PL.remove()

当AMS.applyOomAdjLocked()过程,则会设置某个进程的adj;

当AMS.updateConfiguration()过程中便会更新整个各个级别的oom_adj信息.

当AMS.cleanUpApplicationRecordLocked()或者handleAppDiedLocked()过程,则会将某个进程从lmkd策略中移除.

  • lmkd

lmkd是由init进程,通过解析init.rc文件来启动的lmkd守护进程,lmkd会创建名为lmkd的socket,节点位于/dev/socket/lmkd,该socket用于跟上层framework交互。

  • Kernel层

lowmemorykiller driver位于 drivers/staging/Android/lowmemorykiller.c

  1. lowmemorykiller初始化

static struct shrinker lowmem_shrinker = {

    .scan_objects = lowmem_scan,

    .count_objects = lowmem_count,

    .seeks = DEFAULT_SEEKS * 16

};

 

static int __init lowmem_init(void)

{

    register_shrinker(&lowmem_shrinker);

    return 0;

}

 

static void __exit lowmem_exit(void)

{

    unregister_shrinker(&lowmem_shrinker);

}

 

module_init(lowmem_init);

module_exit(lowmem_exit);

 

通过register_shrinker和unregister_shrinker分别用于初始化和退出。

  1. shrinker

LMK驱动通过注册shrinker来实现的,shrinker是linux kernel标准的回收内存page的机制,由内核线程kswapd负责监控。

当内存不足时kswapd线程会遍历一张shrinker链表,并回调已注册的shrinker函数来回收内存page,kswapd还会周期性唤醒来执行内存操作。每个zone维护active_list和inactive_list链表,内核根据页面活动状态将page在这两个链表之间移动,最终通过shrink_slab和shrink_zone来回收内存页,有兴趣想进一步了解linux内存回收机制,可自行研究,这里再回到LowMemoryKiller的过程分析。

  1. lowmem_count

static unsigned long lowmem_count(struct shrinker *s,

                  struct shrink_control *sc)

{

    return global_page_state(NR_ACTIVE_ANON) +

        global_page_state(NR_ACTIVE_FILE) +

        global_page_state(NR_INACTIVE_ANON) +

        global_page_state(NR_INACTIVE_FILE);

}

 

ANON代表匿名映射,没有后备存储器;FILE代表文件映射; 内存计算公式= 活动匿名内存 + 活动文件内存 + 不活动匿名内存 + 不活动文件内存

  1. lowmem_scan

static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)

{

     struct task_struct *tsk;

     struct task_struct *selected = NULL;

     unsigned long rem = 0;

     int tasksize;

     int i;

     int ret = 0;

     short min_score_adj = OOM_SCORE_ADJ_MAX + 1;

     int minfree = 0;

     int selected_tasksize = 0;

     short selected_oom_score_adj;

     int array_size = ARRAY_SIZE(lowmem_adj);

     int other_free;

     int other_file;

    //get mutex lock

     if (mutex_lock_interruptible(&scan_mutex) < 0)

            return 0;

    //get the free

     other_free = global_page_state(NR_FREE_PAGES);

 

     if (global_page_state(NR_SHMEM) + total_swapcache_pages() <

            global_page_state(NR_FILE_PAGES) + zcache_pages())

            other_file = global_page_state(NR_FILE_PAGES) + zcache_pages() -

                                        global_page_state(NR_SHMEM) -

                                        total_swapcache_pages();

     else

            other_file = 0;

 

     tune_lmk_param(&other_free, &other_file, sc);

 

     //reset the score from the define lowmem_adj

     if (lowmem_adj_size < array_size)

            array_size = lowmem_adj_size;

     if (lowmem_minfree_size < array_size)

            array_size = lowmem_minfree_size;

     for (i = 0; i < array_size; i++) {

            minfree = lowmem_minfree[i];

            if (other_free < minfree && other_file < minfree) {

                   min_score_adj = lowmem_adj[i];

                   break;

            }

     }

 

     ret = adjust_minadj(&min_score_adj);

 

     lowmem_print(3, "lowmem_scan %lu, %x, ofree %d %d, ma %hd\n",

                   sc->nr_to_scan, sc->gfp_mask, other_free,

                   other_file, min_score_adj);

 

     if (min_score_adj == OOM_SCORE_ADJ_MAX + 1) {

            trace_almk_shrink(0, ret, other_free, other_file, 0);

            lowmem_print(5, "lowmem_scan %lu, %x, return 0\n",

                        sc->nr_to_scan, sc->gfp_mask);

            mutex_unlock(&scan_mutex);

            return 0;

     }

 

 

     selected_oom_score_adj = min_score_adj;

 

     rcu_read_lock();

     for_each_process(tsk) {

            struct task_struct *p;

            short oom_score_adj;

 

            if (tsk->flags & PF_KTHREAD)

                   continue;

 

            /* if task no longer has any memory ignore it */

            if (test_task_flag(tsk, TIF_MM_RELEASED))

                   continue;

 

            if (time_before_eq(jiffies, lowmem_deathpending_timeout)) {

                   if (test_task_flag(tsk, TIF_MEMDIE)) {

                          rcu_read_unlock();

                          /* give the system time to free up the memory */

                          msleep_interruptible(20);

                          mutex_unlock(&scan_mutex);

                          return 0;

                   }

            }

 

            p = find_lock_task_mm(tsk);

            if (!p)

                   continue;

 

            oom_score_adj = p->signal->oom_score_adj;

            if (oom_score_adj < min_score_adj) {

                   task_unlock(p);

                   continue;

            }

            tasksize = get_mm_rss(p->mm);

            task_unlock(p);

            if (tasksize <= 0)

                   continue;

            if (selected) {

                   if (oom_score_adj < selected_oom_score_adj)

                          continue;

                   if (oom_score_adj == selected_oom_score_adj &&

                       tasksize <= selected_tasksize)

                          continue;

            }

            selected = p;

            selected_tasksize = tasksize;

            selected_oom_score_adj = oom_score_adj;

            lowmem_print(3, "select '%s' (%d), adj %hd, size %d, to kill\n",

                        p->comm, p->pid, oom_score_adj, tasksize);

     }

     if (selected) {

            long cache_size = other_file * (long)(PAGE_SIZE / 1024);

            long cache_limit = minfree * (long)(PAGE_SIZE / 1024);

            long free = other_free * (long)(PAGE_SIZE / 1024);

            trace_lowmemory_kill(selected, cache_size, cache_limit, free);

            lowmem_print(1, "Killing '%s' (%d), adj %hd,\n" \

                          "   to free %ldkB on behalf of '%s' (%d) because\n" \

                          "   cache %ldkB is below limit %ldkB for oom_score_adj %hd\n" \

                          "   Free memory is %ldkB above reserved.\n" \

                          "   Free CMA is %ldkB\n" \

                          "   Total reserve is %ldkB\n" \

                          "   Total free pages is %ldkB\n" \

                          "   Total file cache is %ldkB\n" \

                          "   Total zcache is %ldkB\n" \

                          "   GFP mask is 0x%x\n",

                        selected->comm, selected->pid,

                        selected_oom_score_adj,

                        selected_tasksize * (long)(PAGE_SIZE / 1024),

                        current->comm, current->pid,

                        cache_size, cache_limit,

                        min_score_adj,

                        other_free * (long)(PAGE_SIZE / 1024),

                        global_page_state(NR_FREE_CMA_PAGES) *

                          (long)(PAGE_SIZE / 1024),

                        totalreserve_pages * (long)(PAGE_SIZE / 1024),

                        global_page_state(NR_FREE_PAGES) *

                          (long)(PAGE_SIZE / 1024),

                        global_page_state(NR_FILE_PAGES) *

                          (long)(PAGE_SIZE / 1024),

                        (long)zcache_pages() * (long)(PAGE_SIZE / 1024),

                        sc->gfp_mask);

 

            if (lowmem_debug_level >= 2 && selected_oom_score_adj == 0) {

                   show_mem(SHOW_MEM_FILTER_NODES);

                   dump_tasks(NULL, NULL);

            }

 

            lowmem_deathpending_timeout = jiffies + HZ;

            set_tsk_thread_flag(selected, TIF_MEMDIE);

            send_sig(SIGKILL, selected, 0);

            rem += selected_tasksize;

            rcu_read_unlock();

            /* give the system time to free up the memory */

            msleep_interruptible(20);

            trace_almk_shrink(selected_tasksize, ret,

                   other_free, other_file, selected_oom_score_adj);

     } else {

            trace_almk_shrink(1, ret, other_free, other_file, 0);

            rcu_read_unlock();

     }

 

     lowmem_print(4, "lowmem_scan %lu, %x, return %lu\n",

                 sc->nr_to_scan, sc->gfp_mask, rem);

     mutex_unlock(&scan_mutex);

     return rem;

}

 

  • 总结

本文主要从frameworks的ProcessList.java调整adj,通过socket通信将事件发送给native的守护进程lmkd;lmkd再根据具体的命令来执行相应操作,其主要功能 更新进程的oom_score_adj值以及lowmemorykiller驱动的parameters(包括minfree和adj);

最后讲到了lowmemorykiller驱动,通过注册shrinker,借助linux标准的内存回收机制,根据当前系统可用内存以及parameters配置参数(adj,minfree)来选取合适的selected_oom_score_adj,再从所有进程中选择adj大于该目标值的并且占用rss内存最大的进程,将其杀掉,从而释放出内存。

  • Lowmemorykiller的参数设置
  1. lmkd参数

oom_adj:代表进程的优先级, 数值越大,优先级越低,越容易被杀. 取值范围[-16, 15]

oom_score_adj: 取值范围[-1000, 1000]

oom_score:lmk策略中貌似并没有看到使用的地方,这个应该是oom才会使用。

想查看某个进程的上述3值,只需要知道pid,查看以下几个节点:

/proc/<pid>/oom_adj

/proc/<pid>/oom_score_adj

/proc/<pid>/oom_score

 

对于oom_adj与oom_score_adj通过方法lowmem_oom_adj_to_oom_score_adj()建立有一定的映射关系:

当oom_adj = 15, 则oom_score_adj=1000;

当oom_adj < 15, 则oom_score_adj= oom_adj * 1000/17;

例如

oom_score_adj取值: 0, 58, 117, 176, 529, 700, 1000

oom_adj对应值: 0, 1, 2, 3, 9, 12, 15

  1. driver参数

/sys/module/lowmemorykiller/parameters/minfree (代表page个数)

/sys/module/lowmemorykiller/parameters/adj (代表oom_score_adj)

 

举例说明:

参数设置:

1,6写入节点/sys/module/lowmemorykiller/parameters/adj

1024,8192写入节点/sys/module/lowmemorykiller/parameters/minfree

策略解读:

当系统可用内存低于8192个pages时,则会杀掉oom_score_adj>=6的进程

当系统可用内存低于1024个pages时,则会杀掉oom_score_adj>=1的进程

 

  • 0
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值