- 背景:程序在做升级的时候,发现RAM不够用了---->比如RAM一共有 200M(free -m命令可查看总的内存),用的只剩大概5M(这个参数是通过/proc/sys/vm/min_free_kbytes参数配置)的时候,----->发现任意正在运行的进程会触发linux的oom-killer的机制,然后dmesg会打印进程的一些相关信息----->在通过linux内部的打分机制,为每个进程打分,得分最高的进程被杀死,从而为RAM提供空间。
- 进程因为OOM被杀死,dmesg会打印如下结果,OOM的机制如下:
1.oom的触发条件:
- 系统无法给应用程序分配物理内存。
- oom kill体现的是:应用层程序申请内存成功,但是运行时无法满足的情况。即malloc成功(malloc并不会去直接的操作物理内存),但是用的时候发现物理内存不足。(申请内存首先是申请的虚拟内存,使用时再映射到物理内存,并不是直接操作物理内存。所以会出现malloc成功,但使用内存的时候进程被杀掉。)
- 触发 out of memory
<4>[ 551.701040] test invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0 <4>[ 551.701062] CPU: 0 PID: 1691 Comm: test Tainted: G O 3.18.48 #1
2.打印调用堆栈
<4>[ 551.701095] [<c0013ca0>] (unwind_backtrace) from [<c0011d0c>] (show_stack+0x10/0x14) <4>[ 551.701114] [<c0011d0c>] (show_stack) from [<c079126c>] (dump_header+0x7c/0xc0) <4>[ 551.701132] [<c079126c>] (dump_header) from [<c009ba10>] (oom_kill_process+0xa8/0x450) <4>[ 551.701148] [<c009ba10>] (oom_kill_process) from [<c009c23c>] (out_of_memory+0x280/0x30c) <4>[ 551.701163] [<c009c23c>] (out_of_memory) from [<c00a03b8>] (__alloc_pages_nodemask+0x890/0x90c) <4>[ 551.701178] [<c00a03b8>] (__alloc_pages_nodemask) from [<c009a88c>] (filemap_fault+0x280/0x4b4) <4>[ 551.701194] [<c009a88c>] (filemap_fault) from [<c00bc868>] (__do_fault+0x34/0x90) <4>[ 551.701208] [<c00bc868>] (__do_fault) from [<c00bf418>] (do_read_fault+0x19c/0x288) <4>[ 551.701222] [<c00bf418>] (do_read_fault) from [<c00bfc1c>] (handle_mm_fault+0x3d8/0x898) <4>[ 551.701237] [<c00bfc1c>] (handle_mm_fault) from [<c0016a38>] (do_page_fault+0x11c/0x378) <4>[ 551.701252] [<c0016a38>] (do_page_fault) from [<c0008618>] (do_PrefetchAbort+0x34/0x128) <4>[ 551.701265] [<c0008618>] (do_PrefetchAbort) from [<c0012d48>] (ret_from_exception+0x0/0x18) <4>[ 551.701274] Exception stack(0xcb76dfb0 to 0xcb76dff8) <4>[ 551.701285] dfa0: b5dfde1c b5dfd60c 000004a0 00000002 <4>[ 551.701298] dfc0: b5dfde20 b5dfde1c 000004a0 b5dfd60c 00000000 000306d8 b5dfd554 000004a0 <4>[ 551.701309] dfe0: 00016f44 b5dfd470 00011c04 4cb64f00 80070010 ffffffff
3. 打印每个进程当前的参数(..省略一些进程)
<4>[ 551.701317] Mem-info: <4>[ 551.701325] Normal per-cpu: <4>[ 551.701333] CPU 0: hi: 42, btch: 7 usd: 22 <4>[ 551.701349] active_anon:27711 inactive_anon:32 isolated_anon:0 <4>[ 551.701349] active_file:82 inactive_file:136 isolated_file:0 <4>[ 551.701349] unevictable:0 dirty:0 writeback:0 unstable:0 <4>[ 551.701349] free:393 slab_reclaimable:1326 slab_unreclaimable:3406 <4>[ 551.701349] mapped:70 shmem:469 pagetables:773 bounce:0 <4>[ 551.701349] free_cma:0 <4>[ 551.701381] Normal free:1572kB min:1572kB low:1964kB high:2356kB active_anon:110844kB inactive_anon:128kB active_file:328kB inactive_file:544kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:256000kB managed:155528kB mlocked:0kB dirty:0kB writeback:0kB mapped:280kB shmem:1876kB slab_reclaimable:5304kB slab_unreclaimable:13624kB kernel_stack:4056kB pagetables:3092kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:5244 all_unreclaimable? yes <4>[ 551.701389] lowmem_reserve[]: 0 0 <4>[ 551.701399] Normal: 13*4kB (UER) 8*8kB (UMR) 1*16kB (R) 1*32kB (R) 0*64kB 1*128kB (R) 1*256kB (R) 0*512kB 1*1024kB (R) 0*2048kB 0*4096kB = 1572kB <4>[ 551.701431] 695 total pagecache pages <4>[ 551.701440] 0 pages in swap cache <4>[ 551.701448] Swap cache stats: add 0, delete 0, find 0/0 <4>[ 551.701455] Free swap = 0kB <4>[ 551.701461] Total swap = 0kB <4>[ 551.703198] 42752 pages of RAM <4>[ 551.703208] 1124 free pages <4>[ 551.703215] 3870 reserved pages <4>[ 551.703221] 3594 slab pages <4>[ 551.703228] 209 pages shared <4>[ 551.703234] 0 pages swap cached <6>[ 551.703243] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name ... ... ... <6>[ 551.704003] [20364] 0 20364 738 30 4 0 0 hello <6>[ 551.704015] [20365] 0 20365 19363 15108 41 0 0 UpdateAPP 评分机制 <3>[ 551.704024] Out of memory: Kill process 20365 (UpdateAPP) score 377 or sacrifice child <3>[ 551.704036] Killed process 20365 (UpdateAPP) total-vm:77452kB, anon-rss:60412kB, file-rss:20kB
参数解释
- total-vm :进程使用的总的虚拟内存。
- anon-rss:匿名内存,比如malloc出来的就是匿名的。当前在进程中为RAM分配的内存量。
- file-rss:映射到设备和文件上的内存页面
rss(Resident Set Size):指明了当前实际占用了多少内存, 实际使用物理内存(包含共享库占用的全部内存)
rss = 15108 * 4 (rss的值*4KB)= anon-rss:60412kB + file-rss:20kB = 60432KB(见评分机制)
4. 评分机制:
内核会给每个进程评分,在系统内存不足的时候,选一个分最高的进程,然后去kill这个进程。 (可以看到UpdateAPP进程的分最高,kill这个进程)
<3>[ 551.704024] Out of memory: Kill process 20365 (UpdateAPP) score 377 or sacrifice child <3>[ 551.704036] Killed process 20365 (UpdateAPP) total-vm:77452kB, anon-rss:60412kB, file-rss:20kB
..
- linux的out_of_memory()函数源码
- OOM源码:当触发OOM机制会进入__out_of_memory函数
static void __out_of_memory(struct zonelist *zonelist, gfp_t gfp_mask, int order, nodemask_t *nodemask, bool force_kill) { …… check_panic_on_oom(constraint, gfp_mask, order, mpol_mask); 检查打开OOM没 …… p = select_bad_process(&points, totalpages, mpol_mask, force_kill); 选择一个进程 /* Found nothing?!?! Either we hang forever, or we panic. */ if (!p) { dump_header(NULL, gfp_mask, order, NULL, mpol_mask); panic("Out of memory and no killable processes...\n"); } if (p != (void *)-1UL) { oom_kill_process(p, gfp_mask, order, points, totalpages, NULL, nodemask, "Out of memory"); 杀掉这个进程 killed = 1; } …… }
- 查看是否打开了OOM机制
void check_panic_on_oom(enum oom_constraint constraint, gfp_t gfp_mask, int order, const nodemask_t *nodemask) { if (likely(!sysctl_panic_on_oom)) return; if (sysctl_panic_on_oom != 2) { if (constraint != CONSTRAINT_NONE) return; } dump_header(NULL, gfp_mask, order, NULL, nodemask); panic("Out of memory: %s panic_on_oom is enabled\n", sysctl_panic_on_oom == 2 ? "compulsory" : "system-wide"); }
内核中sysctl_panic_on_oom变量是和/proc/sys/vm/panic_on_oom对应的,可通过如下命令查看
$ cat /proc/sys/vm/panic_on_oom
对应的参数是如下几种
- oom_kill_allocating_task来配置OOM机制杀谁
(1)谁触发了OOM就干掉谁
(2)谁最“坏”就干掉谁
- 评分机制
select_bad_process---> 调用oom_badness { dump_header(p, gfp_mask, order, memcg, nodemask); -----打印调用堆栈 ... if (adj == OOM_SCORE_ADJ_MIN) { ------如果在评分之前设置这个进程的评分为OOM_SCORE_ADJ_MIN 那么就不杀这个进程 task_unlock(p); return 0; } ... }
- 在设置前可先查看cat /proc/<pid>/oom_score_adj
- 可通过 echo -1000 > /proc/<pid>/oom_score_adj 来禁止此程序被kill
- 显示内存
void show_mem(unsigned int filter) { int free = 0, total = 0, reserved = 0; int shared = 0, cached = 0, slab = 0; struct memblock_region *reg; printk("Mem-info:\n"); show_free_areas(filter); for_each_memblock (memory, reg) { unsigned int pfn1, pfn2; struct page *page, *end; pfn1 = memblock_region_memory_base_pfn(reg); pfn2 = memblock_region_memory_end_pfn(reg); page = pfn_to_page(pfn1); end = pfn_to_page(pfn2 - 1) + 1; do { total++; if (PageReserved(page)) reserved++; else if (PageSwapCache(page)) cached++; else if (PageSlab(page)) slab++; else if (!page_count(page)) free++; else shared += page_count(page) - 1; pfn1++; page = pfn_to_page(pfn1); } while (pfn1 < pfn2); } printk("%d pages of RAM\n", total); printk("%d free pages\n", free);---------显示可用的内存扇区数*4KB = 可用内存 printk("%d reserved pages\n", reserved); printk("%d slab pages\n", slab); printk("%d pages shared\n", shared); printk("%d pages swap cached\n", cached); }
- 如何手动触发OOM
- cat /proc/sys/vm/panic_on_oom
这个值为 0 表示在OOM时系统执行OOM Killer
这个值为 1 表示在OOM时系统会panic(恐慌)
- 一些参数
系统所保留空闲内存的最低限:/proc/sys/vm/min_free_kbytes参数
当系统可用内存(不包含buffer和cache)小于这个值的时候,系统会启动内核线程kswapd来对内存进行回收,如果内存回收后发现内存还是不够,就会触发oom killer,则表明内存真的不够用了。或者在内存回收前或者回收中直接触发了oom killer。
- 运行逻辑