微信公众号:奔跑吧linux社区
本文节选自《奔跑吧Linux内核》第二版卷1第6.3.3章
1.问题描述
下面是有问题的OOM Killer内核日志,其中空闲页面为86048KB,最低警戒水位为22528KB,低水位为28160KB。读者可能会感到疑惑,为什么即使空闲页面远远大于最低警戒水位也无法分配出一个物理页面?
<OOM Killer的问题内核日志>
[ 150.257731] insmod invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL),
order=0, oom_score_adj=0
...
[ 150.272821] Node 0 DMA32 free:86048KB min:22528KB low:28160KB high:33792KB
active_anon:16384KB inactive_anon:6316KB active_file:896KB inactive_file:
808KB unevictable:0KB writepending:0KB present:1048576KB managed:999784KB mlocked:
0KB kernel_stack:2848KB pagetables:812KB bounce:0KB free_pcp:1864KB local_pcp:
756KB free_cma:64280KB
[ 150.335591] lowmem_reserve[]: 0 0 0
...
Oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,
mems_allowed=0,global_oom,task_memcg=/user.slice/user-0.slice/user@0.service,
task=(sd-pam),pid=512,uid=0
[ 150.297054] Out of memory: Kill process 512 ((sd-pam)) score 2 or sacrifice child
[ 150.299368] Killed process 512 ((sd-pam)) total-vm:166912KB, anon-rss:2616KB,
file-rss:0KB, shmem-rss:0KB
[ 150.357941] oom_reaper: reaped process 512 ((sd-pam)), now anon-rss:0KB,
file-rss:0KB, shmem-rss:0KB
2.问题分析
现在的服务器或者手机等设备都配备了大量的内存。虽然配置了大量的内存,当服务器业务量越来越大时,系统内存会处于承压状态,可能系统想分配一个页面都分配不出来,从而触发OOM Killer机制。
我们先来分析一个OOM Killer的正常内核日志。
<OOM Killer的正常内核日志>
[ 296.106260] systemd invoked oom-killer:
gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
...
[ 296.134445] Node 0 DMA32 free:23592KB min:24576KB low:30208KB high:35840KB active_
anon:40680KB inactive_anon:5000KB active_file:72KB inactive_file:112KB unevictable:
0KB writepending:0KB present:1048576KB managed:738068KB mlocked:0KB kernel_stack:
2432KB pagetables:1268KB bounce:0KB free_pcp:32KB local_pcp:32KB free_cma:0KB
[ 296.137154] lowmem_reserve[]: 0 0 0
[ 296.137980] Node 0 DMA32: 1322*4KB (UME) 834*8KB (UME) 378*16KB (UME) 119*32KB
(UME) 28