vm.panic_on_oom | |
---|---|
0(默认) | oom的时候会杀掉某些进程,保全操作系统运转 |
1 | oom的时候,直接系统panic |
2 | 更狠,即使是在memory cgroup oom的时候,也会导致整个系统panic |
* 当值是1 或者2 的时候,使用场景是让系统panic,这样在集群系统中就会导致failover了
sysctl vm.min_free_kbytes
系统保留的空闲内存,会增加oom的风险。
但是保留的内存用于干啥来?
min_free_kbytes:
This is used to force the Linux VM to keep a minimum number
of kilobytes free. The VM uses this number to compute a
watermark[WMARK_MIN] value for each lowmem zone in the system.
Each lowmem zone gets a number of reserved free pages based
proportionally on its size.
Some minimal amount of memory is needed to satisfy PF_MEMALLOC
allocations; if you set this to lower than 1024KB, your system will
become subtly broken, and prone to deadlock under high loads.
Setting this too high will OOM your machine instantly.
sysctl vm.admin_reserve_kbytes
admin_reserve_kbytes
The amount of free memory in the system that should be reserved for users
with the capability cap_sys_admin.
admin_reserve_kbytes defaults to min(3% of free pages, 8MB)
That should provide enough for the admin to log in and kill a process,
if necessary, under the default overcommit 'guess' mode.
Systems running under overcommit 'never' should increase this to account
for the full Virtual Memory Size of programs used to recover. Otherwise,
root may not be able to log in to recover the system. 这句话读懂了,如果overcommit是never,那么系统只会在有那么多内存的情况下才会分配,如果我们运行某个程序,需要100MB,但是实际上只要有个10MB就好了,如果我们剩余内存只有11MB,运行在never的情况下就是不可能成功运行的,而在0和1的时候是可以运行的。
How do you calculate a minimum useful reserve?
sshd or login + bash (or some other shell) + top (or ps, kill, etc.)
For overcommit 'guess', we can sum resident set sizes (RSS).
On x86_64 this is about 8MB.
For overcommit 'never', we can take the max of their virtual sizes (VSZ)
and add the sum of their RSS.
On x86_64 this is about 128MB.
Changing this takes effect whenever an application requests memory.