发现部署的服务莫名挂掉了,并没有溢出dump文件及错误日志,通过万能搜索引擎帮助,在/var/log/message下查找被谁干掉了。
thread1 invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
....
Mar 8 10:40:12 localhost kernel: Out of memory: Kill process 17946 (java) score 184 or sacrifice child
Mar 8 10:40:12 localhost kernel: Killed process 17946 (java) total-vm:6021548kB, anon-rss:1307952kB, file-rss:0kB, shmem-rss:0kB
Mar 8 10:40:12 localhost kernel: java invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Mar 8 10:40:12 localhost kernel: java cpuset=/ mems_allowed=0
Mar 8 10:40:12 localhost kernel: CPU: 1 PID: 26718 Comm: java Kdump: loaded Not tainted 3.10.0-957.el7.x86_64 #1
....
Mar 8 10:40:12 localhost kernel: Out of memory: Kill process 17963 (java) score 184 or sacrifice child
Mar 8 10:40:12 localhost kernel: Killed process 17963 (java) total-vm:6021548kB, anon-rss:1311320kB, file-rss:0kB, shmem-rss:0kB
发现在一个名为thread1的进程要启动,但是资源不够,Linux的oom-killer就将java进程(pid 17946)干掉了,干掉后发现还不够,又将java进程(pid 17963)干掉了。。。。
公司这个4核8g的虚拟机实在资源太小了,上面有3个MongoDB实例,4个java应用。。。