k8s集群中有一个节点创建的pod总是起不来,状态一直是ContainerCreating,describe pod发现sandbox一直创建不起来
kubectl describe pod xxxxx -n xxx
如下:
Normal SandboxChanged 22m (x90 over 32m) kubelet, node4 Pod sandbox changed, it will be killed and re-created.
Warning FailedCreatePodSandBox 32m (x7 over 32m) kubelet, node4 Failed create pod sandbox.
Warning FailedSync 27m (x48 over 32m) kubelet, node4 Error syncing pod
登录节点查看节点日志:
tail -f /var/log/messages
日志输出如下,从日志中可以看出是节点内存的buffer/cache满了导致sandbox无法创建。
Jan 25 15:12:52 node4 kubelet: W0125 15:12:52.446651 20224 cni.go:265] CNI failed to retrieve network namespace path: Cannot find network namespace for the terminated container "a722b111fe3a8e78d1d7ee49280ab743d80c6e6ba955195b65bcfe60d5cf3264"
Jan 25 15:12:53 node4 docker: time="2019-01-25T15:12:53.102069754+08:00" level=error msg="Handler for POST /v1.26/containers/a722b111fe3a8e78d1d7ee49280ab743d80c6e6ba955195b65bcfe60d5cf3264/stop returned error: Container a722b111fe3a8e78d1d7ee49280ab743d80c6e6ba955195b65bcfe60d5cf3264 is already stopped"
Jan 25 15:12:53 node4 kernel: runc:[1:CHILD]: page allocation failure: order:6, mode:0x10c0d0
Jan 25 15:12:53 node4 kernel: CPU: 2 PID: 26598 Comm: runc:[1:CHILD] Tainted: G ---------