昨天晚上,针对K8S环境做了一次压测,50路并发实施,早上起来看监控,发现昨晚8点之后,系统好像都宕掉了,一看master节点和一个node节点状态变成了not ready,主要定位手段如下:
1. 查看master kubelet状态
systemctl status kubelet 状态正常
2. 查看master kube-proxy状态
systemctl status kube-proxy 状态正常
3. 查看master kube-apiserver状态
systemctl status kube-apiserver 状态正常
4. 查看master kube-scheduler状态
systemctl status kube-scheduler 状态正常
5. 查看master etcd状态
systemctl status etcd 状态正常
6. 查看flannel状态
在kubernetes-dashboard上看到flannel挂掉了,查看日志如下
Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "kube-flannel-ds-amd64-sc7sr": Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:301: running exec setns process for init caused \"signal: broken pipe\"": unknown
<