背景
在 CentOS 7.9 上使用二进制包部署 Kubernetes v1.24.1 集群,kubelet 使用 Containerd 作为 container runtime。启动kubelet失败,问题排查和解决。
版本信息
服务 |
版本 |
CentOS |
7.9 |
Kernel |
5.4.195-1.el7.elrepo.x86_64 |
Kubernetes |
v1.24.1 |
containerd |
v1.6.4 |
排查和解决
kubelet 启动失败
[root @ machine5 ~]$ systemctl status kubelet
● kubelet.service - Kubernetes Kubelet
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Active: activating (auto-restart) (Result: exit-code) since Fri 2022-06-10 21:56:47 CST; 304ms ago
查看报错信息
[root @ machine5 ~]$ journalctl -xe -u kubelet
Jun 10 22:23:33 machine5 kubelet[11122]: I0610 22:23:33.098633 11122 remote_runtime.go:114] "Finding the CRI API runtime version"
Jun 10 22:23:33 machine5 kubelet[11122]: W0610 22:23:33.838519 11122 clientconn.go:1331] [core] grpc: addrConn.createTransport failed to connect to { <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial unix: missing address". Reconnecting...
Jun 10 22:23:33 machine5 kubelet[11122]: Error: failed to run Kubelet: unable to determine runtime API version: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix: missing address"
报错信息是“failed to run Kubelet: unable to determine runtime API version”
从报错信息来看,kubelet 找不到 Containerd 服务提供的接口,但Containerd服务已经启动了
Containerd服务启动信息
[root @ machine5 ~]$ systemctl status containerd -l
● containerd.service - containerd container runtime
Loaded: loaded (/usr/lib/systemd/system/containerd.service; enabled; vendor preset: disabled)
Active: active (running) since Fri 2022-06-10 22:20:06 CST; 6s ago
Docs: https://containerd.io
Process: 9923 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
Main PID: 9925 (containerd)
Tasks: 9
Memory: 26.0M
CGroup: /system.slice/containerd.service
└─9925 /usr/bin/containerd
Jun 10 22:20:06 machine5 containerd[9925]: time="2022-06-10T22:20:06.913907117+08:00" level=info msg="loading plugin \"io.containerd.grpc.v1.version\&