kubernetes集群初始化kubeadm启动失败

前提

建议先看一下错误是否一致,再看解决方案。由于我是初学者,在大量的百度之后也学会了一些排查方式。
点我直接看结果

环境
腾讯云centos7

启动命令
kubeadm init
–apiserver-advertise-address=ip
–kubernetes-version v1.18.0
–service-cidr=10.96.0.0/12
–pod-network-cidr=10.244.0.0/16

排查过程

首先这是失败的报错

[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

        Unfortunately, an error has occurred:
                timed out waiting for the condition

        This error is likely caused by:
                - The kubelet is not running
                - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

        If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
                - 'systemctl status kubelet'
                - 'journalctl -xeu kubelet'

        Additionally, a control plane component may have crashed or exited when started by the container runtime.
        To troubleshoot, list all containers using your preferred container runtimes CLI.

        Here is one example how you may list all Kubernetes containers running in docker:
                - 'docker ps -a | grep kube | grep -v pause'
                Once you have found the failing container, you can inspect its logs with:
                - 'docker logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

报错提示我使用journalctl -xeu kubelet查看日志,我们打印日志,结果如下

-- Logs begin at 二 2022-04-19 16:20:01 CST, end at 三 2022-04-20 13:28:56 CST. --
4月 20 13:27:31 k8s-master kubelet[31175]: E0420 13:27:31.386568   31175 kubelet.go:2267] node "k8s-master" not found
4月 20 13:27:31 k8s-master kubelet[31175]: E0420 13:27:31.486711   31175 kubelet.go:2267] node "k8s-master" not found
4月 20 13:27:31 k8s-master kubelet[31175]: E0420 13:27:31.572931   31175 csi_plugin.go:271] Failed to initialize CSINodeInfo: error updating CSINode annotation: timed out waiting for the condition; caused by: [Get https://ip:6443/apis/storage.k8s.io/v1/csinodes/k8s-master: net/http: TLS handshake timeout, Get https://ip:6443/apis/storage.k8s.io/v1/csinodes/k8s-master: dial tcp ip:6443: connect: connection refused]
4月 20 13:27:31 k8s-master kubelet[31175]: E0420 13:27:31.586832   31175 kubelet.go:2267] node "k8s-master" not found
4月 20 13:27:31 k8s-master kubelet[31175]: E0420 13:27:31.686950   31175 kubelet.go:2267] node "k8s-master" not found
4月 20 13:27:31 k8s-master kubelet[31175]: E0420 13:27:31.788088   31175 kubelet.go:2267] node "k8s-master" not found
4月 20 13:27:31 k8s-master kubelet[31175]: I0420 13:27:31.843736   31175 kubelet_node_status.go:294] Setting node annotation to enable volume controller attach/detach
4月 20 13:27:31 k8s-master kubelet[31175]: I0420 13:27:31.845909   31175 kubelet_node_status.go:70] Attempting to register node k8s-master
4月 20 13:27:31 k8s-master kubelet[31175]: E0420 13:27:31.846661   31175 kubelet_node_status.go:92] Unable to register node "k8s-master" with API server: Post https://ip:6443/api/v1/nodes: dial tcp ip:6443: connect: connection refused
4月 20 13:27:31 k8s-master kubelet[31175]: E0420 13:27:31.888212   31175 kubelet.go:2267] node "k8s-master" not found
4月 20 13:27:31 k8s-master kubelet[31175]: E0420 13:27:31.988321   31175 kubelet.go:2267] node "k8s-master" not found
4月 20 13:27:32 k8s-master kubelet[31175]: E0420 13:27:32.005184   31175 csi_plugin.go:271] Failed to initialize CSINodeInfo: error updating CSINode annotation: timed out waiting for the condition; caused by: Get https://ip:6443/apis/storage.k8s.io/v1/csinodes/k8s-master: dial tcp ip:6443: connect: connection refused

这里面有两种报错,我跟据这些报错都查了一遍

node “k8s-master” not found:这是中间错误,查他没用,他不是根源
Failed to initialize CSINodeInfo,dial tcp ip:6443: connect: connection refused:这个错误是初始化失败

现在大概清楚是apiserver没启动,docker ps -a也能看到apiserver退出了(exited),在查询connection refused时,有文章(这篇文章)提到可以查看docker logs,于是我打印了apiserver的日志

W0420 06:27:37.750969       1 clientconn.go:1208] grpc: addrConn.createTransport failed to connect to {https://127.0.0.1:2379  <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...

日志最后就不断在尝试连接2379端口,正如上面文章提到的,2379时etcd的端口,所以是etcd启动失败导致的apiserver失败。

于是我们查看etcd的docker日志

2022-04-20 06:31:13.533316 C | etcdmain: listen tcp ip:2380: bind: cannot assign requested address

最后一行报错,无法分配地址,我以为是安全组的问题,但是并不上,百度之后得到结果:GitHub issue

解决

是公有云的问题,在kubeadm的apiserver-advertise-address参数应该写内网地址,而不是公网地址。
  • 9
    点赞
  • 11
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值