【k8s集群故障】unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf

记录一次k8s集群更新证书,node节点NotReady问题

一开始查看调度到node-1节点的pod都terminating 状态
到节点node-1

kubectl get pod -A
error: You must be logged in to the server (Unauthorize)

将master节点的/etc/kubernetes/admin.conf拷贝到node-1

echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile
再source  ~/.bash_profile

再查看节点污点

 kubectl describe node node-1|grep Taint

发现节点 node.kubernetes.io/unreachable:NoExecute
尝试删除污点

kubectl taint node k8snode2 node.kubernetes.io/unreachable-

结果污点变成node.kubernetes.io/unreachable:NoSchedule

后来查资料发现

node.kubernetes.io/not-ready:节点尚未准备好。这对应于NodeConditionReady为False。

node.kubernetes.io/unreachable:无法从节点控制器访问节点。这对应于NodeConditionReady为Unknown。

node.kubernetes.io/out-of-disk:节点磁盘不足。

node.kubernetes.io/memory-pressure:节点有内存压力。

node.kubernetes.io/disk-pressure:节点有磁盘压力。

node.kubernetes.io/network-unavailable:节点的网络不可用。

node.kubernetes.io/unschedulable:节点不可调度。

node.cloudprovider.kubernetes.io/uninitialized:当kubelet从外部云服务提供程序启动时,在节点上设置此污点以将其标记为不可用。来自cloud-controller-manager的控制器初始化此节点后,kubelet删除此污点。

如果要逐出节点,则节点控制器或kubelet会添加相关的污点NoExecute。如果故障情况恢复正常,则kubelet或节点控制器可以删除相关的污点。具体文档地址,如下所示:https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/

大概意思是说,之所以出现此污点,是k8s内部认为该节点尚不能工作,所以添加了此污点,防止Pod调度到此节点,看了半天,原来节点底层出现故障了,首先查看下kubelet状态,状态不正常,如下所示:

systemctl status kubelet                 
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; disabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: inactive (dead)
     Docs: https://kubernetes.io/docs/

通过

journalctl -xefu kubelet

查看日志

9月 11 17:06:14 node-1 systemd[1]: kubelet.service holdoff time over, scheduling restart.
9月 11 17:06:14 node-1 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
-- Subject: Unit kubelet.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit kubelet.service has finished shutting down.
9月 11 17:06:14 node-1 systemd[1]: Started kubelet: The Kubernetes Node Agent.
-- Subject: Unit kubelet.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit kubelet.service has finished starting up.
-- 
-- The start-up result is done.
9月 11 17:06:14 node-1 kubelet[11167]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
9月 11 17:06:14 node-1 kubelet[11167]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
9月 11 17:06:14 node-1 kubelet[11167]: I0911 17:06:14.821215   11167 server.go:417] Version: v1.18.0
9月 11 17:06:14 node-1 kubelet[11167]: I0911 17:06:14.821648   11167 plugins.go:100] No cloud provider specified.
9月 11 17:06:14 node-1 kubelet[11167]: I0911 17:06:14.821680   11167 server.go:837] Client rotation is on, will bootstrap in background
9月 11 17:06:14 node-1 kubelet[11167]: E0911 17:06:14.825197   11167 bootstrap.go:265] part of the existing bootstrap client certificate is expired: 2024-09-06 06:47:45 +0000 UTC
9月 11 17:06:14 node-1 kubelet[11167]: F0911 17:06:14.825242   11167 server.go:274] failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory
9月 11 17:06:14 node-1 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
9月 11 17:06:14 node-1 systemd[1]: Unit kubelet.service entered failed state.
9月 11 17:06:14 node-1 systemd[1]: kubelet.service failed.
9月 11 17:06:24 node-1 systemd[1]: kubelet.service holdoff time over, scheduling restart.
9月 11 17:06:24 node-1 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
-- Subject: Unit kubelet.service has finished shutting down
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit kubelet.service has finished shutting down.
9月 11 17:06:24 node-1 systemd[1]: Started kubelet: The Kubernetes Node Agent.
-- Subject: Unit kubelet.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit kubelet.service has finished starting up.
-- 
-- The start-up result is done.
9月 11 17:06:25 node-1 kubelet[11236]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
9月 11 17:06:25 node-1 kubelet[11236]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
9月 11 17:06:25 node-1 kubelet[11236]: I0911 17:06:25.053194   11236 server.go:417] Version: v1.18.0
9月 11 17:06:25 node-1 kubelet[11236]: I0911 17:06:25.053574   11236 plugins.go:100] No cloud provider specified.
9月 11 17:06:25 node-1 kubelet[11236]: I0911 17:06:25.053611   11236 server.go:837] Client rotation is on, will bootstrap in background
9月 11 17:06:25 node-1 kubelet[11236]: E0911 17:06:25.057193   11236 bootstrap.go:265] part of the existing bootstrap client certificate is expired: 2024-09-06 06:47:45 +0000 UTC
9月 11 17:06:25 node-1 kubelet[11236]: F0911 17:06:25.057233   11236 server.go:274] failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory
9月 11 17:06:25 node-1 systemd[1]: kubelet.service: main process exited, code=exited, status=255/n/a
9月 11 17:06:25 node-1 systemd[1]: Unit kubelet.service entered failed state.
9月 11 17:06:25 node-1 systemd[1]: kubelet.service failed.

发现 unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf
在节点node-1

cp -a /etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf

systemctl daemon-reload  && systemctl restart kubelet 

恢复了

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

搞什么滚去学习

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值