不小心删除了node节点后如何重新加入集群

最新推荐文章于 2024-09-11 19:03:21 发布

**AE86**

最新推荐文章于 2024-09-11 19:03:21 发布

阅读量245

点赞数 2

分类专栏： K8S 文章标签： k8s 运维

本文链接：https://blog.csdn.net/qq_39965424/article/details/140690898

版权

K8S 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

案发过程：本想给node01节点上的标签删除掉的，结果执行错了命令，导致node01节点消失！

[root@master test-yaml]# kubectl delete node node01 disk=ssd 
node "node01" deleted
Error from server (NotFound): nodes "disk=ssd" not found
[root@master test-yaml]# kubectl get node
NAME     STATUS   ROLES           AGE   VERSION
master   Ready    control-plane   13d   v1.30.0
node02   Ready    <none>          13d   v1.30.0

恢复过程

1. node01节点清理配置数据

# 1.先停止相关服务
[root@node01 ~]# systemctl stop kubelet 
[root@node01 ~]# systemctl stop docker
[root@node01 ~]# systemctl stop cri-docker

# 2.删除相关旧配置文件
[root@node01 ~]# rm -rf /var/lib/cni/
[root@node01 ~]# rm -rf /var/lib/kubelet/
[root@node01 ~]# rm -rf /etc/cni/
[root@node01 ~]# rm -rf /etc/kubernetes/

# 3.重新启动相关服务
[root@node01 ~]# systemctl start kubelet 
[root@node01 ~]# systemctl start docker
[root@node01 ~]# systemctl start cri-docker

2. master节点重新生成join token

[root@master ~]# kubeadm token create --print-join-command
kubeadm join 192.168.0.11:6443 --token 4dx9gu.sb95v5mqq3an77ns --discovery-token-ca-cert-hash sha256:1ff346f4ddd8de598cc6998148d2856b5c5aff4c5ba401796eb772b2c936057 

[root@master test-yaml]# kubeadm token list
TOKEN                     TTL         EXPIRES                USAGES                   DESCRIPTION                                                EXTRA GROUPS
4dx9gu.sb95v5mqq3an77ns   23h         2024-07-26T06:51:02Z   authentication,signing   <none>                                                     system:bootstrappers:kubeadm:default-node-token

3.在node01节点执行join命令

由于k8s版本1.30较高，需加上--cri-socket=unix:///var/run/cri-dockerd.sock命令

[root@node01 ~]# kubeadm join 192.168.0.11:6443 --token 30vo5d.q0jmlzkvzorx8drq --discovery-token-ca-cert-hash sha256:1ff346f4ddd8de598cc6998148d2856b5c5aff4c5ba401796eb772b2c9360571 --cri-socket=unix:///var/run/cri-dockerd.sock
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-check] Waiting for a healthy kubelet. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 501.078678ms
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

[root@node01 ~]#

4.验证集群状态

[root@master ~]# kubectl get node -A
NAME     STATUS     ROLES           AGE   VERSION
master   Ready      control-plane   13d   v1.30.0
node01   NotReady   <none>          25s   v1.30.0
node02   Ready      <none>          13d   v1.30.0
[root@master ~]# kubectl get node -o wide 
NAME     STATUS   ROLES           AGE   VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                 CONTAINER-RUNTIME
master   Ready    control-plane   13d   v1.30.0   192.168.0.11   <none>        CentOS Linux 7 (Core)   3.10.0-1160.119.1.el7.x86_64   docker://26.1.4
node01   Ready    <none>          31s   v1.30.0   192.168.0.12   <none>        CentOS Linux 7 (Core)   3.10.0-1160.119.1.el7.x86_64   docker://26.1.4
node02   Ready    <none>          13d   v1.30.0   192.168.0.13   <none>        CentOS Linux 7 (Core)   3.10.0-1160.119.1.el7.x86_64   docker://26.1.4