1.k8s集群重新加入节点
加入的节点有问题,最快的方法,是去掉节点后,恢复环境,重新加入
# kubectl get node
NAME STATUS ROLES AGE VERSION
host-10-15-49-26 NotReady <none> 16h v1.23.4
host-10-19-83-151 Ready control-plane,master 16h v1.23.4
# kubectl get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-flannel kube-flannel-ds-8hzt7 1/1 Running 0 69s
kube-flannel kube-flannel-ds-s8mss 0/1 Init:0/2 0 69s
kube-system coredns-6d8c4cb4d-7fcvf 1/1 Running 1 (15h ago) 16h
kube-system coredns-6d8c4cb4d-nggh8 1/1 Running 1 (15h ago) 16h
kube-system etcd-host-10-19-83-151 1/1 Running 134 (15h ago) 16h
kube-system kube-apiserver-host-10-19-83-151 1/1 Running 1 (15h ago) 16h
kube-system kube-controller-manager-host-10-19-83-151 1/1 Running 1 (15h ago) 16h
kube-system kube-flannel-ds-amd64-74m7h 0/1 Init:0/1 0 3m52s
kube-system kube-flannel-ds-amd64-7dhkm 0/1 CrashLoopBackOff 19 (44s ago) 16h
kube-system kube-proxy-bfps9 1/1 Running 1 (15h ago) 16h
kube-system kube-proxy-ntqvw 0/1 ContainerCreating 0 16h
kube-system kube-scheduler-host-10-19-83-151 1/1 Running 1 (15h ago) 16h
2.kubeadm reset
# kubeadm reset
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W0609 09:56:16.297463 15581 removeetcdmember.go:80] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] No etcd config found. Assuming external etcd
[reset] Please, manually reset etcd to prevent further issues
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]
The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d
The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.
If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.
The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
3.将环境还原成只有一个master的单机模式
# kubectl delete node host-10-15-49-26
node "host-10-15-49-26" deleted
[root@host-10-19-83-151 ~]#
[root@host-10-19-83-151 ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
host-10-19-83-151 Ready control-plane,master 17h v1.23.4
4.重新生成join命令
# kubeadm token create --print-join-command
kubeadm join 10.19.83.151:6443 --token lp4th8.b50xqkec1frk16tm --discovery-token-ca-cert-hash sha256:b98856b3969a0bca3f3a34a2d16e64f74e6c05535405c93063c5a0deaedb86e5
[root@host-10-19-83-151 ~]#
5.新机器重新加入
# kubeadm join 10.19.83.151:6443 --token lp4th8.b50xqkec1frk16tm --discovery-token-ca-cert-hash sha256:b98856b3969a0bca3f3a34a2d16e64f74e6c05535405c93063c5a0deaedb86e5
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
6.检查当前状态
# kubectl get node
NAME STATUS ROLES AGE VERSION
host-10-15-49-26 NotReady <none> 28s v1.23.4
host-10-19-83-151 Ready control-plane,master 17h v1.23.4
7.查看原因,docker无法使用,
# docker images
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
[root@host-10-15-49-26 ~]# docker --version
Docker version 20.10.12, build e91ed57
[root@host-10-15-49-26 ~]# docker ps
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?