需要删除node 的原因是 pod 在node2上创建失败,给node2 加了一个taint, 让pod选择node1,创建成功,原因不详,我查询了一下解决方法,将node2 从集群中删除再重新加入,所以就尝试了一下。果然可以。 下面就是从集群中删除node 再将node 重新加入集群的流程:
从集群中删除node:
[root@master ~]# kubectl delete node node2
node "node2" deleted
[root@master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready control-plane,master 10d v1.21.0
node1 Ready <none> 10d v1.21.0
在master 上生成加入集群的指令:
[root@master ~]# kubeadm token create --print-join-command
kubeadm join 192.168.204.130:6443 --token 9upog9.x1huogm7non75g7n --discovery-token-ca-cert-hash sha256:b85c1afaa3ba92935ae67caf515b893ce92af375568b8a7ecdc559f81a3d3257
在node2 上执行该命令,加入集群:
[root@node2 ~]# kubeadm join 192.168.204.130:6443 --token 9upog9.x1huogm7non75g7n --discovery-token-ca-cert-hash sha256:b85c1afaa3ba92935ae67caf515b893ce92af375568b8a7ecdc559f81a3d3257
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 24.0.7. Latest validated version: 20.10
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
[ERROR Port-10250]: Port 10250 is in use
[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
由于 kubeadm init 初始化过,一些数据配置已经存在,导致重新加入集群的时候会有冲突,所以会有error。可以通过重置kubeadm 解决。
[root@node2 ~]# kubeadm reset
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W0107 22:55:49.383503 72775 removeetcdmember.go:79] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] No etcd config found. Assuming external etcd
[reset] Please, manually reset etcd to prevent further issues
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]
The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d
The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.
If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.
The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
再次执行加入集群指令:
[root@node2 ~]# kubeadm join 192.168.204.130:6443 --token 9upog9.x1huogm7non75g7n --discovery-token-ca-cert-hash sha256:b85c1afaa3ba92935ae67caf515b893ce92af375568b8a7ecdc559f81a3d3257
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 24.0.7. Latest validated version: 20.10
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
又一个小插曲, kubeadm 重置的时候, 内核参数也重置了, 手动添加bridge-nf-call-iptables参数。
[root@node2 ~]# echo "1" >/proc/sys/net/bridge/bridge-nf-call-iptables
再次尝试加入集群:
[root@node2 ~]# kubeadm join 192.168.204.130:6443 --token vo9o87.p07f0pv6fscubzyz --discovery-token-ca-cert-hash sha256:b85c1afaa3ba92935ae67caf515b893ce92af375568b8a7ecdc559f81a3d3257
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 24.0.7. Latest validated version: 20.10
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
加入成功!