k8s集群随虚拟机迁移到其他机器，集群出现问题解决方法

最新推荐文章于 2024-12-21 19:32:29 发布

学海无涯，学习中

最新推荐文章于 2024-12-21 19:32:29 发布

阅读量730

点赞数

分类专栏： Kubernetes 文章标签： kubernetes 容器云原生

本文链接：https://blog.csdn.net/caosnowflower/article/details/132318582

版权

Kubernetes 专栏收录该内容

1 篇文章

订阅专栏

问题缘由

在机器A上的虚拟机安装k8s集群，集群是一主两从。换机器把虚拟机整体复制到了机器B上，然后k8s集群出现问题，问题的原因是ip变更了，集群内应网络配置出现问题了。

当k8s 集群的master和node的ip发生变化后，通信问题出现了，k8s的各种服务配置都使用的固定老的ip，运行获取node的命令已经连接不到原来的网络ip。

因此，就需要重新设置集群服务器的ip

问题解决

前提操作

修改集群所在机器的hosts，因为ip发生变化，hosts文件里面还是原来的ip。

cat >> /etc/hosts << EOF 
172.16.149.129 k8s-master 
172.16.149.130 k8s-node1 
172.16.149.131 k8s-node2 
EOF

1，master主节点操作

一，切换到/etc/kubernetes/manifests，将etcd.yaml kube-apiserver.yaml里的ip地址替换为新的ip

[root@k8s-test ~]# cd /etc/kubernetes/manifests 
[root@k8s-test manifests]# vim etcd.yaml 
[root@k8s-test manifests]# vim kube-apiserver.yaml

二，生成新的config文件切换到 /etc/kubernetes

[root@k8s-test manifests]# cd ..
[root@k8s-test kubernetes]# mv admin.conf admin.conf.bak
[root@k8s-test kubernetes]# kubeadm init phase kubeconfig admin --apiserver-advertise-address <新的ip>

三，删除老证书，生成新证书切换到 /etc/kubernetes/pki

[root@k8s-test kubernetes]# cd pki
[root@k8s-test pki]# mv apiserver.key apiserver.key.bak
[root@k8s-test pki]# mv apiserver.crt apiserver.crt.bak
[root@k8s-test pki]# kubeadm init phase certs apiserver  --apiserver-advertise-address <新的ip>

四，重启docker，重启docker的命令根据操作系统和版本使用正确的命令。

[root@k8s-test pki]# cd ..
[root@k8s-test kubernetes]# service docker restart
[root@k8s-test kubernetes]# service kubelet restart

五，将配置文件config输出

[root@k8s-test kubernetes]# kubectl get nodes --kubeconfig=admin.conf  #  此时已经是通信成功了
[root@k8s-test kubernetes]# sz admin.conf

六，将kubeconfig默认配置文件替换为admin.conf，这样就可以直接使用kubectl get nodes

[root@k8s-test kubernetes]# mv admin.conf ~/.kube/config

admin.conf配置到访问的机器上，就可以通过api访问这台k8s机器了。

注意：如果按照上面的步骤操作完毕还是有问题，那就按照下面的步骤继续操作：

七，使用命令 kubeadm reset掉所有配置文件后重新init

[root@k8s-test ~]# kubeadm reset --cri-socket /var/run/cri-dockerd.sock

八，reset成功后，重新init，修改apiserver

kubeadm.yaml文件中有关ip的参数都要修改

[root@k8s-test ~]# kubeadm init --config=kubeadm.yaml --ignore-preflight-errors=SystemVerification

执行成功后打印

Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 172.16.149.129:6443 --token abcdef.0123456789abcdef 
--discovery-token-ca-cert-hash sha256:b99c1931f186e47c19753dca9b5b7b191628e7c0c44425be488ec01fb3782c3b 
--cri-socket=unix:///var/run/cri-dockerd.sock

九，然后继续执行

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

执行到这一步时，kubectl get nodes 已经可以看到master主节点已经ready了，但是看不到node从节点，这就需要重新将node节点加入到集群中。

2，node节点操作

重新将node节点加入到集群中

一，使用 kubeadm reset掉所有配置

kubeadm reset --cri-socket /var/run/cri-dockerd.sock

二，node节点加入集群master节点

kubeadm join 172.16.149.129:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:b99c1931f186e47c19753dca9b5b7b191628e7c0c44425be488ec01fb3782c3b --cri-socket=unix:///var/run/cri-dockerd.sock

3，重新安装网络主键Calico

下载：calico.yaml

calico.yaml

一，执行下面的命令进行安装

kubectl apply -f calico.yaml

二，查看集群状态&&查看自带pod状态&&查看组件状态

[root@k8s-master opt]# kubectl  get  nodes
NAME         STATUS   ROLES           AGE     VERSION
k8s-master   Ready    control-plane   2d20h   v1.27.0
k8s-node1    Ready    <none>          2d19h   v1.27.0
k8s-node2    Ready    <none>          2d19h   v1.27.0

查看组件状态

[root@k8s-master opt]# kubectl get pods -n kube-system -o wide
NAME                                       READY   STATUS    RESTARTS   AGE     IP                NODE         NOMINATED NODE   READINESS GATES
calico-kube-controllers-75c9ddd877-wvr94   1/1     Running   0          2d20h   10.244.169.129    k8s-node2    <none>           <none>
calico-node-8js74                          1/1     Running   0          2d20h   192.168.204.132   k8s-node1    <none>           <none>
calico-node-chdt6                          1/1     Running   0          2d20h   192.168.204.133   k8s-node2    <none>           <none>
calico-node-pb9cq                          1/1     Running   0          2d20h   192.168.204.131   k8s-master   <none>           <none>
coredns-65dcc469f7-b2m6x                   1/1     Running   0          2d20h   10.244.169.131    k8s-node2    <none>           <none>
coredns-65dcc469f7-vtb7t                   1/1     Running   0          2d20h   10.244.169.130    k8s-node2    <none>           <none>
etcd-k8s-master                            1/1     Running   0          2d20h   192.168.204.131   k8s-master   <none>           <none>
kube-apiserver-k8s-master                  1/1     Running   0          2d20h   192.168.204.131   k8s-master   <none>           <none>
kube-controller-manager-k8s-master         1/1     Running   0          2d20h   192.168.204.131   k8s-master   <none>           <none>
kube-proxy-6v24c                           1/1     Running   0          2d20h   192.168.204.132   k8s-node1    <none>           <none>
kube-proxy-b85mq                           1/1     Running   0          2d20h   192.168.204.133   k8s-node2    <none>           <none>
kube-proxy-cv6gx                           1/1     Running   0          2d20h   192.168.204.131   k8s-master   <none>           <none>
kube-scheduler-k8s-master                  1/1     Running   0          2d20h   192.168.204.131   k8s-master   <none>           <none>

4，测试

上述步骤执行完毕后，可以 kubectl get nodes 查看node情况，所有节点都出现了，状态是ready。

一，创建一个nginx

#创建一个nginx
kubectl create deployment nginx --image=nginx:1.14-alpine
#暴露端口
kubectl expose deploy nginx --port=80 --target-port=80 --type=NodePort

二，查看服务kubectl get pod,svc

[root@k8s-master opt]# kubectl get pod,svc
NAME                         READY   STATUS    RESTARTS   AGE
pod/nginx-6d56fc78fc-hwwxv   1/1     Running   0          101s

NAME                 TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
service/kubernetes   ClusterIP   10.96.0.1      <none>        443/TCP        2d15h
service/nginx        NodePort    10.98.207.17   <none>        80:32283/TCP   18s

三，通过curl http://10.98.207.17:80 查看nginx

[root@k8s-master opt]# curl http://10.98.207.17:80
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>