问题缘由
在机器A上的虚拟机安装k8s集群,集群是一主两从。换机器把虚拟机整体复制到了机器B上,然后k8s集群出现问题,问题的原因是ip变更了,集群内应网络配置出现问题了。
当k8s 集群的master和node的ip发生变化后,通信问题出现了,k8s的各种服务配置都使用的固定老的ip,运行获取node的命令已经连接不到原来的网络ip。
因此,就需要重新设置集群服务器的ip
问题解决
前提操作
修改集群所在机器的hosts,因为ip发生变化,hosts文件里面还是原来的ip。
cat >> /etc/hosts << EOF
172.16.149.129 k8s-master
172.16.149.130 k8s-node1
172.16.149.131 k8s-node2
EOF
1,master主节点操作
一,切换到/etc/kubernetes/manifests, 将etcd.yaml kube-apiserver.yaml里的ip地址替换为新的ip
[root@k8s-test ~]# cd /etc/kubernetes/manifests
[root@k8s-test manifests]# vim etcd.yaml
[root@k8s-test manifests]# vim kube-apiserver.yaml
二,生成新的config文件 切换到 /etc/kubernetes
[root@k8s-test manifests]# cd ..
[root@k8s-test kubernetes]# mv admin.conf admin.conf.bak
[root@k8s-test kubernetes]# kubeadm init phase kubeconfig admin --apiserver-advertise-address <新的ip>
三,删除老证书,生成新证书 切换到 /etc/kubernetes/pki
[root@k8s-test kubernetes]# cd pki
[root@k8s-test pki]# mv apiserver.key apiserver.key.bak
[root@k8s-test pki]# mv apiserver.crt apiserver.crt.bak
[root@k8s-test pki]# kubeadm init phase certs apiserver --apiserver-advertise-address <新的ip>
四,重启docker,重启docker的命令根据操作系统和版本使用正确的命令。
[root@k8s-test pki]# cd ..
[root@k8s-test kubernetes]# service docker restart
[root@k8s-test kubernetes]# service kubelet restart
五,将配置文件config输出
[root@k8s-test kubernetes]# kubectl get nodes --kubeconfig=admin.conf # 此时已经是通信成功了
[root@k8s-test kubernetes]# sz admin.conf
六,将kubeconfig默认配置文件替换为admin.conf,这样就可以直接使用kubectl get nodes
[root@k8s-test kubernetes]# mv admin.conf ~/.kube/config
admin.conf配置到访问的机器上,就可以通过api访问这台k8s机器了。
注意:如果按照上面的步骤操作完毕还是有问题,那就按照下面的步骤继续操作:
七,使用命令 kubeadm reset掉所有配置文件后重新init
[root@k8s-test ~]# kubeadm reset --cri-socket /var/run/cri-dockerd.sock
八,reset成功后,重新init,修改apiserver
kubeadm.yaml文件中有关ip的参数都要修改
[root@k8s-test ~]# kubeadm init --config=kubeadm.yaml --ignore-preflight-errors=SystemVerification
执行成功后打印
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 172.16.149.129:6443 --token abcdef.0123456789abcdef
--discovery-token-ca-cert-hash sha256:b99c1931f186e47c19753dca9b5b7b191628e7c0c44425be488ec01fb3782c3b
--cri-socket=unix:///var/run/cri-dockerd.sock
九,然后继续执行
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
执行到这一步时,kubectl get nodes 已经可以看到master主节点已经ready了,但是看不到node从节点,这就需要重新将node节点加入到集群中。
2,node节点操作
重新将node节点加入到集群中
一,使用 kubeadm reset掉所有配置
kubeadm reset --cri-socket /var/run/cri-dockerd.sock
二,node节点加入集群master节点
kubeadm join 172.16.149.129:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:b99c1931f186e47c19753dca9b5b7b191628e7c0c44425be488ec01fb3782c3b --cri-socket=unix:///var/run/cri-dockerd.sock
3,重新安装网络主键Calico
下载:calico.yaml
一,执行下面的命令进行安装
kubectl apply -f calico.yaml
二,查看集群状态&&查看自带pod状态&&查看组件状态
[root@k8s-master opt]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane 2d20h v1.27.0
k8s-node1 Ready <none> 2d19h v1.27.0
k8s-node2 Ready <none> 2d19h v1.27.0
查看组件状态
[root@k8s-master opt]# kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-75c9ddd877-wvr94 1/1 Running 0 2d20h 10.244.169.129 k8s-node2 <none> <none>
calico-node-8js74 1/1 Running 0 2d20h 192.168.204.132 k8s-node1 <none> <none>
calico-node-chdt6 1/1 Running 0 2d20h 192.168.204.133 k8s-node2 <none> <none>
calico-node-pb9cq 1/1 Running 0 2d20h 192.168.204.131 k8s-master <none> <none>
coredns-65dcc469f7-b2m6x 1/1 Running 0 2d20h 10.244.169.131 k8s-node2 <none> <none>
coredns-65dcc469f7-vtb7t 1/1 Running 0 2d20h 10.244.169.130 k8s-node2 <none> <none>
etcd-k8s-master 1/1 Running 0 2d20h 192.168.204.131 k8s-master <none> <none>
kube-apiserver-k8s-master 1/1 Running 0 2d20h 192.168.204.131 k8s-master <none> <none>
kube-controller-manager-k8s-master 1/1 Running 0 2d20h 192.168.204.131 k8s-master <none> <none>
kube-proxy-6v24c 1/1 Running 0 2d20h 192.168.204.132 k8s-node1 <none> <none>
kube-proxy-b85mq 1/1 Running 0 2d20h 192.168.204.133 k8s-node2 <none> <none>
kube-proxy-cv6gx 1/1 Running 0 2d20h 192.168.204.131 k8s-master <none> <none>
kube-scheduler-k8s-master 1/1 Running 0 2d20h 192.168.204.131 k8s-master <none> <none>
4,测试
上述步骤执行完毕后,可以 kubectl get nodes 查看node情况,所有节点都出现了,状态是ready。
一,创建一个nginx
#创建一个nginx
kubectl create deployment nginx --image=nginx:1.14-alpine
#暴露端口
kubectl expose deploy nginx --port=80 --target-port=80 --type=NodePort
二, 查看服务kubectl get pod,svc
[root@k8s-master opt]# kubectl get pod,svc
NAME READY STATUS RESTARTS AGE
pod/nginx-6d56fc78fc-hwwxv 1/1 Running 0 101s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 2d15h
service/nginx NodePort 10.98.207.17 <none> 80:32283/TCP 18s
三,通过curl http://10.98.207.17:80 查看nginx
[root@k8s-master opt]# curl http://10.98.207.17:80
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
body {
width: 35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
四,也可以在浏览器通过ip+端口方式访问
ip为三个虚拟机的ip,端口为80端口映射的端口32283(NodePort)