- 准备3台新的物理机,安装基础配置,安装kubeadm、kubelet、kubectl等
- 由于k8s的etcd的集群采用的外接的方式,所以我们要做如下的配置
创建优先于kubeadm提供的kubelet单位文件的优先级的新单位文件来覆盖服务优先级
cat << EOF > /etc/systemd/system/kubelet.service.d/20-etcd-service-manager.conf
[Service]
ExecStart=
# Replace "systemd" with the cgroup driver of your container runtime. The default value in the kubelet is "cgroupfs".
ExecStart=/usr/bin/kubelet --address=127.0.0.1 --pod-manifest-path=/etc/kubernetes/manifests --cgroup-driver=systemd
Restart=always
EOF
##########################
systemctl daemon-reload
systemctl restart kubelet -
为新的3个etcd节点生成kubeadm-config.yaml文件,文件内容如下
[root@etcd1-test ~]# cat kubeadm-config.yaml
apiVersion: "kubeadm.k8s.io/v1beta2"
kind: ClusterConfiguration
etcd:
local:
serverCertSANs:
- "10.120.37.100"
peerCertSANs:
- "10.120.37.100"
dataDir: "/ssd1/etcd"
extraArgs:
quota-backend-bytes: "8589934592"
max-snapshots: "5"
auto-compaction-retention: "1"
max-wals: "8"
initial-cluster: etcd1=https://etcd4-test.yidian-inc.com:2380,etcd2=https://etcd5-test.yidian-inc.com:2380,etcd3=https://etcd6-test.yidian-inc.com:2380
initial-cluster-state: existing
name: etcd1
listen-peer-urls: https://10.120.37.100:2380
listen-client-urls: https://10.120.37.100:2379
advertise-client-urls: https://10.120.37.100:2379
initial-advertise-peer-urls: https://10.120.37.100:2380
-
将老的etcd的ca.crt和ca.key同步到3个新的etcd节点,给新的etcd生成证书
kubeadm init phase certs etcd-server --config=/root/kubeadm-config.yaml
kubeadm init phase certs etcd-peer --config=/root/kubeadm-config.yaml
kubeadm init phase certs etcd-healthcheck-client --config=/root/kubeadm-config.yaml
kubeadm init phase certs apiserver-etcd-client --config=/root/kubeadm-config.yaml
-
更新新的etcd的证书为10年
https://github.com/yuyicai/update-kube-cert
查看更新后证书的时间[root@etcd4-test pki]# openssl x509 -in apiserver-etcd-client.crt -noout -dates
notBefore=Mar 10 03:03:33 2023 GMT
notAfter=Mar 7 03:03:33 2033 GMT
-
生成etcd的静态文件
kubeadm init phase etcd local --config=/root/kubeadm-config.yaml -
添加第一个etcd节点
alias ectl="etcdctl --endpoints=10.120.37.100:2379,10.120.37.101:2379,10.120.42.103:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/apiserver-etcd-client.crt --key=/etc/kubernetes/pki/apiserver-etcd-client.key"
ectl member add etcd1 --peer-urls="https://10.136.45.19:2380"
ectl member list -w table (检查节点) -
修改第一个etcd节点(10.136.45.19)配置
注意1: initial-cluster只能加一个节点,不能全加,否则会报错,“member count is unequal”
注意2: member add 增加时候peer-urls后面是10.136.45.19:2380,所以再initial-cluster 中只能加10.136.45.19:2380,否则对等失败
注意3: 如果还是对等失败,就把initial-cluster中所有的节点都换成ip即可apiVersion: v1 kind: Pod metadata: annotations: kubeadm.kubernetes.io/etcd.advertise-client-urls: https://10.136.45.20:2379 creationTimestamp: null labels: component: etcd tier: control-plane name: etcd namespace: kube-system spec: containers: - command: - etcd - --advertise-client-urls=https://10.136.45.20:2379 - --auto-compaction-retention=1 - --cert-file=/etc/kubernetes/pki/etcd/server.crt - --client-cert-auth=true - --data-dir=/ssd1/etcd - --initial-advertise-peer-urls=https://10.136.45.20:2380 - --initial-cluster=etcd1=https://etcd1-test.yidian-inc.com:2380,etcd2=https://etcd2-test.yidian-inc.com:2380,etcd3=https://etcd3-test.yidian-inc.com:2380,etcd4=https://10.136.45.19:2380 - --initial-cluster-state=existing - --key-file=/etc/kubernetes/pki/etcd/server.key - --listen-client-urls=https://10.136.45.20:2379 - --listen-metrics-urls=http://127.0.0.1:2381 - --listen-peer-urls=https://10.136.45.20:2380 - --max-snapshots=5 - --max-wals=8 - --name=etcd5 - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt - --peer-client-cert-auth=true - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt - --quota-backend-bytes=8589934592 - --snapshot-count=10000 - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt image: k8s.gcr.io/etcd:3.4.3-0 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 8 httpGet: host: 127.0.0.1 path: /health port: 2381 scheme: HTTP initialDelaySeconds: 15 timeoutSeconds: 15 name: etcd resources: {} volumeMounts: - mountPath: /ssd1/etcd name: etcd-data - mountPath: /etc/kubernetes/pki/etcd name: etcd-certs hostNetwork: true priorityClassName: system-cluster-critical volumes: - hostPath: path: /etc/kubernetes/pki/etcd type: DirectoryOrCreate name: etcd-certs - hostPath: path: /ssd1/etcd type: DirectoryOrCreate name: etcd-data status: {}
-
同理,将第二个和第三个节点的etcd都增加到集群当中
-
把开始加入的两个etcd1和etcd2的member的数量在initial-cluster中跟etcd3保持一致,从而保持member一致
- 检查整个集群的状态,确保所有节点OK,数据同步到统一,观察“RAFT APPLIED INDEX“数据一致
- 由于我们的calico的配置文件直接链接了etcd,所以接下来要修改calico的配置,使calico链接到新的etcd集群;先修改secrets,再修改calico-configmap中的etcdendpoint
- 如果前面没有任何问题,继续后续操作
- 摘除lvs,更新master节点etcd的证书,修改apiserve中etcd的配置
- 修改新的etcd节点中initial-cluster,删除老节点的信息,滚动更新
- 开始下线老的etcd节点
- 开始move etcd的master节点到新的节点
- 下线最后一个老的etcd节点
- 测试迁移的结果
启动一个测试的pod,看是否正常,如果不能正常启动,可能要批量重启calico-node;如果正常,可以忽略
k8s外接etcd更换节点
于 2023-03-10 17:10:27 首次发布