Kubernetes 在不同的地方使用了证书(Certificate),在 Kubernetes 安装和组件启动参数中也需要配置大量证书相关的参数。大体分为两种
用于集群 Master、Etcd等通信的证书;
用于集群中客户端kubelet组件和api通信的证书;
与社区kubenetes一样,TKGm用于集群 Master、Etcd等通信的证书有效期为10年,客户端kubelet组件和api通信的证书默认有效期为1年。集群运行1年以后就会导致报 certificate has expired or is not yet valid 错误,导致集群 Node不能于集群 Master正常通信。社区kubenetes规避证书过期一般有三种解决方式:第一种是集群升级,通过升级kubenetes,间接的把证书也升级了。第二种是修改源代码,也就是对kubeadm重新编译。第三种就是重新生成证书。
TKGm证书更新 目前支持通过升级集群和重新生成证书两种方式。
所以监控证书状态,在证书过期之前升级集群或者更新很有必要。很多解决方案为了图省事,直接修改用于集群中客户端kubelet组件和api通信的证书过期时间为99年,其实有很多安全风险隐患,TKGm建议定期升级集群或者手工更新证书。
TKGm 更新证书步骤如下 (证书没有过期之前的处理方案,证书如果已经过期,需要另外的处理步骤)
备注:管理集群和工作集群步骤相同
测试版本
TKGm 版本:1.4
Kubenetes版本:v1.21.2+vmware.1
证书更新步骤:
1. 登录TKGm集群控制节点(需要登陆到每个控制节点),查看客户端证书过期时间
1)查看集群Node IP信息
$kubectl get nodes -o wide
[root@bootstrap yaml]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
workload01n-control-plane-kkpvh Ready control-plane,master 2d19h v1.21.2+vmware.1 192.168.110.173 192.168.110.173 VMware Photon OS/Linux 4.19.198-1.ph3 containerd://1.4.6
workload01n-md-0-5f47967fdb-7x4bx Ready <none> 20h v1.21.2+vmware.1 192.168.110.178 192.168.110.178 VMware Photon OS/Linux 4.19.198-1.ph3 containerd://1.4.6
workload01n-md-0-5f47967fdb-8rclf Ready <none> 21h v1.21.2+vmware.1 192.168.110.177 192.168.110.177 VMware Photon OS/Linux 4.19.198-1.ph3 containerd://1.4.6
workload01n-md-0-5f47967fdb-zrmvl Ready <none> 20h v1.21.2+vmware.1 192.168.110.179 192.168.110.179 VMware Photon OS/Linux 4.19.198-1.ph3 containerd://1.4.6
2) 在bootstrap操作机器登陆集群控制节点
ssh capv@<控制节点ip>
备注:bootstrap的 ssh 公钥在部署管理集群和发布集群已经注入到node
参考拙文《Tanzu学习系列之TKGm 1.4 for vSphere 快速部署》一文相关步骤
[workload01n-admin@workload01n|default] [root@bootstrap yaml]# ssh capv@192.168.110.173
The authenticity of host '192.168.110.173 (192.168.110.173)' can't be established.
ECDSA key fingerprint is SHA256:0v5C1Zh3m6CNRd1B+QzCmE1UiqVJyTpd52QJpZRKfVU.
ECDSA key fingerprint is MD5:69:e5:de:e5:9d:82:0a:dc:6a:e4:ac:29:66:98:82:88.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.110.173' (ECDSA) to the list of known hosts.
Enter passphrase for key '/root/.ssh/id_rsa':
08:25:33 up 2 days, 19:11, 0 users, load average: 1.03, 1.95, 1.96
29 Security notice(s)
Run 'tdnf updateinfo info' to see the details.
capv@workload01n-control-plane-kkpvh [ ~ ]$
3)查看证书状态,其中EXPIRES列是证书的过期时间,确定客户端组件证书过期时间
$ sudo kubeadm certs check-expiration
2.登陆每个控制节点,更新证书
1)更新kube-apiserver, kube-controller-manager, kube-scheduler and etcd证书,更新完成提示重启kube-apiserver, kube-controller-manager, kube-scheduler and etcd,
$ sudo kubeadm certs renew all
capv@workload01n-control-plane-kkpvh [ ~ ]$ sudo kubeadm certs renew all
[renew] Reading configuration from the cluster...
[renew] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
certificate embedded in the kubeconfig file for the admin to use and for kubeadm itself renewed
certificate for serving the Kubernetes API renewed
certificate the apiserver uses to access etcd renewed
certificate for the API server to connect to kubelet renewed
certificate embedded in the kubeconfig file for the controller manager to use renewed
certificate for liveness probes to healthcheck etcd renewed
certificate for etcd nodes to communicate with each other renewed
certificate for serving etcd renewed
certificate for the front proxy client renewed
certificate embedded in the kubeconfig file for the scheduler manager to use renewed
Done renewing certificates. You must restart the kube-apiserver, kube-controller-manager, kube-scheduler and etcd, so that they can use the new certificate
2)查看对应容器号
$ sudo crictl ps|grep kube-apiserver
$ sudo crictl ps|grep kube-controller-manager
$ sudo crictl ps|grep kube-scheduler
$ sudo crictl ps|grep etcd
3)重启对应容器,执行stop命令,之后会自动启动
$sudo crictl stop <对应容器id>
capv@workload01n-control-plane-kkpvh [ ~ ]$ sudo crictl ps|grep kube-apiserver
fd3898665adb4 0b9437b832f65 2 days ago Running kube-apiserver 0 98947a498ed46
capv@workload01n-control-plane-kkpvh [ ~ ]$ sudo crictl ps|grep kube-apiserver
fd3898665adb4 0b9437b832f65 2 days ago Running kube-apiserver 0 98947a498ed46
capv@workload01n-control-plane-kkpvh [ ~ ]$ sudo crictl ps|grep kube-controller-manager
82ce0ee8c5263 060eb69223237 15 hours ago Running kube-controller-manager 1 3735dd4223e35
capv@workload01n-control-plane-kkpvh [ ~ ]$ sudo crictl ps|grep kube-scheduler
c9e4a91c75a55 640b7ee0df98b 15 hours ago Running kube-scheduler 1 056ad6b97126e
capv@workload01n-control-plane-kkpvh [ ~ ]$ sudo crictl ps|grep etcd
bba626f73748e 6f7c29e5ac889 2 days ago Running etcd 0 b51830438ede1
capv@workload01n-control-plane-kkpvh [ ~ ]$ sudo crictl stop fd3898665adb4 82ce0ee8c5263 c9e4a91c75a55 bba626f73748e
fd3898665adb4
82ce0ee8c5263
c9e4a91c75a55
bba626f73748e
capv@workload01n-control-plane-kkpvh [ ~ ]$ sudo crictl ps|grep kube-apiserver
fc53ce36869df 0b9437b832f65 7 seconds ago Running kube-apiserver 1 98947a498ed46
capv@workload01n-control-plane-kkpvh [ ~ ]$ sudo crictl ps|grep kube-controller-manager
d7f91f3826c21 060eb69223237 16 seconds ago Running kube-controller-manager 2 3735dd4223e35
capv@workload01n-control-plane-kkpvh [ ~ ]$ sudo crictl ps|grep kube-scheduler
be6cca6996ecb 640b7ee0df98b 20 seconds ago Running kube-scheduler 2 056ad6b97126e
capv@workload01n-control-plane-kkpvh [ ~ ]$ sudo crictl ps|grep etcd
39e12db582577 6f7c29e5ac889 23 seconds ago Running etcd 1 b51830438ede1
capv@workload01n-control-plane-kkpvh [ ~ ]$
4) 重启kubelet服务
$ sudo systemctl restart kubelet
5)查看证书新的过期时间
$ sudo kubeadm certs check-expiration
证书更新为一年时间
(本文完)
要想了解云原生、机器学习和区块链等技术原理,请立即长按以下二维码,关注本公众号亨利笔记 ( henglibiji ),以免错过更新。