一. 前期准备
- 在开始操作之前,在所有节点 hosts 中配置如下所示的信息:
$ cat /etc/hosts
10.32.137.202 webservice-rs17.idcyz.hb1.kwaidc.com
10.32.137.27 webservice-rs18.idcyz.hb1.kwaidc.com
10.62.172.233 bjpg-rs866.yz02
127.0.0.1 my-dev.ci.com
- 新master节点 安装master的组件,如kubeadm、kubelet、kubectl
- 提前pull api-server、controller-manage、kube-scheduler、etcd 、coredns镜像
参考现有master的配置版本
$ kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.23.4
k8s.gcr.io/kube-controller-manager:v1.23.4
k8s.gcr.io/kube-scheduler:v1.23.4
k8s.gcr.io/kube-proxy:v1.23.4
k8s.gcr.io/pause:3.6
k8s.gcr.io/etcd:3.5.1-0
k8s.gcr.io/coredns/coredns:v1.8.6
备注:一般翻墙拉取比较慢,可以从阿里云拉取,然后tag k8s.gcr.io/xxxxx
二. 更新APIServer 证书
APIServer 地址有变化 或者 新增加master节点的时候 ,由于新主机名或者IP地址不包括在证书的subject的备选名称(SAN)列表中的话,访问的时候会报错,会提示对指定的 IP 地址或者主机名访问证书无效,所以需要更新证书,使 SAN 列表中包含所有你将用来访问 APIServer 的IP地址或者主机名
首页我们一个 kubeadm 的配置文件,如果一开始安装集群的时候你就是使用的配置文件,那么我们可以直接更新这个配置文件,但是如果你没有使用配置文件,直接使用的 kubeadm init 来安装的集群,那么我们可以从集群中获取 kubeadm 的配置信息来创建一个配置文件,因为 kubeadm 会将其配置写入到 kube-system 命名空间下面一个名为 kubeadm-config 的 ConfigMap 中。
- 可以直接执行如下所示的命令将该配置导出:
$ kubectl -n kube-system get configmap kubeadm-config -o jsonpath='{.data.ClusterConfiguration}' > kubeadm.yaml
上面的命令导出如下
controlPlaneEndpoint: 10.32.137.202:6443
apiServer:
extraArgs:
authorization-mode: Node,RBAC
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.23.1
networking:
dnsDomain: cluster.local
podSubnet: 172.20.0.0/16
serviceSubnet: 172.21.0.0/20
scheduler: {}
上面的配置中并没有列出额外的 SAN 信息,我们要添加一个新的数据,需要在 apiServer 属性下面添加一个 certsSANs 的列表。
我这边由于测试 先添加了一个master节点。所以在certsSANs 列表只加了一台机器的名称跟IP
添加后
备注:controlPlaneEndpoint这个需要加不然后面join master 会报错“The cluster has a stable controlPlaneEndpoint address
apiServer:
certSANs:
- webservice-rs17.idcyz.hb1.kwaidc.com
- bjpg-rs866.yz02
- 10.32.137.202
- 10.62.172.233
extraArgs:
authorization-mode: Node,RBAC
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.23.1
networking:
dnsDomain: cluster.local
podSubnet: 172.20.0.0/16
serviceSubnet: 172.21.0.0/20
scheduler: {}
- 或者直接使用edit命令
kubectl -n kube-system edit cm kubeadm-config
移动现有的 APIServer 的证书和密钥,因为 kubeadm 检测到他们已经存在于指定的位置,它就不会创建新的了。
$ mv /etc/kubernetes/pki/apiserver.{crt,key} ~
然后直接使用 kubeadm 命令生成一个新的证书:
$ sudo kubeadm init phase certs apiserver --config kubeadm.yaml
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [bjpg-rs866.yz02 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local webservice-rs17.idcyz.hb1 webservice-rs17.idcyz.hb1.kwaidc.com] and IPs [172.21.0.1 10.32.137.202 10.62.172.233]
通过上面的命令可以查看到 APIServer 签名的 DNS 和 IP 地址信息,一定要和自己的目标签名信息进行对比,如果缺失了数据就需要在上面的 certSANs 中补齐,重新生成证书。
该命令会使用上面指定的 kubeadm 配置文件为 APIServer 生成一个新的证书和密钥,由于指定的配置文件中包含了 certSANs 列表,那么 kubeadm 会在创建新证书的时候自动添加这些 SANs。
最后一步是重启 APIServer 来接收新的证书,最简单的方法是直接杀死 APIServer 的容器:
$ docker ps | grep kube-apiserver | grep -v pause
d18a2d3291cd b6d7abedde39 "kube-apiserver --ad…" 7 weeks ago
$ docker kill xxx
容器被杀掉后,kubelet 会自动重启容器,然后容器将接收新的证书,一旦 APIServer 重启后,我们就可以使用新添加的 IP 地址或者主机名来连接它了,比如我们新添加的 bjpg-rs866.yz02
可以使用 openssl 命令去查看生成的证书信息是否包含我们新添加的 SAN 列表数据:
$ openssl x509 -in /etc/kubernetes/pki/apiserver.crt -text
X509v3 Subject Alternative Name:
DNS:bjpg-rs866.yz02, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:webservice-rs17.idcyz.hb1, DNS:webservice-rs17.idcyz.hb1.kwaidc.com, IP Address:172.21.0.1, IP Address:10.32.137.202, IP Address:10.62.172.233
好多博客说修改完k8s配置后需要上传到集群的配置。可是我看官方文档:
kubeadm config upload from-file 命令不能使用了。。这个有知道的老哥可以补充下。
三.同步master证书到新master节点
主要同步以下文件 其他证书文件会自动生成
scp -P ${port} /etc/kubernetes/pki/ca.crt
scp -P ${port} /etc/kubernetes/pki/ca.key
scp -P ${port} /etc/kubernetes/pki/sa.key
scp -P ${port} /etc/kubernetes/pki/sa.pub
scp -P ${port} /etc/kubernetes/pki/front-proxy-ca.crt
scp -P ${port} /etc/kubernetes/pki/front-proxy-ca.key
scp -P ${port} /etc/kubernetes/pki/etcd/ca.crt
scp -P ${port} /etc/kubernetes/pki/etcd/ca.key
scp -P ${port} /etc/kubernetes/admin.conf
四. 添加控制平面
- 将集群的证书上传到集群中,供其他控制节点使用:
$ kubeadm init phase upload-certs --upload-certs
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
4ac12cf2fa145b7ca7713426c96e4sf3b20a2d0656fb4ac0eb1236ebd10bacd
- 生成token
kubeadm token create --print-join-command
- join master
kubeadm join 10.32.137.20:6443 --token qp5scd.nk5i23hqbswhbh7ubb --discovery-token-ca-cert-hash sha256:5ba43ac04768af8635652c71bf02c72b9b488a6ca67c3b2752b12c0273d9d90b --control-plane 4ac12cf2fa145b7ca7713426c96e4sf3b20a2d0656fb4ac0eb1236ebd10bacd
这个 --control-plane 标志通知 kubeadm join 创建一个新的控制平面。
preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [bjpg-rs866.yz02 localhost] and IPs [10.62.172.23 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [bjpg-rs866.yz02 localhost] and IPs [10.62.172.233 127.0.0.1 ::1]
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [bjpg-rs866.yz02 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [172.21.0.1 10.62.172.233 10.32.137.202]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
The 'update-status' phase is deprecated and will be removed in a future release. Currently it performs no operation
[mark-control-plane] Marking the node bjpg-rs866.yz02 as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node bjpg-rs866.yz02 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
This node has joined the cluster and a new control plane instance was created:
* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.
To start administering your cluster from this node, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Run 'kubectl get nodes' to see this node join the cluster.
文章参考:https://jishuin.proginn.com/p/763bfbd2c2b9