1. 前言
配置信息: ubuntu18.04 k8s集群1.22.0。
更新证书后: 会导致contexts丢失(上下文) kubectl config get-contexts
更新证书
今天使用k8s集群发现报错。
root@k8s-master:~# kubectl get nodes
The connection to the server <master>:6443 was refused - did you specify the right host or port?
1.1 查看kubelet状态
发现无法启动, 查看docker服务,发现api-server也没用启动
root@k8s-master:~# systemctl status kubelet.service
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Wed 2022-12-14 09:25:38 CST; 3s ago
Docs: https://kubernetes.io/docs/home/
Process: 4364 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=1/FAILURE)
Main PID: 4364 (code=exited, status=1/FAILURE)
root@k8s-master:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
查看日志 ,发现报错找不到 bootstrap-kubelet.conf
root@k8s-master:~# tail -f 20 /var/log/syslog
tail: cannot open '20' for reading: No such file or directory
==> /var/log/syslog <==
Dec 14 09:28:22 k8s-master systemd[1]: Started kubelet: The Kubernetes Node Agent.
Dec 14 09:28:22 k8s-master kubelet[5166]: Flag --network-plugin has been deprecated, will be removed along with dockershim.
Dec 14 09:28:22 k8s-master kubelet[5166]: Flag --network-plugin has been deprecated, will be removed along with dockershim.
Dec 14 09:28:22 k8s-master systemd[1]: Started Kubernetes systemd probe.
Dec 14 09:28:22 k8s-master kubelet[5166]: I1214 09:28:22.095203 5166 server.go:440] "Kubelet version" kubeletVersion="v1.22.0"
Dec 14 09:28:22 k8s-master kubelet[5166]: I1214 09:28:22.095517 5166 server.go:868] "Client rotation is on, will bootstrap in background"
Dec 14 09:28:22 k8s-master kubelet[5166]: E1214 09:28:22.096812 5166 bootstrap.go:265] part of the existing bootstrap client certificate in /etc/kubernetes/kubelet.conf is expired: 2022-12-08 06:32:35 +0000 UTC
Dec 14 09:28:22 k8s-master kubelet[5166]: E1214 09:28:22.096858 5166 server.go:294] "Failed to run kubelet" err="failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory"
Dec 14 09:28:22 k8s-master systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Dec 14 09:28:22 k8s-master systemd[1]: kubelet.service: Failed with result 'exit-code'.
Dec 14 09:28:32 k8s-master systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Dec 14 09:28:32 k8s-master systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 59.
Dec 14 09:28:32 k8s-master systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Dec 14 09:28:32 k8s-master systemd[1]: Started kubelet: The Kubernetes Node Agent.
Dec 14 09:28:32 k8s-master kubelet[5215]: Flag --network-plugin has been deprecated, will be removed along with dockershim.
Dec 14 09:28:32 k8s-master kubelet[5215]: Flag --network-plugin has been deprecated, will be removed along with dockershim.
Dec 14 09:28:32 k8s-master systemd[1]: Started Kubernetes systemd probe.
Dec 14 09:28:32 k8s-master kubelet[5215]: I1214 09:28:32.342032 5215 server.go:440] "Kubelet version" kubeletVersion="v1.22.0"
Dec 14 09:28:32 k8s-master kubelet[5215]: I1214 09:28:32.342792 5215 server.go:868] "Client rotation is on, will bootstrap in background"
Dec 14 09:28:32 k8s-master kubelet[5215]: E1214 09:28:32.345247 5215 bootstrap.go:265] part of the existing bootstrap client certificate in /etc/kubernetes/kubelet.conf is expired: 2022-12-08 06:32:35 +0000 UTC
Dec 14 09:28:32 k8s-master kubelet[5215]: E1214 09:28:32.345538 5215 server.go:294] "Failed to run kubelet" err="failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory"
Dec 14 09:28:32 k8s-master systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Dec 14 09:28:32 k8s-master systemd[1]: kubelet.service: Failed with result 'exit-code'.
Dec 14 09:28:42 k8s-master systemd[1]: kubelet.service: Service hold-off time over, scheduling restart.
Dec 14 09:28:42 k8s-master systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 60.
Dec 14 09:28:42 k8s-master systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Dec 14 09:28:42 k8s-master systemd[1]: Started kubelet: The Kubernetes Node Agent.
Dec 14 09:28:42 k8s-master kubelet[5262]: Flag --network-plugin has been deprecated, will be removed along with dockershim.
Dec 14 09:28:42 k8s-master kubelet[5262]: Flag --network-plugin has been deprecated, will be removed along with dockershim.
Dec 14 09:28:42 k8s-master systemd[1]: Started Kubernetes systemd probe.
Dec 14 09:28:42 k8s-master kubelet[5262]: I1214 09:28:42.601131 5262 server.go:440] "Kubelet version" kubeletVersion="v1.22.0"
Dec 14 09:28:42 k8s-master kubelet[5262]: I1214 09:28:42.601981 5262 server.go:868] "Client rotation is on, will bootstrap in background"
Dec 14 09:28:42 k8s-master kubelet[5262]: E1214 09:28:42.604829 5262 bootstrap.go:265] part of the existing bootstrap client certificate in /etc/kubernetes/kubelet.conf is expired: 2022-12-08 06:32:35 +0000 UTC
Dec 14 09:28:42 k8s-master kubelet[5262]: E1214 09:28:42.604871 5262 server.go:294] "Failed to run kubelet" err="failed to run Kubelet: unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf: no such file or directory"
Dec 14 09:28:42 k8s-master systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Dec 14 09:28:42 k8s-master systemd[1]: kubelet.service: Failed with result 'exit-code'.
kubelet.conf (在 TLS 引导时名称为 bootstrap-kubelet.conf),文件里有一个引导令牌或内嵌的客户端证书 。 每个节点都有
root@k8s-master:/etc# ll /etc/kubernetes/kubelet.conf
-rw------- 1 root root 1979 Dec 8 2021 /etc/kubernetes/kubelet.conf
1.2 检查证书是否过期
root@k8s-master:/etc# openssl x509 -in /etc/kubernetes/pki/ca.crt -noout -text |grep Not
Not Before: Dec 8 06:32:31 2021 GMT
Not After : Dec 6 06:32:31 2031 GMT
root@k8s-master:/etc# openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text |grep Not
Not Before: Dec 8 06:32:31 2021 GMT
Not After : Dec 8 06:32:32 2022 GMT
root@k8s-master:/etc# kubeadm certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[check-expiration] Error reading configuration from the Cluster. Falling back to default configuration
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Dec 08, 2022 06:32 UTC <invalid> no
apiserver Dec 08, 2022 06:32 UTC <invalid> ca no
apiserver-etcd-client Dec 08, 2022 06:32 UTC <invalid> etcd-ca no
apiserver-kubelet-client Dec 08, 2022 06:32 UTC <invalid> ca no
controller-manager.conf Dec 08, 2022 06:32 UTC <invalid> no
etcd-healthcheck-client Dec 08, 2022 06:32 UTC <invalid> etcd-ca no
etcd-peer Dec 08, 2022 06:32 UTC <invalid> etcd-ca no
etcd-server Dec 08, 2022 06:32 UTC <invalid> etcd-ca no
front-proxy-client Dec 08, 2022 06:32 UTC <invalid> front-proxy-ca no
scheduler.conf Dec 08, 2022 06:32 UTC <invalid> no
发现CA证书尚未过期,apiserver证书过期、和集群证书全部过期。
命令参数解释:
第一行代码:查询CA证书过期信息
第二行代码:查询apiserver证书过期信息
-x509 表示输出证书
-in 需要进行处理的PEM格式的证书
-noout不打印密钥key数据
-test:text显示格式
## kubeadm init 初始化集群的时候会生成有效期为一年的集群证书。
2. !!!备份原有集群配置文件
mkdir ~/confirm #备份源集群文件
cp -rf /etc/kubernetes/ ~/confirm/
ll ~/confirm/kubernetes/
mkdir confirm/data_etcd #备份ETCD
cp -rf /var/lib/etcd/* ~/confirm/data_etcd
cp /usr/bin/kubeadm /usr/bin/kubeadm.bak
3. 官方更新方式,更新证书(有效期一年)
kubeadm certs renew all #更新所有证书
kubeadm certs check-expiration #检查集群证书信息
kubeadm alpha certs renew all更新的集群证书,但不会更新 kubelet.conf 的证书的
所以现在启动kubelet是失败的。
vim /etc/kubernetes/admin.conf 将user以及下面的经过base64编码的值 (17行开始)
拷贝至 kubelet.conf
将nodes节点上的/etc/kubernetes/kubelet.conf 替换成master节点上的 kubelet.conf
之后重启kubelet服务。
systemctl restart docker.service && systemctl restart kubelet.service
3.1 将user以及下面的经过base64编码的值 (17行开始)
拷贝至 kubelet.conf
注:
admin.conf 这个文件包含集群的地址,admin用户及各种证书的密钥的信息,创建kubernetes集群时,这个文件里的admin用户已经被授予最大权限(通过这个文件找寻API server→资源的创建和访问)kubele.conf
3.2 测试集群是否正常
root@k8s-master:~# kubectl get nodes
error: You must be logged in to the server (Unauthorized)
报错原因:没将更新admin.conf文件配到当前用户变量里
root@k8s-master:~# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
cp: overwrite '/root/.kube/config'? y
root@k8s-master:~# kubectl get pods
NAME READY STATUS RESTARTS AGE
front-end-6f94965fd9-vsvz5 1/1 Running 0 30m
guestbook-86bb8f5bc9-8pqcl 1/1 Running 0 30m
guestbook-86bb8f5bc9-d9tph 1/1 Running 0 30m
nfs-client-provisioner-56dd5765dc-sdsxh 1/1 Running 0 30m
task-2-ds-kh94m 1/1 Running 5 (354d ago) 355d
task-2-ds-v5dvj 1/1 Running 4 (354d ago) 355d
4. 使用开源组件将证书有效期设置为99年(建议初始化集群时)
4.1 下载相关组件:
Golang
官网下载地址:golang官网
打开官网下载地址选择对应的系统版本, 复制下载链接https://studygolang.com/dl/golang/go1.16.5.linux-amd64.tar.gz
https://codeload.github.com/kubernetes/kubernetes/zip/refs/tags/v1.22.0
注意:如果github下载慢可以用gitee下载地址:
https://gitee.com/mirrors/Kubernetes.git
下载与自己当前k8s集群版本相同的文件
apt-get install gcc automake autoconf libtool make #安装编译需要的组件
4.2 查看k8s版本
[root@k8s-master ~]# kubectl version
切换到自己的版本,修改源码,比如我的是v1.22.0版本
cd kubernetes
git checkout v1.22.0
4.3 修改源码里的日期
unzip kubernetes-1.22.0.zip
cd kubernetes-1.22.0/
vim cmd/kubeadm/app/constants/constants.go
找到CertificateValidity,修改如下
....
const (
// KubernetesDir is the directory Kubernetes owns for storing various configuration files
KubernetesDir = "/etc/kubernetes"
// ManifestsSubDirName defines directory name to store manifests
ManifestsSubDirName = "manifests"
// TempDirForKubeadm defines temporary directory for kubeadm
// should be joined with KubernetesDir.
TempDirForKubeadm = "tmp"
// CertificateValidity defines the validity for all the signed certificates generated by kubeadm
CertificateValidity = time.Hour * 24 * 365 * 100
....
解压编辑源码包
修改证书时间 :CertificateValidity = time.Hour * 24 * 365 * 100
修改 :NotAfter: now.Add(duration365d * 100).UTC(),
4.4 编译安装
注: 提前安装go环境
tar zxf go1.16.5.linux-amd64.tar.gz -C /usr/local/
vim /etc/profile
export GOROOT=/usr/local/go
export PATH=$PATH:/usr/local/go/bin
export GOPATH=/go
source /etc/profile
root@k8s-master:~/test# go version
go version go1.16.5 linux/amd64
root@k8s-master:~/test# go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.16.5"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build4057545257=/tmp/go-build -gno-record-gcc-switches"
root@k8s-master:~/test/kubernetes-1.22.0# make WHAT=cmd/kubeadm
+++ [1214 15:37:11] Building go targets for linux/amd64:
./vendor/k8s.io/code-generator/cmd/prerelease-lifecycle-gen
Generating prerelease lifecycle code for 27 targets
+++ [1214 15:37:22] Building go targets for linux/amd64:
./vendor/k8s.io/code-generator/cmd/deepcopy-gen
Generating deepcopy code for 234 targets
+++ [1214 15:37:29] Building go targets for linux/amd64:
./vendor/k8s.io/code-generator/cmd/defaulter-gen
Generating defaulter code for 93 targets
+++ [1214 15:37:37] Building go targets for linux/amd64:
./vendor/k8s.io/code-generator/cmd/conversion-gen
Generating conversion code for 128 targets
+++ [1214 15:37:53] Building go targets for linux/amd64:
./vendor/k8s.io/kube-openapi/cmd/openapi-gen
Generating openapi code for KUBE
Generating openapi code for AGGREGATOR
Generating openapi code for APIEXTENSIONS
Generating openapi code for CODEGEN
Generating openapi code for SAMPLEAPISERVER
+++ [1214 15:38:09] Building go targets for linux/amd64:
cmd/kubeadm
4.5 将新生成的kubeadm进行替换,并更新证书
编译完生成如下目录和二进制文件
root@k8s-master:~/test/kubernetes-1.22.0# ll _output/bin/
total 86596
drwxr-xr-x 2 root root 4096 Dec 14 15:39 ./
drwxr-xr-x 3 root root 4096 Dec 14 15:37 ../
-rwxr-xr-x 1 root root 7540736 Dec 14 15:37 conversion-gen*
-rwxr-xr-x 1 root root 7204864 Dec 14 15:37 deepcopy-gen*
-rwxr-xr-x 1 root root 7233536 Dec 14 15:37 defaulter-gen*
-rwxr-xr-x 1 root root 3555103 Dec 14 15:37 go2make*
-rwxr-xr-x 1 root root 45817856 Dec 14 15:39 kubeadm*
-rwxr-xr-x 1 root root 10141696 Dec 14 15:38 openapi-gen*
-rwxr-xr-x 1 root root 7172096 Dec 14 15:37 prerelease-lifecycle-gen*
备份原kubeadm和证书文件
cp /usr/bin/kubeadm{,.bak20210901}
cp -r /etc/kubernetes/pki{,.bak20210901}
将新生成的kubeadm进行替换
cp _output/bin/kubeadm /usr/bin/kubeadm
生成新的证书
cd /etc/kubernetes/pki
kubeadm alpha certs renew all
输出如下
[root@k8s-master kubernetes]# cd /etc/kubernetes/pki
[root@k8s-master pki]# kubeadm alpha certs renew all
[renew] Reading configuration from the cluster...
[renew] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
certificate embedded in the kubeconfig file for the admin to use and for kubeadm itself renewed
certificate for serving the Kubernetes API renewed
certificate the apiserver uses to access etcd renewed
certificate for the API server to connect to kubelet renewed
certificate embedded in the kubeconfig file for the controller manager to use renewed
certificate for liveness probes to healthcheck etcd renewed
certificate for etcd nodes to communicate with each other renewed
certificate for serving etcd renewed
certificate for the front proxy client renewed
certificate embedded in the kubeconfig file for the scheduler manager to use renewed
验证结果
kubeadm alpha certs check-expiration
输出如下
[root@k8s-master pki]# kubeadm alpha certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Aug 08, 2121 02:32 UTC 99y no
apiserver Aug 08, 2121 02:32 UTC 99y ca no
apiserver-etcd-client Aug 08, 2121 02:32 UTC 99y etcd-ca no
apiserver-kubelet-client Aug 08, 2121 02:32 UTC 99y ca no
controller-manager.conf Aug 08, 2121 02:32 UTC 99y no
etcd-healthcheck-client Aug 08, 2121 02:32 UTC 99y etcd-ca no
etcd-peer Aug 08, 2121 02:32 UTC 99y etcd-ca no
etcd-server Aug 08, 2121 02:32 UTC 99y etcd-ca no
front-proxy-client Aug 08, 2121 02:32 UTC 99y front-proxy-ca no
scheduler.conf Aug 08, 2121 02:32 UTC 99y no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca Aug 28, 2031 07:52 UTC 9y no
etcd-ca Aug 28, 2031 07:53 UTC 9y no
front-proxy-ca Aug 28, 2031 07:53 UTC 9y no
查看集群状态是否OK。
[root@k8s-master pki]# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master Ready master 42h v1.22.0
k8s-node1 Ready <none> 42h v1.22.0
k8s-node2 Ready <none> 42h v1.22.0
查看pod
[root@k8s-master pki]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-7ff77c879f-6pwrg 1/1 Running 1 42h
coredns-7ff77c879f-d6s95 1/1 Running 2 42h
etcd-k8s-master 1/1 Running 3 42h
kube-apiserver-k8s-master 1/1 Running 2 42h
kube-controller-manager-k8s-master 1/1 Running 3 42h
kube-flannel-ds-fs8dj 1/1 Running 3 42h
kube-flannel-ds-g6d4l 1/1 Running 2 42h
kube-flannel-ds-tnrzq 1/1 Running 1 42h
kube-proxy-dngh8 1/1 Running 1 42h
kube-proxy-nxb5q 1/1 Running 2 42h
kube-proxy-zz5xn 1/1 Running 3 42h
kube-scheduler-k8s-master 1/1 Running 2 42h