kubernetes node 节点启动报错: No valid private key

kubernetes node 节点启动报错故障排查

报错场景:

kubernetes 集群安装部署期间,部署node节点kubelet服务时,执行  systemctl start kubelet ,tailf /var/log/messages 看到大量证书验证报错;

报错内容:

May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.583305    5336 feature_gate.go:206] feature gates: &{map[]}
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.589637    5336 mount_linux.go:180] Detected OS with systemd
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.589680    5336 server.go:407] Version: v1.13.4
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.589732    5336 feature_gate.go:206] feature gates: &{map[]}
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.589825    5336 feature_gate.go:206] feature gates: &{map[]}
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.589899    5336 plugins.go:103] No cloud provider specified.
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.589916    5336 server.go:523] No cloud provider specified: "" from the config file: ""
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.589938    5336 bootstrap.go:65] Using bootstrap kubeconfig to generate TLS client cert, key and kubeconfig file
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.593022    5336 bootstrap.go:96] No valid private key and/or certificate found, reusing existing private key or creating a new one
May  5 22:23:40 kubnode-01 kubelet: I0505 22:23:40.612493    5336 bootstrap.go:239] Failed to connect to apiserver: Get https://172.20.101.157:6443/healthz?timeout=1s: x509: certificate signed by unknown authority
May  5 22:23:42 kubnode-01 kubelet: I0505 22:23:42.909358    5336 bootstrap.go:239] Failed to connect to apiserver: Get https://172.20.101.157:6443/healthz?timeout=1s: x509: certificate signed by unknown authority
May  5 22:23:45 kubnode-01 kubelet: I0505 22:23:45.036663    5336 bootstrap.go:239] Failed to connect to apiserver: Get https://172.20.101.157:6443/healthz?timeout=1s: x509: certificate signed by unknown authority

解决办法如下:

在master节点创建kubelet-bootstrap用户
[root@k8s-node01 ~]# 

kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap
clusterrolebinding "kubelet-bootstrap" created
node节点执行启动服务
[root@k8s-node01 ~]# systemctl start kubelet
node 节点kubelet启动后,会向master申请csr证书,需要在master上同意证书申请
master节点执行命令,查看csr状态是Pending
[root@kubm-01 ~]# kubectl get csr
NAME                                                   AGE     REQUESTOR           CONDITION
node-csr-mgZK4Cqvb7kZA7tDqVmszNQYLq27Yydia5LCqKJnnEI   4m11s   kubelet-bootstrap   Pending
master节点执行命令批准证书
[root@kubm-01 ~]# 
kubectl certificate approve node-csr-mgZK4Cqvb7kZA7tDqVmszNQYLq27Yydia5LCqKJnnEI
master节点执行命令接受证书申请,同意后查看状态变成 Approved,Issued
[root@kubm-01 ~]# kubectl get csr
NAME                                                   AGE     REQUESTOR           CONDITION
node-csr-mgZK4Cqvb7kZA7tDqVmszNQYLq27Yydia5LCqKJnnEI   5m39s   kubelet-bootstrap   Approved,Issued
node节点验证

在node节点ssl目录可以看到,多了4个kubelet的证书文件

[root@kubnode-02 kubernetes]# ls /kubernetes/ssl/kubelet*
/kubernetes/ssl/kubelet-client-2019-05-05-22-15-53.pem  /kubernetes/ssl/kubelet-client-current.pem  /kubernetes/ssl/kubelet.crt  /kubernetes/ssl/kubelet.key
删除csr证书 (按需执行)

[root@kubm-01 ~]# kubectl delete csr node-csr-mgZK4Cqvb7kZA7tDqVmszNQYLq27Yydia5LCqKJnnEI
certificatesigningrequest.certificates.k8s.io "node-csr-mgZK4Cqvb7kZA7tDqVmszNQYLq27Yydia5LCqKJnnEI" deleted

验证删除:
kubectl get csr

返回为空

排查过程有点坑。。。。。。。

参考文档:

https://www.liuyalei.top/1433.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值