kube-proxy Failed to retrieve node info: Unauthorized

简介

flannel 容器无法启动,看日志内容如下

I1102 02:32:56.069875       1 main.go:488] Using interface with name bond0.170 and address xx.xx.xx.xx
I1102 02:32:56.069940       1 main.go:505] Defaulting external address to interface address (xx.xx.xx.xx)
E1102 02:32:56.265305       1 main.go:232] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-amd64-4rh69': Get https://10.96.0.1:443/api/v1/namespaces/kube-system/pods/kube-flannel-ds-amd64-40.96.0.1:443: getsockopt: network is unreachable

初步怀疑是iptables有防火墙规则,确认后不是此问题
查看iptables的转发表也没有 10.96.0.1,怀疑是kube-proxy 没有生成正确的转发规则

kube-proxy 日志如下

k logs -f -n kube-system kube-proxy-4zs2c
W1102 03:52:17.820455       1 server_others.go:559] Unknown proxy mode "", assuming iptables proxy
E1102 03:52:17.825488       1 node.go:125] Failed to retrieve node info: Unauthorized
E1102 03:52:18.827659       1 node.go:125] Failed to retrieve node info: Unauthorized
E1102 03:52:21.175085       1 node.go:125] Failed to retrieve node info: Unauthorized
E1102 03:52:25.966158       1 node.go:125] Failed to retrieve node info: Unauthorized
E1102 03:52:35.352455       1 node.go:125] Failed to retrieve node info: Unauthorized
E1102 03:52:52.327513       1 node.go:125] Failed to retrieve node info: Unauthorized
I1102 03:52:52.327542       1 server_others.go:178] can't determine this node's IP, assuming 127.0.0.1; if this is incorrect, please set the --bind-address flag
I1102 03:52:52.327553       1 server_others.go:186] Using iptables Proxier.
I1102 03:52:52.327778       1 server.go:583] Version: v1.18.20
I1102 03:52:52.328152       1 conntrack.go:52] Setting nf_conntrack_max to 2097152
I1102 03:52:52.328341       1 config.go:133] Starting endpoints config controller
I1102 03:52:52.328361       1 shared_informer.go:223] Waiting for caches to sync for endpoints config
I1102 03:52:52.328389       1 config.go:315] Starting service config controller
I1102 03:52:52.328422       1 shared_informer.go:223] Waiting for caches to sync for service config
E1102 03:52:52.330188       1 event.go:260] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"etcd1.16b39e53c745056e", GenerateName:"", Namespace:"default", SelfLink:"", UIrsion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:m(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"etcd1", UID:"etcd1", APIVersiersion:"", FieldPath:""}, Reason:"Starting", Message:"Starting kube-proxy.", Source:v1.EventSource{Component:"kube-proxy", Host:"etcd1"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xc0584b6513913d6e, ext:34567606865, loc:(*time.Loc)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xc0584b6513913d6e, ext:34567606865, loc:(*time.Location)(0x28998a0)}}, Count:1, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Seriees)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'Unauthorized' (will not retry!)
E1102 03:52:52.330813       1 reflector.go:178] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Endpoints: Unauthorized
E1102 03:52:52.331160       1 reflector.go:178] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Service: Unauthorized
E1102 03:52:53.667507       1 reflector.go:178] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Endpoints: Unauthorized
E1102 03:52:53.860980       1 reflector.go:178] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Service: Unauthorized
E1102 03:52:56.219253       1 reflector.go:178] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Endpoints: Unauthorized

报错日志来看是证书验证失败,github上看到了有此问题的解决方法 ,需要删除kube-proxy 依赖的secret
k delete secret -n kube-system kube-proxy-token-hljcr

删除此secret后会自动生成新的,然后删除相关的kube-proxy容器,这个时候正常启动

总结

问题的根本原因是有同事误操作重新跑了kubeadm,导致集群里保存的证书和新生成的证书不一致

引用

https://github.com/kubernetes/kubernetes/issues/84244

### 解决方案概述 当遇到 `kube-proxy` 的健康检查端口(默认为 10257)被占用的情况时,可以通过调整配置文件或排查冲突服务来解决问题。以下是详细的分析和解决方案。 --- #### 1. **确认端口占用情况** 通过命令行工具可以快速定位哪个进程占用了目标端口: ```bash sudo lsof -i :10257 ``` 如果结果显示有其他程序正在使用该端口,则需要停止这些冲突的服务或将它们迁移到其他端口[^1]。 --- #### 2. **修改 kube-proxy 配置** Kubernetes 中的 `kube-proxy` 组件允许自定义其运行参数。要更改健康检查端口,可以在启动选项中指定新的监听地址和端口号。具体方法如下: ##### 方法一:编辑静态 Pod 文件 对于基于静态 Pod 启动的 `kube-proxy` 实例,可以直接修改 `/etc/kubernetes/manifests/kube-proxy.yaml` 文件中的参数部分。例如,在 `command` 字段添加以下内容: ```yaml spec: containers: - name: kube-proxy command: - /usr/local/bin/kube-proxy - --healthz-port=10258 # 修改健康检查端口 - --bind-address=127.0.0.1 # 可选:绑定到本地回环接口 ``` 保存后,Kubernetes 将自动重启 `kube-proxy` 容器并应用新设置[^2]。 ##### 方法二:更新 ConfigMap 如果是通过 ConfigMap 动态管理 `kube-proxy` 参数,则需先获取当前使用的 ConfigMap 并进行修改: ```bash kubectl get configmap kube-proxy -n kube-system -o yaml > kube-proxy-config.yaml ``` 打开导出的 YAML 文件,找到 `config.conf` 或类似的字段,增加或替换以下条目: ```json { "apiVersion": "kubeproxy.config.k8s.io/v1alpha1", "kind": "KubeProxyConfiguration", "healthzBindAddress": "127.0.0.1:10258", // 自定义健康检查地址和端口 } ``` 最后重新应用此 ConfigMap: ```bash kubectl apply -f kube-proxy-config.yaml ``` 等待一段时间让节点上的代理完成同步即可生效[^3]。 --- #### 3. **验证变更效果** 执行以下命令测试新配置是否正常工作: ```bash curl http://localhost:10258/healthz ``` 返回字符串 `"ok"` 表明一切正常;否则可能还需要进一步调试网络环境或其他潜在干扰因素。 --- ### 总结 上述操作涵盖了从基础诊断到高级定制的一系列步骤,能够有效应对因端口冲突引发的各种异常状况。实际部署过程中应结合具体情况灵活选用适合的方式加以实施。 ---
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值