calico-node启动失败,calico/node is not ready: bird/confd is not live

一主一从两台k8s集群启动后,master节点全部正常,work节点的calico-node一直启动失败。查看日志发现如下错误:

检查容器事件及日志,发现有如下报错:

Liveness probe failed: calico/node is not ready: Felix is not live: Get "http://localhost:9099/liveness": dial tcp [  
│ :1[]:9099: connect: connection refused

 Liveness probe failed: calico/node is not ready: bird/confd is not live: exit status 1

-----------------------------------------------------------------------------------------

2022-07-21 08:08:10.064 [INFO][9] startup/startup.go 416: Hit error connecting to datastore - retry error=Get "https://10.233.64.1:443/api/v1/nodes/foo": dial tcp 10.233.6  
│ 4.1:443: i/o timeout  

-----------------------------------------------------------------------------------------

Events:                                                                                                                                                                      
│   Type     Reason     Age                      From     Message                                                                                                              
│   ----     ------     ----                     ----     -------                                                                                                              
│   Warning  Unhealthy  44m (x10743 over 17d)    kubelet  Liveness probe failed: calico/node is not ready: Felix is not live: Get "http://localhost:9099/liveness": dial tcp   
│ ::1[]:9099: connect: connection refused                                                                                                                                      
│   Normal   Pulled     24m (x7006 over 17d)     kubelet  Container image "calico/node:v3.20.0" already present on machine                                                     
│   Warning  Unhealthy  9m40s (x55319 over 17d)  kubelet  Readiness probe failed: calico/node is not ready: BIRD is not ready: Failed to stat() nodename file: stat /var/lib/  
│ calico/nodename: no such file or directory                                                                                                                                   
│   Warning  BackOff    4m39s (x86064 over 17d)  kubelet  Back-off restarting failed container                                                             

从pod事件来看,是健康检查没过。Failed to stat() nodename file: stat /var/lib/  
│ calico/nodename: no such file or directory 

从容器日志来看,是 “https://10.233.64.1:443/api/v1/nodes/foo”这个请求没调通。

首先检测网络,在宿主机上执行 ping 10.233.64.1,发现网络是通的。执行 telnet 10.233.64.1 443,发现443端口不通

10.233.64.1是service的clusterip

检查 ipvsadm -ln   发现链路都正常。

检查 ipvsadm -lnc 发现tcp状态不对,无法建立tcp连接。

这个问题大概率是历史安装文件没有清理干净,比较靠谱的做法:

1、清理历史配置

rm -rf /etc/cni/net.d/*

rm -rf /var/lib/calico/*

ipvsadm --clear

2、修改网段,让service网段和pod网段区分开

    kubePodsCIDR: 10.233.64.0/18
    kubeServiceCIDR: 10.233.0.0/18

3、重新创建calico
 

  • 1
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值