rancher节点 flannel failed to add vxlanRoute 容器网络互 ping 不同 debug 记录

现象: f1 f2 f3 三台机器,f3与其他任意一台互 ping 容器ip,不通

1)检查 flanneld 进程是否正常,查看 flannel subnet 配置,互相 ping subnet 地址 确认不通现象

cat /run/flannel/subnet.env

## 输出
FLANNEL_NETWORK=10.42.0.0/16
FLANNEL_SUBNET=10.42.1.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true

## e.g. 在 f2 f3 机器尝试 ping f1
ping 10.42.1.1

2) 检查 flanneld 进程(容器)的日志

docker ps | grep flanneld

docker logs 44465a91c2d3
I1015 01:53:31.040048       1 main.go:474] Determining IP address of default interface
I1015 01:53:31.040382       1 main.go:487] Using interface with name eth0 and address 10.186.24.202
I1015 01:53:31.040398       1 main.go:504] Defaulting external address to interface address (10.186.24.202)
I1015 01:53:31.066320       1 kube.go:130] Waiting 10m0s for node controller to sync
I1015 01:53:31.066375       1 kube.go:283] Starting kube subnet manager
I1015 01:53:32.066502       1 kube.go:137] Node controller sync successful
I1015 01:53:32.066531       1 main.go:234] Created subnet manager: Kubernetes Subnet Manager - finot2
I1015 01:53:32.066536       1 main.go:237] Installing signal handlers
I1015 01:53:32.066588       1 main.go:352] Found network config - Backend type: vxlan
I1015 01:53:32.066634       1 vxlan.go:119] VXLAN config: VNI=1 Port=0 GBP=false DirectRouting=false
I1015 01:53:32.124906       1 main.go:299] Wrote subnet file to /run/flannel/subnet.env
I1015 01:53:32.124922       1 main.go:303] Running backend.
I1015 01:53:32.124929       1 main.go:321] Waiting for all goroutines to exit
I1015 01:53:32.124943       1 vxlan_network.go:56] watching for new subnet leases
I1015 01:53:32.128063       1 iptables.go:114] Some iptables rules are missing; deleting and recreating rules
I1015 01:53:32.128078       1 iptables.go:136] Deleting iptables rule: -s 10.42.0.0/16 -j ACCEPT
I1015 01:53:32.128431       1 iptables.go:114] Some iptables rules are missing; deleting and recreating rules
I1015 01:53:32.128447       1 iptables.go:136] Deleting iptables rule: -s 10.42.0.0/16 -d 10.42.0.0/16 -j RETURN
I1015 01:53:32.129338       1 iptables.go:136] Deleting iptables rule: -d 10.42.0.0/16 -j ACCEPT
I1015 01:53:32.129754       1 iptables.go:136] Deleting iptables rule: -s 10.42.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE
I1015 01:53:32.130338       1 iptables.go:124] Adding iptables rule: -s 10.42.0.0/16 -j ACCEPT
I1015 01:53:32.130728       1 iptables.go:136] Deleting iptables rule: ! -s 10.42.0.0/16 -d 10.42.2.0/24 -j RETURN
I1015 01:53:32.131929       1 iptables.go:136] Deleting iptables rule: ! -s 10.42.0.0/16 -d 10.42.0.0/16 -j MASQUERADE
I1015 01:53:32.133119       1 iptables.go:124] Adding iptables rule: -d 10.42.0.0/16 -j ACCEPT
I1015 01:53:33.133267       1 iptables.go:124] Adding iptables rule: -s 10.42.0.0/16 -d 10.42.0.0/16 -j RETURN
I1015 01:53:33.225369       1 iptables.go:124] Adding iptables rule: -s 10.42.0.0/16 ! -d 224.0.0.0/4 -j MASQUERADE
I1015 01:53:33.227734       1 iptables.go:124] Adding iptables rule: ! -s 10.42.0.0/16 -d 10.42.2.0/24 -j RETURN
I1015 01:53:33.230012       1 iptables.go:124] Adding iptables rule: ! -s 10.42.0.0/16 -d 10.42.0.0/16 -j MASQUERADE
E1015 01:53:49.498557       1 vxlan_network.go:158] failed to add vxlanRoute (10.42.0.0/24 -> 10.42.0.0): invalid argument

3)分析

日志最后一行有报错,添加路由失败。查看所有网络设备和网段,重点关注 10.42.0.0/24

ip a

## 输出
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
   ......... 略
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,PROMISC,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:5d:e6:cb:e2 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
    inet 10.42.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
    link/ether 2a:de:09:69:5f:8c brd ff:ff:ff:ff:ff:ff
    inet 10.42.1.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
5: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default qlen 1000
    link/ether 0a:58:0a:2a:01:01 brd ff:ff:ff:ff:ff:ff
    inet 10.42.1.1/24 scope global cni0
       valid_lft forever preferred_lft forever
6: vethb6725a8f@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP group default
    link/ether 6a:3b:bb:50:1f:fa brd ff:ff:ff:ff:ff:ff link-netnsid 0
7: veth6fdbf195@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP group default
    link/ether 3e:f9:73:06:28:8b brd ff:ff:ff:ff:ff:ff link-netnsid 1

观察到第三项 docker0 设备 占用了 10.42.0.1/16 网段,与最前面 FLANNEL_NETWORK 声明的网段冲突,导致路由添加失败,Overlay Network 无法做转发

删除 docker0 设备下的 10.42.0.1/16 网段(f1 f2 机器),问题解决

ip addr del 10.42.0.1 dev docker0

# 验证
ip addr show docker0
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:cd:ff:50:c1 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever

事后分析,机器曾装过 Rancher 1.X

Rancher 1.X 版本会在 docker0 设备下面添加 10.42 网段做 ipsec 转发。因未知原因未清理干净,与 flannel 网络的默认配置网段发生冲突。

参考: https://github.com/coreos/flannel/issues/844

转自:https://zhuanlan.zhihu.com/p/46804841

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值