Flannel-Vxlan抓包
- 对于 Kubernetes 集群中的 Pod,由于容器内不便于抓包,通常视情况在 Pod 数据包经过的 veth 设备,docker0 网桥,CNI 插件设备(如 cni0,flannel.1 etc…)及 Pod 所在节点的网卡设备上指定 Pod IP 进行抓包。选取的设备根据怀疑导致网络问题的原因而定,比如范围由大缩小,从源端逐渐靠近目的端,比如怀疑是 CNI 插件导致,则在 CNI插件设备上抓包。从 pod 发出的包逐一经过 veth 设备,cni0设备,flannel0,宿主机网卡,到达对端,抓包时可按顺序逐一抓包,定位问题节点
- 需要注意在不同设备上抓包时指定的源目 IP 地址需要转换,如抓取某 Pod 时,ping {host} 的包,
在 veth 和 cni0 上可以指定 Pod IP 抓包,而在宿主机网卡上如果仍然指定 Pod IP 会发现抓不到包,因为此时 Pod IP 已被转换为宿主机网卡 IP
下图是一个使用 VxLAN模式的 flannel的跨界点通讯的网络模型,在抓包时需要注意对应的网络接口
# 查看宿主机网卡
[root@k8s-master-1 ~]# ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
link/ether 00:0c:29:1a:8f:b1 brd ff:ff:ff:ff:ff:ff
3: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
4: br-4fd7344be7da: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:06:0d:fe:05 brd ff:ff:ff:ff:ff:ff
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:fe:f7:31:65 brd ff:ff:ff:ff:ff:ff
6: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN mode DEFAULT group default
link/ether 46:12:74:d8:4a:42 brd ff:ff:ff:ff:ff:ff
7: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether ae:91:6a:eb:bc:c6 brd ff:ff:ff:ff:ff:ff
13: veth0939f343@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP mode DEFAULT group default
link/ether 12:15:90:e9:f2:97 brd ff:ff:ff:ff:ff:ff link-netnsid 5
14: vethee6d5517@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP mode DEFAULT group default
link/ether 3e:8c:58:6f:7a:00 brd ff:ff:ff:ff:ff:ff link-netnsid 6
17: veth1ed00cb1@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master cni0 state UP mode DEFAULT group default
link/ether aa:26:ba:3e:fb:bf brd ff:ff:ff:ff:ff:ff link-netnsid 9
# 查看当前pod信息
[root@k8s-master-1 ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
centos 1/1 Running 8 10d 10.70.0.80 k8s-master-1 <none> <none>
# 获取centos这个pod中container的ID(如果pod包含多个container,可以inspect pause容器,因为共享了网络)
[root@k8s-master-1 ~]# docker ps | grep centos
ae03ccef71ed eeb6ee3f44bd "/bin/bash" 35 minutes ago Up 35 minutes k8s_centos_centos_default_b2c2c87b-d40c-4ac2-9a89-9d19959241b3_8
09d3ecb20ca9 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 35 minutes ago Up 35 minutes k8s_POD_centos_default_b2c2c87b-d40c-4ac2-9a89-9d19959241b3_53
# 获取centos的PID
[root@k8s-master-1 ~]# docker inspect --format "{{ .State.Pid }}" ae03ccef71ed
11090
# 确定容器内网卡名称,基于eth0@if7 可以知道eth0的上层虚拟设备在宿主机上的网卡序号为7(即cni0),也可以直接进入容器确认序号
[root@k8s-master-1 ~]# nsenter -t 11090 -n ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
3: eth0@if17: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default
link/ether 22:ed:7c:37:00:87 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.70.0.80/24 brd 10.70.0.255 scope global eth0
valid_lft forever preferred_lft forever
宿主机抓取容器流量
- 基于这种方式,我们可以
不用使用容器里面的工具来抓取数据包进行分析,仅需宿主机包含nsenter、tcpdump命令即可
# 抓取容器流量,在centos7内执行ping -c 2 114.114.114.114
[root@k8s-master-1 ~]# nsenter -t 11090 -n tcpdump -i eth0 -Nnnvvl
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
14:26:39.852905 IP (tos 0x0, ttl 64, id 25376, offset 0, flags [DF], proto ICMP (1), length 84)
10.70.0.80 > 114.114.114.114: ICMP echo request, id 4, seq 1, length 64
14:26:39.869139 IP (tos 0x0, ttl 127, id 32150, offset 0, flags [none], proto ICMP (1), length 84)
114.114.114.114 > 10.70.0.80: ICMP echo reply, id 4, seq 1, length 64
14:26:40.854643 IP (tos 0x0, ttl 64, id 25962, offset 0, flags [DF], proto ICMP (1), length 84)
10.70.0.80 > 114.114.114.114: ICMP echo request, id 4, seq 2, length 64
14:26:40.869847 IP (tos 0x0, ttl 127, id 32151, offset 0, flags [none], proto ICMP (1), length 84)
114.114.114.114 > 10.70.0.80: ICMP echo reply, id 4, seq 2, length 64
^C
4 packets captured
4 packets received by filter
0 packets dropped by kernel
基于cni0抓取容器流量
- 由于在flannel vxlan模式下,所有的流量均从cni0->flannel.1,所以在这二张网卡上能抓到容器的流量
# 抓取从容器出去的流量,在centos7内执行ping -c 2 114.114.114.114
[root@k8s-master-1 ~]# tcpdump -i cni0 -Nnnvvl src host 10.70.0.80
tcpdump: listening on cni0, link-type EN10MB (Ethernet), capture size 262144 bytes
14:41:48.459050 IP (tos 0x0, ttl 64, id 19995, offset 0, flags [DF], proto ICMP (1), length 84)
10.70.0.80 > 114.114.114.114: ICMP echo request, id 7, seq 1, length 64
14:41:49.460986 IP (tos 0x0, ttl 64, id 19999, offset 0, flags [DF], proto ICMP (1), length 84)
10.70.0.80 > 114.114.114.114: ICMP echo request, id 7, seq 2, length 64
^C
2 packets captured
2 packets received by filter
0 packets dropped by kernel
基于flannel.1抓取流量
外网
- 由于流量到cni0后,会进行路由选择,会导致容器去外网的流量不会被转发到flannel.1
# 查看宿主机路由规则
[root@k8s-master-1 ~]# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.0.2 0.0.0.0 UG 100 0 0 ens33
10.70.0.0 0.0.0.0 255.255.255.0 U 0 0 0 cni0
10.70.1.0 10.70.1.0 255.255.255.0 UG 0 0 0 flannel.1
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
172.18.0.0 0.0.0.0 255.255.0.0 U 0 0 0 br-4fd7344be7da
192.168.0.0 0.0.0.0 255.255.255.0 U 100 0 0 ens33
# 在centos7内执行ping -c 2 114.114.114.114
[root@k8s-master-1 ~]# tcpdump -i flannel.1 -Nnnvvl #可以发现抓不到数据包
tcpdump: listening on flannel.1, link-type EN10MB (Ethernet), capture size 262144 bytes
^C
0 packets captured
0 packets received by filter
0 packets dropped by kernel
# 在centos7内执行ping -c 2 114.114.114.114,可以从这里发现数据包在宿主机的ens33网卡出去前被SNAT了
[root@k8s-master-1 ~]# tcpdump -i ens33 icmp -Nnnvvl
tcpdump: listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes
14:56:35.362894 IP (tos 0x0, ttl 63, id 32929, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.0.10 > 114.114.114.114: ICMP echo request, id 59555, seq 1, length 64
14:56:35.378949 IP (tos 0x0, ttl 128, id 32286, offset 0, flags [none], proto ICMP (1), length 84)
114.114.114.114 > 192.168.0.10: ICMP echo reply, id 59555, seq 1, length 64
14:56:36.364858 IP (tos 0x0, ttl 63, id 33787, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.0.10 > 114.114.114.114: ICMP echo request, id 59555, seq 2, length 64
14:56:36.381098 IP (tos 0x0, ttl 128, id 32287, offset 0, flags [none], proto ICMP (1), length 84)
114.114.114.114 > 192.168.0.10: ICMP echo reply, id 59555, seq 2, length 64
^C
内网
# 在centos7 ping 另一个节点的pod ip:ping -c 2 10.70.1.46
# 在flannel.1抓取流量
[root@k8s-master-1 ~]# tcpdump -i flannel.1 icmp -Nnnvvl # 当流量转发到cni0后,会进行路由规则匹配,然后转发到flanne.1网卡,在这里我们看到,此时数据包还未进行vxlan封包
tcpdump: listening on flannel.1, link-type EN10MB (Ethernet), capture size 262144 bytes
15:04:30.230644 IP (tos 0x0, ttl 63, id 40259, offset 0, flags [DF], proto ICMP (1), length 84)
10.70.0.80 > 10.70.1.46: ICMP echo request, id 12, seq 1, length 64
15:04:30.230836 IP (tos 0x0, ttl 63, id 24309, offset 0, flags [none], proto ICMP (1), length 84)
10.70.1.46 > 10.70.0.80: ICMP echo reply, id 12, seq 1, length 64
15:04:31.232329 IP (tos 0x0, ttl 63, id 41261, offset 0, flags [DF], proto ICMP (1), length 84)
10.70.0.80 > 10.70.1.46: ICMP echo request, id 12, seq 2, length 64
15:04:31.232607 IP (tos 0x0, ttl 63, id 24780, offset 0, flags [none], proto ICMP (1), length 84)
10.70.1.46 > 10.70.0.80: ICMP echo reply, id 12, seq 2, length 64
# 查看对端 pod mac 地址:
[root@k8s-master-1 ~]# ip neigh | grep 10.70.1
10.70.1.0 dev flannel.1 lladdr 56:1f:85:1c:05:19 PERMANENT
# 查看fdb表,VXLAN的转发过程主要依赖于FDB(Forwarding Database)实现, VXLAN设备根据MAC地址来查找相应的VTEP IP地址,继而将二层数据帧封装发送至相应VTEP
[root@k8s-master-1 ~]# bridge fdb show dev flannel.1
9a:5b:c1:a7:26:5a dst 192.168.0.11 self permanent
56:1f:85:1c:05:19 dst 192.168.0.11 self permanent
# 由于流量到flannel.1后,经过fdb表转发,会到宿主机ens33物理网卡,此时的数据包会进行vxlan封装,此时无法基于pod ip等去进行抓包了,因为被封装到UDP数据包里面去了