OVS Vxlan模式
OVS 支持 GRE、VXLAN、STT、Geneve和IPsec隧道协议,这些隧道协议就是overlay网络的基础协议,通过对物理网络做的一层封装和扩展,解决了二层网络数量不足的问题,最大限度的减少对底层物理网络拓扑的依赖性,同时也最大限度的增加了对网络的控制。针对VXLAN隧道创建vtep口分为一对一模式和一对多模式,一个一模式是指local_ip和remore_ip都是明确的ipv4地址,而一对多模式local_ip是明确的ipv4地址,remore_ip=flow代表可以到达任何其他的vtep口,但是需要在流表里指定vxlan封装的外层ip,才能发送给对端vtep
搭建环境验证
// host1
ip netns add ns10
ip l a veth10 type veth peer name ovs-veth10
ip l s veth10 netns ns10
ovs-vsctl add-br br-int
ovs-vsctl add-port br-int ovs-veth10
ip l s ovs-veth10 up
ip netns exec ns10 ip link set veth10 address fe:fe:fe:fe:fe:aa
ip netns exec ns10 ip a a 1.1.1.1/24 dev veth10
ip netns exec ns10 ip l s veth10 up
ip netns exec ns10 arp -s 1.1.1.2 fe:fe:fe:fe:fe:bb
// core: one to one mode remoteIP is a specific ip, set vid to tunnelID
ovs-vsctl add-port br-int aa -- set interface aa type=vxlan options:local_ip=10.128.128.27 options:remote_ip=10.128.128.52 option:key=flow
ovs-ofctl add-flow br-int 'table=0,priority=100,ip,in_port=ovs-veth10 action=set_field:0x7->tun_id,normal'
// host2
ip netns add ns20
ip l a veth20 type veth peer name ovs-veth20
ip l s veth20 netns ns20
ovs-vsctl add-br br-int
ovs-vsctl add-port br-int ovs-veth20
ip l s ovs-veth20 up
ip netns exec ns20 ip link set veth20 address fe:fe:fe:fe:fe:bb
ip netns exec ns20 ip a a 1.1.1.2/24 dev veth20
ip netns exec ns20 ip l s veth20 up
ip netns exec ns20 arp -s 1.1.1.1 fe:fe:fe:fe:fe:aa
// core: one to one mode remoteIP is a specific ip, set vid to tunnelID
ovs-vsctl add-port br-int aa -- set interface aa type=vxlan options:local_ip=10.128.128.52 options:remote_ip=10.128.128.27 option:key=flow
ovs-ofctl add-flow br-int 'table=0,priority=100,ip,in_port=ovs-veth20 action=set_field:0x7->tun_id,normal'
// verify one to one packet
// host1 ns10 ping host2 ns20
ip netns exec ns10 ping 1.1.1.2
PING 1.1.1.2 (1.1.1.2) 56(84) bytes of data.
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.983 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.400 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.471 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.439 ms
// host1 在vxlan口抓包是未封装vxlan头部的包
tcpdump -i vxlan_sys_4789 -nn -vv -e
tcpdump: listening on vxlan_sys_4789, link-type EN10MB (Ethernet), capture size 262144 bytes
07:57:18.911039 fe:fe:fe:fe:fe:aa > fe:fe:fe:fe:fe:bb, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 53553, offset 0, flags [DF], proto ICMP (1), length 84)
1.1.1.1 > 1.1.1.2: ICMP echo request, id 10982, seq 1, length 64
07:57:18.911610 fe:fe:fe:fe:fe:bb > fe:fe:fe:fe:fe:aa, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 29337, offset 0, flags [none], proto ICMP (1), length 84)
1.1.1.2 > 1.1.1.1: ICMP echo reply, id 10982, seq 1, length 64
07:57:19.911891 fe:fe:fe:fe:fe:aa > fe:fe:fe:fe:fe:bb, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 53650, offset 0, flags [DF], proto ICMP (1), length 84)
1.1.1.1 > 1.1.1.2: ICMP echo request, id 10982, seq 2, length 64
07:57:19.912243 fe:fe:fe:fe:fe:bb > fe:fe:fe:fe:fe:aa, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 29374, offset 0, flags [none], proto ICMP (1), length 84)
1.1.1.2 > 1.1.1.1: ICMP echo reply, id 10982, seq 2, length 64
// host1 物理口抓包抓到封装vxlan后的包
tcpdump -i eth0 -nn -vv -e dst 10.128.128.52
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
07:58:02.418249 00:50:56:95:b0:b2 > 00:50:56:95:59:53, ethertype IPv4 (0x0800), length 148: (tos 0x0, ttl 64, id 62027, offset 0, flags [DF], proto UDP (17), length 134)
10.128.128.27.42104 > 10.128.128.52.4789: [no cksum] VXLAN, flags [I] (0x08), vni 7
fe:fe:fe:fe:fe:aa > fe:fe:fe:fe:fe:bb, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 59454, offset 0, flags [DF], proto ICMP (1), length 84)
1.1.1.1 > 1.1.1.2: ICMP echo request, id 11762, seq 1, length 64
07:58:03.419160 00:50:56:95:b0:b2 > 00:50:56:95:59:53, ethertype IPv4 (0x0800), length 148: (tos 0x0, ttl 64, id 62065, offset 0, flags [DF], proto UDP (17), length 134)
10.128.128.27.42104 > 10.128.128.52.4789: [no cksum] VXLAN, flags [I] (0x08), vni 7
fe:fe:fe:fe:fe:aa > fe:fe:fe:fe:fe:bb, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 59519, offset 0, flags [DF], proto ICMP (1), length 84)
1.1.1.1 > 1.1.1.2: ICMP echo request, id 11762, seq 2, length 64
tcpdump -i eth0 -nn -vv -e src 10.128.128.52
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
08:05:34.582869 00:50:56:95:59:53 > 00:50:56:95:b0:b2, ethertype IPv4 (0x0800), length 148: (tos 0x0, ttl 64, id 18217, offset 0, flags [DF], proto UDP (17), length 134)
10.128.128.52.40479 > 10.128.128.27.4789: [no cksum] VXLAN, flags [I] (0x08), vni 7
fe:fe:fe:fe:fe:bb > fe:fe:fe:fe:fe:aa, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 24861, offset 0, flags [none], proto ICMP (1), length 84)
1.1.1.2 > 1.1.1.1: ICMP echo reply, id 20116, seq 1, length 64
08:05:35.583508 00:50:56:95:59:53 > 00:50:56:95:b0:b2, ethertype IPv4 (0x0800), length 148: (tos 0x0, ttl 64, id 18395, offset 0, flags [DF], proto UDP (17), length 134)
10.128.128.52.40479 > 10.128.128.27.4789: [no cksum] VXLAN, flags [I] (0x08), vni 7
fe:fe:fe:fe:fe:bb > fe:fe:fe:fe:fe:aa, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 25007, offset 0, flags [none], proto ICMP (1), length 84)
1.1.1.2 > 1.1.1.1: ICMP echo reply, id 20116, seq 2, length 64
// 将两个host的vxlan口删掉换成一对多模式,注意一对多模式必须制定tun_dst否则vxlan口收不到数据包
(host1+host2)ovs-vsctl del-port aa
(host1)ovs-vsctl add-port br-int bb -- set interface bb type=vxlan options:local_ip=10.128.128.27 options:remote_ip=flow option:key=flow
(host1)ovs-ofctl add-flow br-int 'table=0,priority=200,ip,in_port=ovs-veth10 action=set_field:0x7->tun_id,set_field:10.128.128.52->tun_dst,normal'
(host2)ovs-vsctl add-port br-int bb -- set interface bb type=vxlan options:local_ip=10.128.128.52 options:remote_ip=flow option:key=flow
(host2)ovs-ofctl add-flow br-int 'table=0,priority=200,ip,in_port=ovs-veth20 action=set_field:0x7->tun_id,set_field:10.128.128.27->tun_dst,normal'
// verify one to more packet
// host1 ns10 ping host2 ns20
PING 1.1.1.2 (1.1.1.2) 56(84) bytes of data.
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=1.07 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.462 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.435 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.432 ms
// host1 物理口抓包抓到封装vxlan后的包
tcpdump -i vxlan_sys_4789 -nn -vv -e
tcpdump: listening on vxlan_sys_4789, link-type EN10MB (Ethernet), capture size 262144 bytes
08:29:06.935436 fe:fe:fe:fe:fe:aa > fe:fe:fe:fe:fe:bb, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 37795, offset 0, flags [DF], proto ICMP (1), length 84)
1.1.1.1 > 1.1.1.2: ICMP echo request, id 46159, seq 1, length 64
08:29:06.936214 fe:fe:fe:fe:fe:bb > fe:fe:fe:fe:fe:aa, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 10425, offset 0, flags [none], proto ICMP (1), length 84)
1.1.1.2 > 1.1.1.1: ICMP echo reply, id 46159, seq 1, length 64
// host1 物理口抓包抓到封装vxlan后的包,可以看到此时vxlan封装的外层的ip是在流表里指定的ip,外层的mac是host2物理网卡的mac
tcpdump -i ens160 -nn -vv -e dst 10.128.128.52
tcpdump: listening on ens160, link-type EN10MB (Ethernet), capture size 262144 bytes
08:29:06.935475 00:50:56:95:b0:b2 > 00:50:56:95:59:53, ethertype IPv4 (0x0800), length 148: (tos 0x0, ttl 64, id 5977, offset 0, flags [DF], proto UDP (17), length 134)
10.128.128.27.42104 > 10.128.128.52.4789: [no cksum] VXLAN, flags [I] (0x08), vni 7
fe:fe:fe:fe:fe:aa > fe:fe:fe:fe:fe:bb, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 37795, offset 0, flags [DF], proto ICMP (1), length 84)
1.1.1.1 > 1.1.1.2: ICMP echo request, id 46159, seq 1, length 64
08:29:07.936513 00:50:56:95:b0:b2 > 00:50:56:95:59:53, ethertype IPv4 (0x0800), length 148: (tos 0x0, ttl 64, id 5979, offset 0, flags [DF], proto UDP (17), length 134)
10.128.128.27.42104 > 10.128.128.52.4789: [no cksum] VXLAN, flags [I] (0x08), vni 7
fe:fe:fe:fe:fe:aa > fe:fe:fe:fe:fe:bb, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 38013, offset 0, flags [DF], proto ICMP (1), length 84)
1.1.1.1 > 1.1.1.2: ICMP echo request, id 46159, seq 2, length 64
总结:一对一模式集群内每两个主机都要互联,每个主机建立n-1个tunnel口,指定本地ip和对端ip,一对多模式每个主机只有1个tunnel口,指定本地ip,对端ip用flow表示,但是需要在流表里指定对端ip