目录
Leaf1/Leaf2/Leaf3/Spine1/Spine2配置数据
1、实验目的
了解分布式网关以及BGP EVPN工作机制。
2、实验环境
实验拓扑
ENSP软件版本:V100R003C00SPC100,CE12800软件版本: Version 8.180 (CE12800 V200R005C10SPC607B607),该版本可在模拟器上完美支持VXLAN。
- 按照实验环境进行组网。
- 基础配置:配置OSPF,相互能学习到Loopback地址。
- 完成BGP EVPN配置和分布式网关相关配置。
- 服务器vm1能Ping通vm3。
- vm2能Ping通vm3
- vm1、vm2、vm3能Ping通R1地址177.1.1.1。
- 检查VXLAN/EVPN/路由状态。
数据准备
服务器配置数据
名称 | IP地址 | 网关 | VLAN编号 |
VM-1 | 192.168.1.1/24 | 192.168.1.254 | 10 |
VM-2 | 192.168.2.1/24 | 192.168.2.254 | 20 |
VM-3 | 192.168.1.2/24 | 192.168.1.254 | 30 |
vSwitch配置数据
vSwitch-1 | 划分vlan10/20,和Leaf1 Trunk连接 |
vSwitch-2 | 划分vlan30,和Leaf2 Acess连接 |
Leaf1/Leaf2/Leaf3/Spine1/Spine2配置数据
设备 | LoopBack0 (Router-ID) | LoopBack10 (VTEP IP) | VID | BD | L2VNI |
Leaf-1 | 10.1.1.1 | 20.1.1.1 | 10 | 10 | 10 |
20 | 20 | 20 | |||
Leaf-2 | 10.1.1.2 | 20.1.1.2 | 30 | 10 | 10 |
Leaf-3 | 10.1.1.3 | 20.1.1.3 | |||
Spine-1 | 10.1.1.4 | ||||
Spine-2 | 10.1.1.5 |
本实验场景,Spine不做vxlan封装,只是三层转发,不需要VTEP地址。Leaf3没有业务接入点,不需配置二层vxlan/bridge domain。
互联端口地址:10.1.xy.x or y/24。
EVPN相关配置数据
BD | L2VNI | RD | RT | L3VNI | RD | RT | |
Leaf-1 | 10 | 10 | 10:1 | 10:1 1000:1(ert) | 100 | 100:1 | 100:1 1000:1(evpn) |
20 | 20 | 20:1 | 20:1 1000:1(ert) | ||||
Leaf-2 | 10 | 10 | 10:1 | 10:1 1000:1(ert) | 100 | 100:1 | 100:1 1000:1(evpn) |
Leaf-3 | 100 | 100:1 | 100:1 1000:1(evpn) |
Leaf-3没有L2VPN配置,只是通过L3VNI和Leaf1/Leaf2进行通信。
3、配置文件
详见ENSP工程文件
4、配置步骤
4.1 基础配置
- 配置vm1/vm2/vm3IP地址/网关;
- 配置vSwitch-1,端口划分vlan,配置和Leaf1的Trunk连接,vSwitch-2端口划分vlan30,和Leaf2 Acess连接;
a. vSwitch-1配置如下:
#
interface Ethernet0/0/1
port link-type trunk
port trunk allow-pass vlan 10 20
#
interface Ethernet0/0/2
port link-type access
port default vlan 10
#
interface Ethernet0/0/3
port link-type access
port default vlan 20
#
b. vSwitch-2配置如下:
interface Ethernet0/0/1
port link-type access
port default vlan 30
#
interface Ethernet0/0/2
port link-type access
port default vlan 30
- 配置Spine1/Leaf1/Leaf2/Leaf3的loopback地址,互联地址,配置OSPF,使得Loopback地址可达。
a. Spine-1配置如下:
sysname Spine-1
#
interface GE1/0/0
undo portswitch
undo shutdown
ip address 10.1.45.4 255.255.255.0
ospf network-type p2p
#
interface GE1/0/1
undo portswitch
undo shutdown
ip address 10.1.14.4 255.255.255.0
ospf network-type p2p
#
interface GE1/0/2
undo portswitch
undo shutdown
ip address 10.1.24.4 255.255.255.0
ospf network-type p2p
#
interface GE1/0/3
undo portswitch
undo shutdown
ip address 10.1.34.4 255.255.255.0
ospf network-type p2p
#
interface LoopBack0
ip address 10.1.1.4 255.255.255.255
#
ospf 1 router-id 10.1.1.4
area 0.0.0.0
network 10.1.1.4 0.0.0.0
network 10.1.14.4 0.0.0.0
network 10.1.24.4 0.0.0.0
network 10.1.34.4 0.0.0.0
network 10.1.45.4 0.0.0.0
#
b. Spine-2配置如下:
sysname Spine-2
#
interface GE1/0/0
undo portswitch
undo shutdown
ip address 10.1.45.5 255.255.255.0
ospf network-type p2p
#
interface GE1/0/1
undo portswitch
undo shutdown
ip address 10.1.15.5 255.255.255.0
ospf network-type p2p
#
interface GE1/0/2
undo portswitch
undo shutdown
ip address 10.1.25.5 255.255.255.0
ospf network-type p2p
#
interface GE1/0/3
undo portswitch
undo shutdown
ip address 10.1.35.5 255.255.255.0
ospf network-type p2p
#
interface LoopBack0
ip address 10.1.1.5 255.255.255.255
#
ospf 1 router-id 10.1.1.5
area 0.0.0.0
network 10.1.1.5 0.0.0.0
network 10.1.15.5 0.0.0.0
network 10.1.25.5 0.0.0.0
network 10.1.35.5 0.0.0.0
network 10.1.45.5 0.0.0.0
#
c. Leaf-1配置如下:
sysname Leaf-1
#
interface GE1/0/0
undo portswitch
undo shutdown
ip address 10.1.14.1 255.255.255.0
ospf network-type p2p
#
interface GE1/0/2
undo portswitch
undo shutdown
ip address 10.1.15.1 255.255.255.0
ospf network-type p2p
#
interface LoopBack0
ip address 10.1.1.1 255.255.255.255
#
interface LoopBack1
ip address 20.1.1.1 255.255.255.255
#
ospf 1 router-id 10.1.1.1
area 0.0.0.0
network 10.1.1.1 0.0.0.0
network 10.1.14.1 0.0.0.0
network 10.1.15.1 0.0.0.0
network 20.1.1.1 0.0.0.0
#
d. Leaf-2配置如下:
sysname Leaf-2
#
interface GE1/0/0
undo portswitch
undo shutdown
ip address 10.1.24.2 255.255.255.0
ospf network-type p2p
#
interface GE1/0/2
undo portswitch
undo shutdown
ip address 10.1.25.2 255.255.255.0
ospf network-type p2p
#
interface LoopBack0
ip address 10.1.1.2 255.255.255.255
#
interface LoopBack1
ip address 20.1.1.2 255.255.255.255
#
ospf 1 router-id 10.1.1.2
area 0.0.0.0
network 10.1.1.2 0.0.0.0
network 10.1.24.2 0.0.0.0
network 10.1.25.2 0.0.0.0
network 20.1.1.2 0.0.0.0
#
e. Leaf-3配置如下:
sysname Leaf-3
#
interface GE1/0/0
undo portswitch
undo shutdown
ip address 10.1.34.3 255.255.255.0
ospf network-type p2p
#
interface GE1/0/2
undo portswitch
undo shutdown
ip address 10.1.35.3 255.255.255.0
ospf network-type p2p
#
interface LoopBack0
ip address 10.1.1.3 255.255.255.255
#
interface LoopBack1
ip address 20.1.1.3 255.255.255.255
#
ospf 1 router-id 10.1.1.3
area 0.0.0.0
network 10.1.1.3 0.0.0.0
network 10.1.34.3 0.0.0.0
network 10.1.35.3 0.0.0.0
network 20.1.1.3 0.0.0.0
4.2 配置业务接入点
分别在Leaf1/Leaf2配置业务接入点。
1) Leaf1配置:
#
bridge-domain 10
#
bridge-domain 20
#
interface GE1/0/1
undo shutdown
#
interface GE1/0/1.10 mode l2
encapsulation dot1q vid 10
bridge-domain 10
#
interface GE1/0/1.20 mode l2
encapsulation dot1q vid 20
bridge-domain 20
#
2) Leaf2配置:
bridge-domain 20
#
interface GE1/0/1
undo shutdown
#
interface GE1/0/1.20 mode l2
encapsulation untag
bridge-domain 20
4.3 配置BGP EVPN Peer
1) Spine1(RR)配置:
#
evpn-overlay enable
#
bgp 100
router-id 10.1.1.4
peer 10.1.1.1 as-number 100
peer 10.1.1.1 connect-interface LoopBack0
peer 10.1.1.2 as-number 100
peer 10.1.1.2 connect-interface LoopBack0
peer 10.1.1.3 as-number 100
peer 10.1.1.3 connect-interface LoopBack0
peer 10.1.1.5 as-number 100
peer 10.1.1.5 connect-interface LoopBack0
#
ipv4-family unicast
undo peer 10.1.1.1 enable
undo peer 10.1.1.2 enable
undo peer 10.1.1.3 enable
undo peer 10.1.1.5 enable
#
l2vpn-family evpn
undo policy vpn-target
peer 10.1.1.1 enable
peer 10.1.1.1 reflect-client
peer 10.1.1.2 enable
peer 10.1.1.2 reflect-client
peer 10.1.1.3 enable
peer 10.1.1.3 reflect-client
peer 10.1.1.5 enable
peer 10.1.1.5 reflect-client
#
2) Spine2(RR)配置:
#
evpn-overlay enable
#
bgp 100
router-id 10.1.1.5
peer 10.1.1.1 as-number 100
peer 10.1.1.1 connect-interface LoopBack0
peer 10.1.1.2 as-number 100
peer 10.1.1.2 connect-interface LoopBack0
peer 10.1.1.3 as-number 100
peer 10.1.1.3 connect-interface LoopBack0
peer 10.1.1.4 as-number 100
peer 10.1.1.4 connect-interface LoopBack0
#
ipv4-family unicast
undo peer 10.1.1.1 enable
undo peer 10.1.1.2 enable
undo peer 10.1.1.3 enable
undo peer 10.1.1.4 enable
#
l2vpn-family evpn
undo policy vpn-target
peer 10.1.1.1 enable
peer 10.1.1.1 reflect-client
peer 10.1.1.2 enable
peer 10.1.1.2 reflect-client
peer 10.1.1.3 enable
peer 10.1.1.3 reflect-client
peer 10.1.1.4 enable
peer 10.1.1.4 reflect-client
#
3) Leaf1 BGP EVPN配置:
#
evpn-overlay enable
#
bgp 100
router-id 10.1.1.1
peer 10.1.1.4 as-number 100
peer 10.1.1.4 connect-interface LoopBack0
peer 10.1.1.5 as-number 100
peer 10.1.1.5 connect-interface LoopBack0
#
ipv4-family unicast
undo peer 10.1.1.4 enable
undo peer 10.1.1.5 enable
#
l2vpn-family evpn
policy vpn-target
peer 10.1.1.4 enable
peer 10.1.1.5 enable
#
4) Leaf2 BGP EVPN配置:
#
evpn-overlay enable
#
bgp 100
router-id 10.1.1.2
peer 10.1.1.4 as-number 100
peer 10.1.1.4 connect-interface LoopBack0
peer 10.1.1.5 as-number 100
peer 10.1.1.5 connect-interface LoopBack0
#
ipv4-family unicast
undo peer 10.1.1.4 enable
undo peer 10.1.1.5 enable
#
l2vpn-family evpn
policy vpn-target
peer 10.1.1.4 enable
peer 10.1.1.5 enable
#
5) Leaf3 BGP EVPN配置:
#
evpn-overlay enable
#
bgp 100
router-id 10.1.1.3
peer 10.1.1.4 as-number 100
peer 10.1.1.4 connect-interface LoopBack0
peer 10.1.1.5 as-number 100
peer 10.1.1.5 connect-interface LoopBack0
#
ipv4-family unicast
undo peer 10.1.1.4 enable
undo peer 10.1.1.5 enable
#
l2vpn-family evpn
policy vpn-target
peer 10.1.1.4 enable
peer 10.1.1.5 enable
#
Undo policy vpn-target
缺省情况下,PE对收到的VPNv4路由进行VPN-target过滤。通过过滤的路由会被加入到路由表中,没有通过过滤的路由将被丢弃。因此,如果PE没有配置VPN实例,或者VPN实例没有配置VPN-Target,则PE丢弃所有收到的VPNv4路由。
Spine1/2(RR)不配置VPN实例,但是RR需要保存所有EVPN路由信息,以通告给对端Leaf。这种情况下,RR应接收所有的EVPN路由信息,不对它们进行VPN-Target过滤。
4.4 配置VPN实例和EVPN实例
Leaf1配置,Leaf2/Leaf3类似,其中Leaf3没有配置bridge-domain,所以没有evpn实例:
1) Leaf1配置VPN实例和EVPN实例,如下:
ip vpn-instance vpn1
ipv4-family
route-distinguisher 100:1
vpn-target 100:1 export-extcommunity
vpn-target 1000:1 export-extcommunity evpn
vpn-target 100:1 import-extcommunity
vpn-target 1000:1 import-extcommunity evpn
vxlan vni 100
#
bridge-domain 10
vxlan vni 10
evpn
route-distinguisher 10:1
vpn-target 10:1 export-extcommunity
vpn-target 1000:1 export-extcommunity
vpn-target 10:1 import-extcommunity
#
bridge-domain 20
vxlan vni 20
evpn
route-distinguisher 20:1
vpn-target 20:1 export-extcommunity
vpn-target 1000:1 export-extcommunity
vpn-target 20:1 import-extcommunity
#
2) Leaf2配置VPN实例和EVPN实例,如下:
ip vpn-instance vpn1
ipv4-family
route-distinguisher 100:1
vpn-target 100:1 export-extcommunity
vpn-target 1000:1 export-extcommunity evpn
vpn-target 100:1 import-extcommunity
vpn-target 1000:1 import-extcommunity evpn
vxlan vni 100
#
bridge-domain 10
vxlan vni 10
evpn
route-distinguisher 10:1
vpn-target 10:1 export-extcommunity
vpn-target 1000:1 export-extcommunity
vpn-target 10:1 import-extcommunity
#
bridge-domain 20
vxlan vni 20
evpn
route-distinguisher 20:1
vpn-target 20:1 export-extcommunity
vpn-target 1000:1 export-extcommunity
vpn-target 20:1 import-extcommunity
#
3) Leaf3配置VPN实例,如下:
ip vpn-instance vpn1
ipv4-family
route-distinguisher 100:1
vpn-target 100:1 export-extcommunity
vpn-target 1000:1 export-extcommunity evpn
vpn-target 100:1 import-extcommunity
vpn-target 1000:1 import-extcommunity evpn
vxlan vni 100
L3VPN(ip vpn-instance)下配置vpn-target 1000:1 export-extcommunity evpn,主要对该L3VPN产生的ip prefix,在生成BGP Update——Type5类EVPN路由时,通过MPBGP EVPN传输时,携带RT:1000:1,用于远端L3VPN过滤接收该前缀路由。
L2VPN(evpn)下配置vpn-target 1000:1 export-extcommunity,主要对evpn产生的Type2类EVPN路由,即mac/ip信息,提取其中的ip信息(32位主机路由),通过MPBGP EVPN传输时,携带RT:1000:1,用于远端L3VPN过滤接收该主机路由。
4.5 使能头端复制功能
1) Leaf1配置,如下:
interface Nve1
source 20.1.1.1
vni 10 head-end peer-list protocol bgp
vni 20 head-end peer-list protocol bgp
#
2) Leaf2配置,如下:
interface Nve1
source 20.1.1.2
vni 10 head-end peer-list protocol bgp
vni 20 head-end peer-list protocol bgp
#
3) Leaf3配置,如下:
interface Nve1
source 20.1.1.3
只需要配置NVE端口,指定源地址即可,无需配置头端复制(没有BUM traffic)。
配置vni 10 head-end peer-list protocol bgp后,会生成BGP Update——Type3类EVPN路由(inclusive multicast route),携带vni10和VTEP地址为20.1.1.1,告诉其他VTEP,自己是对应VNI(即L2VNI)的成员,远端收到后,判断路由可达则建立VXLAN隧道,同时会把该VTEP接入到自己对应VNI的头端复制列表中(用于BUM流量的发送)。
通过dis vxlan peer查看vni对应的头端复制列表。
4.6 配置VXLAN三层网关
1) Leaf1配置如下:
interface Vbdif10
ip binding vpn-instance vpn1
ip address 192.168.1.254 255.255.255.0
mac-address 0001-0001-0001
vxlan anycast-gateway enable
arp collect host enable
#
interface Vbdif20
ip binding vpn-instance vpn1
ip address 192.168.2.254 255.255.255.0
mac-address 0002-0002-0002
vxlan anycast-gateway enable
arp collect host enable
#
2) Leaf2配置如下:
interface Vbdif10
ip binding vpn-instance vpn1
ip address 192.168.1.254 255.255.255.0
mac-address 0001-0001-0001
vxlan anycast-gateway enable
arp collect host enable
#
interface Vbdif20
ip binding vpn-instance vpn1
ip address 192.168.2.254 255.255.255.0
mac-address 0002-0002-0002
vxlan anycast-gateway enable
arp collect host enable
配置说明:
vxlan anycast-gateway enable
当用户希望网关作为分布式网关,并且需要网关只学习用户侧主机发送的ARP、ND或DHCP报文时,可以执行该命令。使能分布式网关功能之后:
网关只处理收到的用户侧主机发送的ARP、ND或DHCP报文,并生成主机路由。
网关删除已经学到的网络侧的ARP、ND或DHCP报文,同时删除相应的主机路由。
arp collect host enable
使三层网关能够获取主机信息表。
在配置分布式网关部署方式的VXLAN(BGP EVPN方式)场景中,当VXLAN网关之间发布的路由类型为IRB时,需配置arp collect host enable命令,用来发布主机路由。
4.7 配置BGP对邻居发布IRB路由
Leaf1/Leaf2类似,配置向邻居发布:
1) Leaf1配置如下:
bgp 100
l2vpn-family evpn
peer 10.1.1.4 advertise irb
peer 10.1.1.5 advertise irb
#
2) Leaf2配置如下:
bgp 100
l2vpn-family evpn
peer 10.1.1.4 advertise irb
peer 10.1.1.5 advertise irb
#
3) Spnie1配置如下:bgp 100
l2vpn-family evpn
undo policy vpn-target
peer 10.1.1.1 advertise irb
peer 10.1.1.2 advertise irb
peer 10.1.1.3 advertise irb
peer 10.1.1.5 advertise irb
4) Spnie2配置如下:
bgp 100
l2vpn-family evpn
undo policy vpn-target
peer 10.1.1.1 advertise irb
peer 10.1.1.2 advertise irb
peer 10.1.1.3 advertise irb
peer 10.1.1.4 advertise irb
Spine设备作为RR,也需要使能对等体发布irb路由功能,否则irb路由经过反射器后不会再给客户。
4.8 配置BGP对邻居发布IP前缀路由
只有在如下两种场景中,Leaf节点可发布网段路由:
Leaf节点连接的网段在整个VXLAN网络中是唯一的,而且有效的主机明细路由数量较大,此时可发布主机IP所在的网段路由,从而减轻Leaf节点上路由存储的压力。
VXLAN网络中的主机需要访问外部网络,此时Leaf节点可在VXLAN网络中发布其连接的外部网段路由,从而使其他Leaf节点学习到去往外部网络的路由。
Leaf3配置IP前缀路由发布,如下:
bgp 100
ipv4-family vpn-instance vpn1
import-route ospf 2
advertise l2vpn evpn
#
advertise l2vpn evpn
//让vpn-instance的三层路由通过bgp evpn传递
Leaf1和Leaf2上可使用import-route命令发布前缀路由,将192.168.1.0/24和192.168.2.0/24的ip vpn实例路由转换成evpn的tpye-5类型发布出去。但是若使用了该命令之后,由于leaf1和Leaf2上面都承载了192.168.1.0/24和192.168.2.0/24的网段,Leaf3在根据BGP选路原则选路的时候默认均只会选择Leaf1发送的前缀路由。而且这样在实际传输过程中可能会出现未知问题,故Leaf1和Leaf2发布IRB路由即可,同时终端在入网之后应用协议进程自身会交互报文,顾不用担心Leaf1和Leaf2作为网关设备无法收集到VM1、VM2和VM3的IRB路由信息。
若Leaf1和Leaf2发布了关于192.168.1.0/24的前缀信息,那么当Leaf3上访问192.168.1.3(该地址实际没有被VM使用),则Leaf3还是会通过L3 VXLAN隧道传递给Leaf1或者Leaf2,造成网络没必要的流量负担,而使用IRB路由可以避免该问题的出现。
另外若Leaf3下R1上的有主动访问192.168.1.1的需求,则需要结合实际情况判断是否在Leaf1上发布IP前缀路由,因为IBR是结合ARP表项信息收集的32位主机路由,倘若192.168.1.1的ARP表项超时,则Leaf3关于192.168.1.1的32位IRB路由也会消失,这种场景下会出现无法访问192.168.1.1。
4.9 其他配置
配置R1和Leaf3互联和路由:
1) R1配置:
sysname R1
#
interface GigabitEthernet0/0/1
ip address 10.1.36.6 255.255.255.0
#
interface LoopBack0
ip address 177.1.1.1 255.255.255.255
#
ospf 1 router-id 10.1.36.6
area 0.0.0.0
network 177.1.1.1 0.0.0.0
network 10.1.36.6 0.0.0.0
#
2) Leaf3配置:
interface GE1/0/1
undo portswitch
undo shutdown
ip binding vpn-instance vpn1
ip address 10.1.36.3 255.255.255.0
#
ospf 2 router-id 10.1.36.3 vpn-instance vpn1
import-route bgp
area 0.0.0.0
network 10.1.36.3 0.0.0.0
#
5、结果验证
5.1 检查EVPN Peer是否正常建立
在Spine上检查BGP EVPN的邻居状态,如下:
[~Spine-1]display bgp evpn peer
BGP local router ID : 10.1.1.4
Local AS number : 100
Total number of peers : 4
Peers in established state : 3
Peer V AS MsgRcvd MsgSent OutQ Up/Down State PrefRcv
10.1.1.1 4 100 11 15 0 00:00:59 Established 10
10.1.1.2 4 100 10 15 0 00:01:01 Established 8
10.1.1.3 4 100 5 12 0 00:00:58 Established 2
10.1.1.5 4 100 0 0 0 01:31:30 Active 0
5.2 查看EVPN路由
5.2.1 Leaf1初始bgp evpn路由信息
1) 在Leaf1上查看BGP EVPN邻居状态,如下:
[~Leaf-1]display bgp evpn peer
BGP local router ID : 10.1.1.1
Local AS number : 100
Total number of peers : 2
Peers in established state : 1
Peer V AS MsgRcvd MsgSent OutQ Up/Down State PrefRcv
10.1.1.4 4 100 20 15 0 00:05:34 Established 5
10.1.1.5 4 100 0 0 0 01:33:55 Active 0
2) 此时查看Leaf1的全局路由信息,如下:
[~Leaf-1]display ip routing-table vpn-instance vpn1
Proto: Protocol Pre: Preference
Route Flags: R - relay, D - download to fib, T - to vpn-instance, B - black hole route
------------------------------------------------------------------------------
Routing Table : vpn1
Destinations : 10 Routes : 10
Destination/Mask Proto Pre Cost Flags NextHop Interface
10.1.36.0/24 IBGP 255 2 RD 20.1.1.3 VXLAN
177.1.1.1/32 IBGP 255 2 RD 20.1.1.3 VXLAN //type5的IP前缀路由
192.168.1.0/24 Direct 0 0 D 192.168.1.254 Vbdif10
192.168.1.2/32 IBGP 255 0 RD 20.1.1.2 VXLAN //type2的IRB路由
192.168.1.254/32 Direct 0 0 D 127.0.0.1 Vbdif10
192.168.1.255/32 Direct 0 0 D 127.0.0.1 Vbdif10
192.168.2.0/24 Direct 0 0 D 192.168.2.254 Vbdif20
192.168.2.254/32 Direct 0 0 D 127.0.0.1 Vbdif20
192.168.2.255/32 Direct 0 0 D 127.0.0.1 Vbdif20
255.255.255.255/32 Direct 0 0 D 127.0.0.1 InLoopBack0
3) 查看Leaf1的EVPN路由信息,如下:
[~Leaf-1]display bgp evpn all routing-table
Local AS number : 100
BGP Local router ID is 10.1.1.1
Status codes: * - valid, > - best, d - damped, x - best external, a - add path,
h - history, i - internal, s - suppressed, S - Stale
Origin : i - IGP, e - EGP, ? - incomplete
EVPN address family:
Number of Mac Routes: 4
Route Distinguisher: 10:1
Network(EthTagId/MacAddrLen/MacAddr/IpAddrLen/IpAddr) NextHop
*> 0:48:0001-0001-0001:0:0.0.0.0 0.0.0.0
*> 0:48:5489-9805-6139:32:192.168.1.1 0.0.0.0
Route Distinguisher: 20:1
Network(EthTagId/MacAddrLen/MacAddr/IpAddrLen/IpAddr) NextHop
*> 0:48:0002-0002-0002:0:0.0.0.0 0.0.0.0
*> 0:48:5489-9897-18ac:32:192.168.2.1 0.0.0.0
EVPN-Instance 10:
Number of Mac Routes: 2
Network(EthTagId/MacAddrLen/MacAddr/IpAddrLen/IpAddr) NextHop
*> 0:48:0001-0001-0001:0:0.0.0.0 0.0.0.0
*> 0:48:5489-9805-6139:32:192.168.1.1 0.0.0.0
EVPN-Instance 20:
Number of Mac Routes: 2
Network(EthTagId/MacAddrLen/MacAddr/IpAddrLen/IpAddr) NextHop
*> 0:48:0002-0002-0002:0:0.0.0.0 0.0.0.0
*> 0:48:5489-9897-18ac:32:192.168.2.1 0.0.0.0
EVPN address family:
Number of Inclusive Multicast Routes: 4
Route Distinguisher: 10:1
Network(EthTagId/IpAddrLen/OriginalIp) NextHop
*> 0:32:20.1.1.1 0.0.0.0
*>i 0:32:20.1.1.2 20.1.1.2
Route Distinguisher: 20:1
Network(EthTagId/IpAddrLen/OriginalIp) NextHop
*> 0:32:20.1.1.1 0.0.0.0
*>i 0:32:20.1.1.2 20.1.1.2
EVPN-Instance 10:
Number of Inclusive Multicast Routes: 2
Network(EthTagId/IpAddrLen/OriginalIp) NextHop
*> 0:32:20.1.1.1 0.0.0.0
*>i 0:32:20.1.1.2 20.1.1.2
EVPN-Instance 20:
Number of Inclusive Multicast Routes: 2
Network(EthTagId/IpAddrLen/OriginalIp) NextHop
*> 0:32:20.1.1.1 0.0.0.0
*>i 0:32:20.1.1.2 20.1.1.2
EVPN address family:
Number of Ip Prefix Routes: 2
Route Distinguisher: 100:1
Network(EthTagId/IpPrefix/IpPrefixLen) NextHop
*>i 0:10.1.36.0:24 20.1.1.3
*>i 0:177.1.1.1:32 20.1.1.3
EVPN-Instance __vni100__:
Number of Ip Prefix Routes: 2
Network(EthTagId/IpPrefix/IpPrefixLen) NextHop
*>i 0:10.1.36.0:24 20.1.1.3
*>i 0:177.1.1.1:32 20.1.1.3
4条type3路由,分别为Leaf1和Leaf2关于VNI10和20产生的。
4条type2路由,2条mac路由,2条由arp路由;
2条type5路由,为Leaf3引入的ospf路由;
4) 2条type5路由,由L3VPN注入进来的:
[~Leaf-1]display bgp vpnv4 all routing-table
BGP Local router ID is 10.1.1.1
Status codes: * - valid, > - best, d - damped, h - history,
i - internal, s - suppressed, S - Stale
Origin : i - IGP, e - EGP, ? - incomplete
Total number of routes from all PE: 4
Route Distinguisher: 100:1
Network NextHop MED LocPrf PrefVal Path/Ogn
*> 192.168.1.0 0.0.0.0 0 0 ?
*> 192.168.1.254/32 0.0.0.0 0 0 ?
*> 192.168.2.0 0.0.0.0 0 0 ?
*> 192.168.2.254/32 0.0.0.0 0 0 ?
VPN-Instance vpn1, Router ID 10.1.1.1:
Total Number of Routes: 4
Network NextHop MED LocPrf PrefVal Path/Ogn
*> 192.168.1.0 0.0.0.0 0 0 ?
*> 192.168.1.254/32 0.0.0.0 0 0 ?
*> 192.168.2.0 0.0.0.0 0 0 ?
*> 192.168.2.254/32 0.0.0.0 0 0 ?
具体查看一条前缀路由(比如:0:192.168.1.0:24)的注入方式,可以看到此路由是“Imported route.":
[~Leaf-1]display bgp evpn all routing-table prefix-route 0:177.1.1.1:32
BGP local router ID : 10.1.1.1
Local AS number : 100
Total routes of Route Distinguisher(100:1): 1
BGP routing table entry information of 0:177.1.1.1:32:
Label information (Received/Applied): 100/NULL
From: 10.1.1.4 (10.1.1.4)
Route Duration: 0d00h18m18s
Relay IP Nexthop: 10.1.14.4
Relay Tunnel Out-Interface:
Original nexthop: 20.1.1.3
Qos information : 0x0
Ext-Community:RT <100 : 1>, RT <1000 : 1>, OSPF DOMAIN ID <0.0.0.0 : 0>, OSPF ROUTER ID <10.1.36.3 : 0>, OSPF RT <0.0.0.0 : 0 : 1>, Tunnel Type <VxLan(8)>, Router's MAC <707b-e8e6-5d85>
AS-path Nil, origin incomplete, MED 2, localpref 100, pref-val 0, valid, internal, best, select, pre 255, IGP cost 2
Originator: 10.1.1.3
Cluster list: 10.1.1.4
Route Type: 5 (Ip Prefix Route)
Ethernet Tag ID: 0, IP Prefix/Len: 177.1.1.1/32, ESI: 0000.0000.0000.0000.0000, GW IP Address: 0.0.0.0
Not advertised to any peer yet
EVPN-Instance __vni100__:
Number of Ip Prefix Routes: 1
BGP routing table entry information of 0:177.1.1.1:32:
Route Distinguisher: 100:1
Remote-Cross route
Label information (Received/Applied): 100/NULL
From: 10.1.1.4 (10.1.1.4)
Route Duration: 0d00h18m18s
Relay Tunnel Out-Interface: VXLAN
Original nexthop: 20.1.1.3
Qos information : 0x0
Ext-Community:RT <100 : 1>, RT <1000 : 1>, OSPF DOMAIN ID <0.0.0.0 : 0>, OSPF ROUTER ID <10.1.36.3 : 0>, OSPF RT <0.0.0.0 : 0 : 1>, Tunnel Type <VxLan(8)>, Router's MAC <707b-e8e6-5d85>
AS-path Nil, origin incomplete, MED 2, localpref 100, pref-val 0, valid, internal, best, select, pre 255
Originator: 10.1.1.3
Cluster list: 10.1.1.4
Route Type: 5 (Ip Prefix Route)
Ethernet Tag ID: 0, IP Prefix/Len: 177.1.1.1/32, ESI: 0000.0000.0000.0000.0000, GW IP Address: 0.0.0.0
Not advertised to any peer yet
5) 在Leaf1上,Leaf1连接Spine1的端口抓包,查看BGP update信息。
5.2.2 Type 3路由
首先在Leaf1和Leaf2之间建立BGP EVPN对等体。然后,在Leaf1和Leaf2上分别创建二层广播域,并在二层广播域下配置关联的二层VNI。接下来在二层广播域下创建EVPN实例,配置本端EVPN实例的RD、出方向VPN-Target(ERT)、入方向VPN-Target(IRT)。在配置完本端VTEP IP地址后,Leaf1和Leaf2会生成BGPEVPN路由并发送给对端,该路由携带本端EVPN实例的出方向VPN-Target和BGPEVPN协议新定义的Type3路由即Inclusive Multicast路由。其中,Inclusive Multicast路由如下图所示,由前缀和PMSI属性组成,VTEP IP地址存放在前缀的Originating Router's IP Address字段中,二层VNI存放在PMSI属性的MPLSLabel字段中。VTEP IP地址同时也放在下一跳属性中。
a. Leaf1发给Spine1的bgp update type3抓包信息:
Frame 483: 510 bytes on wire (4080 bits), 510 bytes captured (4080 bits) on interface 0
Ethernet II, Src: 4e:01:00:11:01:00 (4e:01:00:11:01:00), Dst: 4e:01:00:41:01:01 (4e:01:00:41:01:01)
Internet Protocol Version 4, Src: 10.1.1.1, Dst: 10.1.1.4
Transmission Control Protocol, Src Port: 50046, Dst Port: 179, Seq: 65, Ack: 65, Len: 456
Border Gateway Protocol - UPDATE Message
Marker: ffffffffffffffffffffffffffffffff
Length: 108
Type: UPDATE Message (2)
Withdrawn Routes Length: 0
Total Path Attribute Length: 85
Path attributes
Path Attribute - ORIGIN: INCOMPLETE
Path Attribute - AS_PATH: empty
Path Attribute - LOCAL_PREF: 100
Path Attribute - EXTENDED_COMMUNITIES
Flags: 0xc0, Optional, Transitive: Optional, Transitive, Complete
Type Code: EXTENDED_COMMUNITIES (16)
Length: 24
Carried extended communities: (3 communities)
Community Transitive Two-Octet AS Route Target: 10:1
Community Transitive Two-Octet AS Route Target: 1000:1
Community Transitive Opaque Encapsulation: VXLAN Encapsulation
Path Attribute - PMSI_TUNNEL_ATTRIBUTE
Flags: 0xc0, Optional, Transitive: Optional, Transitive, Complete
Type Code: PMSI_TUNNEL_ATTRIBUTE (22)
Length: 9
Flags: 0
Tunnel Type: Ingress Replication (6)
0000 0000 0000 0000 0000 .... = MPLS Label: 0//十六进制a转换为十进制为10,即L2VNI为10.
Tunnel ID: tunnel end point -> 20.1.1.1
Tunnel type ingress replication IP end point: 20.1.1.1
Path Attribute - MP_REACH_NLRI
Flags: 0x90, Optional, Length: Optional, Non-transitive, Complete, Extended Length
Type Code: MP_REACH_NLRI (14)
Length: 28
Address family identifier (AFI): Layer-2 VPN (25)
Subsequent address family identifier (SAFI): EVPN (70)
Next hop network address (4 bytes) //VTEP IP地址同时也放在下一跳属性中,即20.1.1.1
Number of Subnetwork points of attachment (SNPA): 0
Network layer reachability information (19 bytes)
EVPN NLRI: Inclusive Multicast Route
AFI: Inclusive Multicast Route (3)
Length: 17
Route Distinguisher: 0000000a00000001 (10:1)
Ethernet Tag ID: 0
IP Address Length: 32
IPv4 address: 20.1.1.1 //Originating Router's IP Address为20.1.1.1
Border Gateway Protocol - UPDATE Message
Border Gateway Protocol - UPDATE Message
Marker: ffffffffffffffffffffffffffffffff
Length: 108
Type: UPDATE Message (2)
Withdrawn Routes Length: 0
Total Path Attribute Length: 85
Path attributes
Path Attribute - ORIGIN: INCOMPLETE
Path Attribute - AS_PATH: empty
Path Attribute - LOCAL_PREF: 100
Path Attribute - EXTENDED_COMMUNITIES
Flags: 0xc0, Optional, Transitive: Optional, Transitive, Complete
Type Code: EXTENDED_COMMUNITIES (16)
Length: 24
Carried extended communities: (3 communities)
Community Transitive Two-Octet AS Route Target: 20:1
Community Transitive Two-Octet AS Route Target: 1000:1
Community Transitive Opaque Encapsulation: VXLAN Encapsulation
Path Attribute - PMSI_TUNNEL_ATTRIBUTE
Flags: 0xc0, Optional, Transitive: Optional, Transitive, Complete
Type Code: PMSI_TUNNEL_ATTRIBUTE (22)
Length: 9
Flags: 0
Tunnel Type: Ingress Replication (6)
0000 0000 0000 0000 0001 .... = MPLS Label: 1 //十六进制14转换为十进制20,即L2VNI为20
Tunnel ID: tunnel end point -> 20.1.1.1
Tunnel type ingress replication IP end point: 20.1.1.1
Path Attribute - MP_REACH_NLRI
Flags: 0x90, Optional, Length: Optional, Non-transitive, Complete, Extended Length
Type Code: MP_REACH_NLRI (14)
Length: 28
Address family identifier (AFI): Layer-2 VPN (25)
Subsequent address family identifier (SAFI): EVPN (70)
Next hop network address (4 bytes)
Number of Subnetwork points of attachment (SNPA): 0
Network layer reachability information (19 bytes)
EVPN NLRI: Inclusive Multicast Route
AFI: Inclusive Multicast Route (3)
Length: 17
Route Distinguisher: 0000001400000001 (20:1)
Ethernet Tag ID: 0
IP Address Length: 32
IPv4 address: 20.1.1.1 //Originating Router's IP Address为20.1.1.1
Border Gateway Protocol - UPDATE Message
b. Spine1发给Leaf1的bgp update type3抓包信息:
Frame 487: 1012 bytes on wire (8096 bits), 1012 bytes captured (8096 bits) on interface 0
Ethernet II, Src: 4e:01:00:41:01:01 (4e:01:00:41:01:01), Dst: 4e:01:00:11:01:00 (4e:01:00:11:01:00)
Internet Protocol Version 4, Src: 10.1.1.4, Dst: 10.1.1.1
Transmission Control Protocol, Src Port: 179, Dst Port: 50046, Seq: 577, Ack: 551, Len: 958
Border Gateway Protocol - UPDATE Message
Border Gateway Protocol - UPDATE Message
Marker: ffffffffffffffffffffffffffffffff
Length: 122
Type: UPDATE Message (2)
Withdrawn Routes Length: 0
Total Path Attribute Length: 99
Path attributes
Path Attribute - ORIGIN: INCOMPLETE
Path Attribute - AS_PATH: empty
Path Attribute - LOCAL_PREF: 100
Path Attribute - ORIGINATOR_ID: 10.1.1.2
Path Attribute - CLUSTER_LIST: 10.1.1.4
Path Attribute - EXTENDED_COMMUNITIES
Flags: 0xc0, Optional, Transitive: Optional, Transitive, Complete
Type Code: EXTENDED_COMMUNITIES (16)
Length: 24
Carried extended communities: (3 communities)
Community Transitive Two-Octet AS Route Target: 10:1
Community Transitive Two-Octet AS Route Target: 1000:1
Community Transitive Opaque Encapsulation: VXLAN Encapsulation
Path Attribute - PMSI_TUNNEL_ATTRIBUTE
Path Attribute - MP_REACH_NLRI
Flags: 0x90, Optional, Length: Optional, Non-transitive, Complete, Extended Length
Type Code: MP_REACH_NLRI (14)
Length: 28
Address family identifier (AFI): Layer-2 VPN (25)
Subsequent address family identifier (SAFI): EVPN (70)
Next hop network address (4 bytes)
Number of Subnetwork points of attachment (SNPA): 0
Network layer reachability information (19 bytes)
EVPN NLRI: Inclusive Multicast Route
AFI: Inclusive Multicast Route (3)
Length: 17
Route Distinguisher: 0000000a00000001 (10:1)
Ethernet Tag ID: 0
IP Address Length: 32
IPv4 address: 20.1.1.2
Border Gateway Protocol - UPDATE Message
Border Gateway Protocol - UPDATE Message
Border Gateway Protocol - UPDATE Message
Marker: ffffffffffffffffffffffffffffffff
Length: 122
Type: UPDATE Message (2)
Withdrawn Routes Length: 0
Total Path Attribute Length: 99
Path attributes
Path Attribute - ORIGIN: INCOMPLETE
Path Attribute - AS_PATH: empty
Path Attribute - LOCAL_PREF: 100
Path Attribute - ORIGINATOR_ID: 10.1.1.2
Path Attribute - CLUSTER_LIST: 10.1.1.4
Path Attribute - EXTENDED_COMMUNITIES
Path Attribute - PMSI_TUNNEL_ATTRIBUTE
Path Attribute - MP_REACH_NLRI
Border Gateway Protocol - UPDATE Message
Border Gateway Protocol - UPDATE Message
这个type3路由就是Leaf2产生的,vni 10/20触发产生。
c. 在Leaf1查看bgp evpn type3路由信息:
[~Leaf-1]display bgp evpn all routing-table inclusive-route
Local AS number : 100
BGP Local router ID is 10.1.1.1
Status codes: * - valid, > - best, d - damped, x - best external, a - add path,
h - history, i - internal, s - suppressed, S - Stale
Origin : i - IGP, e - EGP, ? - incomplete
EVPN address family:
Number of Inclusive Multicast Routes: 4
Route Distinguisher: 10:1
Network(EthTagId/IpAddrLen/OriginalIp) NextHop
*> 0:32:20.1.1.1 0.0.0.0
*>i 0:32:20.1.1.2 20.1.1.2
Route Distinguisher: 20:1
Network(EthTagId/IpAddrLen/OriginalIp) NextHop
*> 0:32:20.1.1.1 0.0.0.0
*>i 0:32:20.1.1.2 20.1.1.2
EVPN-Instance 10:
Number of Inclusive Multicast Routes: 2
Network(EthTagId/IpAddrLen/OriginalIp) NextHop
*> 0:32:20.1.1.1 0.0.0.0
*>i 0:32:20.1.1.2 20.1.1.2
EVPN-Instance 20:
Number of Inclusive Multicast Routes: 2
Network(EthTagId/IpAddrLen/OriginalIp) NextHop
*> 0:32:20.1.1.1 0.0.0.0
*>i 0:32:20.1.1.2 20.1.1.2
[~Leaf-1]
一共4条type3路由,2条为Leaf1关于EVPN-Instance 10和EVPN-Instance 20产生的,如上紫色标记部分,2条为Leaf2关于EVPN-Instance 10和EVPN-Instance 20产生的,如上红色标记部分;
查看具体的type3路由的命令:[~Leaf-1]display bgp evpn all routing-table inclusive-route 0:32:20.1.1.1
Leaf1和Leaf2在收到对端发来的BGP EVPN路由后,首先检查该路由携带的EVPN实例的出方向VPN-Target,如果与本端EVPN实例的入方向VPN-Target相等,则接收该路由,否则丢弃该路由。在接收该路由后,Leaf1和Leaf2将获取其中携带的对端VTEP IP地址(从下一跳属性获取)和二层VNI,如果对端VTEP IP地址是三层路由可达的,则建立一条到对端的VXLAN隧道;同时,本端会创建一个基于VNI的头端复制表并将对端VTEP IP地址加入其中,用于后续BUM报文转发。
a. 查看Leaf1 vxlan peer建立:
[~Leaf-1]display vxlan peer
Number of peers : 2
Vni ID Source Destination Type
--------------------------------------------------------------
10 20.1.1.1 20.1.1.2 dynamic
20 20.1.1.1 20.1.1.2 dynamic
type是动态的,通过bgp evpn协议建立的。
这就是头端复制列表。
b. 查看Leaf1 vxla隧道,Leaf1和Leaf2、Leaf3都建立了vxlan tunnel:
[~Leaf-1]display vxlan tunnel
Number of vxlan tunnel : 2
Tunnel ID Source Destination State Type
--------------------------------------------------------------
4026531843 20.1.1.1 20.1.1.2 up dynamic
4026531844 20.1.1.1 20.1.1.3 up dynamic
注意由于Leaf3没有配置L2VNI,所以Leaf1和Leaf3之间没有BUM流量的头端复制列表(就是没有vxlan peer)。
可以理解tunnel主要用于单播流量迭代到tunnel上(进行vxlan封装)。
关于和Leaf3的VXLAN隧道其实是通过tpye5类路由建立的L3隧道。实际抓包并没有发现Leaf3产生的type3类路由。
VPN-Target是一种BGP扩展团体属性,一个EVPN实例可以配置出方向和入方向两类VPN Target,两端EVPN实例的VPN-Target要相互匹配(即本端EVPN实例配置的出方向VPN-target值需要与对端EVPN实例配置的入方向VPN-target值相等),才能相互交换EVPN路由,否则VXLAN隧道无法建立成功。如果仅有一端匹配成功可以接收路由,则此端设备上可以建立通往另一端设备的隧道,但是无法传输数据报文,因为另一端设备在收到报文后,将会检查本端是否有通往对端的VXLAN隧道,如果没有的话,将会丢弃报文。
5.2.3 Type 5路由
Spine1发送给Leaf1的bgp update type5的抓包信息(实际实验环境中为Leaf3产生):
Frame 31: 768 bytes on wire (6144 bits), 768 bytes captured (6144 bits) on interface 0
Ethernet II, Src: 4e:01:00:11:01:01 (4e:01:00:11:01:01), Dst: 4e:01:00:21:01:00 (4e:01:00:21:01:00)
Internet Protocol Version 4, Src: 10.1.1.4, Dst: 10.1.1.1
Transmission Control Protocol, Src Port: 179, Dst Port: 65055, Seq: 65, Ack: 65, Len: 714
Border Gateway Protocol - UPDATE Message
Border Gateway Protocol - UPDATE Message
[Malformed Packet: BGP]
[Expert Info (Error/Malformed): Malformed Packet (Exception occurred)]
Border Gateway Protocol - UPDATE Message
Border Gateway Protocol - UPDATE Message
[Malformed Packet: BGP]
Border Gateway Protocol - UPDATE Message
Marker: ffffffffffffffffffffffffffffffff
Length: 202
Type: UPDATE Message (2)
Withdrawn Routes Length: 0
Total Path Attribute Length: 179
Path attributes
Path Attribute - ORIGIN: INCOMPLETE
Path Attribute - AS_PATH: empty
Path Attribute - MULTI_EXIT_DISC: 2
Path Attribute - LOCAL_PREF: 100
Path Attribute - ORIGINATOR_ID: 10.1.1.3
Path Attribute - CLUSTER_LIST: 10.1.1.4
Path Attribute - EXTENDED_COMMUNITIES
Flags: 0xc0, Optional, Transitive, Complete
Type Code: EXTENDED_COMMUNITIES (16)
Length: 56
Carried extended communities: (7 communities)
Route Target: 100:1 [Transitive 2-Octet AS-Specific] //Leaf3上vpn-target 100:1 export-extcommunity
Route Target: 1000:1 [Transitive 2-Octet AS-Specific] //Leaf3上 vpn-target 1000:1 export-extcommunity evpn
OSPF Domain Identifier: 0:0 [Transitive 2-Octet AS-Specific]
OSPF Router ID: 10.1.36.3:0 [Transitive IPv4-Address-Specific]
OSPF Route Type: Area: 0.0.0.0, Type: Router [Transitive Opaque]
Encapsulation: VXLAN Encapsulation [Transitive Opaque]
Unknown subtype 0x03: 0x707b 0xe8e6 0x5d85 [Transitive EVPN] //Router's MAC,Leaf3上VTEP的MAC地址707b-08e6-5db5,即20.1.1.3
Path Attribute - MP_REACH_NLRI
Flags: 0x90, Optional, Extended-Length, Non-transitive, Complete
Type Code: MP_REACH_NLRI (14)
Length: 81
Address family identifier (AFI): Layer-2 VPN (25)
Subsequent address family identifier (SAFI): EVPN (70)
Next hop network address (4 bytes) //该处可能是wireshark版本问题,该字段没有直接显示,但是实际上为Leaf3 VETP的IP地址,且为十六进制格式。
//Laef1收到该tpye5类型的信息,看到Leaf3的VTEP地址,判断路由可达,故建立VXLAN隧道。
Number of Subnetwork points of attachment (SNPA): 0
Network layer reachability information (72 bytes)
EVPN NLRI: IP Prefix route
Route Type: IP Prefix route (5)
Length: 34
Route Distinguisher: 0000006400000001 (100:1)
ESI: 00:00:00:00:00:00:00:00:00:00
Ethernet Tag ID: 0
IP prefix length: 24
IPv4 address: 10.1.36.0
IPv4 Gateway address: 0.0.0.0
MPLS Label Stack: 6, (BOGUS: Bottom of Stack NOT set!)
MPLS Label: 1006,//该处可能为wireshark版本问题,该字段没有直接显示,但实际为L3 VNI的值,且为十六进制格式。
//转换为十进制后值为100,即Leaf1结合Leaf3的VTEP地址,建立L3的VXLAN隧道。
EVPN NLRI: IP Prefix route
Route Type: IP Prefix route (5)
Length: 34
Route Distinguisher: 0000006400000001 (100:1)
ESI: 00:00:00:00:00:00:00:00:00:00
Ethernet Tag ID: 0
IP prefix length: 32
IPv4 address: 177.1.1.1
IPv4 Gateway address: 0.0.0.0
MPLS Label Stack: 6, (BOGUS: Bottom of Stack NOT set!)
MPLS Label: 1006,
说明:
type5携带L3VPN的export RT值:100:1和1000:1(evpn);
type5只携带L3VNI:100;
Type5携带Next hop network address,即为Leaf3的VTEP的IP地址;
type5携带Router’s MAC ,即为Leaf3的NVE(VTEP)端口mac地址:
<Leaf-3>display interface Nve1
Nve1 current state : UP (ifindex: 17)
Line protocol current state : UP
Description:
IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is 707b-e8e6-5d85
Tpye5作为前缀路由不携带携带L2VNI,通过普通IP VPN实例的RT值100:1匹配本地的IP VPN实例,然后生成路由条目。
若leaf1和leaf2也发布了直连192.168.1.0/24和192.168.2.0/24的IP前缀路由,则针对这两个网段,组成分布式网关,彼此发送相同的192.168.1.0/24和192.168.2.0/24的路由。具体信息这里不做展示。
Leaf1查看bgp evpn type5的路由表:
<Leaf-1>display bgp evpn all routing-table prefix-route
Local AS number : 100
BGP Local router ID is 10.1.1.1
Status codes: * - valid, > - best, d - damped, x - best external, a - add path,
h - history, i - internal, s - suppressed, S - Stale
Origin : i - IGP, e - EGP, ? - incomplete
EVPN address family:
Number of Ip Prefix Routes: 2
Route Distinguisher: 100:1
Network(EthTagId/IpPrefix/IpPrefixLen) NextHop
*>i 0:10.1.36.0:24 20.1.1.3
*>i 0:177.1.1.1:32 20.1.1.3
EVPN-Instance __vni100__:
Number of Ip Prefix Routes: 2
Network(EthTagId/IpPrefix/IpPrefixLen) NextHop
*>i 0:10.1.36.0:24 20.1.1.3
*>i 0:177.1.1.1:32 20.1.1.3
查看具体的bgp evpn type5路由信息,如0:177.1.1.1:32:
<Leaf-1>display bgp evpn all routing-table prefix-route 0:177.1.1.1:32
BGP local router ID : 10.1.1.1
Local AS number : 100
Total routes of Route Distinguisher(100:1): 1
BGP routing table entry information of 0:177.1.1.1:32:
Label information (Received/Applied): 100/NULL
From: 10.1.1.4 (10.1.1.4)
Route Duration: 0d00h24m23s
Relay IP Nexthop: 10.1.14.4
Relay Tunnel Out-Interface:
Original nexthop: 20.1.1.3
Qos information : 0x0
Ext-Community:RT <100 : 1>, RT <1000 : 1>, OSPF DOMAIN ID <0.0.0.0 : 0>, OSPF ROUTER ID <10.1.36.3 : 0>, OSPF RT <0.0.0.0 : 0 : 1>, Tunnel Type <VxLan(8)>, Router's MAC <707b-e8e6-5d85>
AS-path Nil, origin incomplete, MED 2, localpref 100, pref-val 0, valid, internal, best, select, pre 255, IGP cost 2
Originator: 10.1.1.3
Cluster list: 10.1.1.4
Route Type: 5 (Ip Prefix Route)
Ethernet Tag ID: 0, IP Prefix/Len: 177.1.1.1/32, ESI: 0000.0000.0000.0000.0000, GW IP Address: 0.0.0.0
Not advertised to any peer yet
EVPN-Instance __vni100__:
Number of Ip Prefix Routes: 1
BGP routing table entry information of 0:177.1.1.1:32:
Route Distinguisher: 100:1
Remote-Cross route
Label information (Received/Applied): 100/NULL
From: 10.1.1.4 (10.1.1.4)
Route Duration: 0d00h24m23s
Relay Tunnel Out-Interface: VXLAN
Original nexthop: 20.1.1.3
Qos information : 0x0
Ext-Community:RT <100 : 1>, RT <1000 : 1>, OSPF DOMAIN ID <0.0.0.0 : 0>, OSPF ROUTER ID <10.1.36.3 : 0>, OSPF RT <0.0.0.0 : 0 : 1>, Tunnel Type <VxLan(8)>, Router's MAC <707b-e8e6-5d85>
AS-path Nil, origin incomplete, MED 2, localpref 100, pref-val 0, valid, internal, best, select, pre 255
Originator: 10.1.1.3
Cluster list: 10.1.1.4
Route Type: 5 (Ip Prefix Route)
Ethernet Tag ID: 0, IP Prefix/Len: 177.1.1.1/32, ESI: 0000.0000.0000.0000.0000, GW IP Address: 0.0.0.0
Not advertised to any peer yet
去往Leaf3的177.1.1.1/32的路由,下一跳为vtep地址20.1.1.3,vxlan封装。
查看路由关联隧道信息:
<Leaf-1>display ip routing-table vpn-instance vpn1 177.1.1.1 verbose
Route Flags: R - relay, D - download to fib, T - to vpn-instance, B - black hole route
------------------------------------------------------------------------------
Routing Table : vpn1
Summary Count : 1
Destination: 177.1.1.1/32
Protocol: IBGP Process ID: 0
Preference: 255 Cost: 2
NextHop: 20.1.1.3 Neighbour: 0.0.0.0
State: Active Adv Relied Age: 00h29m42s
Tag: 0 Priority: low
Label: NULL QoSInfo: 0x0
IndirectID: 0xAB00009C
RelayNextHop: 0.0.0.0 Interface: VXLAN
TunnelID: 0x0000000027f000000e Flags: RD
<Leaf-1>display tunnel all
Tunnel ID Type Destination Status
-----------------------------------------------------------------------------
0x0000000027f000000d vxlan_nvo3 20.1.1.2 UP
0x0000000027f000000e vxlan_nvo3 20.1.1.3 Up
5.2.4 Type 2路由
type2路由分三种:mac route, arp route, irb route。如下所示:
leaf1发给spine1的mac路由抓包:
mac路由抓包:
Frame 28: 510 bytes on wire (4080 bits), 510 bytes captured (4080 bits) on interface 0
Ethernet II, Src: 4e:01:00:21:01:00 (4e:01:00:21:01:00), Dst: 4e:01:00:11:01:01 (4e:01:00:11:01:01)
Internet Protocol Version 4, Src: 10.1.1.1, Dst: 10.1.1.4
Transmission Control Protocol, Src Port: 65056, Dst Port: 179, Seq: 65, Ack: 65, Len: 456
Border Gateway Protocol - UPDATE Message
Border Gateway Protocol - UPDATE Message
Marker: ffffffffffffffffffffffffffffffff
Length: 120
Type: UPDATE Message (2)
Withdrawn Routes Length: 0
Total Path Attribute Length: 97
Path attributes
Path Attribute - ORIGIN: INCOMPLETE
Path Attribute - AS_PATH: empty
Path Attribute - LOCAL_PREF: 100
Path Attribute - EXTENDED_COMMUNITIES
Flags: 0xc0, Optional, Transitive, Complete
Type Code: EXTENDED_COMMUNITIES (16)
Length: 32
Carried extended communities: (4 communities)
Route Target: 10:1 [Transitive 2-Octet AS-Specific]
Route Target: 1000:1 [Transitive 2-Octet AS-Specific]
Encapsulation: VXLAN Encapsulation [Transitive Opaque]
MAC Mobility: Sticky MAC [Transitive EVPN]
Path Attribute - MP_REACH_NLRI
Flags: 0x90, Optional, Extended-Length, Non-transitive, Complete
Type Code: MP_REACH_NLRI (14)
Length: 44
Address family identifier (AFI): Layer-2 VPN (25)
Subsequent address family identifier (SAFI): EVPN (70)
Next hop network address (4 bytes)
Number of Subnetwork points of attachment (SNPA): 0
Network layer reachability information (35 bytes)
EVPN NLRI: MAC Advertisement Route
Route Type: MAC Advertisement Route (2)
Length: 33
Route Distinguisher: 0000000a00000001 (10:1)
ESI: 00:00:00:00:00:00:00:00:00:00
Ethernet Tag ID: 0
MAC Address Length: 48
MAC Address: EquipTra_01:00:01 (00:01:00:01:00:01)
IP Address Length: 0
IP Address: NOT INCLUDED
[Expert Info (Note/Protocol): IP Address: NOT INCLUDED]
a. 这是一条mac route,发送的mac地址为启用了分布式网关的int vbdif 10端口的mac地址:另外该处MP_REACH_NLRI并没有携带L2VNI值,Leaf2收到之后通过对比其携带的RT值(10:1和1000:1),发现其中10:1和本地BD10对应的EVPN实例iRT相同,则将该路由放置在对应的EVPN实例中,且打上L2VNI。如下所示:
[~Leaf-2]display bgp evpn all routing-table mac 0:48:0001-0001-0001:0:0.0.0.0
BGP local router ID : 10.1.1.2
Local AS number : 100
Total routes of Route Distinguisher(10:1): 2
BGP routing table entry information of 0:48:0001-0001-0001:0:0.0.0.0:
Label information (Received/Applied): 10/NULL
From: 0.0.0.0 (0.0.0.0)
Route Duration: 0d04h04m10s
Direct Out-interface:
Original nexthop: 20.1.1.2
Qos information : 0x0
Ext-Community:RT <10 : 1>, RT <1000 : 1>, Tunnel Type <VxLan(8)>, Mac Mobility <flag:1 seq:0 res:0>
AS-path Nil, origin incomplete, pref-val 0, valid, local, best, select, pre 255
Route Type: 2 (MAC Advertisement Route)
Ethernet Tag ID: 0, MAC Address/Len: 0001-0001-0001/48, IP Address/Len: 0.0.0.0/0, ESI:0000.0000.0000.0000.0000
Advertised to such 1 peers:
10.1.1.4
BGP routing table entry information of 0:48:0001-0001-0001:0:0.0.0.0:
Label information (Received/Applied): 10/NULL
From: 10.1.1.4 (10.1.1.4)
Route Duration: 0d01h04m10s
Relay IP Nexthop: 10.1.24.4
Relay Tunnel Out-Interface:
Original nexthop: 20.1.1.1
Qos information : 0x0
Ext-Community:RT <10 : 1>, RT <1000 : 1>, Tunnel Type <VxLan(8)>, Mac Mobility <flag:1 seq:0 res:0>
AS-path Nil, origin incomplete, localpref 100, pref-val 0, valid, internal, pre 255, IGP cost 2, not preferred for route type
Originator: 10.1.1.1
Cluster list: 10.1.1.4
Route Type: 2 (MAC Advertisement Route)
Ethernet Tag ID: 0, MAC Address/Len: 0001-0001-0001/48, IP Address/Len: 0.0.0.0/0, ESI:0000.0000.0000.0000.0000
Not advertised to any peer yet
EVPN-Instance 10:
Number of Mac Routes: 2
BGP routing table entry information of 0:48:0001-0001-0001:0:0.0.0.0:
Route Distinguisher: 10:1
Label information (Received/Applied): 10/NULL
From: 0.0.0.0 (0.0.0.0)
Route Duration: 0d04h04m12s
Direct Out-interface:
Original nexthop: 20.1.1.2
Qos information : 0x0
Ext-Community:Tunnel Type <VxLan(8)>, Mac Mobility <flag:1 seq:0 res:0>
AS-path Nil, origin incomplete, pref-val 0, valid, local, best, select, pre 255
Route Type: 2 (MAC Advertisement Route)
Ethernet Tag ID: 0, MAC Address/Len: 0001-0001-0001/48, IP Address/Len: 0.0.0.0/0, ESI:0000.0000.0000.0000.0000
Not advertised to any peer yet
BGP routing table entry information of 0:48:0001-0001-0001:0:0.0.0.0:
Route Distinguisher: 10:1
Remote-Cross route
Label information (Received/Applied): 10/NULL
From: 10.1.1.4 (10.1.1.4)
Route Duration: 0d01h04m12s
Relay Tunnel Out-Interface: VXLAN
Original nexthop: 20.1.1.1
Qos information : 0x0
Ext-Community:RT <10 : 1>, RT <1000 : 1>, Tunnel Type <VxLan(8)>, Mac Mobility <flag:1 seq:0 res:0>
AS-path Nil, origin incomplete, localpref 100, pref-val 0, valid, internal, pre 255, not preferred for mac mobility
Originator: 10.1.1.1
Cluster list: 10.1.1.4
Route Type: 2 (MAC Advertisement Route)
Ethernet Tag ID: 0, MAC Address/Len: 0001-0001-0001/48, IP Address/Len: 0.0.0.0/0, ESI:0000.0000.0000.0000.0000
Not advertised to any peer yet
b. Leaf1发送的mac route,经过spine反射之后交给leaf2,由于分布式网关场景下leaf1和Lea2leaf2的网关地址和mac地址都一直,因此Leaf2有0001:0001:0001的mac route,本地优先。
arp route 在分布式对称转发场景中不会产生,原因在于配置过程中指定了向BGP EVPN邻居发送的路由类型为IRB路由,如peer 10.1.1.4 advertise irb。
leaf1发给spine1的irb路由抓包:
irb路由抓包:
Frame 30: 764 bytes on wire (6112 bits), 764 bytes captured (6112 bits) on interface 0
Ethernet II, Src: 4e:01:00:21:01:00 (4e:01:00:21:01:00), Dst: 4e:01:00:11:01:01 (4e:01:00:11:01:01)
Internet Protocol Version 4, Src: 10.1.1.1, Dst: 10.1.1.4
Transmission Control Protocol, Src Port: 65055, Dst Port: 179, Seq: 65, Ack: 65, Len: 710
Border Gateway Protocol - UPDATE Message
Border Gateway Protocol - UPDATE Message
[Malformed Packet: BGP]
Border Gateway Protocol - UPDATE Message
Marker: ffffffffffffffffffffffffffffffff
Length: 127
Type: UPDATE Message (2)
Withdrawn Routes Length: 0
Total Path Attribute Length: 104
Path attributes
Path Attribute - ORIGIN: INCOMPLETE
Path Attribute - AS_PATH: empty
Path Attribute - LOCAL_PREF: 100
Path Attribute - EXTENDED_COMMUNITIES
Flags: 0xc0, Optional, Transitive, Complete
Type Code: EXTENDED_COMMUNITIES (16)
Length: 32
Carried extended communities: (4 communities)
Route Target: 10:1 [Transitive 2-Octet AS-Specific]
Route Target: 1000:1 [Transitive 2-Octet AS-Specific]
Encapsulation: VXLAN Encapsulation [Transitive Opaque]
Unknown subtype 0x03: 0x707b 0xe80d 0x0337 [Transitive EVPN] //
Path Attribute - MP_REACH_NLRI
Flags: 0x90, Optional, Extended-Length, Non-transitive, Complete
Type Code: MP_REACH_NLRI (14)
Length: 51
Address family identifier (AFI): Layer-2 VPN (25)
Subsequent address family identifier (SAFI): EVPN (70)
Next hop network address (4 bytes)
Number of Subnetwork points of attachment (SNPA): 0
Network layer reachability information (42 bytes)
EVPN NLRI: MAC Advertisement Route
Route Type: MAC Advertisement Route (2)
Length: 40
Route Distinguisher: 0000000a00000001 (10:1)
ESI: 00:00:00:00:00:00:00:00:00:00
Ethernet Tag ID: 0
MAC Address Length: 48
MAC Address: HuaweiTe_05:61:39 (54:89:98:05:61:39)
IP Address Length: 32
IPv4 address: 192.168.1.1
[Malformed Packet: BGP]
Border Gateway Protocol - UPDATE Message
Border Gateway Protocol - UPDATE Message
[Malformed Packet: BGP]
Border Gateway Protocol - UPDATE Message
[Malformed Packet: BGP]
a. 这是一条ibr route,发送的mac地址为leaf1下挂的vm1的IP和MAC信息,其vm1在向外通信的时候,网关根据arp信息生成irb路由;
b. Leaf1根据网关配置的arp collect host enable命令来实现arp信息收集,然后生成ibr路由;
c. Leaf1发送的ibr route,经过spine反射之后交给leaf2,同mac route一样该处MP_REACH_NLRI并没有携带L2VNI值,Leaf2收到之后通过对比其携带的RT值(10:1和1000:1),发现其中10:1和本地BD10对应的EVPN实例iRT相同,则将该路由放置在对应的EVPN实例中,且本地打上L2VNI值,同时发现1000:1和本地IP VPN实例的iRT相同,则将该路由放置在对应的IP VPN实例中,且本地打上L3VNI值。
d. Tpye2中arp和irb区别在于,ibr携带L2VNI、L3VNI以及Router's MAC,但是抓包过程中发现并没有携带L2VNI和L3VNI信息,该处实际不影响本地路由接收,原因如步骤c中所述。具体信息如下:
[~Leaf-2]display bgp evpn all routing-table mac-route 0:48:5489-9805-6139:32:192.168.1.1
BGP local router ID : 10.1.1.2
Local AS number : 100
Total routes of Route Distinguisher(10:1): 1
BGP routing table entry information of 0:48:5489-9805-6139:32:192.168.1.1:
Label information (Received/Applied): 10 100/NULL
From: 10.1.1.4 (10.1.1.4)
Route Duration: 0d00h00m15s
Relay IP Nexthop: 10.1.24.4
Relay Tunnel Out-Interface:
Original nexthop: 20.1.1.1
Qos information : 0x0
Ext-Community:RT <10 : 1>, RT <1000 : 1>, Tunnel Type <VxLan(8)>, Router's MAC <707b-e80d-0337>
AS-path Nil, origin incomplete, localpref 100, pref-val 0, valid, internal, best, select, pre 255, IGP cost 2
Originator: 10.1.1.1
Cluster list: 10.1.1.4
Route Type: 2 (MAC Advertisement Route)
Ethernet Tag ID: 0, MAC Address/Len: 5489-9805-6139/48, IP Address/Len: 192.168.1.1/32, ESI:0000.0000.0000.0000.0000
Not advertised to any peer yet
EVPN-Instance 10:
Number of Mac Routes: 1
BGP routing table entry information of 0:48:5489-9805-6139:32:192.168.1.1:
Route Distinguisher: 10:1
Remote-Cross route
Label information (Received/Applied): 10 100/NULL //10为L2VNI ,100为L3VNI
From: 10.1.1.4 (10.1.1.4)
Route Duration: 0d00h00m15s
Relay Tunnel Out-Interface: VXLAN
Original nexthop: 20.1.1.1
Qos information : 0x0
Ext-Community:RT <10 : 1>, RT <1000 : 1>, Tunnel Type <VxLan(8)>, Router's MAC <707b-e80d-0337>
AS-path Nil, origin incomplete, localpref 100, pref-val 0, valid, internal, best, select, pre 255
Originator: 10.1.1.1
Cluster list: 10.1.1.4
Route Type: 2 (MAC Advertisement Route)
Ethernet Tag ID: 0, MAC Address/Len: 5489-9805-6139/48, IP Address/Len: 192.168.1.1/32, ESI:0000.0000.0000.0000.0000
Not advertised to any peer yet
EVPN-Instance __vni100__:
Number of Mac Routes: 1
BGP routing table entry information of 0:48:5489-9805-6139:32:192.168.1.1:
Route Distinguisher: 10:1
Remote-Cross route
Label information (Received/Applied): 10 100/NULL
From: 10.1.1.4 (10.1.1.4)
Route Duration: 0d00h00m16s
Relay Tunnel Out-Interface: VXLAN
Original nexthop: 20.1.1.1
Qos information : 0x0
Ext-Community:RT <10 : 1>, RT <1000 : 1>, Tunnel Type <VxLan(8)>, Router's MAC <707b-e80d-0337>
AS-path Nil, origin incomplete, localpref 100, pref-val 0, valid, internal, best, select, pre 255
Originator: 10.1.1.1
Cluster list: 10.1.1.4
Route Type: 2 (MAC Advertisement Route)
Ethernet Tag ID: 0, MAC Address/Len: 5489-9805-6139/48, IP Address/Len: 192.168.1.1/32, ESI:0000.0000.0000.0000.0000
Not advertised to any peer yet
5.3、同网段,不同Leaf服务器Ping测试
执行操作:192.168.1.1Ping 192.168.1.2,并在Leaf1连Spine1端口抓包,其来回路径为VM1-vSwitch1-Leaf1-Spine1-Leaf2-vSwtich2-VM3;
vm1首次和vm3通信,发送arp报文学习vm3的mac地址;
Leaf1通过原mac学习到vm1的MAC地址、BDID(二层广播域标识)和报文入接口的对应关系,并在本地MAC表中生成vm1的MAC表项,其出接口为GE1/0/1.10。
[~Leaf-1]display mac-address dynamic
Flags: * - Backup
BD : bridge-domain Age : dynamic MAC learned time in seconds
-------------------------------------------------------------------------------
MAC Address VLAN/VSI/BD Learned-From Type Age
-------------------------------------------------------------------------------
5489-9805-6139 -/-/10 GE1/0/1.10 dynamic -
-------------------------------------------------------------------------------
Total items: 1
[~Leaf-1]
Leaf1上VTEP根据对应的二层广播域获取对应VNI10的头端复制隧道列表,依据获取的隧道列表进行报文复制,并进行VXLAN封装。然后将封装后的报文从出接口转发出去。
a. Leaf1关于VNI10的头端复制隧道列表如下:
[~Leaf-1]display vxlan peer
Number of peers : 2
Vni ID Source Destination Type Out Vni ID
-------------------------------------------------------------------------------
10 20.1.1.1 20.1.1.2 dynamic 10
20 20.1.1.1 20.1.1.2 dynamic 20
[~Leaf-1]
b. 封装报文信息如下:
Frame 346: 110 bytes on wire (880 bits), 110 bytes captured (880 bits) on interface 0
Ethernet II, Src: 38:42:a9:01:01:00 (38:42:a9:01:01:00), Dst: 38:42:a9:04:01:01 (38:42:a9:04:01:01)
Internet Protocol Version 4, Src: 20.1.1.1, Dst: 20.1.1.2
User Datagram Protocol, Src Port: 4789, Dst Port: 4789
Virtual eXtensible Local Area Network
Flags: 0x0800, VXLAN Network ID (VNI)
Group Policy ID: 0
VXLAN Network Identifier (VNI): 10
Reserved: 0
Ethernet II, Src: HuaweiTe_05:61:39 (54:89:98:05:61:39), Dst: Broadcast (ff:ff:ff:ff:ff:ff)
Address Resolution Protocol (request)
Leaf2上VTEP收到VXLAN报文后,根据UDP目的端口号、源/目的IP地址、VNI判断VXLAN报文的合法有效性。然后依据VNI获取对应的二层广播域,进行VXLAN解封装,获取内层二层报文。同时学习到vm1的MAC地址、BDID(二层广播域标识)和报文入VTEP的对应关系,形成mac转发表,如下:
[~Leaf-2-GE1/0/1.20]display mac-address dynamic
Flags: * - Backup
BD : bridge-domain Age : dynamic MAC learned time in seconds
-------------------------------------------------------------------------------
MAC Address VLAN/VSI/BD Learned-From Type Age
-------------------------------------------------------------------------------
5489-9805-6139 -/-/10 20.1.1.1 dynamic -
-------------------------------------------------------------------------------
Leaf2检查内层二层报文的目的MAC,发现是BUM MAC,在对应的二层广播域内的非VXLAN隧道侧进行广播处理,即:Leaf2分别从本地MAC表中找到非VXLAN隧道侧的所有出接口(GE1/0/1.20)和封装信息,为报文添加VLAN Tag30,转发给对应的终端vm3。
vm3收到该arp报文解封装之后,发现是请求自己的mac地址,则通过单播的方式响应arp报文,单播的目的mac地址为5489-9805-6139。
Leaf2收到该arp应答报文后,通过原mac学习到vm1的MAC地址、BDID(二层广播域标识)和报文入接口的对应关系,并在本地MAC表中生成vm1的MAC表项,其出接口为GE1/0/1.20。
<Leaf-2>display mac-address dynamic
Flags: * - Backup
BD : bridge-domain Age : dynamic MAC learned time in seconds
-------------------------------------------------------------------------------
MAC Address VLAN/VSI/BD Learned-From Type Age
-------------------------------------------------------------------------------
5489-9805-6139 -/-/10 20.1.1.1 dynamic -
5489-988b-4f15 -/-/10 GE1/0/1.20 dynamic -
-------------------------------------------------------------------------------
Total items: 2
Leaf2同时查看mac表项,发现目的mac地址出接口为20.1.1.1,且VNI为10,随即查看VXLAN隧道信息进行封装。
a. Leaf2的VXLAN隧道信息如下:
<Leaf-2>display vxlan tunnel
Number of vxlan tunnel : 2
Tunnel ID Source Destination State Type Uptime
--------------------------------------------------------------------------------
---
4026531841 20.1.1.2 20.1.1.3 up dynamic 00:41:53
4026531842 20.1.1.2 20.1.1.1 up dynamic 00:41:51
<Leaf-2>
b. 封装报文信息如下:
Frame 347: 110 bytes on wire (880 bits), 110 bytes captured (880 bits) on interface 0
Ethernet II, Src: 38:42:a9:04:01:01 (38:42:a9:04:01:01), Dst: 38:42:a9:01:01:00 (38:42:a9:01:01:00)
Internet Protocol Version 4, Src: 20.1.1.2, Dst: 20.1.1.1
User Datagram Protocol, Src Port: 4789, Dst Port: 4789
Virtual eXtensible Local Area Network
Flags: 0x0800, VXLAN Network ID (VNI)
Group Policy ID: 0
VXLAN Network Identifier (VNI): 10
Reserved: 0
Ethernet II, Src: HuaweiTe_8b:4f:15 (54:89:98:8b:4f:15), Dst: HuaweiTe_05:61:39 (54:89:98:05:61:39)
Address Resolution Protocol (reply)
Leaf1上VTEP收到VXLAN报文后,根据UDP目的端口号、源/目的IP地址、VNI判断VXLAN报文的合法有效性。然后依据VNI获取对应的二层广播域,进行VXLAN解封装,获取内层二层报文。同时学习到vm3的MAC地址、BDID(二层广播域标识)和报文入VTEP的对应关系,形成mac转发表,如下:
[~Leaf-1]display mac-address dynamic
Flags: * - Backup
BD : bridge-domain Age : dynamic MAC learned time in seconds
-------------------------------------------------------------------------------
MAC Address VLAN/VSI/BD Learned-From Type Age
-------------------------------------------------------------------------------
5489-988b-4f15 -/-/10 20.1.1.2 dynamic -
5489-9805-6139 -/-/10 GE1/0/1.10 dynamic -
-------------------------------------------------------------------------------
Total items: 2
Leaf2检查内层二层报文的目的MAC,发现是单播MAC,查看mac表项,找到出接口(GE1/0/1.10)和封装信息,为报文添加VLAN Tag30,转发给对应的终端vm1。
vm1收到arp应答报文后学习到vm3的mac地址后完成报文封装。
Leaf1收到后查看mac表项,发现目的mac地址的出接口为20.1.1.2,且VNI为10,随即查看VXLAN隧道信息进行封装。
报文封装:
Frame 348: 124 bytes on wire (992 bits), 124 bytes captured (992 bits) on interface 0
Ethernet II, Src: 38:42:a9:01:01:00 (38:42:a9:01:01:00), Dst: 38:42:a9:04:01:01 (38:42:a9:04:01:01)
Internet Protocol Version 4, Src: 20.1.1.1, Dst: 20.1.1.2
User Datagram Protocol, Src Port: 4789, Dst Port: 4789
Virtual eXtensible Local Area Network
Flags: 0x0800, VXLAN Network ID (VNI)
Group Policy ID: 0
VXLAN Network Identifier (VNI): 10
Reserved: 0
Ethernet II, Src: HuaweiTe_05:61:39 (54:89:98:05:61:39), Dst: HuaweiTe_8b:4f:15 (54:89:98:8b:4f:15)
Internet Protocol Version 4, Src: 192.168.1.1, Dst: 192.168.1.2
Internet Control Message Protocol
Leaf2上VTEP收到VXLAN报文后,根据UDP目的端口号、源/目的IP地址、VNI判断VXLAN报文的合法有效性。然后依据VNI获取对应的二层广播域,进行VXLAN解封装,获取内层icmp request 报文,根据目的mac地址查到mac表项,即从Leaf2的VNI10 对应的GE1/0/1.20接口转发出去。
vm3收到icmp request报文后随即回复icmp reply报文,其仍然是单播转发,其过程同icmp request类似。
需要注意,在ibr对称型分布式网关场景中,二层通信没有BGP EVPN没有相关mac的学习信息。同网段通信,完全依靠数据平面arp完成,控制平面的BGP EVPN没有触发任何动作(没有update,没有mac学习)。因为同网段通信,并不解析网关地址,所以没有触发bgp evpn控制平面学习。
5.4、不同网段,同一Leaf下服务器Ping测试
执行操作:192.168.1.1 Ping 192.168.2.1,并在Leaf1连Spine1端口抓包,其来回路径为VM1-vSwitch1-Leaf1-vSwtich2-VM2。
该过程同集中转发场景中三层互访一至,也和传统的单臂路由场景类似。
5.5、不同网段,不同Leaf下服务器Ping
执行操作:192.168.2.1 Ping 192.168.1.2,并在Leaf1连Spine1端口抓包,其来回路径为VM2-vSwitch1-Leaf1-Leaf2-vSwtich2-VM3。
Leaf1收到来自vm2的报文,检测到报文的目的MAC是网关接口MAC,判断该报文需要进行三层转发。
Leaf1根据报文的入接口找到对应的二层广播域,然后找到绑定该广播域VBDIF接口的L3VPN实例。根据报文的目的IP地址,查找该L3VPN实例下的路由表,获取该路由对应的三层VNI,以及下一跳地址。再根据出接口是VXLAN隧道,判断需要进行VXLAN封装:
a. 根据tpye2-ibr路由学习到192.168.1.2/32位路由,如下:
[~Leaf-1]display ip routing-table vpn-instance vpn1
Proto: Protocol Pre: Preference
Route Flags: R - relay, D - download to fib, T - to vpn-instance, B - black hole
route
------------------------------------------------------------------------------
Routing Table : vpn1
Destinations : 10 Routes : 10
Destination/Mask Proto Pre Cost Flags NextHop Interface
10.1.36.0/24 IBGP 255 2 RD 20.1.1.3 VXLAN
177.1.1.1/32 IBGP 255 2 RD 20.1.1.3 VXLAN
192.168.1.0/24 Direct 0 0 D 192.168.1.254 Vbdif10
192.168.1.2/32 IBGP 255 0 RD 20.1.1.2 VXLAN //该路由条目有irb路由
192.168.1.254/32 Direct 0 0 D 127.0.0.1 Vbdif10
192.168.1.255/32 Direct 0 0 D 127.0.0.1 Vbdif10
192.168.2.0/24 Direct 0 0 D 192.168.2.254 Vbdif20
192.168.2.254/32 Direct 0 0 D 127.0.0.1 Vbdif20
192.168.2.255/32 Direct 0 0 D 127.0.0.1 Vbdif20
255.255.255.255/32 Direct 0 0 D 127.0.0.1 InLoopBack0
b. Leaf1关于192.168.1.2/32路由的详细信息如下:
[~Leaf-1]display ip routing-table vpn-instance vpn1 192.168.1.2 verbose
Route Flags: R - relay, D - download to fib, T - to vpn-instance, B - black hole
route
------------------------------------------------------------------------------
Routing Table : vpn1
Summary Count : 1
Destination: 192.168.1.2/32
Protocol: IBGP Process ID: 0
Preference: 255 Cost: 0
NextHop: 20.1.1.2 Neighbour: 10.1.1.4
State: Active Adv Relied Age: 00h02m17s
Tag: 0 Priority: low
Label: NULL QoSInfo: 0x0
IndirectID: 0x1000080 Instance:
RelayNextHop: 0.0.0.0 Interface: VXLAN
TunnelID: 0x0000000027f0000001 Flags: RD
c. Leaf1上关于192.168.1.2/32的EVPN路由如下:
[~Leaf-1]display bgp evpn all routing-table mac-route 0:48:5489-988b-4f15:32:192.168.1.2
EVPN-Instance __RD_1_100_1__:
Number of Mac Routes: 1
BGP routing table entry information of 0:48:5489-988b-4f15:32:192.168.1.2:
Route Distinguisher: 10:1
Remote-Cross route
Label information (Received/Applied): 10 100/NULL
From: 10.1.1.4 (10.1.1.4)
Route Duration: 0d00h17m49s
Relay Tunnel Out-Interface: VXLAN
Original nexthop: 20.1.1.2
Qos information : 0x0
Ext-Community: RT <10 : 1>, RT <1000 : 1>, Tunnel Type <VxLan>, Router's MAC <707b-e85f-46d1>
AS-path Nil, origin incomplete, localpref 100, pref-val 0, valid, internal, best, select, pre 255
Originator: 10.1.1.2
Cluster list: 10.1.1.4
Route Type: 2 (MAC Advertisement Route)
Ethernet Tag ID: 0, MAC Address/Len: 5489-988b-4f15/48, IP Address/Len: 192.168.1.2/32, ESI:0000.0000.0000.0000.0000
Not advertised to any peer yet
d. Leaf1关于VXLAN隧道信息如下:
[~Leaf-1] display tunnel all
Tunnel ID Type Destination
Status
--------------------------------------------------------------------------------
--------
0x0000000027f0000001 vxlan_nvo3 20.1.1.2
UP
0x0000000027f0000002 vxlan_nvo3 20.1.1.3
UP
根据VXLAN隧道的目的IP和源IP地址,获取对应的MAC地址,并将内层目的MAC和源MAC替换。
a. 内层源MAC地址为vm2网关地址192.168.2.254(IP VPN实例入口)接口的mac地址,即0002-0002-0002;
b. 内网目的MAC地址为Leaf2上VTEP的mac地址,该MAC地址通过irb路由中Router's MAC 携带。
将三层VNI封装到报文中。
外层封装VXLAN隧道的目的IP和源IP地址,源MAC地址为Leaf1的出接口MAC地址,目的MAC地址为网络下一跳的MAC地址。
报文封装如下
报文封装
Frame 2084: 124 bytes on wire (992 bits), 124 bytes captured (992 bits) on interface 0
Ethernet II, Src: 38:42:a9:01:01:00 (38:42:a9:01:01:00), Dst: 38:42:a9:04:01:01 (38:42:a9:04:01:01)
Internet Protocol Version 4, Src: 20.1.1.1, Dst: 20.1.1.2
User Datagram Protocol, Src Port: 4789, Dst Port: 4789
Virtual eXtensible Local Area Network
Flags: 0x0800, VXLAN Network ID (VNI)
Group Policy ID: 0
VXLAN Network Identifier (VNI): 100
Reserved: 0
Ethernet II, Src: NetSys_02:00:02 (00:02:00:02:00:02), Dst: HuaweiTe_5f:46:d1 (70:7b:e8:5f:46:d1)
Destination: HuaweiTe_5f:46:d1 (70:7b:e8:5f:46:d1)
Source: NetSys_02:00:02 (00:02:00:02:00:02)
Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 192.168.2.1, Dst: 192.168.1.2
Internet Control Message Protocol
封装后的报文根据外层MAC和IP信息在IP网络中传输,送达Leaf2。
Leaf2收到VXLAN报文后进行解封装,检测到报文外层的目的MAC是自己的MAC地址,判断该报文需要进行三层转发,进行外层解封装,后发现内层的目的MAC是自己VTEP的MAC地址,继续解封装;
Leaf2根据报文携带的三层VNI找到对应的L3VPN实例,查找该L3VPN实例下的路由表,发现192.1681.2所在的网段路由获取报文的下一跳是网关接口地址,然后进行ARP迭代,将目的MAC地址替换为VM3的MAC地址,源MAC地址替换为Leaf2的网关地址为192.168.1.254的MAC地址,转发给vm3。
vm3给vm2回复的报文转发流程类似。
5.6、模拟外部和服务器之间Ping
Leaf3配置IP前缀路由发布,因此R1上177.1.1.1/32位路由条目通过bgp evpn type 5 prefix-route update, 携带如下信息:
a. type5携带L3VPN的export RT值:100:1和1000:1(evpn);
b. type5只携带L3VNI:100;
c. Type5携带Next hop network address,即为Leaf3的VTEP的IP地址;
d. type5携带Router’s MAC ,即为Leaf3的NVE(VTEP)端口mac地址:
具体报文和路由信息详见章节5.2.3。
Leaf1和2配置对邻居发布IRB路由,因此Leaf1和Leaf2通过ARP信息收集的主机信息(Leaf1上192.168.1.1/32和192.168.2.1/32;Leaf2上192.1681.2/32)会通过bgp evpn type 2 irb update,以Leaf1为例携带信息如下:
a.type2携带EVPN实例的export RT值:10:1和1000:1
b.type2不携带L2 VNI和L3VNI //该处和资料有出入,可能是模拟器软件版本原因,但是实际不影响对端路由接收,分析详见章节5.2.4
c.Type2携带Next hop network address,即为Leaf1的VTEP的IP地址;
d.Type2携带Router’s MAC ,即为Leaf1的NVE(VTEP)端口mac地址:
Leaf2根据该路由信息用于不用网段,不同Leaf下互访(即192.168.1.2和192.168.1.1之间),Leaf3根据该路由信息用于外部和服务器之间互访。以177.1.1.1 ping 192.168.1.1为例具体流程如下:
a. R1产生icmp request报文,如下:
Frame 4: 98 bytes on wire (784 bits), 98 bytes captured (784 bits) on interface 0
Ethernet II, Src: HuaweiTe_dc:48:25 (54:89:98:dc:48:25), Dst: 38:42:a9:03:01:01 (38:42:a9:03:01:01)
Internet Protocol Version 4, Src: 177.1.1.1, Dst: 192.168.1.1
Internet Control Message Protocol
b. Leaf3从接口GE1/0/1接口收到该icmp request,匹配至IP VPN实例中,查找路由信息如下:
<Leaf-3>display ip routing-table vpn-instance vpn1 192.168.1.1
Proto: Protocol Pre: Preference
Route Flags: R - relay, D - download to fib, T - to vpn-instance, B - black hole route
------------------------------------------------------------------------------
Routing Table : vpn1
Summary Count : 1
Destination/Mask Proto Pre Cost Flags NextHop Interface
192.168.1.1/32 IBGP 255 0 RD 20.1.1.1 VXLAN
<Leaf-3>display ip routing-table vpn-instance vpn1 192.168.1.1 verbose
Route Flags: R - relay, D - download to fib, T - to vpn-instance, B - black hole route
------------------------------------------------------------------------------
Routing Table : vpn1
Summary Count : 1
Destination: 192.168.1.1/32
Protocol: IBGP Process ID: 0
Preference: 255 Cost: 0
NextHop: 20.1.1.1 Neighbour: 10.1.1.4
State: Active Adv Relied Age: 00h23m24s
Tag: 0 Priority: low
Label: NULL QoSInfo: 0x0
IndirectID: 0x1000072 Instance:
RelayNextHop: 0.0.0.0 Interface: VXLAN
TunnelID: 0x0000000027f0000004 Flags: RD
<Leaf-3>
<Leaf-3>display tunnel all
Tunnel ID Type Destination Status
----------------------------------------------------------------------------------------
0x0000000027f0000002 vxlan_nvo3 20.1.1.2 UP
0x0000000027f0000004 vxlan_nvo3 20.1.1.1 UP
<Leaf-3>display bgp evpn all routing-table mac-route 0:48:5489-9805-6139:32:192.168.1.1
EVPN-Instance __RD_1_100_1__:
Number of Mac Routes: 1
BGP routing table entry information of 0:48:5489-9805-6139:32:192.168.1.1:
Route Distinguisher: 10:1
Remote-Cross route
Label information (Received/Applied): 10 100/NULL
From: 10.1.1.4 (10.1.1.4)
Route Duration: 0d00h21m43s
Relay Tunnel Out-Interface: VXLAN
Original nexthop: 20.1.1.1
Qos information : 0x0
Ext-Community: RT <10 : 1>, RT <1000 : 1>, Tunnel Type <VxLan>, Router's MAC <707b-e80d-0337>
AS-path Nil, origin incomplete, localpref 100, pref-val 0, valid, internal, best, select, pre 255
Originator: 10.1.1.1
Cluster list: 10.1.1.4
Route Type: 2 (MAC Advertisement Route)
Ethernet Tag ID: 0, MAC Address/Len: 5489-9805-6139/48, IP Address/Len: 192.168.1.1/32, ESI:0000.0000.0000.0000.0000
Not advertised to any peer yet
c. Leaf3匹配路由,进行VXLAN封装:
根据VXLAN隧道的目的IP和源IP地址,获取对应的MAC地址,并将内层目的MAC和源MAC替换。
i.内层源MAC地址为GE1/0/1(IP VPN实例入口)接口的mac地址,即38:42:a9:03:01:01;
ii.内网目的MAC地址为Leaf1上VTEP的mac地址,该MAC地址通过irb路由中Router's MAC 携带。
将三层VNI封装到报文中。
外层封装VXLAN隧道的目的IP和源IP地址,源MAC地址为Leaf1的出接口MAC地址,目的MAC地址为网络下一跳的MAC地址。
d. 报文封装如下:
Frame 3306: 148 bytes on wire (1184 bits), 148 bytes captured (1184 bits) on interface 0
Ethernet II, Src: 38:42:a9:04:01:01 (38:42:a9:04:01:01), Dst: 38:42:a9:01:01:00 (38:42:a9:01:01:00)
Internet Protocol Version 4, Src: 20.1.1.3, Dst: 20.1.1.1
User Datagram Protocol, Src Port: 4789, Dst Port: 4789
Virtual eXtensible Local Area Network
Flags: 0x0800, VXLAN Network ID (VNI)
Group Policy ID: 0
VXLAN Network Identifier (VNI): 100
Reserved: 0
Ethernet II, Src: 38:42:a9:03:01:01 (38:42:a9:03:01:01), Dst: HuaweiTe_0d:03:37 (70:7b:e8:0d:03:37)
Internet Protocol Version 4, Src: 177.1.1.1, Dst: 192.168.1.1
Internet Control Message Protocol
封装后的报文根据外层MAC和IP信息在IP网络中传输,送达Leaf1。
Leaf1收到VXLAN报文后进行解封装,检测到报文外层的目的MAC是自己的MAC地址,判断该报文需要进行三层转发,进行外层解封装,后发现内层的目的MAC是自己VTEP的MAC地址,继续解封装;
Leaf1根据报文携带的三层VNI找到对应的L3VPN实例,查找该L3VPN实例下的路由表,发现192.168.1.1所在的网段路由获取报文的下一跳是网关接口地址,然后进行ARP迭代,将目的MAC地址替换为VM1的MAC地址,源MAC地址替换为Leaf1的网关地址为192.168.1.254的MAC地址,转发给vm1。
vm1给R1回复的报文转发流程类似。
5.7、arp广播抑制和代答
arp广播抑制,作用域L2和L3网关。
在终端租户初次互通过程中,终端租户会发送ARP广播请求报文,而ARP请求报文会在二层网络内广播。为了抑制ARP广播请求报文给网络带来的广播风暴,可在VXLAN二层网关设备上使能ARP广播抑制功能。但是,ARP广播抑制功能的实现依赖于三层网关上的主机信息表(包括主机IP地址、MAC地址、VTEP地址和VNI ID,主机信息表通过arp collect host enable获取)。
a. 配置如下:
在bridge-domain下配置arp broadcast-suppress,使能ARP广播抑制功能。
Leaf1和Leaf2上类似,如下:
bridge-domain 10
arp broadcast-suppress enable
#
bridge-domain 20
arp broadcast-suppress enable
#
b. 结果验证:
vm1访问vm3同网段,不同leaf互访,leaf1收到vm1发送的arp报文后,会根据irb路由信息进行arp广播抑制,如下:
Frame 6: 114 bytes on wire (912 bits), 114 bytes captured (912 bits) on interface 0
Ethernet II, Src: 38:42:a9:03:01:00 (38:42:a9:03:01:00), Dst: 38:42:a9:02:01:01 (38:42:a9:02:01:01)
Internet Protocol Version 4, Src: 20.1.1.1, Dst: 20.1.1.2
User Datagram Protocol, Src Port: 4789, Dst Port: 4789
Virtual eXtensible Local Area Network
Flags: 0x0800, VXLAN Network ID (VNI)
Group Policy ID: 0
VXLAN Network Identifier (VNI): 10
Reserved: 0
Ethernet II, Src: HuaweiTe_05:61:39 (54:89:98:05:61:39), Dst: HuaweiTe_8b:4f:15 (54:89:98:8b:4f:15)
Destination: HuaweiTe_8b:4f:15 (54:89:98:8b:4f:15)
Source: HuaweiTe_05:61:39 (54:89:98:05:61:39)
Type: 802.1Q Virtual LAN (0x8100)
802.1Q Virtual LAN, PRI: 6, DEI: 0, ID: 10
Address Resolution Protocol (request)
arp proxy/arp 代答,作用与L2网关。
arp l2-proxy enable ----BD下配置二层代答功能,可以配合arp collect host一起使用,即三层网关通过arp收集通过BGP EVPN更新至二层网关,二层网关收到arp请求报文后判断是否进行代答处理。
ARP二层代理是一种有效分散ARP报文处理压力的方法,其核心思想是隔离ARP广播域,对接收到的ARP请求报文尽量进行本地优先代答。执行arp l2-proxy enable命令使能ARP二层代理功能之后,设备在收到ARP报文时,会先查看自己能否获取到该ARP请求报文的目的用户的信息,如果能够获取就直接进行ARP代答,否则按照原先的转发流程转发该报文。
已经配置了VBDIF接口的Leaf上无法使能该功能。