第二部分:技术讲解

实验已经做通了,但其中的技术细节还要仔细研究一下。整个m×××的运作以及结构是怎样的呢。下面来说说。

一、             MTIdefault MDT

 在配置过程中,对PE的配置有两点是需要关注的,一是在loopback0上启用了pim sm,二是在vrf中配置了mdt default 239.100.0.1

PE R7为例,在没有配置上面两条命令之前,其pim邻居是这样的:

一种是P网络全局pim邻居

r7#sh ip pim nei       

PIM Neighbor Table

Neighbor          Interface                Uptime/Expires       Ver   DR

Address                                                                                      Prio/Mode

37.0.0.3          FastEthernet0/0.37       00:52:49/00:01:39 v2    1 / S

47.0.0.4          FastEthernet0/0.47       00:52:49/00:01:40 v2    1 / S

另一种是客户×××pim邻居

r7#sh ip pim vrf abc nei

PIM Neighbor Table

Neighbor          Interface                Uptime/Expires       Ver     DR

Address                                                                                       Prio/Mode

78.0.0.8          FastEthernet0/0.78       00:52:45/00:01:44 v2    1 / DR S

 

 

一旦敲入了mdt default 239.100.0.1R7上马上就会出现信息:

*Mar  1 00:56:52.547: %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel0, changed state to up

生成了一个tunnel 0接口,ip 地址使用的是loopback0的地址7.7.7.7。这个tunnel接口叫做MTImulticast tunnel interface),位于VRF中。

看一下tunnel0的资料:

r7#sh int tun 0

Tunnel0 is up, line protocol is up

  Hardware is Tunnel

  Interface is unnumbered. Using address of Loopback0 (7.7.7.7)

  MTU 1514 bytes, BW 9 Kbit, DLY 500000 usec,

     reliability 255/255, txload 1/255, rxload 1/255

  Encapsulation TUNNEL, loopback not set

  Keepalive not set

  Tunnel source 7.7.7.7 (Loopback0), destination 239.100.0.1

  Tunnel protocol/transport GRE/IP Multicast, key disabled, sequencing disabled

显示这个tunnelGRE tunnel,传送ip组播数据。

然后再敲入

r7(config)#int lo0

ip pim sparse-mode

(这个loopback0MP-iBGP建邻居所用的地址,所以要启用PIM SM,而且它将成为MDT的根)

出现信息如下:

*Mar  1 01:02:33.791: %PIM-5-DRCHG: DR change from neighbor 0.0.0.0 to 7.7.7.7 on interface Loopback0 (vrf default)

*Mar  1 01:02:45.647: %PIM-5-NBRCHG: neighbor 5.5.5.5 UP on interface Tunnel0 (vrf abc)

*Mar  1 01:02:45.975: %PIM-5-NBRCHG: neighbor 2.2.2.2 UP on interface Tunnel0 (vrf abc)

 

R7跟其他PE建立了pim邻居,使用的是MTI接口,即3peMTI建成了邻居。

r7#sh ip pim vrf abc nei

PIM Neighbor Table

Neighbor          Interface                Uptime/Expires       Ver    DR

Address                                                             Prio/Mode

78.0.0.8          FastEthernet0/0.78       01:07:08/00:01:36       v2     1 / DR S

2.2.2.2           Tunnel0                  00:04:33/00:01:37     v2     1 / S

5.5.5.5           Tunnel0                  00:04:33/00:01:38     v2     1 / S

R2上看:

r2#sh ip pim vrf abc nei

PIM Neighbor Table

Neighbor          Interface                Uptime/Expires    Ver   DR

Address                                                         Prio/Mode

12.0.0.1          FastEthernet0/0.12       01:09:37/00:01:32   v2    1 / S

7.7.7.7           Tunnel0                  00:06:29/00:01:39 v2    1 / DR S

5.5.5.5           Tunnel0                  01:08:21/00:01:40 v2    1 / S

显示R7tunnel0DR。这是因为3MTI相当于连接到了1个共享网段上,就像在一个以太网内,所以要选举DR

MTI很关键,它是客户***pim和运营商P网络的PIM交互的中介。

 

如图,现在整网的pim邻居有这么三种。

我们现在来查看运营商的P路由器R4上的组播路由状态:

r4#sh ip mro

IP Multicast Routing Table

Flags: D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,

       L - Local, P - Pruned, R - RP-bit set, F - Register flag,

       T - SPT-bit set, J - Join SPT, M - MSDP created entry,

       X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,

       U - URD, I - Received Source Specific Host Report, Z - Multicast Tunnel

       Y - Joined MDT-data group, y - Sending to MDT-data group

Outgoing interface flags: H - Hardware switched, A - Assert winner

 Timers: Uptime/Expires

 Interface state: Interface, Next-Hop or VCD, State/Mode

 

(*, 239.100.0.1), 01:28:04/00:03:03, RP 4.4.4.4, flags: S

  Incoming interface: Null, RPF nbr 0.0.0.0

  Outgoing interface list:

    FastEthernet0/0.47, Forward/Sparse, 00:26:11/00:02:50

    FastEthernet0/0.45, Forward/Sparse, 01:28:03/00:03:03

    FastEthernet0/0.34, Forward/Sparse, 01:28:04/00:02:56

 

(2.2.2.2, 239.100.0.1), 01:27:49/00:03:23, flags: T

  Incoming interface: FastEthernet0/0.34, RPF nbr 34.0.0.3

  Outgoing interface list:

    FastEthernet0/0.45, Forward/Sparse, 01:27:49/00:03:03

 

(5.5.5.5, 239.100.0.1), 01:27:50/00:03:13, flags: T

  Incoming interface: FastEthernet0/0.45, RPF nbr 45.0.0.5

  Outgoing interface list:

    FastEthernet0/0.47, Forward/Sparse, 00:26:11/00:02:50

    FastEthernet0/0.34, Forward/Sparse, 01:27:50/00:02:55

 

(7.7.7.7, 239.100.0.1), 00:31:52/00:03:12, flags: T

  Incoming interface: FastEthernet0/0.47, RPF nbr 47.0.0.7

  Outgoing interface list:

    FastEthernet0/0.45, Forward/Sparse, 00:31:52/00:03:02

 

(*, 224.0.1.40), 01:29:00/00:03:07, RP 4.4.4.4, flags: SJCL

  Incoming interface: Null, RPF nbr 0.0.0.0

  Outgoing interface list:

    FastEthernet0/0.47, Forward/Sparse, 01:27:40/00:02:38

    FastEthernet0/0.45, Forward/Sparse, 01:28:04/00:03:07

    FastEthernet0/0.34, Forward/Sparse, 01:28:05/00:03:04

Loopback0, Forward/Sparse, 01:29:00/00:02:01

比前面所见多了4个条目,1个(*G)和3个(SG),分别是(*, 239.100.0.1)(2.2.2.2, 239.100.0.1)(5.5.5.5, 239.100.0.1)(7.7.7.7, 239.100.0.1)。这里的G正是前面在vrf中配置的MDT default 239.100.0.1MDT全称是multicast distribution tree,即组播分发树。配置239.100.0.1这个default MDT虽然是在vrf中,但是是给运营商P网络中的组播路由器来用的。P网络用这个组地址建成了1个全互连的通道供***的组播使用。不管客户***网络内有多少组播源和组地址,运营商网络并不关心,也不需要知道,它只使用这一个缺省的组播分发树来转发客户组播数据。

 

所有的***站点都加入到这棵树中,他们互为发送者和接收者,即既为根(root)又为叶(leaf)

二、MP-iBGP的组播更新包

PE加入default MDT靠的是MP-iBGP消息,配置了default MDT后他们之间互发MP-iBGP更新,这种更新包与常规***v4的路由更新有所不同。

r7# sh ip bgp v a       

BGP table version is 24, local router ID is 7.7.7.7

Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,

              r RIB-failure, S Stale

Origin codes: i - IGP, e - EGP, ? - incomplete

 

   Network          Next Hop            Metric LocPrf Weight Path

Route Distinguisher: 100:1 (default for vrf abc)

*>i1.1.1.1/32       2.2.2.2                  2    100      0 ?

*>i6.6.6.6/32       5.5.5.5                  2    100      0 ?

*> 8.8.8.8/32       78.0.0.8                 2         32768 ?

*>i12.0.0.0/24      2.2.2.2                  0    100      0 ?

*>i56.0.0.0/24      5.5.5.5                  0    100      0 ?

*> 78.0.0.0/24      0.0.0.0                  0         32768 ?

Route Distinguisher: 2:100:1

*>i2.2.2.2/32       2.2.2.2                  0    100      0 ?

*>i5.5.5.5/32       5.5.5.5                  0    100      0 ?

*> 7.7.7.7/32       0.0.0.0                                0 ?

MDTmp-ibgp更新的RD2:100:1而不是单播***v4路由更新的100:1。这个2代表RD类型2,区别于单播***v4路由的RD。而且,这种更新携带了一些别的扩展community属性。

r7# sh ip bgp v a 5.5.5.5

BGP routing table entry for 2:100:1:5.5.5.5/32, version 20

Paths: (1 available, best #1, no table, not advertised to EBGP peer)

  Not advertised to any peer

  Local

    5.5.5.5 (metric 3) from 5.5.5.5 (5.5.5.5)

      Origin incomplete, metric 0, localpref 100, valid, internal, mdt, no-import, best

      Extended Community: RT:100:1 MDT:100:239.100.0.1,

mpls labels in/out nolabel/511

可以看到里面有MDT的信息。


20150316更新:

IOS 15.2已经不用rt来携带mdt,而是使用bgpmdt地址族来传递给PIM

router bgp 1

 bgplog-neighbor-changes

 neighbor peerall peer-group

 neighbor peerall remote-as 1

 neighbor peerall update-source Loopback0

 neighbor 9.9.0.2 peer-group peerall

 neighbor 9.9.0.3 peer-group peerall

 neighbor 9.9.0.4 peer-group peerall

 neighbor 9.9.0.5 peer-group peerall

 !

 address-family ipv4

 neighbor 9.9.0.2 activate

 neighbor 9.9.0.3 activate

 neighbor 9.9.0.4 activate

 neighbor 9.9.0.5 activate

 exit-address-family

 !       

 address-family ***v4

 neighbor peerall send-community extended

 neighbor peerall route-reflector-client

 neighbor 9.9.0.2 activate

 neighbor 9.9.0.5 activate

 exit-address-family

 !

 address-family ipv4 mdt

  neighbor peerall send-community both

  neighbor peerall route-reflector-client

  neighbor 9.9.0.2 activate 即使对方不支持mdt safi,也要在这里配上激活语句

  neighbor 9.9.0.5 activate

 

查看:

r7#sh ip bgp ipv4 mdt all

BGP table version is 8, local router ID is9.9.0.7

Status codes: s suppressed, d damped, hhistory, * valid, > best, i - internal,

              r RIB-failure, S Stale, mmultipath, b backup-path, f RT-Filter,

              x best-external, aadditional-path, c RIB-compressed,

Origin codes: i - IGP, e - EGP, ? -incomplete

RPKI validation codes: V valid, I invalid,N Not found

 

     Network          Next Hop            Metric LocPrf Weight Path

Route Distinguisher: 100:1 (default for vrfabc)

 *>i 9.9.0.2/32       9.9.0.2                  0    100     0 ?

 *>i 9.9.0.5/32       9.9.0.5                  0    100     0 ?

 *> 9.9.0.7/32       0.0.0.0                                0 ?

r7#sh ip bgp ipv4 mdt all 9.9.0.2

BGP routing table entry for100:1:9.9.0.2/32         version 8

Paths: (1 available, best #1, tableIPv4-MDT-BGP-Table)

 Advertised to update-groups:

    3        

  RefreshEpoch 1

 Local, (Received from a RR-client)

   9.9.0.2 from 9.9.0.2 (9.9.0.2)

     Origin incomplete, metric 0, localpref 100, valid, internal, best,

      MDT group address: 239.0.0.1

 

推荐在所有参与M×××的路由器上配置MDT SAFI。尽管mdt safi的好处是为了ssm树的构建,在M×××中为PIM-SM配置缺省mdt group时,也必须配置MDT SAFI 。从组播的观点来看,在pim-sm核心中要使M×××工作正常,mdt safi不是必须的。但在某些场景,必须配置新地址族以创建MTI。缺了这句,MTI不会创建,M×××也不工作。

为了向下兼容,要对MDTSAFI下的所有peer启用ext-communities。在纯的支持MDT SAFI的环境,可不显式配置ext-communities。在混合环境下,有的路由器不支持MDT SAFI,就必须配置ext-communities

 



三、端到端组播数据包转发分析:

现在以R1为源,R8为接收者分析全程组播包的转发。

1、  R1 ping 238.0.0.1,包到达R2mvrf

2、  R2mvrf中的组播路由表如下:

r2#sh ip mro vrf abc

 (1.1.1.1, 238.0.0.1), 00:00:08/00:03:21, flags:

  Incoming interface: FastEthernet0/0.12, RPF nbr 12.0.0.1

  Outgoing interface list:

    Tunnel0, Forward/Sparse, 00:00:08/00:03:21

出接口为tunnel0。于是MTIGRE将组播包封装起来,源地址改为2.2.2.2,目的地址为239.100.0.1。现在要从vrf中出到全局中去了。

r2#sh ip mro

(2.2.2.2, 239.100.0.1), 02:50:31/00:03:14, flags: FTZ

  Incoming interface: Loopback0, RPF nbr 0.0.0.0

  Outgoing interface list:

FastEthernet0/0.23, Forward/Sparse, 02:50:31/00:02:50

根据全局组播路由表,出接口为f0/0.23,送出。R3收到。

3、  R3的全局组播路由表如下

  r3#sh ip mro

(2.2.2.2, 239.100.0.1), 02:49:13/00:03:02, flags: T

  Incoming interface: FastEthernet0/0.23, RPF nbr 23.0.0.2

  Outgoing interface list:

    FastEthernet0/0.37, Forward/Sparse, 01:47:35/00:03:04

FastEthernet0/0.34, Forward/Sparse, 02:49:13/00:03:28, A

将包复制两份,分别从f0/0.37f0/0.34送出,R7R4收到。

4R7的全局组播路由表如下

r7#sh ip mro

(2.2.2.2, 239.100.0.1), 01:59:49/00:02:57, flags: JTZ

  Incoming interface: FastEthernet0/0.37, RPF nbr 37.0.0.3

  Outgoing interface list:

MVRF abc, Forward/Sparse, 01:54:09/00:02:57

出接口为mvrf,于是被送入vrf中。

R7mvrf的组播路由表如下:

r7#sh ip mro vrf abc

(1.1.1.1, 238.0.0.1), 00:01:38/00:01:51, flags:

  Incoming interface: Tunnel0, RPF nbr 2.2.2.2

  Outgoing interface list:

FastEthernet0/0.78, Forward/Sparse, 00:01:38/00:03:19

出接口为f0/0.78,送出。R8收到,到终点站。

5、另一路R4收到的包会被R5抛弃,因为R5mvrf中无出口。

r5#sh ip mro vrf abc

(*, 238.0.0.1), 00:00:15/00:02:47, RP 1.1.1.1, flags: SP

  Incoming interface: Tunnel0, RPF nbr 2.2.2.2

  Outgoing interface list: Null