文章出处:www.net1980.com
故障现象:
         OSPF邻居关系无法正常建立,OSPF的状态机一直处理Exstart和Down之间来回跳转。
告警信息:
Feb 12 11:54:58.796 CCT: %OSPF-5-ADJCHG: Process 88, Nbr 201.109.132.8 on Vlan2 from EXSTART to DOWN, Neighbor Down: Too many retransmissions
Feb 12 11:54:59.476 CCT: %OSPF-5-ADJCHG: Process 88, Nbr 201.109.132.12 on Vlan2 from LOADING to FULL, Loading Done
Feb 12 11:55:58.795 CCT: %OSPF-5-ADJCHG: Process 88, Nbr 201.109.132.8 on Vlan2 from DOWN to DOWN, Neighbor Down: Ignore timer expired
Feb 12 11:58:17.993 CCT: %OSPF-5-ADJCHG: Process 88, Nbr 201.109.132.8 on Vlan2 from EXSTART to DOWN, Neighbor Down: Too many retransmissions
Feb 12 11:58:20.912 CCT: %OSPF-5-ADJCHG: Process 88, Nbr 201.109.132.12 on Vlan2 from LOADING to FULL, Loading Done
Feb 12 11:59:17.992 CCT: %OSPF-5-ADJCHG: Process 88, Nbr 201.109.132.8 on Vlan2 from DOWN to DOWN, Neighbor Down: Ignore timer expired
Feb 12 12:01:33.301 CCT: %OSPF-5-ADJCHG: Process 88, Nbr 201.109.132.8 on Vlan2 from EXSTART to DOWN, Neighbor Down: Too many retransmissions
Feb 12 12:01:33.601 CCT: %OSPF-5-ADJCHG: Process 88, Nbr 201.109.132.12 on Vlan2 from LOADING to FULL, Loading Done
Feb 12 12:02:33.300 CCT: %OSPF-5-ADJCHG: Process 88, Nbr 201.109.132.8 on Vlan2 from DOWN to DOWN, Neighbor Down: Ignore timer expired
Feb 12 12:04:46.170 CCT: %OSPF-5-ADJCHG: Process 88, Nbr 201.109.132.8 on Vlan2 from EXSTART to DOWN, Neighbor Down: Too many retransmissions
Feb 12 12:04:50.774 CCT: %OSPF-5-ADJCHG: Process 88, Nbr 201.109.132.12 on Vlan2 from LOADING to FULL, Loading Done
Feb 12 12:05:46.169 CCT: %OSPF-5-ADJCHG: Process 88, Nbr 201.109.132.8 on Vlan2 from DOWN to DOWN, Neighbor Down: Ignore timer expired
Feb 12 12:08:03.539 CCT: %OSPF-5-ADJCHG: Process 88, Nbr 201.109.132.8 on Vlan2 from EXSTART to DOWN, Neighbor Down: Too many retransmissions
Feb 12 12:08:06.179 CCT: %OSPF-5-ADJCHG: Process 88, Nbr 201.109.132.12 on Vlan2 from LOADING to FULL, Loading Done
Feb 12 12:09:03.538 CCT: %OSPF-5-ADJCHG: Process 88, Nbr 201.109.132.8 on Vlan2 from DOWN to DOWN, Neighbor Down: Ignore timer expired
 
处理步骤:
-、我们先熟悉一下OSPF邻居建立的几个步骤:
1、down状态
2、init 初始状态
3、Two-way状态 我出现在对方发送的路由信息中,就进入two-way状态
4、exstart 确定主从关系,即谁先发送dbd。
5、exchang 在主的带领下开始交换dbd
6、loading 请求更详细的信息
7、full
二、从日志信息中可以看出,OSPF邻居关系建立的步骤是从1到4步后再无法进行下去了,只能又重新1步开始了。因为能进行到第4步,所以可以排除 OSPF链路出现DOWN的问题了。由于在第4步就无法建立下去了,也就是说DBD包协商不成功。在Exstart阶段进行DBD包协商主从关系时会比较 MTU值,如果两边DBD报文中的MTU参数相互之间不符合将会协商不成功。在DBD报文中进行MTU的比较的原因主要是由于DBD报文的内容比较大,如果两边的MTU值不一致的话很可能会造成DBD包丢弃,所以会在Exstart过程中增加MTU的协商过程。所以对OSPF邻居关系的端口MTU一致性进行检查。
 
三、经检查发现OSPF两侧设备的MTU值确实不一致,由于对端路由器上还有一些MPLS ×××的业务,为保证业务的正常对所有端口的MTU的进行了调整,而出问题的那个端口是不承载MPLS ×××业务的,本来不应该对MTU值进行更改。由于两侧MTU的不一致引起了OSPF邻居关系的震荡。由于DBD进行MTU参数比较只是在OSPF的 Exstart阶段进行的,如果OSPF邻居关系一直处于Full的状态时更改MTU并不会中断OSPF邻居关系,只有在OSPF邻居关系重建时才会出现以上的问题。而且一直会出现OSPF的振荡。
 
处理结果:
        由于设备两端的端口的MTU值不同,造成OSPF的exstart协商无法通过,邻居关系建立反复来回协商而产生大量日志和警告。现在只要把两端设备端口的MTU值设置成一致就可以了。