最近几次在用户现场排除故障都发现Cisco 6800IA莫名其妙的重启,查看日志消息如下(这是今天抓取的日志):

Jan 11 11:27:22: %SATMGR-SW1-5-FABRIC_PORT_DOWN: SDP down on interface Te1/3/8, connected to FEX 101, uplink 52 (reason: SDP timeout)
Jan 11 11:27:22: %SATMGR-SW1-5-FEX_MODULE_OFFLINE: FEX 101, module 1 offline (reason: all fabric links down)
Jan 11 11:27:22: %SATMGR-SW1-5-FABRIC_PORT_DOWN: SDP down on interface Te1/3/7, connected to FEX 101, uplink 51 (reason: SDP timeout)
Jan 11 11:27:22: %OIR-SW1-6-INSREM: Switch 101 Physical Slot 1 - Module Type LINE_CARD  removed
Jan 11 03:27:28.115: SW1: %HA_EM-6-LOG: Mandatory.go_fabriclinkstatus.tcl: GOLD EEM TCL policy for TestFEXFabricLinkStatus
Jan 11 11:27:28: %CONST_DIAG-SW1-6-RSL_LINK_DOWN: RSL link on module 3 port 7-8 is down
Jan 11 11:32:06: %SATMGR-SW1-5-FABRIC_PORT_UP: SDP up on interface Te1/3/8, connected to FEX 101, uplink 52
Jan 11 11:32:09: %SATMGR-SW1-5-ONLINE: FEX 101 online
Jan 11 11:32:09: %SATMGR-SW1-5-FEX_MODULE_ONLINE: FEX 101, module 1 online
Jan 11 11:32:09: %OIR-SW1-6-INSREM: Switch 101 Physical Slot 1 - Module Type LINE_CARD  inserted
Jan 11 11:32:06.554: %LINK-3-UPDOWN: Interface Port-channel1, changed state to up (FEX-101)
Jan 11 11:32:07: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-channel1, changed state to up (FEX-101)
Jan 11 11:32:12: %SATMGR-SW1-5-FABRIC_PORT_UP: SDP up on interface Te1/3/7, connected to FEX 101, uplink 51
Jan 11 11:32:24: %DIAG-SW1-6-RUN_MINIMUM: Fex 101 Module 1: Running Minimal Diagnostics...
Jan 11 11:32:24: %DIAG-SW1-6-DIAG_OK: Fex 101 Module 1: Passed Online Diagnostics
Jan 11 11:32:25: %OIR-SW1-6-SP_INSCARD: Card inserted in Switch_number = 101, physical slot 1,interfaces are now online

检查6807XL和6800IA的连接端口、线缆,发现一切正常,检查核心的配置,也并没有发现什么特殊情况。

查看IOS版本,发现设备内的版本相对较老,从设备上线到现在一直没有升级过,难道是软件的BUG?于时决定对系统进行升级!升级步骤在这里不做说明了!

升级完成后一直到今天上午,6800IA都工作正常,没有重启过,然而就在今天中午又重启了一次,见鬼了!

之前问过度娘,可是度娘并没有给出什么有用的线索。没有办法只能去官网碰碰运气,没想到还真让瞎猫碰到了死耗子……下面是官网的一段内容!


C6800/IA: FEX may crash when a host port configured as SPAN destination
CSCuo88038
Description
Symptom:
In Catalyst 6800 / IA setup, when a host port is configured as SPAN destination, it may cause crash / reset the FEX.

Logs reported:
%SATMGR-SW1-5-FABRIC_PORT_DOWN: SDP down on interface , connected to FEX , uplink  (reason: link down)
SATMGR-SW1-3-ERR_DUAL_ACTIVE_DETECT_INCAPABLE: channel group  is no longer dual-active detection capable

Conditions:
Issue is seen only after configuring the host port as SPAN Destination port

Workaround:
Move the SPAN destination to the parent/controller switch.


上面那段大体意思是说:如果在做端口镜像的时候,把6800IA上的端口做为了目的端口,就会导致6800IA重启。

还真别说,这几次重启之前,我们都在6800IA上做过monitor session。此时心中有一万头小动物跑过……

在这里说明一下,并不是做完monitor session就会重启,今天是做完了一次抓包后,过了一段时间才重启的。

之前有一次同事在现场也是做了抓包操作,然后6800IA重启了,同事还说:不会是我抓包把它抓重启了吧?我说:它还没有那么金贵,不至于……。得了,现在终于知道就是抓包引起的了!6800IA你让我说你啥好?


进一步查看文档

图片.png

文档上说了受影响的IOS版本,及已经修复此BUG的版本,但是需要注意这里说的IOS版本并不是针对SUP6T的引擎而言的。SUP6T有哪些IOS受到影响,有哪些版本已经修复暂时没有查到!悲剧的是我这里用的就是SUP6T的版本……


虽然暂时没有找到合适的版本,不过总算知道哪里出现的问题。