一、处理过程
XX联通SF4810异常宕机,VCS资源无法自动切换,计费采集业务中断.
系统版本:solaris 10
VCS版本:5.0
VXVM版本:5.0
二、处理过程
2.1检查硬件、启系统
到达现场检查发现IB6、RP0、RP2故障:
sun4810_sc1:SC> showlog -v
Mar 14 09:53:10 sun4810_sc1 Platform.SC: [ID 846693 local0.notice] Device will not be polled
Mar 14 09:53:10 sun4810_sc1 Platform.SC: [ID 752932 local0.notice] PCI I/O Board at /N0/IB6 Device poll caused: sun.serengeti.FailedHwException: I2cComm.readCmd: CBH Port is disabled: IB6.sbbc0.regs.c0 (118000c0)
Mar 14 09:53:10 sun4810_sc1 Platform.SC: [ID 846693 local0.notice] Device will not be polled
Mar 14 09:53:11 sun4810_sc1 Platform.SC: [ID 679592 local0.error]
Mar 14 09:53:11 sun4810_sc1 Platform.SC: [ID 679592 local0.error]
Mar 14 09:53:11 sun4810_sc1 Platform.SC: [ID 898456 local0.error]
/partition0/RP0/dx0:
General Error Status[0x1e] : 0x00010001
AccCPerr [16:16] : 0x1
CPerr [00:00] : 0x1 Control Parity Error
Safari Port Error Status 6[0x25] : 0x00050004
AccIFOv [16:16] : 0x1
AccErr [18:18] : 0x1
SafPar [02:02] : 0x1 Safari input parity error
>>> Safari Port Error Status 7[0x26] : 0x00078004
AccIFOv [16:16] : 0x1
AccErr [18:18] : 0x1
AccIFPar [17:17] : 0x1
FirstError [15:15] : 0x1
SafPar [02:02] : 0x1 Safari input parity error
Mar 14 09:53:14 sun4810_sc1 Platform.SC: [ID 208063 local0.error] [AD] Event: SF4810.ASIC.AR.ADR_PERR.10433006
CSN: 138H2FE9 DomainID: A ADInfo: 1.SCAPP.20.12
Time: Sun Mar 14 09:53:13 GMT+08:00 2010
FRU-List-Count: 1; FRU-PN: 5014953; FRU-SN: 002538; FRU-LOC: RP0
Recommended-Action: Service action required
[AD] Event: SF4810.ASIC.AR.ADR_PERR.10433007
CSN: 138H2FE9 DomainID: A ADInfo: 1.SCAPP.20.12
Time: Sun Mar 14 09:53:13 GMT+08:00 2010
FRU-List-Count: 2; FRU-PN: 5014404; FRU-SN: 011488; FRU-LOC: /N0/IB6
FRU-PN: 5014953; FRU-SN: 014739; FRU-LOC: RP2
Recommended-Action: Service action required
Mar 14 09:53:14 sun4810_sc1 Platform.SC: [ID 789080 local0.crit] A fatal condition is detected on Domain A. Initiating automatic restoration for this domain.
Mar 14 09:53:16 sun4810_sc1 Platform.SC: [ID 187965 local0.error] Data Parity error polling failed. Board will no longer be polled: JtagController.tapWait: CBH Port is disabled: IB6.sdc.b0 (12c000b0)
Mar 14 09:53:38 sun4810_sc1 Platform.SC: [ID 930884 local0.error] [AD] Event: SF4810.ASIC.DX.SAF_IN_PAR_ERR.30233026
CSN: 138H2FE9 DomainID: A ADInfo: 1.SCAPP.20.12
Time: Sun Mar 14 09:53:38 GMT+08:00 2010
FRU-List-Count: 3; FRU-PN: 5014953; FRU-SN: 002538; FRU-LOC: RP0
FRU-PN: 5014404; FRU-SN: 011488; FRU-LOC: /N0/IB6
FRU-PN: 5014953; FRU-SN: 014739; FRU-LOC: RP2
Recommended-Action: Service action required
Mar 14 09:54:07 sun4810_sc1 Platform.SC: [ID 620190 local0.notice] A: CycleKeyswitch: Initiating keyswitch: off, domain A.
Mar 14 09:56:16 sun4810_sc1 Platform.SC: [ID 299301 local0.notice] A: CycleKeyswitch: Initiating keyswitch: on, domain A.
启系统:
sun4810_sc1:A> break
This will suspend Solaris in domain A.