ORACLE ODAX9-2的一个误告警Affects: /SYS/MB的分析处理

在运维的多套ORACLE ODAX9-2版本,都遇到了一个计算节点的告警:Description: The service Processor poweron selftest has deteced a problem. Probabity;:100, UulD:cd1ebbdf-f099-61de-ca44-ef646defe034, Resource:/SYS/MB,;此告警从描述上来看比较验证,但是事实是主机运行正常,对此告警进行分析认为就误报,ORACLE ODA的硬件管理平台ILOM上提供了清理告警的接口,按如下步骤进行清理后,告警消除,后续持续观察,系统运行正常。

处理步骤如下:

1、查看告警信息:

点击查看告警详情:

2、命令行接口查看告警信息(序列号已经脱敏请勿对比)

[root@aaadb1 ~]# ipmitool sunoem cli
Connected. Use ^D to exit.
-> start /SP/faultmgmt/shell
Are you sure you want to start /SP/faultmgmt/shell (y/n)? y

faultmgmtsp> fmadm faulty
------------------- ------------------------------------ -------------- --------
Time                UUID                                 msgid          Severity
------------------- ------------------------------------ -------------- --------
2023-07-31/08:55:39 cd1ebbdf-f099-61de-ca44-ef646defe034 ILOM-8000-4T   Critical

Problem Status           : open
Diag Engine              : fdd 1.0
System                  
   Manufacturer          : Oracle Corporation
   Name                  : ORACLE SERVER X9-2L
   Part_Number           : 7603
   Serial_Number         : 23000

System Component        
   Firmware_Manufacturer : Oracle Corporation
   Firmware_Version      : (ILOM)5.1.0.23 r147470,(BIOS)62070300
   Firmware_Release      : (ILOM)2022.09.03,(BIOS)2022.08.17

----------------------------------------
Suspect 1 of 1
   Problem class  : fault.chassis.device.sppost
   Certainty      : 100%
   Affects        : /SYS/MB
   Status         : faulted

   FRU                 
      Status            : faulty
      Location          : /SYS/MB
      Manufacturer      : Oracle Corporation
      Name              : ASM,MTHRBD,2U
      Part_Number       : 820000
      Revision          : 12
      Serial_Number     : 4650000
      Chassis          
         Manufacturer   : Oracle Corporation
         Name           : ORACLE SERVER X9-2L
         Part_Number    : 76000000
         Serial_Number  : 2300000

Description : The Service Processor power-on self test has detected a
              problem.

Response    : The service-required LED may be illuminated on the affected
              FRU and chassis.

Impact      : The Service Processor may not be able to perform necessary
              functions to power on, monitor, or manage the system.

Action      : Please refer to the associated reference document at
              http://support.oracle.com/msg/ILOM-8000-4T for the latest
              service procedures and policies regarding this diagnosis.

3、清理告警

清理的步骤参考PSH Procedural Article for ILOM-Based Diagnosis (Doc ID 1155200.1),使用'fmadm repair' 命令即可

 
  • Enter the fault management shell.

-> start /SP/faultmgmt/shell
Are you sure you want to start /SP/faultmgmt/shell (y/n) ? y

faultmgmtsp>

  • Use 'fmadm repair'  to clear the fault.

Rather than the UUID, the FRU path (/SYS/FANBD/FM0) could also be used.


Example  3
Example  3 shows the 'fmadm repaired' command required  after the suspect FRU has been replaced.  Using the UUID from the 'fmadm faulty from Example 1 above, the command would be:
 

faultmgmtsp> fmadm repair 9df39f93-f356-6d26-e081-e4f3a9872c2f


Example 4

Example 4 shows the 'fmadm repaired' command required after the FRU has been replaced..  This example shows the FRU Path from Example 2 above being used.  The command would be:
 

fmadm repair /SYS/MB

具体处理日志如下:(根据告警事件的UUID)

faultmgmtsp> fmadm repair  cd1ebbdf-f099-61de-ca44-ef646defe034
faultmgmtsp> fmadm faulty
No faults found
faultmgmtsp> exit
-> exit
Disconnected

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值