客户反映:SUN T5220后面板告警,及前面板FAN指示灯告警。


1/登进串口控制台,默认进入“sc>”用户终端(此终端为admin用户所登陆,之前有人登陆。)

注:登进串口控制台,root用户(初始密码:changeme),在无admin用户情况下,创建admin用户:

create /SP/users/admin

password:admin

-->set /SP/users/admin role=Administrator

-->set /SP/users/admin cli_mode=alom

创建完后,注销

-->exit


2/既然在“sc>”提示符下,则先“help”

sc> help

Available commands

------------------

Power and Reset control commands:

 powercycle [-y] [-f]

 poweroff [-y] [-f]

 poweron [-c] [FRU]

 reset [-y] [-c] [-d] [-f] [-n]

Console commands:

 break [-y] [-c]

 console [-f]

 consolehistory [-b lines|-e lines|-v] [-g lines] [boot|run]

Boot control commands:

 bootmode [normal|reset_nvram|bootscript="string"|config="configname"]

 setkeyswitch [-y] <normal|stby|diag|locked>

 showkeyswitch

Locator LED commands:

 setlocator [on|off]

 showlocator

Status and Fault commands:

 clearasrdb

 clearfault <UUID>

 disablecomponent [asr-key]

 enablecomponent [asr-key]

 removefru [-y] <FRU>

 setfru -c [data]

 showcomponent [asr-key]

 showenvironment

 showfaults [-v]

 showfru [FRU]

 showlogs [-b lines|-e lines|-v] [-g lines] [-p logtype[r|p]]

 shownetwork [-v]

 showplatform [-v]

 showpower [-v]

ALOM Configuration commands:

 setdate <[mmdd]HHMM | mmddHHMM[cc]yy][.SS]>

 setsc [param] [value]

 setupsc

 showdate

 showhost [version]

 showsc [-v] [param]

ALOM Administrative commands:

 flashupdate <-s IPaddr -f pathname> [-v] [-y] [-c]

 help [command]

 logout

 password

 resetsc [-y]

 restartssh [-y]

 setdefaults [-y]

 ssh-keygen [-l|-r] <-t {rsa|dsa}>

 showusers [-g lines]

 useradd <username>

 userclimode <username> <default|alom>

 userdel [-y] <username>

 userpassword <username>

 userperm <username> [c][u][a][r][o][s]

 usershow [username]


3/可见showfaults[-v] 是显示当前系统故障。

sc> showfaults

Last POST Run: Fri Jul 20 05:12:03 2012


Post Status: Passed all devices

 ID FRU               Fault

  1 /SYS/FANBD0/FM1   SP detected fault: TACH at /SYS/FANBD0/FM1/F1 has reached low non-recoverable threshold.

  2 /SYS/FANBD0/FM1   SP detected fault: TACH at /SYS/FANBD0/FM1/F0 has reached low non-recoverable threshold.

  3 /SYS/FANBD0/FM0   SP detected fault: TACH at /SYS/FANBD0/FM0/F0 has reached low non-recoverable threshold.

  4 /SYS/FANBD0/FM0   SP detected fault: TACH at /SYS/FANBD0/FM0/F1 has reached low non-recoverable threshold.

  5 /SYS/MB           Host detected fault MSGID: PCIEX-8000-0A  UUID: 31b27bc3-0aff-6aea-c326-d2459bb1ff51


4/定位故障,由上面显示,2块风扇模块FM0、FM1发现故障。其中“/SYS/MB” 为主板(预测性自我修复诊断出主板故障,而实际有故障的组件为FAN)


5/处理故障:

1数据备份,建议客户应用及业务数据备份

2连接控制台,使用串口线连接机器进入串行控制台

3关闭系统电源,在串行控制台执行命令: -> stop /SYS

4佩戴防静电护腕,确认已经佩戴防静电护腕,并且防静电护腕连接到机柜上的未涂漆部分

5断开电源,断开主电源和次电源

6记录线序,记录设备线序,拔掉设备线

7提起闩锁,打开顶盖门。

8拆除有故障的风扇。

9将取下的风扇放置在防静电的材质表面

10从防静电包装中取出新风扇

11其按照正确的方位安装风扇。

12合上顶盖门

13线序按照记录,重新插回。

14加电自检,确认硬件是否就绪。

15启动系统,在控制台执行命令->  start /SYS


6/查看故障,进入root用户,使用“show /SP/faultmgmt”命令查看故障。

-> show /SP/faultmgmt


/SP/faultmgmt

   Targets:

       shell

       0 (/SYS/MB)


   Properties:


   Commands:

       cd

       show


7/可见在系统中故障已消失,但是面板上的告警灯未消除,而且在root下查看故障报错为“0 (/SYS/MB)”因上述已说明“(/SYS/MB)”为主板误报错(其真正故障的是FAN),所以此告警并非真正告警,所以消除。


8/消除告警指示灯“set /SYS/MB clear_fault_action=true”

-> set /SYS/MB clear_fault_action=true

Are you sure you want to clear /SYS/MB (y/n)? y

Set 'clear_fault_action' to 'true'


9/再次查看故障“show /SP/faultmgmt ”并未有主板告警,而且面板告警灯也消除。

-> show /SP/faultmgmt                


/SP/faultmgmt

   Targets:

       shell


   Properties:


   Commands:

       cd

       show


10/退出,OK!

-> exit