今天早上刚到公司一会,就接到南京客户打来的电话,说他们的核心会员数据库宕机了,让我远程帮忙查看一下。
还没来得急看报纸的我赶紧打开电脑,远程连接到客户的服务器进行诊断。
客户的生产环境是AIX 6.1上的Oracle 11.2.0.3.0,在凌晨1:40分左右发生的故障。
为了不涉及泄露客户隐私,把数据库实例名进行了替换。
下面看具体的分析:
1. 数据库alert.log分析
Mon Jan 05 01:40:50 2015
WARNING: ASM communication error: op 18 state 0x50 (3113)
ERROR: slave communication error with ASM
NOTE: Deferred communication with ASM instance
Errors in file /u01/app/oracle/diag/rdbms/test5/test5/trace/
test5_ora_16581034.trc:
ORA-03113: end-of-file on communication channel
Process ID:
Session ID: 288 Serial number: 5649
NOTE: deferred map free for map id 4422
Mon Jan 05 01:40:55 2015
NOTE: ASMB terminating
Mon Jan 05 01:40:55 2015
***********************************************************************
Fatal NI connect error 12170.
VERSION INFORMATION:
TNS for IBM/AIX RISC System/6000: Version 11.2.0.3.0 - Production
TCP/IP NT Protocol Adapter for IBM/AIX RISC System/6000: Version 11.2.0.3.0 - Production
Oracle Bequeath NT Protocol Adapter for IBM/AIX RISC System/6000: Version 11.2.0.3.0 - Production
Time: 05-JAN-2015 01:40:55
Tracing not turned on.
Tns error struct:
ns main err code: 12535
TNS-12535: TNS:operation timed out
ns secondary err code: 12606
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=test2)(PORT=64460))
WARNING: inbound connection timed out (ORA-3136)
Mon Jan 05 01:40:55 2015
***********************************************************************
Fatal NI connect error 12170.
VERSION INFORMATION:
TNS for IBM/AIX RISC System/6000: Version 11.2.0.3.0 - Production
TCP/IP NT Protocol Adapter for IBM/AIX RISC System/6000: Version 11.2.0.3.0 - Production
Oracle Bequeath NT Protocol Adapter for IBM/AIX RISC System/6000: Version 11.2.0.3.0 - Production
Time: 05-JAN-2015 01:40:55
Tracing not turned on.
Tns error struct:
ns main err code: 12535
TNS-12535: TNS:operation timed out
ns secondary err code: 12606
nt main err code: 0
nt secondary err code: 0
nt OS err code: 0
Mon Jan 05 01:40:55 2015
***********************************************************************
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=test2)(PORT=64530))
Errors in file /u01/app/oracle/diag/rdbms/test5/test5/trace/
test5_asmb_5898342.trc:
ORA-15064: communication failu