无法访问存储导致down实例

博客描述了一次数据库实例因无法访问存储而宕机的问题。出现ORA-12505错误,通过`crs_stat -t`命令发现实例状态为OFFLINE。日志显示LGWR进程遇到IO错误,实例由LGWR终止。磁盘心跳和存储路径也出现离线状态,最终确定是存储问题导致实例down掉。解决措施包括启动实例和重新定位服务。
摘要由CSDN通过智能技术生成
一、现象:
SQL/DEVELOPER,连接数据库时报了ORA-12505:TNS:listener does not currently kown of SID given in connect descriptor
二、查看信息:
1、lsnrctl status查看监听正常
2、crs_stat -t  发现:ora.....d1.inst 目标状态和当前状态是OFFLINE,通过ps -ef | grep pmon 发现实例1确实down了
$ crs_stat -t                                                                                                                                                                                           
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.exprd.db   application    ONLINE    ONLINE    trsendb2     
ora.....exp.cs application    ONLINE    ONLINE    trsendb1     
ora....rd1.srv application    ONLINE    ONLINE    trsendb1     
ora....d1.inst application    OFFLINE   OFFLINE   trsendb1    
ora.....flt.cs application    ONLINE    ONLINE    trsendb1     
ora....rd1.srv application    ONLINE    ONLINE    trsendb1      
ora....rd2.srv application    ONLINE    ONLINE    trsendb1     
ora.....mdm.cs application    ONLINE    ONLINE    trsendb2     
ora....rd1.srv application    ONLINE    ONLINE    trsendb1     
ora....rd2.srv application    ONLINE    ONLINE    trsendb2       
ora....rd1.srv application    ONLINE    ONLINE    trsendb1     
ora.....ord.cs application    ONLINE    ONLINE    trsendb2     
ora....rd2.srv application    ONLINE    ONLINE    trsendb1     
ora.....pbl.cs application    ONLINE    ONLINE    trsendb1     
ora....rd1.srv application    ONLINE    ONLINE    trsendb1     
ora.....rpt.cs application    ONLINE    ONLINE    trsendb2     
ora....rd2.srv application    ONLINE    ONLINE    trsendb1     
ora.....rut.cs application    ONLINE    ONLINE    trsendb1    
ora.....stl.cs application    ONLINE    ONLINE    trsendb2     
ora....rd2.srv application    ONLINE    ONLINE    trsendb1     
ora....SM1.asm application    ONLINE    ONLINE    trsendb1     
ora....R1.lsnr application    ONLINE    ONLINE    trsendb1     
ora....vr1.gsd application    ONLINE    ONLINE    trsendb1     
ora....vr1.ons application    ONLINE    ONLINE    trsendb1     
ora....vr1.vip application    ONLINE    ONLINE    trsendb1     
ora....SM2.asm application    ONLINE    ONLINE    trsendb2     
ora....R2.lsnr application    ONLINE    ONLINE    trsendb2     
ora....vr2.gsd application    ONLINE    ONLINE    trsendb2     
ora....vr2.ons application    ONLINE    ONLINE    trsendb2     
ora....vr2.vip application    ONLINE    ONLINE    trsendb2
   
$ ps -ef | grep pmon
  oracle  3222     1  0  Mar 12  ?         9:09 asm_pmon_+ASM1
  oracle  2147  1415  0 17:14:53 pts/1     0:00 grep pmon

3、日志
OS日志
Apr  9 14:37:16 trsen01 sshd[1068]: SSH: Server;LType: Throughput;Remote: 192.168.8.199-56261;IN: 5112;OUT: 1812;Duration: 16.9;tPut_in: 302.7;tPut_out: 107.3
Apr 10 14:45:19 trsen01 vmunix: class : tgtpath, instance 8
Apr 10 14:45:19 trsen01 vmunix: Target path (class=tgtpath, instance=8) has gone offline.  The target path h/w path is 0/0/0/5/0/0/1.0x50001fe1501c8e0a
Apr 10 14:45:19 trsen01 vmunix:
Apr 10 14:45:26 trsen01 vmunix: class : tgtpath, instance 6
Apr 10 14:45:26 trsen01 vmunix: cTlaarsgse t:  ptagtthp a(tchl,a sisn=sttgatnpcaet h7,
Apr 10 14:45:26 trsen01 vmunix:  iTnasrtgaentc e=6) has gone offline.  The target path h/w path is 0/0/0/5/0/0/1.0x50001fe1501c8e0e
Apr 10 14:45:26 trsen01 vmunix: path (class=tgtpath, instance=7) has gone offline.  The target path h/w path is 0/0/0/5/0/0/1.0x50001fe1501c8e0f
Apr 10 14:45:26 trsen01 vmunix:
Apr 10 14:45:27 trsen01 vmunix: class : tgtpath, instance 2
Apr 10 14:45:27 trsen01 vmunix: Target path (class=tgtpath, instance=2) has gone offline.  The target path h/w path is 0/0/0/5/0/0/0.0x50001fe1501c8e0c
Apr 10 14:45:27 trsen01 vmunix: class : tgtpath, instance 3
Apr 10 14:45:27 trsen01 vmunix: Target path (class=tgtpath, instance=3) has gone offline.  The target path h/w path is 0/0/0/5/0/0/0.0x50001fe1501c8e0d
Apr 10 14:45:27 trsen01 vmunix: class : tgtpath, instance 4
Apr 10 14:45:27 trsen01 vmunix: Target path (class=tgtpath, instance=4) has gone offline.  The target path h/w path is 0/0/0/5/0/0/0.0x50001fe1501c8e08
Apr 10 14:45:27 trsen01 vmunix: class : tgtpath, instance 5
Apr 10 14:45:27 trsen01 vmunix: Target path (class=tgtpath, instance=5) has gone offline.  The target path h/w path is 0/0/0/5/0/0/0.0x50001fe1501c8e09
Apr 10 14:45:28 trsen01 vmunix: class : tgtpath, instance 9
Apr 10 14:45:28 trsen01 vmunix: Target path (class=tgtpath, instance=9) has gone offline.  The target path h/w path is 0/0/0/5/0/0/1.0x50001fe1501c8e0b
ALER日志
Thu Apr 10 14:43:46 EAT 2014
Errors in file /u01/app/oracle/product/admin/trsendb/bdump/trsendb1_lgwr_3473.trc:
ORA-00340: IO error processing online log 5 of thread 1
ORA-00345: redo log write error block 436153 count 1
ORA-00312: online log 5 thread 1: '+TDBASM2/trsendb/onlinelog/group_5.5776.752860931'
ORA-65535: Message 65535 not found;  product=RDBMS; facility=ORA
ORA-00345: redo log write error block 436153 count 1
ORA-00312: online log 5 thread 1: '+TDBASM2/trsendb/onlinelog/group_5.5777.752860951'
ORA-65535: Message 65535 not found;  product=RDBMS; facility=ORA
LGWR: terminating instance due to error 340
Thu Apr 10 14:43:46 EAT 2014
Trace dumping is performing id=[cdmp_20140410144346]
Thu Apr 10 14:43:55 EAT 2014
Termination issued to instance processes. Waiting for the processes to exit
Thu Apr 10 14:44:01 EAT 2014
Instance termination failed to kill one or more processes
Instance terminated by LGWR, pid = 3473==================>中断实例
Thu Apr 10 15:15:49 EAT 2014
lgwr trc日志
*** 2014-04-10 14:43:43.930=============>在14:43:43时,报错了LGWR IO失败
Warning: log write time 820ms, size 2KB
WARNING: IO Failed.  au:107090 diskname:/dev/rdisk/asm5disk
     rq:9ffffffffd0018e8 buffer:c000000100640800 au_offset(bytes):189440 iosz:1024 operation:1
     status:2
WARNING: IO Failed.  au:106889 diskname:/dev/rdisk/asm10disk
ocssd日志
[    CSSD]2014-04-10 14:45:26.070 [8] >WARNING: clssnmDiskPMT: voting device offline at 50% fatal, termination in 99613 ms, disk (0//dev/rdisk/votingdisk)==>磁盘心跳出现问题
[    CSSD]2014-04-10 14:46:16.022 [8] >WARNING: clssnmDiskPMT: voting device offline at 75% fatal, termination in 49661 ms, disk (0//dev/rdisk/votingdisk)
[    CSSD]2014-04-10 14:46:46.590 [8] >WARNING: clssnmDiskPMT: voting device offline at 90% fatal, termination in 19093 ms, disk (0//dev/rdisk/votingdisk)
[    CSSD]2014-04-10 14:46:47.600 [8] >WARNING: clssnmDiskPMT: voting device offline at 90% fatal, termination in 18083 ms, disk (0//dev/rdisk/votingdisk)

从整个日志可以看出:
=>LGWR进程43:43.930s在无法写日志,导致出现IO错误
=>44:01s Instance terminated by LGWR
=>45:19~:45:26 os日志记录class : tgtpath, instance x has gone offline
=>45:26.070s 磁盘心跳出现问题
=>在15点左右,开发出现连不上数据问题

从日志初步分析,数据库不能访问存储,导致实例down了

4、
启动实例=>srvctl start instance -d trsendb -i trsendb1;
relocate服务=>srvctl relocate service -d trsendb -s xxx -i trsendb2 -t trsendb1;
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值