oracle两节点RAC,由于gipc导致某节点crs无法启动问题分析

两节点RAC,其中1 节点集群CRS无法启动。经过分析原因为2节点gipcd进程异常,导致节点之间无法正常通信,重启2节点gipcd.bin后问题得以恢复。 从现象来看,是 ora.crsd ora.evmd 无法启动,其他组件正常。

1.     检查和分析

1.1.   节点 1 集群 alert 日志

节点1集群日志13:08分时手动重启内容如下, 关于olsnodes.log无法删除的信息本环境中一直存在,此处信息可忽略。

 

2018-11-26   13:08:29.521:

[client(892)]CRS-0009:log   file   "/home/u01/app/grid/11.2.0/product/log/sxmms1/client/olsnodes.log"   reopened

2018-11-26   13:08:29.521:

[client(892)]CRS-0019:file   rotation terminated. log file: "/home/u01/app/grid/11.2.0/product/log/sxmms1/client/olsnodes.log"

2018-11-26   13:08:42.421:

[ohasd(903)]CRS-2112:The   OLR service started on node sxmms1.

2018-11-26   13:08:42.433:

[ohasd(903)]CRS-1301:Oracle   High Availability Service started on node sxmms1.

2018-11-26   13:08:42.433:

[ohasd(903)]CRS-8017:location:   /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0   errors occurred

2018-11-26   13:08:45.864:

[/home/u01/app/grid/11.2.0/product/bin/orarootagent.bin(948)]CRS-2302:Cannot   get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).

2018-11-26   13:08:51.238:

[gpnpd(1118)]CRS-2328:GPNPD   started on node sxmms1.

2018-11-26   13:08:53.710:

[cssd(1184)]CRS-1713:CSSD   daemon is started in clustered mode

2018-11-26   13:08:55.508:

[ohasd(903)]CRS-2767:Resource   state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE

2018-11-26   13:08:55.509:

[ohasd(903)]CRS-2769:Unable   to failover resource 'ora.diskmon'.

2018-11-26   13:09:03.406:

[cssd(1184)]CRS-1707:Lease   acquisition for node sxmms1 number 1 completed

2018-11-26   13:09:04.658:

[cssd(1184)]CRS-1605:CSSD   voting file is online: ORCL:OCR2; details in   /home/u01/app/grid/11.2.0/product/log/sxmms1/cssd/ocssd.log.

2018-11-26   13:09:07.670:

[cssd(1184)]CRS-1601:CSSD   Reconfiguration complete. Active nodes are sxmms1 sxmms2 .

2018-11-26   13:09:09.989:

[ctssd(1269)]CRS-2407:The   new Cluster Time Synchronization Service reference node is host sxmms2.

2018-11-26   13:09:09.990:

[ctssd(1269)]CRS-2401:The   Cluster Time Synchronization Service started on host sxmms1.

2018-11-26   13:09:11.701:

[ohasd(903)]CRS-2767:Resource   state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE

2018-11-26   13:09:11.701:

[ohasd(903)]CRS-2769:Unable   to failover resource 'ora.diskmon'.

2018-11-26   13:10:08.710:

[/home/u01/app/grid/11.2.0/product/bin/orarootagent.bin(1129)]CRS-5818:Aborted   command 'start' for resource 'ora.ctssd'. Details at (:CRSAGF00113:) {0:0:2}   in /home/u01/app/grid/11.2.0/product/log/sxmms1/agent/ohasd/orarootagent_root/orarootagent_root.log.

2018-11-26   13:10:12.714:

[ohasd(903)]CRS-2757:Command   'Start' timed out waiting for response from the resource 'ora.ctssd'. Details   at (:CRSPE00111:) {0:0:2} in /home/u01/app/grid/11.2.0/product/log/sxmms1/ohasd/ohasd.log.

[client(1584)]CRS-10001:26-Nov-18   13:10 ACFS-9391: Checking for existing ADVM/ACFS installation.

[client(1589)]CRS-10001:26-Nov-18   13:10 ACFS-9392: Validating ADVM/ACFS installation files for operating   system.

[client(1591)]CRS-10001:26-Nov-18   13:10 ACFS-9393: Verifying ASM Administrator setup.

[client(1594)]CRS-10001:26-Nov-18   13:10 ACFS-9308: Loading installed ADVM/ACFS drivers.

[client(1597)]CRS-10001:26-Nov-18   13:10 ACFS-9154: Loading 'oracleoks.ko' driver.

[client(1625)]CRS-10001:26-Nov-18   13:10 ACFS-9154: Loading 'oracleadvm.ko' driver.

[client(1653)]CRS-10001:26-Nov-18   13:10 ACFS-9154: Loading 'oracleacfs.ko' driver.

[client(1764)]CRS-10001:26-Nov-18   13:10 ACFS-9327: Verifying ADVM/ACFS devices.

[client(1773)]CRS-10001:26-Nov-18   13:10 ACFS-9156: Detecting control device '/dev/asm/.asm_ctl_spec'.

[client(1777)]CRS-10001:26-Nov-18   13:10 ACFS-9156: Detecting control device '/dev/ofsctl'.

[client(1782)]CRS-10001:26-Nov-18   13:10 ACFS-9322: completed

2018-11-26   13:10:14.067:

[ohasd(903)]CRS-2807:Resource   'ora.asm' failed to start automatically.

2018-11-26   13:10:14.067:

[ohasd(903)]CRS-2807:Resource   'ora.crsd' failed to start automatically.

2018-11-26   13:10:14.067:

[ohasd(903)]CRS-2807:Resource   'ora.evmd' failed to start automatically.

2018-11-26   13:11:42.738:

[ohasd(903)]CRS-2765:Resource   'ora.ctssd' has failed on server 'sxmms1'.

2018-11-26   13:11:45.381:

[ctssd(2151)]CRS-2407:The   new Cluster Time Synchronization Service reference node is host sxmms2.

2018-11-26   13:11:45.382:

[ctssd(2151)]CRS-2401:The   Cluster Time Synchronization Service started on host sxmms1.

 

1.2.   节点 1 AGENT 分析

日志只截取了部分内容,从日志来看,几乎很多组件在启动时都出现了超时

/home/u01/app/grid/11.2.0/product/log/sxmms1/agent/ohasd/orarootagent_root/orarootagent_root.log

2018-11-26 13:10:06.792:   [ora.ctssd][2525660928]{0:0:2} [start] clsdmc_respget return: status=0,   ecode=0, returnbuf=[0x7f51780ce0c0], buflen=8

2018-11-26 13:10:06.792:   [ora.ctssd][2525660928]{0:0:2} [start] Start: Extended check return buffer:   "? with length of 8

2018-11-26 13:10:06.792:   [ora.ctssd][2525660928]{0:0:2} [start] translateReturnCodes, return = 0,   state detail = Checkcb data [0x7f51780ce0c0]: mode[0xc0] offset[0 ms].

[    clsdmc][2525660928]CLSDMC.C returnbuflen=8, extraDataBuf=C0,   returnbuf=7805FCE0

2018-11-26 13:10:07.793:   [ora.ctssd][2525660928]{0:0:2} [start] clsdmc_respget return: status=0,   ecode=0, returnbuf=[0x7f517805fce0], buflen=8

2018-11-26 13:10:07.793:   [ora.ctssd][2525660928]{0:0:2} [start] Start: Extended check return buffer:   "? with length of 8

2018-11-26 13:10:07.793:   [ora.ctssd][2525660928]{0:0:2} [start] translateReturnCodes, return = 0,   state detail = Checkcb data [0x7f517805fce0]: mode[0xc0] offset[0 ms].

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值