ORACLE 12C RAC 启动报错,错误代码 ORA-27300,ORA-27301,ORA-27302

大家好我是VIK,今天分析一篇集群启动故障的处理案例

3月15号接到一个销售转过来的case,说客户的集群出了故障,请帮忙看看的请求;

现电话了客户,客户反应3月2日中午左右出现的问题,一个节点不可用,运行的业务目前正常,期间有运维公司,应用开发商都上去处理过,大家的意见是重启主机,需要我们进行核实,分析如下:

1、检查服务器和集群的状态

非一体机,12.2 无任何补丁,2套X86服务器,1节点集群异常(数据库正常,还有会话),2节点正常工作,负载不高,硬件、空间均未检查到异常,多路径存在掉线的情况,但通过Mulitpath -ll 看又是正常的,2根私有网络,其中一根不同,sysctl中配置的稀疏模式

2、通过TFA收集故障时段的日志

在trc中发现如下错误信息

2024-03-15 00:35:10.534 : USRTHRD:2381616896: {0:1:16} ORA-27300: OS system dependent operation:sslssunreghdlr failed with status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: sskgpreset1

查询MOS文档以及后台文档,排除上面指出的问题

3、尝试强制重启集群

2024-03-14 22:15:08.392 [CRSD(181793)]CRS-0804: Cluster Ready Service aborted due to Oracle Cluster Registry error [PROC-26: Error while accessing the physical storage Storage layer error [Insufficient quorum to open OCR devices] [0]]. Details at (:CRSD00111:) in /u01/app/grid/diag/crs/ecardv5-db1/crs/trace/crsd.trc.
2024-03-14 22:15:08.502 [CRSD(181853)]CRS-8500: Oracle Clusterware CRSD process is starting with operating system process ID 181853
2024-03-14 22:37:11.374 [CRSD(192087)]CRS-8500: Oracle Clusterware CRSD process is starting with operating system process ID 192087
2024-03-14 22:59:14.254 [CRSD(202502)]CRS-8500: Oracle Clusterware CRSD process is starting with operating system process ID 202502
2024-03-14 23:21:17.227 [CRSD(212076)]CRS-8500: Oracle Clusterware CRSD process is starting with operating system process ID 212076
2024-03-14 23:43:20.137 [CRSD(223090)]CRS-8500: Oracle Clusterware CRSD process is starting with operating system process ID 223090
2024-03-15 00:05:23.012 [OHASD(13452)]CRS-2878: Failed to restart resource 'ora.crsd'
2024-03-15 00:05:23.516 [GPNPD(19901)]CRS-2329: GPNPD on node ecardv5-db1 shut down. 
2024-03-15 00:05:23.710 [OSYSMOND(32415)]CRS-8504: Oracle Clusterware OSYSMOND process with operating system process ID 32415 is exiting
2024-03-15 00:05:25.522 [MDNSD(19101)]CRS-5602: mDNS service stopping by request.
2024-03-15 00:05:25.755 [MDNSD(19101)]CRS-8504: Oracle Clusterware MDNSD process with operating system process ID 19101 is exiting
2024-03-15 00:05:45.441 [OCTSSD(32007)]CRS-2405: The Cluster Time Synchronization Service on host ecardv5-db1 is shutdown by user
2024-03-15 00:05:45.609 [OCTSSD(32007)]CRS-8504: Oracle Clusterware OCTSSD process with operating system process ID 32007 is exiting
2024-03-15 00:05:46.494 [OCSSD(21498)]CRS-1603: CSSD on node ecardv5-db1 has been shut down.
2024-03-15 00:05:54.110 [OCSSD(21498)]CRS-1660: The CSS daemon shutdown has completed
2024-03-15 00:05:54.110 [OCSSD(21498)]CRS-8504: Oracle Clusterware OCSSD process with operating system process ID 21498 is exiting
2024-03-15 00:05:55.613 [ORAROOTAGENT(18471)]CRS-5822: Agent '/u01/app/12.2.0/grid/bin/orarootagent_root' disconnected from server. Details at (:CRSAGF00117:) {0:1:6873} in /u01/app/grid/diag/crs/ecardv5-db1/crs/trace/ohasd_orarootagent_root.trc.
2024-03-15 00:11:37.043 [OHASD(235850)]CRS-8500: Oracle Clusterware OHASD process is starting with operating system process ID 235850
2024-03-15 00:11:37.052 [OHASD(235850)]CRS-0714: Oracle Clusterware Release 12.2.0.1.0.
2024-03-15 00:11:37.065 [OHASD(235850)]CRS-2112: The OLR service started on node ecardv5-db1.
2024-03-15 00:11:37.108 [OHASD(235850)]CRS-1301: Oracle High Availability Service started on node ecardv5-db1.
2024-03-15 00:11:37.108 [OHASD(235850)]CRS-8017: location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred
2024-03-15 00:11:37.402 [CSSDAGENT(235929)]CRS-8500: Oracle Clusterware CSSDAGENT process is starting with operating system process ID 235929
2024-03-15 00:11:37.412 [CSSDMONITOR(235933)]CRS-8500: Oracle Clusterware CSSDMONITOR process is starting with operating system process ID 235933
2024-03-15 00:11:37.420 [ORAROOTAGENT(235917)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 235917
2024-03-15 00:11:37.477 [ORAAGENT(235927)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 235927
2024-03-15 00:11:38.333 [ORAAGENT(236052)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 236052
2024-03-15 00:11:38.434 [MDNSD(236071)]CRS-8500: Oracle Clusterware MDNSD process is starting with operating system process ID 236071
2024-03-15 00:11:38.452 [EVMD(236073)]CRS-8500: Oracle Clusterware EVMD process is starting with operating system process ID 236073
2024-03-15 00:11:39.464 [GPNPD(236098)]CRS-8500: Oracle Clusterware GPNPD process is starting with operating system process ID 236098
2024-03-15 00:11:40.504 [GPNPD(236098)]CRS-2328: GPNPD started on node ecardv5-db1. 
2024-03-15 00:11:40.517 [GIPCD(236162)]CRS-8500: Oracle Clusterware GIPCD process is starting with operating system process ID 236162
2024-03-15 00:11:42.503 [CSSDMONITOR(236189)]CRS-8500: Oracle Clusterware CSSDMONITOR process is starting with operating system process ID 236189
2024-03-15 00:11:45.098 [CSSDAGENT(236207)]CRS-8500: Oracle Clusterware CSSDAGENT process is starting with operating system process ID 236207
2024-03-15 00:11:45.366 [OCSSD(236222)]CRS-8500: Oracle Clusterware OCSSD process is starting with operating system process ID 236222
2024-03-15 00:11:46.432 [OCSSD(236222)]CRS-1713: CSSD daemon is started in hub mode
2024-03-15 00:12:46.541 [ORAROOTAGENT(235917)]CRS-5818: Aborted command 'action' for resource 'ora.driver.afd'. Details at (:CRSAGF00113:) {0:0:93} in /u01/app/grid/diag/crs/ecardv5-db1/crs/trace/ohasd_orarootagent_root.trc.
2024-03-15 00:12:46.611 [ORAROOTAGENT(235917)]CRS-5014: Agent "ORAROOTAGENT" timed out starting process "/u01/app/12.2.0/grid/bin/afdroot" for action "action": details at "(:CLSN00009:)" in "/u01/app/grid/diag/crs/ecardv5-db1/crs/trace/ohasd_orarootagent_root.trc"
2024-03-15 00:13:47.763 [OCSSD(236222)]CRS-1707: Lease acquisition for node ecardv5-db1 number 1 completed
2024-03-15 00:13:47.814 [ORAROOTAGENT(235917)]CRS-5818: Aborted command 'action' for resource 'ora.driver.afd'. Details at (:CRSAGF00113:) {0:0:166} in /u01/app/grid/diag/crs/ecardv5-db1/crs/trace/ohasd_orarootagent_root.trc.
2024-03-15 00:13:47.899 [ORAROOTAGENT(235917)]CRS-5014: Agent "ORAROOTAGENT" timed out starting process "/u01/app/12.2.0/grid/bin/afdroot" for action "action": details at "(:CLSN00009:)" in "/u01/app/grid/diag/crs/ecardv5-db1/crs/trace/ohasd_orarootagent_root.trc"
2024-03-15 00:13:48.910 [OCSSD(236222)]CRS-1605: CSSD voting file is online: AFD:CRS2; details in /u01/app/grid/diag/crs/ecardv5-db1/crs/trace/ocssd.trc.
2024-03-15 00:13:48.919 [OCSSD(236222)]CRS-1605: CSSD voting file is online: AFD:CRS1; details in /u01/app/grid/diag/crs/ecardv5-db1/crs/trace/ocssd.trc.
2024-03-15 00:13:48.929 [OCSSD(236222)]CRS-1605: CSSD voting file is online: AFD:CRS3; details in /u01/app/grid/diag/crs/ecardv5-db1/crs/trace/ocssd.trc.
2024-03-15 00:13:50.051 [OCSSD(236222)]CRS-1601: CSSD Reconfiguration complete. Active nodes are ecardv5-db1 ecardv5-db2 .
2024-03-15 00:13:50.090 [ORAROOTAGENT(235917)]CRS-5021: Check of storage failed: details at "(:CLSN00117:)" in "/u01/app/grid/diag/crs/ecardv5-db1/crs/trace/ohasd_orarootagent_root.trc"
2024-03-15 00:13:52.338 [OCSSD(236222)]CRS-1720: Cluster Synchronization Services daemon (CSSD) is ready for operation.
2024-03-15 00:13:52.362 [OCTSSD(245098)]CRS-8500: Oracle Clusterware OCTSSD process is starting with operating system process ID 245098
2024-03-15 00:13:53.305 [OCTSSD(245098)]CRS-2403: The Cluster Time Synchronization Service on host ecardv5-db1 is in observer mode.
2024-03-15 00:13:53.990 [ORAROOTAGENT(235917)]CRS-5019: All OCR locations are on ASM disk groups [CRS], and none of these disk groups are mounted. Details are at "(:CLSN00140:)" in "/u01/app/grid/diag/crs/ecardv5-db1/crs/trace/ohasd_orarootagent_root.trc".
2024-03-15 00:13:54.462 [OCTSSD(245098)]CRS-2407: The new Cluster Time Synchronization Service reference node is host ecardv5-db2.
2024-03-15 00:13:54.463 [OCTSSD(245098)]CRS-2401: The Cluster Time Synchronization Service started on host ecardv5-db1.
2024-03-15 00:14:05.238 [ORAROOTAGENT(235917)]CRS-5019: All OCR locations are on ASM disk groups [CRS], and none of these disk groups are mounted. Details are at "(:CLSN00140:)" in "
/u01/app/grid/diag/crs/ecardv5-db1/crs/trace/ohasd_orarootagent_root.trc".===>第2点错误信息

停止过程中,发现ASM监听无法停止,手动KILL后正常停止,再次启动集群trc中依然报这个错误。

4、清理/tmp /var目录下.ora*文件再次启动集群,还是提示同样的错误,无语。。。

5、再次加成crs alert ohas等集群日志,以及根据现象和错误再次到后台sr中查找可用信息,发现有一个类似问题,12.2 RAC 无任何补丁,报同样错误,解决方案为重启两套服务器。

6、建议客户备份完成后对数据库进行重启,结果客户的运维工程师在业务期间直接对2个服务器进行重启操作,问题得以解决。

后面建议客户安装补丁和收回root密码


 

  • 27
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值