ORA-600(kfgFinalize_2)错误

故障背景

由于机房需要搬迁,服务器要在一个晚上的时间内完成:

  1. 依次关闭应用
  2. 关闭数据库
  3. 关闭存储
  4. 服务器下架
  5. 运输至新机房
  6. 服务器上架
  7. 启动存储、数据库、应用等

一切按照计划的顺利进行着,不过一般这么大动静的操作,顺顺利利的话也“实属罕见”。果然我们的RAC遇到了问题。还是个bug。真棒

DB信息

数据库版本:11.2.0.3 RAC for OEL6.4

事情经过

  • 服务器刚上架之后RAC启动一切顺利,并且在crsctl stat res -t 中看到了所有节点的DB都是Open状态。

  • 过了一会儿在次检查的时候,发现node2 DB shutdown了。而且ASM instance 状态也不正常了。

  • 尝试手动启动ASM,未果。可是ASM起不来DB也起不来呀。

  • 尝试重启大法,将node2 reboot ,好了 cluster 并没有像预想的那样起来,ASM也没有起来,DB也没有起来,凉凉。

  • 所以我去查看了ASM的log。发现一下报错信息:

  1. 无限重复一下信息

MON querying group 1 at 414 for pid 23, osid 4992 NOTE: cache opening disk 0 of grp 1: (already open) ARCH label:ARCH NOTE:
F1X0 found on disk 0 au 2 fcn 0.0 NOTE: cache mounting (not first)
(retry) external redundancy group 1/0xC07844EE (ARCH) NOTE: LGWR
attempting to mount thread 2 for diskgroup 1 (ARCH) NOTE: LGWR found
thread 2 open ckpt=2732.3731 - signalling ORA-15133 NOTE: LGWR caught
ORA-15133 while mounting diskgroup 1 WARNING: instance recovery
required during mount

  1. 最后ora-600 ,kfgFinalize_2

Errors in file
/u01/app/oracle/diag/asm/+asm/+ASM2/trace/+ASM2_ora_4992.trc
(incident=739389): ORA-00600: internal error code, arguments:
[kfgFinalize_2], [], [], [], [], [], [], [], [], [], [], [] Incident
details in:
/u01/app/oracle/diag/asm/+asm/+ASM2/incident/incdir_739389/+ASM2_ora_4992_i739389.trc
Use ADRCI or Support Workbench to package the incident. See Note 411.1
at My Oracle Support for error and packaging details. Wed May 06
22:33:30 2020 Dumping diagnostic data in
directory=[cdmp_20200506223330], requested by (instance=2, osid=4992),
summary=[incident=739389]. ORA-00600: internal error code, arguments:
[kfgFinalize_2], [], [], [], [], [], [], [], [], [], [], [] ERROR:
ALTER DISKGROUP ALL MOUNT /* asm agent call crs // {0:0:2} */ NOTE:
cache dismounting (clean) group 1/0xC07844EE (ARCH) NOTE: messaging
CKPT to quiesce pins Unix process pid: 4992, image: oracle@rbdb82 (TNS
V1-V3) NOTE: lgwr not being msg’d to dismount Wed May 06 22:33:30 2020

于是翻找MOS上发现与一篇文章描述的问题100%的一样“Unable to mount ASM diskgroup due to LGWR caught ORA-15133 and ORA-00600 [KFGFINALIZE_2] (Doc ID 1548801.1)”
从报错信息 到 数据库版本完全中招!

Solution 1:

This is fixed in 12.1 by unpublished Bug 13955826 : 12G_ASM_X64:
ORA-15133 AND ORA-00600 [KFGFINALIZE_2] IN ASM Also this is fixed from
11.2.0.4.4 DB PSU onwards. Either apply one off patches available for your version or upgrade to 12.1.

emm… 升级嘛 有些不太现实

Solution 2:

A workaround is available in the interim: WORKAROUND INFORMATION

a. Shutdown the databases on the other nodes that are open and using the same diskgroup that is not mounting.
b. Then dismount this diskgroup where it is mounted on other nodes,
sql> alter diskgroup DATA dismount;
c. After completion of the dismount on the other nodes, verified in the alert log file, try to mount this diskgroup on node 1 and wait for completion. sql> alter diskgroup DATA mount;
d. Then mount this diskgroup on other nodes on-by-one
e. Then startup related databases normally.

Oracle 好心的提供了一临时解决的办法。方法温柔,风险度低,可尝试。

尝试解决问题

a. 关闭其他已经启动的数据库

[oracle@rbdb81 ~]$ sqlplus / as sysdba
SQL> shutdown immediate

b. dismount 磁盘组

[grid@rbdb81 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.3.0 Production on Wed May 6 22:19:32 2020

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> select name , state from v$asm_diskgroup;

NAME			       STATE
------------------------------ -----------
ARCH			       MOUNTED
DATA01			       MOUNTED
DATA02			       MOUNTED
OCRVOTEMO1		       MOUNTED
OCRVOTE 		       MOUNTED

SQL> 
SQL> alter diskgroup DATA01 dismount;

Diskgroup altered.

SQL> alter diskgroup DATA02 dismount;

Diskgroup altered.

SQL> select name , state from v$asm_diskgroup;

NAME			       STATE
------------------------------ -----------
ARCH			       MOUNTED
DATA01			       DISMOUNTED
DATA02			       DISMOUNTED
OCRVOTEMO1		       MOUNTED
OCRVOTE 		       MOUNTED

SQL> 

c. 其他的节点成功dismount之后, 跟踪查看asm alert log file, 并尝试mount diskgroup on node1

sql> alter diskgroup DATA mount;

d. 尝试一个一个的去挂载其他node 的diskgroup。
好的,那么问题来了。我的node2 在reboot之后,连cluster都没有起来,由于cluster是开机自动启动的,所以我手动停掉再次启动下。
然后就一直卡着。

[root@rbdb82 ~]# /u01/app/grid/product/11.2.0/bin/crsctl stop has
[root@rbdb82 ~]# /u01/app/grid/product/11.2.0/bin/crsctl start has

查看了cluster.log 也并没有报错。

有趣的发现

  • 于是我手动把node1 cluster也stop了,在敲下回车的瞬间, node2起来了。神奇了!

  • 那么node2 起来了,一直没问题的node1 我再手动start呢? 结果依然是卡着。。

  • 手动stop node2 cluster ,node1 瞬间起来了。那么说明他俩一定是在争抢某一个资源导致的。

  • 既然node1 是一直正常的节点,我把node1 完全调整为正常状态,启动asm,启动DB。

1、根据问题发生在启动asm初期的时候
2、one by one 只能启动一个节点
3.、asm采用 asmlib (重点)

最终解决

初步判断是由于asm cache中的ACD、COD所记录的两个节点的信息不一致(也有可能句柄为释放)。既然重启软件不管用,那么我打算分别启动两个node吧。还是先拿node2开刀。
么想到,等待了漫长10min之后,cluster、asm 都起来了。手动启动DB。继续遇到报错

SQL> startup
ORACLE instance started.

Total System Global Area 2.6122E+10 bytes
Fixed Size		    2240624 bytes
Variable Size		 7314870160 bytes
Database Buffers	 1.8790E+10 bytes
Redo Buffers		   14823424 bytes
Database mounted.
ORA-00322: log 4 of thread 2 is not current copy
ORA-00312: online log 4 thread 2: '+DATA01/rbdbon8/onlinelog/group_4.257.831815533'

解决:

SQL> alter database clear logfile '+DATA01/rbdbon8/onlinelog/group_4.257.831815533';

SQL> alter database  open;

欢喜

[grid@rbdb81 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.ARCH.dg
               ONLINE  ONLINE       rbdb81                                       
               ONLINE  ONLINE       rbdb82                                       
ora.DATA01.dg
               ONLINE  ONLINE       rbdb81                                       
               ONLINE  ONLINE       rbdb82                                       
ora.DATA02.dg
               ONLINE  ONLINE       rbdb81                                       
               ONLINE  ONLINE       rbdb82                                       
ora.LISTENER.lsnr
               ONLINE  ONLINE       rbdb81                                       
               ONLINE  ONLINE       rbdb82                                       
ora.LISTENER_WORK.lsnr
               ONLINE  ONLINE       rbdb81                                       
               ONLINE  ONLINE       rbdb82                                       
ora.OCRVOTE.dg
               ONLINE  ONLINE       rbdb81                                       
               ONLINE  ONLINE       rbdb82                                       
ora.OCRVOTEMO1.dg
               ONLINE  ONLINE       rbdb81                                       
               ONLINE  ONLINE       rbdb82                                       
ora.asm
               ONLINE  ONLINE       rbdb81                   Started             
               ONLINE  ONLINE       rbdb82                   Started             
ora.gsd
               OFFLINE OFFLINE      rbdb81                                       
               OFFLINE OFFLINE      rbdb82                                       
ora.net1.network
               ONLINE  ONLINE       rbdb81                                       
               ONLINE  ONLINE       rbdb82                                       
ora.ons
               ONLINE  ONLINE       rbdb81                                       
               ONLINE  ONLINE       rbdb82                                       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       rbdb81                                       
ora.cvu
      1        ONLINE  ONLINE       rbdb81                                       
ora.oc4j
      1        ONLINE  ONLINE       rbdb81                                       
ora.rbdb81.vip
      1        ONLINE  ONLINE       rbdb81                                       
ora.rbdb82.vip
      1        ONLINE  ONLINE       rbdb82                                       
ora.rbdbon8.db
      1        ONLINE  ONLINE       rbdb82                   Open                
      2        ONLINE  ONLINE       rbdb81                   Open                
ora.scan1.vip
      1        ONLINE  ONLINE       rbdb81                                       
[grid@rbdb81 ~]$ 

the end !

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值