ORA-29702


Instances Abort With ORA-29702 When The Server is rebooted or shut down [ID 752399.1]  

--------------------------------------------------------------------------------
 
  Modified 23-MAR-2009     Type PROBLEM     Status PUBLISHED  

In this Document
  Symptoms
  Cause
  Solution
  References

 

--------------------------------------------------------------------------------

 

Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 10.2.0.4
Oracle Server - Enterprise Edition - Version: 11.1.0.6 to 11.1.0.7
Linux x86-64

Symptoms
When the server is rebooted or shutdown all instances on the server abort with ORA-29702.

Last log entries in the alert.log look like:
Error: KGXGN aborts the instance (6)
Errors in file /HOME/oracle/admin/+ASM/bdump/+asm1_lmon_8981.trc:
ORA-29702: error occurred in Cluster Group Service operation
LMON: terminating instance due to error 29702
System state dump is made for local instance
System State dumped to trace file /HOME/oracle/admin/+ASM/bdump/+asm1_diag_8977.
trc
Trace dumping is performing id=[cdmp_20080929110159]


Cause
When doing a reboot or a shutdown of a server the K96init.crs is called after the operating system stops other services like network, therefore instances on the server crash due to losing the private interconnect.

Solution
Replace the K96 links with K19 links.

Please run the following steps on all nodes in the cluster:


Shutdown the clusterware.
Run this script to replace K96 links with K19 links as root.
Start up the Clusterware

Script:
RC_START=S96
RC_KILL=K19
RC_KILL_OLD=K96
RCSDIR="/etc/rc.d/rc3.d /etc/rc.d/rc5.d"
RCKDIR="/etc/rc.d/rc0.d /etc/rc.d/rc1.d /etc/rc.d/rc2.d /etc/rc.d/rc4.d
/etc/rc.d/rc6.d"
RCALLDIR="/etc/rc.d/rc0.d /etc/rc.d/rc1.d /etc/rc.d/rc2.d /etc/rc.d/rc3.d
/etc/rc.d/rc4.d /etc/rc.d/rc5.d /etc/rc.d/rc6.d"
ID=/etc/init.d
if [ -z "$RMF" ]; then RMF="/bin/rm -f"; fi
if [ -z "$LNS" ]; then LNS="/bin/ln -s"; fi
if [ -z "$ECHO" ]; then ECHO=/bin/echo; fi
# Clean up any old init.crs scripts
for rc in $RCALLDIR
do
$RMF $rc/"$RC_START"init.crs
$RMF $rc/"$RC_KILL"init.crs
$RMF $rc/"$RC_KILL_OLD"init.crs
done
# Install new ones
for rc in $RCSDIR
do
$LNS $ID/init.crs $rc/"$RC_START"init.crs || { $ECHO $?; exit 1; }
done
for rc in $RCKDIR
do
$LNS $ID/init.crs $rc/"$RC_KILL"init.crs || { $ECHO $?; exit 1; }
done

References
BUG:7326677 - WHEN NODE IS REBOOTED, RAC ASM CRASHES DURING SHUTDOWN WITH ORA-29702
BUG:7496341 - FIX FOR 4587300 DOESN'T EXIST IN 11.1.0.7
//
One of the Instances Fails to Start After Reboot With ORA-29702 [ID 788455.1]  

--------------------------------------------------------------------------------
 
  Modified 26-JUN-2009     Type PROBLEM     Status REVIEWED  

In this Document
  Symptoms
  Cause
  Solution
  References

 

--------------------------------------------------------------------------------

 

Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.4 to 10.2.0.4
This problem can occur on any platform.
Oracle Server Enterprise Edition - Version: 10.2.0.4 to 10.2.0.4
Symptoms
After a node is rebooted, one of the instances fails to start automatically, but can be started manually.

Database Alert log reports:
==================
Thu Feb 5 15:19:07 2009
Error: KGXGN aborts the instance (6)
Thu Feb 5 15:19:07 2009
Errors in file
/oneport/apps/oracle/admin/dbname/bdump/<instance>_lmon_14390.trc:
ORA-29702: error occurred in Cluster Group Service operation
LMON: terminating instance due to error 29702
Thu Feb 5 15:19:07 2009
System state dump is made for local instance
System State dumped to trace file
/oneport/apps/oracle/admin/dbname/bdump/<instance>_diag_14386.trc
Thu Feb 5 15:19:08 2009
Trace dumping is performing id=[cdmp_20090205151907]
Thu Feb 5 15:19:13 2009
Instance terminated by LMON, pid = 14390
Thu Feb 5 15:38:08 2009
Starting ORACLE instance (normal)

From <instance>_diag_14386.trc, we see:
===========================
*** 2009-02-05 15:19:07.218
2009-02-05 15:19:07.217: [ CSSCLNT]clsssRecvMsg: comm error received, comrc
11, con (1068dea10), msg (ffffffff7fffd9e8), msgl 144
2009-02-05 15:19:07.289: [ CSSCLNT]clssgsGGetStatus: communications failed
(0/3/1)
2009-02-05 15:19:07.289: [ CSSCLNT]clssgsGGetStatus: returning 8
kgxgnpstat: received ABORT event from CLSS
CM problem, please abort
*** 2009-02-05 15:19:07.289
Node monitor becomes unavailable for service
2009-02-05 15:19:07.497: [ CSSCLNT]clsssRecvMsg: comm error received, comrc
11, con (1068dea10), msg (ffffffff7fffd9e8), msgl 144
2009-02-05 15:19:07.498: [ CSSCLNT]clssgsGGetStatus: communications failed
(0/3/1)
2009-02-05 15:19:07.498: [ CSSCLNT]clssgsGGetStatus: returning 8
kgxgnpstat: received ABORT event from CLSS
CM problem, please abort


The crsd.log shows the database instances being started before the ASM instances.


ora.<dbname>.<instance>.inst.log shortly after the reboot, reports:


startup
ORA-1565 error in identifying +<ASM disk>/../<spfile>


Cause
The instances were missing the ASM dependency.

 

crs_stat -p ora.<dbname>.<instance>.inst  shows that REQUIRED_RESOURCE is empty. It should contain the name of the ASM resource to ensure that ASM is started before the database instance.

Solution
Add ASM dependency to instances manually:

srvctl modify instance -d <database name> -i <instance name> -s <ASM instance name>

References
NOTE:387217.1 - INSTANCE NOT STARTING AUTOMATICALLY BY CRS WHEN ASM IS USED

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值