"crsd.bin" STACK EXECUTION DISABLED

他们对操作系统进行的补丁升级,造成了一系列的Oracle 数据库的问题,这个昨天晚上解决CRS的问题的解决过程总结.自己保留一下,以备以后查询方便

[@more@]

环境:
AIX5.3ML07
HACMP5.2
Oracle 10.2.0.2 RAC
裸设备管理数据文件


故障现象:
- ORA-29701: unable to connect to Cluster Manager (重起系统后CRS起不来)
- 升级OS 补丁从ML05 ->ML07后,发现数据库的控制文件对应的LV不能使用(其他的文件没测试,估计都是坏了)
- 升级OS 补丁从ML05 ->ML07 一天后,重起CRS ,发现产生大量CORE 文件,CRS不能正常启动了
OS 上错误:
C69F5C9B 0429201208 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED
C69F5C9B 0429201208 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED
C69F5C9B 0429201208 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED
C69F5C9B 0429201208 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATED
errpt -aj C69F5C9B | more
Failure Causes
SOFTWARE PROGRAM

Recommended Actions
RERUN THE APPLICATION PROGRAM
IF PROBLEM PERSISTS THEN DO THE FOLLOWING
CONTACT APPROPRIATE SERVICE REPRESENTATIVE

Detail Data
SIGNAL NUMBER
11
USER'S PROCESS ID:
860306
FILE SYSTEM SERIAL NUMBER
13
INODE NUMBER
114802
CORE FILE NAME
/oracle/home/OraHome_1/log/pekax129/crsd/core
PROGRAM NAME
crsd.bin
STACK EXECUTION DISABLED
stdin

CRS HOME里CRS 错误:
/oracle/home/OraHome_1/log/TESTSERVER/crsd 里面产生大量的CORE文件
core.2008-04-29-19:51:56
crsd.log 里大量如下错误
008-04-29 20:12:11.105: [ CRSMAIN][1]32Checking the OCR device
2008-04-29 20:12:11.107: [ CRSMAIN][1]32Connecting to the CSS Daemon
2008-04-29 20:12:11.479: [ CRSD][1]32Daemon Version: 10.2.0.2.0 Active Versio
n: 10.2.0.2.0
2008-04-29 20:12:11.479: [ CRSD][1]32Active Version and Software Version are
same
2008-04-29 20:12:11.479: [ CRSMAIN][1]32Initializing OCR
2008-04-29 20:12:11.483: [ OCRRAW][1]proprioo: for disk 0 (/dev/rlv_ocrfile), i
d match (1), my id set (1040465895,1028247821) total id sets (1), 1st set (10404
65895,1028247821), 2nd set (0,0) my votes (2), total votes (2)
2008-04-29 20:12:11.562: [ OCRMAS][3344]th_master:12: I AM THE NEW OCR MASTER a
t incar 1. Node Number = 1
2008-04-29 20:12:11.562: [ OCRRAW][3344]proprioo: for disk 0 (/dev/rlv_ocrfile)
, id match (1), my id set (1040465895,1028247821) total id sets (1), 1st set (10
40465895,1028247821), 2nd set (0,0) my votes (2), total votes (2)
2008-04-29 20:12:11.575: [ CRSD][1]32ENV Logging level for Module: allcomp 0
2008-04-29 20:12:11.576: [ CRSD][1]32ENV Logging level for Module: default 0
2008-04-29 20:12:11.576: [ CRSD][1]32ENV Logging level for Module: COMMCRS 0
2008-04-29 20:12:11.577: [ CRSD][1]32ENV Logging level for Module: COMMNS 0
2008-04-29 20:12:11.578: [ CRSD][1]32ENV Logging level for Module: CRSUI 0
2008-04-29 20:12:11.579: [ CRSD][1]32ENV Logging level for Module: CRSCOMM

解决过程:
- 尝试APPLY CRS BUG PATCH 5467456 去解决(虽然5467456 有点相似,但和故障现象并不完全一样)
1. Verify that the Oracle Inventory is properly configured.
#
# As the crs software owner;
# % opatch lsinventory -detail -oh
#
# As the rdbms server software owner;
# % opatch lsinventory -detail -oh
#
# This should list the components the list of nodes.
#
# If the Oracle inventory is not setup correctly this utility will
# fail.
###########################################################################
#
# 2. Unzip the PSE container file
#
# % unzip p5467456_10202_AIX64-5L.zip
#
###########################################################################
#
# 3. Shut down all Applications replying on the RDBMS/ASM instances, which
# rely on RDBMS and ASM instances dependent on the CRS daemons being shut
# down in step 4. Then shutdown the RDBMS and ASM instances dependent
# on the CRS daemons being shut down in step 4.
#
###########################################################################
#
# 4. In configuration A, shut down the CRS daemons on all nodes.
# In configuration B, shut down the CRS daemons on the local node.
#
# As root, issue the following command to stop the CRS daemons.
#
# Linux# /etc/init.d/init.crs stop
# SunOS# /etc/init.d/init.crs stop
# AIX# /etc/init.crs stop
# HP-UX# /sbin/init.d/init.crs stop
# OSF1# /sbin/init.d/init.crs stop
#
###########################################################################
#
# 5. Prior to applying this part of the fix, you must invoke this script
# as root to unlock protected files.
#
# is the software installer/owner for the CRS Home.
#
# # custom/scripts/prerootpatch.sh -crshome -crsuser
#
# Note: In configuration A, invoke this only on one node.
#
#
###########################################################################
#
# 6. Now invoke an additional script as the crs software installer/owner.
# This script will save important configuration settings.
#
# % custom/scripts/prepatch.sh -crshome
#
# Note: In configuration A, invoke this only on one node.
#
###########################################################################
#
# 7. Patch the Files
#
# 7.1 Patch the CRS home files
#
# After unlocking any protected files and saving configuration settings
# you are now ready to run opatch using the following command.
#
# As the crs software owner;
#
# % opatch apply -local -oh
#
# Note: In configuration A, invoke this only on one node.
#
#
# 7.2 Patch the RDBMS home files.
#
# Note: The RDBMS portion can only be applied to an RDBMS home that
# has been upgraded to *10.1.0.5*.
#
# For additional information please read Note.363254.1;
#
# Applying one-off Oracle Clusterware patches in a mixed version home
# environment
#
# As the RDBMS software owner;
#
# % opatch apply custom/server/5467456 -local -oh
#
# Note: In configuration A, invoke this only on one node.
#
#
###########################################################################
#
# 8. After opatch completes, some configuration settings need to be applied
# to the patched files. As the crs software owner execute the following;
#
# % custom/scripts/postpatch.sh -crshome
#
# Note: In configuration A, invoke this only on one node.
#
###########################################################################
#
# 9. Now security settings need to be restored on the CRS Home. This script
# will also restart the CRS daemons. Invoke this script as root.
#
# # custom/scripts/postrootpatch.sh -crshome
#
# Note: This script should only be invoked as part of the patch process.
#
# Note: In configuration A, invoke this on each node. Do not invoke this
# in parallel on two nodes.
#
###########################################################################
#
# 10. On success you can determine whether the patch has been installed by
# using the following command;
#
# % opatch lsinventory -detail -oh
#
# Installed Patch List:
# =====================
#
# Patch 5467456 : applied on Tue May 23 02:13:02 EDT 2006
# Created on 18 Jul 2006, 17:23:25 hrs US/Pacific
# Bugs fixed:
# ...
#
# % opatch lsinventory -detail -oh
#
# Installed Patch List:
# =====================
# Patch 5467456 : applied on Tue May 23 02:15:36 EDT 2006
# Created on 18 Jul 2006, 17:20:44 hrs US/Pacific
# Bugs fixed:
# ...
#
但以上的操作并没有解决问题,CRS还是起不来,并且CORE 文件还是不停的产生

- 尝试用原来系统很稳定时期的CRS 的配置文件进行恢复
以下都是root 用户
ocrconfig -show backup
/etc/init.crs stop (2 个节点)
ocrconfig -restore $ORA_CRS_HOME/cdata/backup00
/etc/init/crs start

ps -ef | grep d.bin
oracle 655550 1 0 20:13:51 - 0:06 /oracle/home/OraHome_1/bin/ev
md.bin
root 766136 1 0 20:13:37 - 1:16 /oracle/home/OraHome_1/bin/cr
sd.bin reboot
oracle 815206 995544 0 10:31:43 pts/1 0:00 grep d.bin
oracle 901314 847880 0 20:51:45 - 2:14 /oracle/home/OraHome_1/bin/oc
ssd.bin

经过检查,CRS 启动正常。 CORE 文件也不在产生

- 对于控制文件坏的问题,只能是删除LV重新创建RAW 设备,然后进行数据库的恢复




来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/7318139/viewspace-1003186/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/7318139/viewspace-1003186/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值