oracle crs voting disk损坏一例(asm+rac)

--DBA行有一句老话,什么损坏都不可怕,可怕的是没有有效备份
--下面就遇到了这个问题,就是在没有备份下的恢复
检查一下crs进程:

$ ps -ef|grep crs
root 10469 1 0 16:52:16 ? 0:00 /bin/sh /etc/init.d/init.crsd run
oracle 14164 9725 0 17:05:48 pts/1 0:00 grep crs


此时需要以root身份启动crs后台进程:或者crsctl

# /etc/init.d/init.crs start
Startup will be queued to init within 30 seconds.

# id
uid=0(root) gid=1(other)
# ps -ef|grep crs

--crsctl start crs 启动crs报错信息如下
clsscfg_vhinit: unable(1) to open disk (/dev/rdsk/c1t16d12s5)
Internal Error Information:
Category: 1234
Operation: scls_block_open
Location: open
Other: open failed /dev/rdsk/c1t16d12s5
Dep: 9
Failure 1 checking the Cluster Synchronization Services voting disk '/dev/rdsk/c1t16d12s5'.
Not able to read adequate number of voting disks
PRKH-1010 : Unable to communicate with CRS services.

同时查看/tmp目录,有下面类似的文件生成,查看
/tmp/crsctl.1177
/tmp/crsctl.1179


#crs_stat -t
CRS-0184: Cannot communicate with the CRS daemon.

--检查下这个磁盘(/dev/rdsk/c1t16d12s5)
dd if=/dev/rdsk/c1t16d12s5 of=/opt/oracle/test
提示
dd: /dev/rdsk/c1t16d12s5: open: I/O error

看来voting disk坏了
表决盘坏了,只能通过重建voting disk了,
重建voting disk 有2个方法
1、如果voting disk有备份的话
2、如果没有备份的话,只能重新安装clusterware了


开始准备集群安装软件/home/oracle/clusterware
准备保留原来的信息
节点1:10.253.20.168 节点名称:ahniosdb1 实例名称:niosdb1 主机名:AHNIOSDB1
节点2:10.253.20.169 节点名称:ahniosdb2 实例名称:niosdb2 主机名:AHNIOSDB2

删除原来的集群软件
首先做好备份
mv /opt/oracle/crs /opt/oracle/crsbak

准备2个新的raw device 每个300M大小 用作安装crs 和 voting disk
/dev/rdsk/c1t16d14s4 --用来安装crs
/dev/rdsk/c1t16d14s5 --用作voting disk

安装过程中
The ONS configuration failed to create

--如果在下面的执行root.sh时出现执行时出现下面的信息
# ./root.sh
Checking to see if Oracle CRS stack is already configured Oracle
CRS stack is already configured and will be running under init(1M)
--上面的报错,说明删除crs时没有清理干净
--清理旧配置信息(在各节点运行)

rm -rf /etc/oracle/*
rm -rf /var/tmp/.oracle
修改 /etc/inittab, 删除以下三行.
h1:2:respawn:/etc/init.evmd run >/dev/null 2>&1 h2:2:respawn:/etc/init.cssd fatal >/dev/null 2>&1 h3:2:respawn:/etc/init.crsd run >/dev/null 2>&1 ps -ef|grep init.d
--发现下面的几个进程始终存在,删除下面的进程;重起机器
/etc/init.d/init.crsd
/etc/init.d/init.evmd
/etc/init.d/init.cssd


然后再执行root.sh 成功
# ./root.sh

--查看crs_stat -t 发现gsd ons vip都已经online
# crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....db1.gsd application ONLINE ONLINE ahniosdb1
ora....db1.ons application ONLINE ONLINE ahniosdb1
ora....db1.vip application ONLINE ONLINE ahniosdb1
ora....db2.gsd application ONLINE ONLINE ahniosdb2
ora....db2.ons application ONLINE ONLINE ahniosdb2
ora....db2.vip application ONLINE ONLINE ahniosdb2

--准备加载listener asm db instance
--asm 准备好节点名字 asm实例名称 oracle_home
srvctl add asm -n ahniosdb1 -i +ASM1 -o /opt/oracle/product/10gr2
srvctl add asm -n ahniosdb2 -i +ASM2 -o /opt/oracle/product/10gr2
--db
srvctl add database -d niosdb -o /opt/oracle/product/10gr2
--instance
srvctl add instance -d niosdb -i niosdb1 -n ahniosdb1
srvctl add instance -d niosdb -i niosdb2 -n ahniosdb2
--启动asm
srvctl start asm -n ahniosdb1
srvctl start asm -n ahniosdb2
--启动instance
srvctl start instance -d niosdb -i niosdb1
srvctl start instance -d niosdb -i niosdb2
--检查下各节点的状态
# crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....SM1.asm application ONLINE ONLINE ahniosdb1
ora....B1.lsnr application ONLINE ONLINE ahniosdb1
ora....db1.gsd application ONLINE ONLINE ahniosdb1
ora....db1.ons application ONLINE ONLINE ahniosdb1
ora....db1.vip application ONLINE ONLINE ahniosdb1
ora....SM2.asm application ONLINE ONLINE ahniosdb2
ora....B2.lsnr application ONLINE ONLINE ahniosdb2
ora....db2.gsd application ONLINE ONLINE ahniosdb2
ora....db2.ons application ONLINE ONLINE ahniosdb2
ora....db2.vip application ONLINE ONLINE ahniosdb2
ora.niosdb.db application ONLINE ONLINE ahniosdb2
ora....b1.inst application ONLINE ONLINE ahniosdb1
ora....b2.inst application ONLINE ONLINE ahniosdb2

全部online
至此 oracle 2-node rac voting disk crash的恢复完成

客户:起来了 起来了 我们自己也连上了。

我说 测试下应用 看能不能连接

dd if=/dev/rdsk/c1t16d12s5 of=/opt/oracle/voting.bak

集群
/home/oracle/clusterware
# more /etc/hosts配置如下
127.0.0.1 localhost
10.253.20.168 AHNIOSDB1 loghost
10.253.20.173 AHNIOSDB1-VIP
10.11.0.11 AHNIOSDB1-PIV

10.253.20.169 AHNIOSDB2 loghost
10.253.20.174 AHNIOSDB2-VIP
10.11.0.12 AHNIOSDB2-PIV


Details (see full log at /opt/oracle/oraInventory/logs/installActions2010-03-25_07-58-57PM.log):

/opt/oracle/crs/install/onsconfig add_config AHNIOSDB1:6251 AHNIOSDB2:6251

/opt/oracle/crs/bin/oifcfg setif -global bge0/10.253.20.160:public bge1/10.11.0.0:cluster_interconnect sppp0/192.168.254.0:cluster_interconnect

/opt/oracle/crs/bin/cluvfy stage -post crsinst -n AHNIOSDB1,AHNIOSDB2



网上收到的一个其他解决方式 :
1) 清理旧配置信息(在各节点运行)

rm -rf /etc/oracle/*

rm -rf /var/tmp/.oracle

修改 /etc/inittab, 删除以下三行.

h1:2:respawn:/etc/init.evmd run >/dev/null 2>&1
h2:2:respawn:/etc/init.cssd fatal >/dev/null 2>&1
h3:2:respawn:/etc/init.crsd run >/dev/null 2>&1
再运行init q

如果在下面的执行root.sh时出现以下信息则证明配置信息没清楚干净.

Checking to see if Oracle CRS stack is already configured

Oracle CRS stack is already configured and will be running under init(1M)

2) 清理当前内存中信息
slibclean
用genkld |grep crs 检查,有的话继续用 slibclean清楚干净.

3) modify /CRS_HOME/install/rootconfig

修改里面关于ocr和vote disk的信息.

4) 用oracle用户touch 出新的ocr文件和 vote disk (必须)文件,否则出错

# /opt/oracle/10g/crs/root.sh
WARNING: directory '/opt/oracle/10g' is not owned by root
WARNING: directory '/opt/oracle' is not owned by root
"/ocr/vote" does not exist. Create it before proceeding.
Make sure that this file is shared across cluster nodes.
1

5) 在各节点执行/CRS_HOME/root.sh

METHOD 2 - RE-INSTALL CRS
-------------------------

The only safe and sure way to re-create the voting disk in 10gR1 is to reinstall
CRS. The deinstallation procedure is the only way we have to undo the CSS fatal
mode, which in turn makes it safe to reinstall.

Only do this after consulting with Oracle Support Services and there is no reasonable
way to fix the inconsistency.

Once you re-install CRS, you can restore OCR from one of its automatic backups.
Then, you can back up the voting disk, and also back it up again after any node
addition or deletion operations.


1. Use Note 239998.1 to completely remove the CRS installation.
2. Re-install CRS
3. Run the CRS root.sh as prompted at the end of the CRS install.
4. Run the root.sh in the database $ORACLE_HOME to re-run VIPCA. This will re-
create the VIP, GSD, and ONS resources.
5. Use NETCA to re-add any listeners.
6. Add databases and instances with SRVCTL, syntax is in Note 259301.1

ORACLE RAC 的一些备份
ORACLE会自动对CRS的配置信息OCR盘进行自动备份
可以通过orcconfig -showbackup查看备份信息

对于中裁盘votingdisk可以使用DD命令备份文件系统
可以通过crsctl query css votedisk
备份
dd if=/dev/votedisk of=/oraclebackup/vote_disk
恢复只要
dd if=/oraclebackup/vote_disk of /dev/votedisk

ASM实例的备份
可以只备份ASM的$ORACLE_HOME

OCR的备份可以通过如下命令
ocrconfig -export myfile
orcdump -backupfile myfile
恢复可以用如下命令
crs stop
ocrconfig -import myfile


OCR和Voting disk的管理
ocr和vote disk虽然是很小的空间,但是对RAC来所非常的重要呀,

继给客户排除好故障以后,也顺便给ocr和vote disk做好了备份,这里特别整理了这两者的管理和备份方法,特记录下来给网友们参考

Voting disk记录节点成员信息,如包含哪些节点成员、节点的添加删除信息记录,大小为20MB
查看voting disk位置:crsctl query css votedisk
$ crsctl query css votedisk
0. 0 /dev/ocrbackup
如果CRS安装过程失败,需要重新安装则需要初始化voting disk盘,可用DD或重建卷
dd if=/dev/zero of=/dev/ocrbackup bs=8192 count=2560
备份votedisk: dd if=/dev/ocrbackup of=/tmp/votedisk.bak
恢复votedisk: dd if=/tmp/votedisk.bak of=/dev/ocrbackup
添加voting disk镜像盘:
crsctl add css votedisk /dev/ocrbackup -force
删除voting disk镜像盘
crsctl delete css votedisk /dev/ocrbackup -force


OCR方面
OCR记录节点成员的配置信息,如database、ASM、instance、listener、VIP等CRS资源的配置信息,可存储于裸设备或者群集文件系统上,推荐设置大小为100MB
如以RAW的方式,则划分一个RAW,例如:/dev/ocrbackup
如果CRS安装过程失败,需要重新安装则需要初始化OCR盘(RAW方式),可用DD或重建卷
dd if=/dev/zero of=/dev/ocrbackup bs=8192 count=12800
Oracle每四个小时自动发起备份,并保存三个版本,但只存在一个节点上
$ ocrconfig -showbackup


恢复OCR:ocrconfig -restore /u01/app/oracle/product/10.2.0/crs/cdata/crs/backup01.ocr

OCR手动导出:ocrconfig -export /tmp/ocr_bak
OCR手动导入:ocrconfig -import /tmp/ocr_bak

添加OCR镜像盘:
1.用crsctl stop crs停掉CRS服务
2.创建用于镜像OCR的RAW设备,比如为:/dev/rhdisk6
3.用ocrconfig –export 导出OCR的信息
4.编辑/etc/oracle/ocr.loc文件,添加ocrmirrorconfig_loc行
$ cat ocr.loc
ocrconfig_loc=/dev/ocrbackup
ocrmirrorconfig_loc=/dev/ocrmirror
local_only=FALSE
5.用ocrconfig –import 导入OCR的信息
6.检查ocr设置信息
$ ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 103724
Used space (kbytes) : 3824
Available space (kbytes) : 99900
ID : 1086971606
Device/File Name : /dev/ocrbackup Device/File integrity check succeeded
Device/File Name : /dev/ocrmirror Device/File integrity check succeeded
Cluster registry integrity check succeeded
7.最后用crsctl start crs启动CRS服务


[@more@]

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/70612/viewspace-1032370/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/70612/viewspace-1032370/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值