假设我们安装CRS时执行root.sh失败,我们应当如何重新执行root.sh,再比如OCR和VOTE全部损坏,并且都没有备份,应当如何恢复,这时候最简单的办法就是重新配置OCR和VOTE,下面是具体模拟过程:
[root@rac1 oracle]# crs_stat -t
Name Type Target State Host
————————————————————
ora.orcl.db application ONLINE ONLINE rac2
ora….l1.inst application ONLINE ONLINE rac1
ora….l2.inst application ONLINE ONLINE rac2
ora….SM1.asm application ONLINE ONLINE rac1
ora….C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora….SM2.asm application ONLINE ONLINE rac2
ora….C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
这是一个已经在运行中的RAC环境,我们模拟所有的ocr和vote均损坏。
ocr和vot盘均采用裸设备:
# cat ocr.loc
ocrconfig_loc=/dev/raw/raw1
ocrmirrorconfig_loc=/dev/raw/raw2
[root@rac1 oracle]# crsctl query css votedisk
0. 0 /dev/raw/raw3
1. 0 /dev/raw/raw4
2. 0 /dev/raw/raw5
located 3 votedisk(s).
我们将所有的裸设备全部dd格式化掉:
[root@rac1 oracle]# dd if=/dev/zero f=/dev/raw/raw1 bs=8192 count=12800
12800+0 records in
12800+0 records out
[root@rac1 oracle]# dd if=/dev/zero f=/dev/raw/raw2 bs=8192 count=12800
12800+0 records in
12800+0 records out
[root@rac1 ~]# dd if=/dev/zero f=/dev/raw/raw3 bs=8192 count=12800
12800+0 records in
12800+0 records out
[root@rac1 ~]# dd if=/dev/zero f=/dev/raw/raw4 bs=8192 count=12800
12800+0 records in
12800+0 records out
[root@rac1 ~]# dd if=/dev/zero f=/dev/raw/raw5 bs=8192 count=12800
12800+0 records in
12800+0 records out
此时CRS进程已经无法启动:
[root@rac1 oracle]# crsctl check crs
Failure 1 contacting CSS daemon
Cannot communicate with CRS
Cannot communicate with EVM
尝试从新配置CRS:
分别在每个节点执行$CRS_HOME/install/rootdelete.sh
[root@rac1 oracle]# cd $CRS_HOME
[root@rac1 crs]# cd install
[root@rac1 install]# ./rootdelete.sh
Shutting down Oracle Cluster Ready Services (CRS):
OCR initialization failed with invalid format: PROC-22: The OCR backend has an invalid format
Shutdown has begun. The daemons should exit soon.
Checking to see if Oracle CRS stack is down…
Oracle CRS stack is not running.
Oracle CRS stack is down now.
Removing script. for Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in '/etc/oracle/scls_scr'
二节点:
[root@rac2 install]# ./rootdelete.sh
Shutting down Oracle Cluster Ready Services (CRS):
OCR initialization failed with invalid format: PROC-22: The OCR backend has an invalid format
Shutdown has begun. The daemons should exit soon.
Checking to see if Oracle CRS stack is down…
Oracle CRS stack is not running.
Oracle CRS stack is down now.
Removing script. for Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in '/etc/oracle/scls_scr'
在任一节点上执行$CRS_HOME/install/rootdeinstall.sh
[root@rac1 install]# ./rootdeinstall.sh
Removing contents from OCR mirror device
2560+0 records in
2560+0 records out
Removing contents from OCR device
2560+0 records in
2560+0 records out
此时我们已经将OCR和VOTE盘的安装配置信息删除,我们可以重新运行root.sh
[root@rac1 crs]# ./root.sh
WARNING: directory '/oracle/app/product/10.2.0' is not owned by root
WARNING: directory '/oracle/app/product' is not owned by root
WARNING: directory '/oracle/app' is not owned by root
WARNING: directory '/oracle' is not owned by root
Checking to see if Oracle CRS stack is already configured
Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/oracle/app/product/10.2.0' is not owned by root
WARNING: directory '/oracle/app/product' is not owned by root
WARNING: directory '/oracle/app' is not owned by root
WARNING: directory '/oracle' is not owned by root
assigning default hostname rac1 for node 1.
assigning default hostname rac2 for node 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node :
node 1: rac1 rac1-priv rac1
node 2: rac2 rac2-priv rac2
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Now formatting voting device: /dev/raw/raw3
Now formatting voting device: /dev/raw/raw4
Now formatting voting device: /dev/raw/raw5
Format of 3 voting devices complete.
Startup will be queued to init within 90 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
rac1
CSS is inactive on these nodes.
rac2
Local node checking complete.
Run root.sh on remaining nodes to start CRS daemons.
[root@rac1 crs]# crsctl check crs
在二节点同样执行$CRS_HOME/root.sh
执行vipca重新配置VIP
此时crs中没有注册信息我们需要用netca命令重新配置监听,以便监听注册到crs中,以及ASM DB都注册到crs中
netca监听重新配置截图省略:
6.到目前为止只有监听,nodeapps资源注册到了OCR中,我们还需要将ASM,DB都注册进去。
7.继续将ASM注册进去:
[root@rac1 oracle]# srvctl add asm -n rac1 -i +ASM1 -o /oracle/app/product/db_1
null
[PRKS-1030 : Failed to add configuration for ASM instance "+ASM1" on node "rac1" in cluster registry, [PRKH-1001 : HASContext Internal Error]
[PRKH-1001 : HASContext Internal Error]]
添加后报错,查了很久的原因,最后发现ASM资源是要通过oracle用户注册的,其实也不难发现,但凡是涉及到$ORACLE_HOME目录的肯定是要用oracle用户去执行,就像后面注册数据库DB资源也是一样。看来任何东西都不能疏忽啊。
成功注册了ASM资源:
[oracle@rac1 ~]$ srvctl add asm -n rac1 -i +ASM1 -o /oracle/app/product/db_1
[oracle@rac1 ~]$ srvctl add asm -n rac2 -i +ASM2 -o /oracle/app/product/db_1
[oracle@rac1 ~]$ crs_stat -t
Name Type Target State Host
————————————————————
ora….SM1.asm application OFFLINE OFFLINE
ora….C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora….SM2.asm application OFFLINE OFFLINE
ora….C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
尝试启动ASM资源:
[root@rac1 oracle]# srvctl start asm -n rac1
[root@rac1 oracle]# srvctl start asm -n rac2
[root@rac1 oracle]# crs_stat -t
Name Type Target State Host
————————————————————
ora….SM1.asm application ONLINE ONLINE rac1
ora….C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora….SM2.asm application ONLINE ONLINE rac2
ora….C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
很幸运ASM启动了,不过有的时候ASM启动时候会报错:
若在启动时报ORA-27550错误。是因为RAC无法确定使用哪个网卡作为Private Interconnect,解决方法:在两个ASM的pfile文件里添加如下参数:
+ASM1.cluster_interconnects='10.10.17.221'
+ASM2.cluster_interconnects='10.10.17.222'
手工向OCR中添加DB信息:
[oracle@rac1 ~]$ srvctl add database -d orcl -o /oracle/app/product/db_1
手工添加实例信息:
[oracle@rac1 ~]$ srvctl add instance -d orcl -i orcl1 -n rac1
[oracle@rac1 ~]$ srvctl add instance -d orcl -i orcl2 -n rac2
[root@rac1 oracle]# crs_stat -t
Name Type Target State Host
————————————————————
ora.orcl.db application OFFLINE OFFLINE
ora….l1.inst application OFFLINE OFFLINE
ora….l2.inst application OFFLINE OFFLINE
ora….SM1.asm application ONLINE ONLINE rac1
ora….C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora….SM2.asm application ONLINE ONLINE rac2
ora….C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
修改实例和ASM实例的依赖关系:
[oracle@rac1 ~]$ srvctl modify instance -d orcl -i orcl1 -s +ASM1
[oracle@rac1 ~]$ srvctl modify instance -d orcl -i orcl2 -s +ASM2
启动数据库:
[oracle@rac1 ~]$ srvctl start database -d orcl
[oracle@rac1 ~]$ crs_stat -t
Name Type Target State Host
————————————————————
ora.orcl.db application ONLINE ONLINE rac2
ora….l1.inst application ONLINE ONLINE rac1
ora….l2.inst application ONLINE ONLINE rac2
ora….SM1.asm application ONLINE ONLINE rac1
ora….C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora….SM2.asm application ONLINE ONLINE rac2
ora….C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
所有实例启动成功。
如果启动过程中也和起订ASM一样报错:
若也出现ORA-27550错误。也是因为RAC无法确定使用哪个网卡作为Private Interconnect,修改pfile参数在重启动即可解决。
SQL>alter system set cluster_interconnects='10.10.17.221' scope=spfile sid='RACDB1';
SQL>alter system set cluster_interconnects='10.10.17.222' scope=spfile sid='RACDB2';
为确保安全,重新启动CRS(双节点)
[root@rac1 oracle]# crsctl stop crs
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.
[root@rac1 oracle]# crsctl start crs
Attempting to start CRS stack
The CRS stack will be started shortly
CRS启动成功:
[root@rac1 oracle]# crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
[root@rac1 oracle]# crs_stat -t
Name Type Target State Host
————————————————————
ora.orcl.db application ONLINE ONLINE rac1
ora….l1.inst application ONLINE ONLINE rac1
ora….l2.inst application ONLINE ONLINE rac2
ora….SM1.asm application ONLINE ONLINE rac1
ora….C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora….SM2.asm application ONLINE ONLINE rac2
ora….C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
至此root.sh 的重新配置完全结束。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/23732248/viewspace-709565/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/23732248/viewspace-709565/