OCR磁盘和Votedisk磁盘全部破坏且都没有备份的恢复

最新推荐文章于 2021-04-07 16:58:45 发布

xyz846

最新推荐文章于 2021-04-07 16:58:45 发布

阅读量1.5k

点赞数

分类专栏： Oracle

本文链接：https://blog.csdn.net/xyz846/article/details/17194627

版权

Oracle 专栏收录该内容

223 篇文章 1 订阅

订阅专栏

假设OCR磁盘和Votedisk磁盘全部破坏，并且都没有备份，该如何恢复，这时最简单的方法就是重新初始话OCR和Votedisk，具体操作如下：

参考《大话oracle rac》

模拟磁盘损坏：

[root@node1 ~]# crsctl stop crs

Stopping resources.

Error while stopping resources. Possible cause: CRSD is down.

Stopping CSSD.

Unable to communicate with the CSS daemon.

[root@node1 ~]# dd if=/dev/zero f=/dev/raw/raw1 bs=102400 count=1200

dd: writing `/dev/raw/raw1': No space left on device

1045+0 records in

1044+0 records out

106938368 bytes (107 MB) copied, 6.68439 seconds, 16.0 MB/s

You have new mail in /var/spool/mail/root

[root@node1 ~]# dd if=/dev/zero f=/dev/raw/raw2 bs=102400 count=1200

dd: writing `/dev/raw/raw2': No space left on device

1045+0 records in

1044+0 records out

106938368 bytes (107 MB) copied, 7.62786 seconds, 14.0 MB/s

[root@node1 ~]# dd if=/dev/zero f=/dev/raw/raw5 bs=102400 count=1200

dd: writing `/dev/raw/raw5': No space left on device

1045+0 records in

1044+0 records out

106938368 bytes (107 MB) copied, 8.75194 seconds, 12.2 MB/s

[root@node1 ~]# dd if=/dev/zero f=/dev/raw/raw6 bs=102400 count=1200

dd: writing `/dev/raw/raw6': No space left on device

1045+0 records in

1044+0 records out

106938368 bytes (107 MB) copied, 6.50958 seconds, 16.4 MB/s

[root@node1 ~]# dd if=/dev/zero f=/dev/raw/raw6 bs=102400 count=3000

dd: writing `/dev/raw/raw6': No space left on device

1045+0 records in

1044+0 records out

106938368 bytes (107 MB) copied, 6.61992 seconds, 16.2 MB/s

[root@node1 ~]# dd if=/dev/zero f=/dev/raw/raw7 bs=102400 count=3000

dd: writing `/dev/raw/raw7': No space left on device

2509+0 records in

2508+0 records out

256884736 bytes (257 MB) copied, 16.0283 seconds, 16.0 MB/s

[root@node1 ~]#

1停止所有节点的Clusterware Stack

Crsctl stop crs;

格式化所有的OCR和Votedisk

2分别在每个节点用root用户执行$CRS_HOME\install\rootdelete.sh脚本

[root@node1 ~]# $CRS_HOME/install/rootdelete.sh

Shutting down Oracle Cluster Ready Services (CRS):

OCR initialization failed with invalid format: PROC-22: The OCR backend has an invalid format

Shutdown has begun. The daemons should exit soon.

Checking to see if Oracle CRS stack is down...

Oracle CRS stack is not running.

Oracle CRS stack is down now.

Removing script. for Oracle Cluster Ready services

Updating ocr file for downgrade

Cleaning up SCR settings in '/etc/oracle/scls_scr'

3在任意一个节点上用root用户执行$CRS_HOME\install\rootinstall.sh脚本

[root@node1 ~]# $CRS_HOME/install/rootdeinstall.sh

Removing contents from OCR device

2560+0 records in

2560+0 records out

10485760 bytes (10 MB) copied, 2.36706 seconds, 4.4 MB/s

4在和上一步同一个节点上用root执行$CRS_HOME\root.sh脚本

[root@node1 ~]# $CRS_HOME/root.sh

WARNING: directory '/opt/ora10g/product/10.2.0' is not owned by root

WARNING: directory '/opt/ora10g/product' is not owned by root

WARNING: directory '/opt/ora10g' is not owned by root

Checking to see if Oracle CRS stack is already configured

Setting the permissions on OCR backup directory

Setting up NS directories

Oracle Cluster Registry configuration upgraded successfully

WARNING: directory '/opt/ora10g/product/10.2.0' is not owned by root

WARNING: directory '/opt/ora10g/product' is not owned by root

WARNING: directory '/opt/ora10g' is not owned by root

assigning default hostname node1 for node 1.

assigning default hostname node2 for node 2.

Successfully accumulated necessary OCR keys.

Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.

node <nodenumber>: <nodename> <private interconnect name> <hostname>

node 1: node1 node1-priv node1

node 2: node2 node2-priv node2

Creating OCR keys for user 'root', privgrp 'root'..

Operation successful.

Now formatting voting device: /dev/raw/raw1

Format of 1 voting devices complete.

Startup will be queued to init within 90 seconds.

Adding daemons to inittab

Expecting the CRS daemons to be up within 600 seconds.

CSS is active on these nodes.

node1

CSS is inactive on these nodes.

node2

Local node checking complete.

Run root.sh on remaining nodes to start CRS daemons.

5在其他节点用root执行行$CRS_HOME\root.sh脚本

[root@node2 ~]# $CRS_HOME/root.sh

WARNING: directory '/opt/ora10g/product/10.2.0' is not owned by root

WARNING: directory '/opt/ora10g/product' is not owned by root

WARNING: directory '/opt/ora10g' is not owned by root

Checking to see if Oracle CRS stack is already configured

Setting the permissions on OCR backup directory

Setting up NS directories

Oracle Cluster Registry configuration upgraded successfully

WARNING: directory '/opt/ora10g/product/10.2.0' is not owned by root

WARNING: directory '/opt/ora10g/product' is not owned by root

WARNING: directory '/opt/ora10g' is not owned by root

clscfg: EXISTING configuration version 3 detected.

clscfg: version 3 is 10G Release 2.

assigning default hostname node1 for node 1.

assigning default hostname node2 for node 2.

Successfully accumulated necessary OCR keys.

Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.

node <nodenumber>: <nodename> <private interconnect name> <hostname>

node 1: node1 node1-priv node1

node 2: node2 node2-priv node2

clscfg: Arguments check out successfully.

NO KEYS WERE WRITTEN. Supply -force parameter to override.

-force is destructive and will destroy any previous cluster

configuration.

Oracle Cluster Registry for cluster has already been initialized

Startup will be queued to init within 90 seconds.

Adding daemons to inittab

Expecting the CRS daemons to be up within 600 seconds.

CSS is active on these nodes.

node1

node2

CSS is active on all nodes.

Waiting for the Oracle CRSD and EVMD to start

Oracle CRS stack installed and running under init(1M)

Running vipca(silent) for configuring nodeapps

Error 0(Native: listNetInterfaces:[3])

[Error 0(Native: listNetInterfaces:[3])]

[root@node2 ~]# vipca

Error 0(Native: listNetInterfaces:[3])

[Error 0(Native: listNetInterfaces:[3])]

上述错误的解决：

[root@node1 ~]# oifcfg iflist

eth1 10.10.17.0

virbr0 192.168.122.0

eth0 192.168.1.0

[root@node1 ~]# oifcfg setif -global eth0/192.168.1.0:public

[root@node1 ~]# oifcfg setif -global eth1/10.10.17.0:cluster_interconnect

[root@node1 ~]#

[root@node1 ~]# oifcfg iflist

eth1 10.10.17.0

virbr0 192.168.122.0

eth0 192.168.1.0

[root@node1 ~]# oifcfg getif

eth0 192.168.1.0 global public

eth1 10.10.17.0 global cluster_interconnect

由于上述错误，ONS,GSD,VIP没有创建成功，需手工运行vipca.

[root@node1 ~]# crs_stat -t

Name Type Target State Host

------------------------------------------------------------

ora.node1.gsd application ONLINE ONLINE node1

ora.node1.ons application ONLINE ONLINE node1

ora.node1.vip application ONLINE ONLINE node1

ora.node2.gsd application ONLINE ONLINE node2

ora.node2.ons application ONLINE ONLINE node2

ora.node2.vip application ONLINE ONLINE node2

6用netca命令重新配置监听，确认注册到Clusterware中

[root@node1 ~]# crs_stat -t

Name Type Target State Host

------------------------------------------------------------

ora....E1.lsnr application ONLINE ONLINE node1

ora.node1.gsd application ONLINE ONLINE node1

ora.node1.ons application ONLINE ONLINE node1

ora.node1.vip application ONLINE ONLINE node1

ora....E2.lsnr application ONLINE ONLINE node2

ora.node2.gsd application ONLINE ONLINE node2

ora.node2.ons application ONLINE ONLINE node2

ora.node2.vip application ONLINE ONLINE node2

到目前为止，只有Listener，ONS,GSD,VIP注册到OCR中，还需要把ASM，数据库都注册到OCR中。

7 向OCR中添加ASM(需在oracle用户下)

[root@node1 dbs]# srvctl add asm -n node1 -i +ASM1 -o /opt/ora10g/product/10.2.0/db_1

null

[PRKS-1030 : Failed to add configuration for ASM instance "+ASM1" on node "node1" in cluster registry, [PRKH-1001 : HASContext Internal Error]

[PRKH-1001 : HASContext Internal Error]]

[root@node1 dbs]# su - oracle

[oracle@node1 ~]$ srvctl add asm -n node1 -i +ASM1 -o /opt/ora10g/product/10.2.0/db_1

[oracle@node1 ~]$ srvctl add asm -n node2 -i +ASM2 -o /opt/ora10g/product/10.2.0/db_1

8启动ASM

[oracle@node1 ~]$ srvctl start asm -n node1

[oracle@node1 ~]$ srvctl start asm -n node2

若在启动时报ORA-27550错误。是因为RAC无法确定使用哪个网卡作为Private Interconnect，解决方法：在两个ASM的pfile文件里添加如下参数：

+ASM1.cluster_interconnects='10.10.17.221'

+ASM2.cluster_interconnects='10.10.17.222'

9手工向OCR中添加Database对象。

[oracle@node1 ~]$ srvctl add database -d RACDB -o /opt/ora10g/product/10.2.0/db_1

10添加2个实例对象

[oracle@node1 ~]$ srvctl add instance -d RACDB -i RACDB1 -n node1

[oracle@node1 ~]$ srvctl add instance -d RACDB -i RACDB2 -n node2

11修改实例和ASM实例的依赖关系

[oracle@node1 ~]$ srvctl modify instance -d RACDB -i RACDB1 -s +ASM1

[oracle@node1 ~]$ srvctl modify instance -d RACDB -i RACDB2 -s +ASM2

12启动数据库

[oracle@node1 ~]$ srvctl start database -d RACDB

若也出现ORA-27550错误。也是因为RAC无法确定使用哪个网卡作为Private Interconnect，修改pfile参数在重启动即可解决。

SQL>alter system set cluster_interconnects='10.10.17.221' scope=spfile sid='RACDB1';

SQL>alter system set cluster_interconnects='10.10.17.222' scope=spfile sid='RACDB2';

[root@node1 ~]# crs_stat -t

Name Type Target State Host

------------------------------------------------------------

ora....B1.inst application ONLINE ONLINE node1

ora....B2.inst application ONLINE ONLINE node2

ora.RACDB.db application ONLINE ONLINE node1

ora....SM1.asm application ONLINE ONLINE node1

ora....E1.lsnr application ONLINE ONLINE node1

ora.node1.gsd application ONLINE ONLINE node1

ora.node1.ons application ONLINE ONLINE node1

ora.node1.vip application ONLINE ONLINE node1

ora....SM2.asm application ONLINE ONLINE node2

ora....E2.lsnr application ONLINE ONLINE node2

ora.node2.gsd application ONLINE ONLINE node2

ora.node2.ons application ONLINE ONLINE node2

ora.node2.vip application ONLINE ONLINE node2