OCR是保存整个集群信息的存储,它放在共享存储上面,如果它要是出问题了CRS就会出现问题。
[root@racr1 ~]# su - oracle
racr1->
racr1-> ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 524024
Used space (kbytes) : 3832
Available space (kbytes) : 520192
ID : 801841162
Device/File Name : /dev/raw/raw1
Device/File integrity check succeeded
Device/File not configured
Cluster registry integrity check succeeded
racr1->
通常的情况下,我们可以通过ocrcheck命令来查看ocr的存放路径和当前状态。
还有一个命令它维护OCR磁盘的,那就是ocrconfig
racr1-> ocrconfig
Name:
ocrconfig - Configuration tool for Oracle Cluster Registry.
Synopsis:
ocrconfig [option]
option:
-export <filename> [-s online]
- Export cluster register contents to a file
-import <filename> - Import cluster registry contents from a file
-upgrade [<user> [<group>]]
- Upgrade cluster registry from previous version
-downgrade [-version <version string>]
- Downgrade cluster registry to the specified version
-backuploc <dirname> - Configure periodic backup location
-showbackup - Show backup information
-restore <filename> - Restore from physical backup
-replace ocr|ocrmirror [<filename>] - Add/replace/remove a OCR device/file
-overwrite - Overwrite OCR configuration on disk
-repair ocr|ocrmirror <filename> - Repair local OCR configuration
-help - Print out this help information
Note:
A log file will be created in
$ORACLE_HOME/log/<hostname>/client/ocrconfig_<pid>.log. Please ensure
you have file creation privileges in the above directory before
running this tool.
racr1->
可以看到这个命令有备份和恢复的参数,应该是用来备份ocr信息的。
下面就演示一个ocr破坏后如果恢复的。
[root@racr1 ~]# cd /u01/app/oracle/product/10.2.0/crs_1/bin/
[root@racr1 bin]#
[root@racr1 bin]# ./ocrconfig -export /home/oracle/ocrexp.exp -s online
[root@racr1 bin]#
[root@racr1 bin]# cd /home/oracle/
[root@racr1 oracle]# ls
ocrexp.exp OracleHome.tar
[root@racr1 oracle]#
先做好一个ocr的备份,默认情况下,每4个小时就会自动备份一次。可以使用ocrconfig的-showbackup来看。
既然做好了备份,那就开始破坏了。
[root@racr1 bin]# dd if=/dev/zero of=/dev/raw/raw1 bs=1024 count=102400
102400+0 records in
102400+0 records out
104857600 bytes (105 MB) copied, 20.4541 seconds, 5.1 MB/s
[root@racr1 bin]#
[root@racr1 bin]#
这样就破坏了现在集群使用的ocr信息了。我们可以在使用ocrchec来看看现在的情况
[root@racr1 oracle]# su - oracle
racr1->
racr1-> ocrcheck
PROT-601: Failed to initialize ocrcheck
命令没办法执行了,一点是ocr坏了。通过cluvfy工具也能验证ocr坏了。
racr1-> /opt/clusterware/cluvfy/runcluvfy.sh comp ocr -n all
Verifying OCR integrity
Unable to retrieve nodelist from Oracle clusterware.
Verification cannot proceed.
racr1->
都坏了,就恢复吧。
[root@racr1 bin]# ./ocrconfig -import /opt/ocrexp.exp
PROT-19: Cannot proceed while clusterware is running. Shutdown clusterware first
[root@racr1 bin]#
[root@racr1 bin]# ./crsctl stop crs
OCR initialization failed with invalid format: PROC-22: The OCR backend has an invalid format
我在恢复的时候,遇到一个问题,就是import的时候要我关集群,关集群提示我OCR有问题,这个时候只能kill进程了,那么kill谁呢
[root@racr1 bin]# ps -ef | grep crs
root 17039 1841 0 23:51 pts/4 00:00:00 grep crs
root 17715 14950 0 22:53 ? 00:00:00 /bin/su -l oracle -c sh -c 'ulimit -c unlimited; cd /u01/app/oracle/product/10.2.0/crs_1/log/racr1/evmd; exec /u01/app/oracle/product/10.2.0/crs_1/bin/evmd '
oracle 17719 17715 0 22:53 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/bin/evmd.bin
root 18171 17860 0 22:53 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/bin/oprocd.bin run -t 1000 -m 500 -f
root 18218 17867 0 22:53 ? 00:00:00 /sbin/runuser -l oracle -c /bin/sh -c 'cd /u01/app/oracle/product/10.2.0/crs_1/log/racr1/cssd/oclsomon; ulimit -c unlimited; /u01/app/oracle/product/10.2.0/crs_1/bin/oclsomon || exit $?'
oracle 18219 18218 0 22:53 ? 00:00:00 /bin/sh -c cd /u01/app/oracle/product/10.2.0/crs_1/log/racr1/cssd/oclsomon; ulimit -c unlimited; /u01/app/oracle/product/10.2.0/crs_1/bin/oclsomon || exit $?
oracle 18243 18219 0 22:53 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/bin/oclsomon.bin
oracle 18344 17920 0 22:53 ? 00:00:03 /u01/app/oracle/product/10.2.0/crs_1/bin/ocssd.bin
oracle 18495 17719 0 22:53 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/bin/evmlogger.bin -o /u01/app/oracle/product/10.2.0/crs_1/evm/log/evmlogger.info -l /u01/app/oracle/product/10.2.0/crs_1/evm/log/evmlogger.log
oracle 18781 1 0 22:53 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/opmn/bin/ons -d
oracle 18782 18781 0 22:53 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/opmn/bin/ons -d
root 23225 1 0 23:03 ? 00:00:00 /bin/sh /etc/init.d/init.crsd run
root 24573 23225 0 23:06 ? 00:00:06 /u01/app/oracle/product/10.2.0/crs_1/bin/crsd.bin restart
[root@racr1 bin]#
root@racr1 bin]# kill -9 24573
大家看到了,主要是干掉init.crsd这个进程。然后就可以导入ocr的备份信息了
[root@racr1 bin]# ./ocrconfig -import /opt/ocrexp.exp
[root@racr1 bin]# ./crsctl start crs
Attempting to start CRS stack
The CRS stack will be started shortly
[root@racr1 bin]#
[root@racr1 bin]# ./crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
这个整个集群就起来了,但是还不能用
[root@racr1 bin]# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....D1.inst application ONLINE OFFLINE
ora....D2.inst application ONLINE OFFLINE
ora.PROD.db application ONLINE OFFLINE
ora....SM1.asm application ONLINE OFFLINE
ora....R1.lsnr application ONLINE OFFLINE
ora.racr1.gsd application ONLINE OFFLINE
ora.racr1.ons application ONLINE OFFLINE
ora.racr1.vip application ONLINE OFFLINE
ora....SM2.asm application ONLINE OFFLINE
ora....R2.lsnr application ONLINE OFFLINE
ora.racr2.gsd application ONLINE OFFLINE
ora.racr2.ons application ONLINE OFFLINE
ora.racr2.vip application ONLINE OFFLINE
[root@racr1 bin]#
[root@racr1 bin]# crs_start -all
-bash: crs_start: command not found
[root@racr1 bin]# ./crs_start -all
Attempting to start `ora.racr1.ASM1.asm` on member `racr1`
Attempting to start `ora.racr1.vip` on member `racr1`
Attempting to start `ora.racr2.vip` on member `racr2`
Attempting to start `ora.racr2.ASM2.asm` on member `racr2`
Start of `ora.racr1.vip` on member `racr1` succeeded.
Attempting to start `ora.racr1.LISTENER_RACR1.lsnr` on member `racr1`
Start of `ora.racr2.vip` on member `racr2` succeeded.
Attempting to start `ora.racr2.LISTENER_RACR2.lsnr` on member `racr2`
Start of `ora.racr1.LISTENER_RACR1.lsnr` on member `racr1` succeeded.
Start of `ora.racr2.LISTENER_RACR2.lsnr` on member `racr2` succeeded.
Start of `ora.racr1.ASM1.asm` on member `racr1` succeeded.
Attempting to start `ora.PROD.PROD1.inst` on member `racr1`
Start of `ora.racr2.ASM2.asm` on member `racr2` succeeded.
Attempting to start `ora.PROD.PROD2.inst` on member `racr2`
Start of `ora.PROD.PROD1.inst` on member `racr1` succeeded.
Start of `ora.PROD.PROD2.inst` on member `racr2` succeeded.
CRS-1002: Resource 'ora.racr1.ons' is already running on member 'racr1'
CRS-1002: Resource 'ora.racr2.ons' is already running on member 'racr2'
CRS-1002: Resource 'ora.PROD.db' is already running on member 'racr2'
Attempting to start `ora.racr1.gsd` on member `racr1`
Attempting to start `ora.racr2.gsd` on member `racr2`
Start of `ora.racr1.gsd` on member `racr1` succeeded.
Start of `ora.racr2.gsd` on member `racr2` succeeded.
CRS-0223: Resource 'ora.PROD.db' has placement error.
CRS-0223: Resource 'ora.racr1.ons' has placement error.
CRS-0223: Resource 'ora.racr2.ons' has placement error.
[root@racr1 bin]# ./crs_s
-bash: ./crs_s: No such file or directory
[root@racr1 bin]# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora....D1.inst application ONLINE ONLINE racr1
ora....D2.inst application ONLINE ONLINE racr2
ora.PROD.db application ONLINE ONLINE racr2
ora....SM1.asm application ONLINE ONLINE racr1
ora....R1.lsnr application ONLINE ONLINE racr1
ora.racr1.gsd application ONLINE ONLINE racr1
ora.racr1.ons application ONLINE ONLINE racr1
ora.racr1.vip application ONLINE ONLINE racr1
ora....SM2.asm application ONLINE ONLINE racr2
ora....R2.lsnr application ONLINE ONLINE racr2
ora.racr2.gsd application ONLINE ONLINE racr2
ora.racr2.ons application ONLINE ONLINE racr2
ora.racr2.vip application ONLINE ONLINE racr2
[root@racr1 bin]#
最后再验证一下就好了
[root@racr1 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
Version : 2
Total space (kbytes) : 524024
Used space (kbytes) : 3832
Available space (kbytes) : 520192
ID : 673724156
Device/File Name : /dev/raw/raw1
Device/File integrity check succeeded
Device/File not configured
Cluster registry integrity check succeeded
[root@racr1 bin]#
好了,ocrchek可以用。默认没输入一次ocrcheck命令,那么在$CRS_HOME/log/<NODENAME>/client目录下就会产生一个ocrcheck_pid.log日志文件
。