01_RAC_集群维护_votedisk_ocr_olr

clusterware磁盘文件管理
1.clusterware的磁盘文件类型包括表决磁盘、OCR和OLR,这3个类型的文件对clusterware的稳定运行至关重要。在日常的工作中会对这些文件进行备份、恢复、添加、删除、升级等操作。
2.在每次对clusterware进行重新配置之前应该对OCR、表决磁盘和OLR文件进行备份,当配置出现问题可以执行相关clusterware 磁盘文件的恢复操作,避免由于重新配置出现问题导致不能使用clusterware管理工具管理各种资源
3.4个主要的管理工具:CRSCTL、OCRCONFIG、OCRCHECK、OCRDUMP

2.查看资源
1.
[grid@r2 ~]$ crs_stat
NAME=ora.DATA.dg
TYPE=ora.diskgroup.type
TARGET=ONLINE
STATE=ONLINE on r1

NAME=ora.FRA.dg
TYPE=ora.diskgroup.type
TARGET=ONLINE
STATE=ONLINE on r1

NAME=ora.LISTENER.lsnr
TYPE=ora.listener.type
TARGET=ONLINE
STATE=ONLINE on r1

NAME=ora.LISTENER_SCAN1.lsnr
TYPE=ora.scan_listener.type
TARGET=ONLINE
STATE=ONLINE on r2

NAME=ora.OCR.dg
TYPE=ora.diskgroup.type
TARGET=ONLINE
STATE=ONLINE on r1

NAME=ora.asm
TYPE=ora.asm.type
TARGET=ONLINE
STATE=ONLINE on r1

NAME=ora.cvu
TYPE=ora.cvu.type
TARGET=ONLINE
STATE=ONLINE on r2

NAME=ora.gsd
TYPE=ora.gsd.type
TARGET=OFFLINE
STATE=OFFLINE

NAME=ora.net1.network
TYPE=ora.network.type
TARGET=ONLINE
STATE=ONLINE on r1

NAME=ora.oc4j
TYPE=ora.oc4j.type
TARGET=ONLINE
STATE=ONLINE on r1

NAME=ora.ons
TYPE=ora.ons.type
TARGET=ONLINE
STATE=ONLINE on r1

NAME=ora.orcl.db
TYPE=ora.database.type
TARGET=ONLINE
STATE=ONLINE on r1

NAME=ora.r1.ASM1.asm
TYPE=application
TARGET=ONLINE
STATE=ONLINE on r1

NAME=ora.r1.LISTENER_R1.lsnr
TYPE=application
TARGET=ONLINE
STATE=ONLINE on r1

2.[grid@r2 ~]# crs_stat -t
Name Type Target State Host

ora.DATA.dg ora…up.type ONLINE ONLINE r1
ora.FRA.dg ora…up.type ONLINE ONLINE r1
ora…ER.lsnr ora…er.type ONLINE ONLINE r1
ora…N1.lsnr ora…er.type ONLINE ONLINE r1
ora.OCR.dg ora…up.type ONLINE ONLINE r1
ora.asm ora.asm.type ONLINE ONLINE r1
ora.cvu ora.cvu.type ONLINE ONLINE r2
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora…network ora…rk.type ONLINE ONLINE r1
ora.oc4j ora.oc4j.type ONLINE ONLINE r1
ora.ons ora.ons.type ONLINE ONLINE r1
ora.orcl.db ora…se.type ONLINE ONLINE r1
ora…SM1.asm application ONLINE ONLINE r1
ora…R1.lsnr application ONLINE ONLINE r1
ora.r1.gsd application OFFLINE OFFLINE
ora.r1.ons application ONLINE ONLINE r1
ora.r1.vip ora…t1.type ONLINE ONLINE r1
ora…SM2.asm application ONLINE ONLINE r2
ora…R2.lsnr application ONLINE ONLINE r2
ora.r2.gsd application OFFLINE OFFLINE
ora.r2.ons application ONLINE ONLINE r2
ora.r2.vip ora…t1.type ONLINE ONLINE r2
ora.scan1.vip ora…ip.type ONLINE ONLINE r1
[grid@r2 ~]$

3.crsctl命令使用。
[grid@r2 ~]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
[grid@r2 ~]$ crsctl check ctss
CRS-4701: The Cluster Time Synchronization Service is in Active mode.
CRS-4702: Offset (in msec): 0
[grid@r2 ~]$ crsctl check cluster -all

r1:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

r2:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online

[grid@r2 ~]$

4.查看ASM实例状态
4.1.查看ASM相关的进程
[grid@r2 ~]$ ps -ef|grep asm
grid 12918 1 0 Dec20 ? 00:01:12 asm_pmon_+ASM2
grid 12921 1 0 Dec20 ? 00:01:07 asm_psp0_+ASM2
grid 12923 1 3 Dec20 ? 02:03:46 asm_vktm_+ASM2
grid 12928 1 0 Dec20 ? 00:00:09 asm_gen0_+ASM2
grid 12930 1 0 Dec20 ? 00:05:29 asm_diag_+ASM2
grid 12932 1 0 Dec20 ? 00:00:59 asm_ping_+ASM2
grid 12934 1 0 Dec20 ? 00:12:21 asm_dia0_+ASM2
grid 12936 1 0 Dec20 ? 00:15:04 asm_lmon_+ASM2
grid 12938 1 0 Dec20 ? 00:10:20 asm_lmd0_+ASM2
grid 12940 1 0 Dec20 ? 00:23:01 asm_lms0_+ASM2
grid 12944 1 0 Dec20 ? 00:00:27 asm_lmhb_+ASM2
grid 12946 1 0 Dec20 ? 00:00:10 asm_mman_+ASM2
grid 12949 1 0 Dec20 ? 00:00:10 asm_dbw0_+ASM2
grid 12951 1 0 Dec20 ? 00:00:10 asm_lgwr_+ASM2
grid 12953 1 0 Dec20 ? 00:00:20 asm_ckpt_+ASM2
grid 12955 1 0 Dec20 ? 00:00:08 asm_smon_+ASM2
grid 12957 1 0 Dec20 ? 00:02:16 asm_rbal_+ASM2
grid 12959 1 0 Dec20 ? 00:00:33 asm_gmon_+ASM2
grid 12961 1 0 Dec20 ? 00:00:30 asm_mmon_+ASM2
grid 12963 1 0 Dec20 ? 00:01:25 asm_mmnl_+ASM2
grid 12966 1 0 Dec20 ? 00:00:47 asm_lck0_+ASM2
grid 13016 1 0 Dec20 ? 00:00:04 asm_asmb_+ASM2
grid 13024 1 0 Dec20 ? 00:00:26 oracle+ASM2_asmb_+asm2 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
grid 91821 1 0 19:02 ? 00:00:00 asm_pz99_+ASM2
grid 92443 119595 0 19:06 pts/1 00:00:00 grep --color=auto asm
oracle 119186 1 0 Dec21 ? 00:00:03 ora_asmb_orcl2
grid 119190 1 0 Dec21 ? 00:00:16 oracle+ASM2_asmb_orcl2 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
[grid@r2 ~]$
4.2查看ASM实例的状态
SQL> select instance_name,status from v$instance;

5.管理 voting 文件
在voting文件中存储节点的成员信息,每个节点在启动并且试图加入RAC集群时,都需要读voting文件,以确定当前节点的成员资格。这个文件无疑是很重要的,所以在安装clusterware时需要创建多个voting文件,每隔voting文件都有一个唯一的id即fuid。
1.如果将voting文件存储在ASM磁盘组,通过磁盘组的冗余就可以保证voting文件的安全。
2.对voting文件进行备份
3.如果有必要可以增加其他的voting文件,或者删除多余的voting文件,还可以将voting文件从一个存储位置迁移到另外一个存储位置。

5.1查看voting 文件的编号,状态,fuid,磁盘名称,磁盘组名称

[grid@r2 ~]$ crsctl query css votedisk;
 ## STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   5ab0e286c1984f61bfbde3f60d29550e (/dev/asm-disksdb1) [OCR]
Located 1 voting disk(s).

5.2删除损坏的voting文件。

[grid@r2 ~]$ crsctl delete css votedisk 5ab0e286c1984f61bfbde3f60d29550

5.3增加一个voting文件。

[grid@r2 ~]$ crsctl add css votedisk +OCR  //不支持ASM
CRS-4671: This command is not supported for ASM diskgroups.
CRS-4000: Command Add failed, or completed with errors.
[grid@r2 ~]$ 

注:
1.表决磁盘存在asm中的个数不能删除和添加,而是通过asm的normal,high、external冗余级别决定的。
如:在normal中必须有3个故障组存在3个votedisk,在high中必须要有5个故障组存在5个votedisk,那么在external只有1个votedisk
2.表决磁盘文件在11G不在支持dd命令对其的备份和还原,而是支持crsctl相关命令
3.表决磁盘文件的个数要是奇数,便于投票选举,且表决磁盘文件的个数最多为15个,但一般没必要超过5个。

5.4将voting文件从一个存储位置迁移到另一个存储位置。
[grid@r2 ~]$ crsctl replace votedisk +database;

5.5 如果所有voting文件都损坏,就需要从OCR中进行恢复。因为voting文件损坏,所以clusterware是无法正常启动的,启动到exclusive模式。此模式下,clusterware不读取voting文件。
[grid@r2 ~]$ su root
Password:
[root@r2 grid]# crsctl start crs -excl

[root@r2 grid]# crsctl check cluster
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
[grid@racn1 ~]$

[root@r2 grid]# crsctl query css votedisk
[root@r2 grid]# crsctl replace votedisk +database; /恢复voting文件,以前的voting文件将被删除。
[root@r2 grid]# crsctl stop crs;
[root@r2 grid]# crsctl start crs;

6.管理OCR文件
OCR 文件用于存储clusterware以及数据库的配置信息。在安装clusterware时,要求创建多个OCR文件。如果把OCR文件存储在ASM磁盘组中,那么可以通过磁盘的冗余就可以保证OCR文件的安全。对OCR文件的管理主要包括备份、恢复、添加、删除以及迁移等。OCR文件的备份是自动进行的,在clusterware运行的过程中,每四个小时对OCR文件进行一次备份,并保留最后的三个备份。在每天结束及每周结束时,还要各产生一个备份,并保留下来。可以通过ocrconfig命令可以对OCR文件手工进行备份,以root用户身份执行。

6.1对OCR文件进行手工备份
[root@r2 grid]#ocrconfig -manualbackup
r1 2019/12/22 22:41:08 /u01/app/grid/11.2.0/cdata/rac-cluster/backup_20191222_224108.ocr
注:r2节点执行,会生成到r1节点。

6.2通过ocrconfig查看备份的OCR文件
[root@r2 cdata]# ocrconfig -showbackup

r1 2019/12/22 22:26:19 /u01/app/grid/11.2.0/cdata/rac-cluster/backup00.ocr

r1 2019/12/22 18:26:18 /u01/app/grid/11.2.0/cdata/rac-cluster/backup01.ocr

r1 2019/12/22 14:26:17 /u01/app/grid/11.2.0/cdata/rac-cluster/backup02.ocr

r1 2019/12/21 02:26:05 /u01/app/grid/11.2.0/cdata/rac-cluster/day.ocr

r1 2019/12/20 06:25:54 /u01/app/grid/11.2.0/cdata/rac-cluster/week.ocr

r1 2019/12/22 22:41:08 /u01/app/grid/11.2.0/cdata/rac-cluster/backup_20191222_224108.ocr
[root@r2 cdata]#

6.3如果RAC集群中的某个资源无法启动,那么需要通过ocrcheck命令来检查OCR文件是否损坏,如果损坏,需要根据OCR文件的备份进行恢复。如果因为ASM磁盘组的损坏而导致OCR文件损坏,那么首先需要重新创建ASM磁盘组。

[grid@r1 ~]$ ocrcheck
Status of Oracle Cluster Registry is as follows :
	 Version                  :          3
	 Total space (kbytes)     :     262120
	 Used space (kbytes)      :       3308
	 Available space (kbytes) :     258812
	 ID                       : 2033750317
	 Device/File Name         :   +OCRVOTE
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

	 Cluster registry integrity check succeeded

	 Logical corruption check bypassed due to non-privileged user

[grid@racn1 ~]$ 

添加OCR文件到ASM 磁盘

[root@r1 ~]# cd /u01/app/grid/11.2.0/bin
[root@r1 bin]# ./ocrconfig -add +FRA

[root@r1 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
	 Version                  :          3
	 Total space (kbytes)     :     262120
	 Used space (kbytes)      :       3308
	 Available space (kbytes) :     258812
	 ID                       : 2033750317
	 Device/File Name         :   +OCRVOTE
                                    Device/File integrity check succeeded
	 Device/File Name         :       +FRA
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

	 Cluster registry integrity check succeeded

	 Logical corruption check succeeded

删除不必要的OCR配置

[root@r1 bin]# ./ocrconfig -delete +FRA
[root@r1 bin]# ./ocrcheck
Status of Oracle Cluster Registry is as follows :
	 Version                  :          3
	 Total space (kbytes)     :     262120
	 Used space (kbytes)      :       3308
	 Available space (kbytes) :     258812
	 ID                       : 2033750317
	 Device/File Name         :   +OCRVOTE
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

	 Cluster registry integrity check succeeded

	 Logical corruption check succeeded

说明:对于集群文件系统来说,至少创建两个OCR文件,避免OCR单一指向失败

查看OCR引导文件的信息

[root@racn1 bin]# cat /etc/oracle/ocr.loc
#Device/file +FRA being deleted 
ocrconfig_loc=+OCRVOTE

替换OCR文件
./ocrconfig -replace <current_ocr_location> -replacement <new_ocr_location>;

OLR文件的管理
在Clusterware 11gR2中,集群中的每个节点都有节点特定的本地注册表(oracle local registry,OLR),当Clusterware 安装OCR的时候OLR同时被安装和配置。不管oracle 是否运行或者功能是否齐全,每个节点的多个进程能同时通过读和写访问节点配置的OLR信息。OLR类似于集群本地节点OCR 本地的注册表。OLR中包含有关clusterware 可管理的信息,以及各种服务之间的依赖关系,oracle 高可用服务使用这些信息。OLR存储在集群中每个节点的本地存储设备上。

检查OLR状态

[root@racn1 bin]# ./ocrcheck -local
Status of Oracle Local Registry is as follows :
	 Version                  :          3
	 Total space (kbytes)     :     262120
	 Used space (kbytes)      :       2672
	 Available space (kbytes) :     259448
	 ID                       : 1041700445
	 Device/File Name         : /u01/app/grid/11.2.0/cdata/racn1.olr
                                    Device/File integrity check succeeded

	 Local registry integrity check succeeded

	 Logical corruption check succeeded

OLR的手动备份

[root@racn1 bin]# ./ocrconfig -local -manualbackup

racn1     2020/03/03 23:52:28     /u01/app/grid/11.2.0/cdata/racn1/backup_20200303_235228.olr

racn1     2020/03/03 23:52:03     /u01/app/grid/11.2.0/cdata/racn1/backup_20200303_235203.olr

racn1     2020/02/25 07:09:17     /u01/app/grid/11.2.0/cdata/racn1/backup_20200225_070917.olr

racn1     2020/02/25 06:24:42     /u01/app/grid/11.2.0/cdata/racn1/backup_20200225_062442.olr
[root@racn1 bin]# 

查看手动备份

[root@racn1 bin]# ./ocrconfig -local -showbackup  manual

racn1     2020/03/03 23:52:28     /u01/app/grid/11.2.0/cdata/racn1/backup_20200303_235228.olr

racn1     2020/03/03 23:52:03     /u01/app/grid/11.2.0/cdata/racn1/backup_20200303_235203.olr

racn1     2020/02/25 07:09:17     /u01/app/grid/11.2.0/cdata/racn1/backup_20200225_070917.olr

racn1     2020/02/25 06:24:42     /u01/app/grid/11.2.0/cdata/racn1/backup_20200225_062442.olr
[root@racn1 bin]# 

改变OLR备份位置

[root@racn1 bin]# ./ocrconfig -local -backuploc /u01/app/grid/11.2.0/cdata/racn1/test
[root@racn1 bin]# ./ocrconfig -local -manualbackup

racn1     2020/03/04 00:04:37     /u01/app/grid/11.2.0/cdata/racn1/test/backup_20200304_000437.olr

racn1     2020/03/03 23:52:28     /u01/app/grid/11.2.0/cdata/racn1/backup_20200303_235228.olr

racn1     2020/03/03 23:52:03     /u01/app/grid/11.2.0/cdata/racn1/backup_20200303_235203.olr

racn1     2020/02/25 07:09:17     /u01/app/grid/11.2.0/cdata/racn1/backup_20200225_070917.olr

racn1     2020/02/25 06:24:42     /u01/app/grid/11.2.0/cdata/racn1/backup_20200225_062442.olr
[root@racn1 bin]# 

恢复OLR

[root@racn1 bin]# ./crsctl stop crs
[root@racn1 bin]# ./ocrconfig -local -restore /u01/app/grid/11.2.0/cdata/racn1/test/backup_20200304_000437.olr
[root@racn1 bin]# ./ocrcheck -local
Status of Oracle Local Registry is as follows :
	 Version                  :          3
	 Total space (kbytes)     :     262120
	 Used space (kbytes)      :       2720
	 Available space (kbytes) :     259400
	 ID                       : 1041700445
	 Device/File Name         : /u01/app/grid/11.2.0/cdata/racn1.olr
                                    Device/File integrity check succeeded

	 Local registry integrity check succeeded

	 Logical corruption check succeeded

[root@racn1 bin]# 
[root@racn1 bin]# ./crsctl start crs

注意:在节点安装、更新clusterware或者给集群添加节点之后,执行root.sh脚本会备份OLR文件,之后只能手动备份OLR文件,OLR文件不支持自动备份。当从ASM迁移OCR到其他存储方式或者从其他存储方式迁移OCR到ASM,应该创建一个新的OLR文件备份。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值