主要介绍ASM diskmetadata及ASM diskgroup metadata的备份及恢复,还介绍了Oracle RAC环境中比较重要的Voting Disk及OCR的备份及恢复。
一、备份ASM disk头文件
1、查询磁盘的路径(路径会因系统的不同而显示各异,以下是在Linux环境,如在Solaris环境会显示如’/dev/rdsk/c0t60080E5000367E98000004A25373CF5Cd0s6‘)
[grid@ractst01 ~]$ sqlplus / as sysasm
SQL>select g.group_number g_no, g.name,d.disk_number, d.mount_status, d.header_status, g.type,d.name, d.path fromv$asm_disk d,v$asm_diskgroup g where g.group_number=d.group_number;
G_NO NAME DISK_NUMBERMOUNT_S HEADER_STATU TYPE NAME PATH
---------- ---------- ------------------ ------------ ------ ---------- ------------------------------
1 DATA 0CACHED MEMBER EXTERN DISK1 ORCL:DISK1
1 DATA 1CACHED MEMBER EXTERN DISK2 ORCL:DISK2
1 DATA 2CACHED MEMBER EXTERN DISK3 ORCL:DISK3
2、备份ASM disk头文件(也可采用dd等其它工具,建议使用kfed)
[grid@ractst01 ~]$kfedread ORCL:DISK1 te=/home/grid/mdbackup/disk1.bak
[grid@ractst01 ~]$ more/home/grid/mdbackup/disk1.bak
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002:KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj: 2147483648 ; 0x008: TYPE=0x8NUMB=0x0
kfbh.check: 2074855584 ; 0x00c:0x7babc8a0
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr: ORCLDISKDISK1 ; 0x000: length=13
kfdhdb.driver.reserved[0]: 1263749444 ; 0x008: 0x4b534944
kfdhdb.driver.reserved[1]: 49 ; 0x00c: 0x00000031
kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000
kfdhdb.compat: 186646528 ; 0x020: 0x0b200000
kfdhdb.dsknum: 0 ; 0x024: 0x0000
kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts: 3 ; 0x027:KFDHDR_MEMBER
kfdhdb.dskname: DISK1 ; 0x028: length=5
kfdhdb.grpname: DATA ; 0x048: length=4
kfdhdb.fgname: DISK1 ; 0x068: length=5
kfdhdb.capname: ; 0x088: length=0
kfdhdb.crestmp.hi: 33025220 ; 0x0a8: HOUR=0x4DAYS=0x6 MNTH=0xb YEAR=0x7df
kfdhdb.crestmp.lo: 2096627712 ; 0x0ac: USEC=0x0MSEC=0x200 SECS=0xf MINS=0x1f
kfdhdb.mntstmp.hi: 33025221 ; 0x0b0: HOUR=0x5 DAYS=0x6 MNTH=0xbYEAR=0x7df
kfdhdb.mntstmp.lo: 3831974912 ; 0x0b4: USEC=0x0MSEC=0x1d3 SECS=0x6 MINS=0x39
kfdhdb.secsize: 512 ; 0x0b8: 0x0200
kfdhdb.blksize: 4096 ; 0x0ba: 0x1000
kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000
kfdhdb.mfact: 113792 ; 0x0c0: 0x0001bc80
kfdhdb.dsksize: 8189 ; 0x0c4: 0x00001ffd
kfdhdb.pmcnt: 2 ; 0x0c8: 0x00000002
kfdhdb.fstlocn: 1 ; 0x0cc: 0x00000001
kfdhdb.altlocn: 2 ; 0x0d0: 0x00000002
kfdhdb.f1b1locn: 2 ; 0x0d4: 0x00000002
kfdhdb.redomirrors[0]: 0 ; 0x0d8: 0x0000
kfdhdb.redomirrors[1]: 0 ; 0x0da: 0x0000
kfdhdb.redomirrors[2]: 0 ; 0x0dc: 0x0000
kfdhdb.redomirrors[3]: 0 ; 0x0de: 0x0000
kfdhdb.dbcompat: 168820736 ; 0x0e0: 0x0a100000
kfdhdb.grpstmp.hi: 33025220 ; 0x0e4: HOUR=0x4DAYS=0x6 MNTH=0xb YEAR=0x7df
kfdhdb.grpstmp.lo: 2096453632 ; 0x0e8: USEC=0x0MSEC=0x156 SECS=0xf MINS=0x1f
kfdhdb.vfstart: 128 ; 0x0ec: 0x00000080
kfdhdb.vfend: 160 ; 0x0f0: 0x000000a0
kfdhdb.spfile: 0 ; 0x0f4: 0x00000000
kfdhdb.spfflg: 0 ; 0x0f8: 0x00000000
kfdhdb.ub4spare[0]: 0 ; 0x0fc: 0x00000000
kfdhdb.ub4spare[1]: 0 ; 0x100: 0x00000000
kfdhdb.ub4spare[2]: 0 ; 0x104: 0x00000000
kfdhdb.ub4spare[3]: 0 ; 0x108: 0x00000000
kfdhdb.ub4spare[4]: 0 ; 0x10c: 0x00000000
kfdhdb.ub4spare[5]: 0 ; 0x110: 0x00000000
kfdhdb.ub4spare[6]: 0 ; 0x114: 0x00000000
kfdhdb.ub4spare[7]: 0 ; 0x118: 0x00000000
kfdhdb.ub4spare[8]: 0 ; 0x11c: 0x00000000
kfdhdb.ub4spare[9]: 0 ; 0x120: 0x00000000
kfdhdb.ub4spare[10]: 0 ; 0x124: 0x00000000
kfdhdb.ub4spare[11]: 0 ; 0x128: 0x00000000
kfdhdb.ub4spare[12]: 0 ; 0x12c: 0x00000000
………………………………………………………………..
[grid@ractst01 ~]$ kfed read ORCL:DISK2te=/home/grid/mdbackup/disk2.bak
[grid@ractst01 ~]$ kfed read ORCL:DISK3te=/home/grid/mdbackup/disk3.bak
3、一般每磁盘备份一次即可,如有磁盘或磁盘组的属性调整需进行一次新的备份。
4、上传备份文件到统一备份归档目录。
二、ASM disk头文件恢复:
1、以下是模拟头文件损坏:
[grid@ractst01 ~]$dd if=/dev/zero of=ORCL:DISK2 bs=4096 count=2
SQL> select NAME,MOUNT_STATUS,HEADER_STATUS from v$asm_disk;
NAME MOUNT_S HEADER_STATU
------------------------------ ------- ------------
CLOSED PROVISIONED
CLOSED PROVISIONED
CLOSED PROVISIONED
DISK1 CACHED MEMBER
DISK2 CACHED CANDIDATE
DISK3 CACHED MEMBER
SQL> shutdown abort
ASM instance shutdown
SQL> startup nomount;
ORA-03113: end-of-file on communication channel
2、恢复损坏的ASM disk头文件
[grid@ractst01 ~]#kfed repair /dev/sdb1(有关这方面的相关知识将在另外文档中介绍)
[grid@ractst01 ~]#kfedwrite /dev/sdb1 te=/home/grid/mdbackup/disk2.bak
SQL> /
NAME MOUNT_S HEADER_STATU
------------------------------ -------------------
CLOSED PROVISIONED
CLOSED PROVISIONED
CLOSED PROVISIONED
DISK1 CACHED MEMBER
DISK2 CACHED MEMBER
DISK3 CACHED MEMBER
三、备份ASM diskgroup头文件
1、在本机建立备份目录
[grid@ractst01 ~]$ mkdir mdbackup
2、查询磁盘组的信息
[grid@ractst01 ~]$ asmcmd
ASMCMD>
ASMCMD> lsdg
State Type Rebal Sector Block AU Total_MB Free_MB Req_mir_free_MB Usable_file_MB Offline_disks Voting_files Name
MOUNTED EXTERN N 512 4096 1048576 24567 22432 0 22432 0 N DATA/ ----磁盘组名称为DATA,可能有多个磁盘组
3、备份ASM磁盘组头文件信息,建议每个磁盘组备份为一个文件
ASMCMD> md_backup/home/grid/mdbackup/DATA.bak -G DATA
Disk group metadata to be backed up:DATA
Current alias directory path:RACTST/ONLINELOG
Current alias directory path: RACTST
Current alias directory path: RACTST/TEMPFILE
Current alias directory path:ractst-cluster/OCRFILE
Current alias directory path:RACTST/DATAFILE
Current alias directory path:ractst-cluster/ASMPARAMETERFILE
Current alias directory path:RACTST/PARAMETERFILE
Current alias directory path: ractst-cluster
Current alias directory path:RACTST/CONTROLFILE
4、备份策略:一般不需要定期备份,如有磁盘组的修改比如磁盘的添加及删除、磁盘组属性的修改等,建议进行一次新的备份。
5、上传备份文件到统一备份归档目录。
四、恢复ASM diskgroup头文件
一旦ASM diskgroup metadata有问题,可以通过md_restore恢复:
ASMCMD> help md_restore
md_restore <backup_file>[--silent]
[--full|--nodg|--newdg] [-S<sql_script_file>]
[-G'<diskgroup_name>,<diskgroup_name>,...']
[-o'<old_diskgroup_name>:<new_diskgroup_name>,...']
Perform ASM Metadata restore for disk groups.
Read metadata information from <backup_file>.
--silent Ignore errors. Normally if md_restore encounters an error, it
will stop. Specifying thisflag ignores that.
--full create disk group andrestore metadata.
--nodg restore metadata only.
--newdg create disk group with a different name and restore
metadata; -o is required.
-S Write SQL commands to <sql_script_file> instead of executingthem.
-G Select the disk groups to be restored. If no disk groups defined,
all of them will be restored.
-o Rename disk group <old_diskgroup_name> to<new_diskgroup_name>.
五、以下内容是RAC环境中比较重要的VotingDisk及OCR的备份及恢复
1、Voting Disk
Voting Disk主要用于记录节点成员状态,在出现脑裂时,决定哪个成员获得控制权,其他的成员须从集群中剔除。
查看votedisk的位置:
[root@rac1 ~]# crsctl query cssvotedisk
0. 0 /dev/raw/raw2
located 1 votedisk(s).
备份votedisk盘:
[root@raw1 bin]# dd if=/dev/raw/raw2of=/home/grid/votebackup/voting_disk.bak
294912+0 records in
294912+0 records out
恢复votedisk盘:
[root@raw1 bin]# dd if=/home/grid/votebackup/voting_disk.bakof=/dev/raw/raw2
294912+0 records in
294912+0 records out
注:从11GR2开始不再需要手工备份votingdisk,votingdisk的改变会自动备份到ocr备份文件中,相关信息会自动还原到任何添加的表决磁盘文件中。
2、OCR
OCR记录节点成员的配置信息,如database,ASM,Instance,VIP等CRS资源的配置信息。OracleClusterware把集群的配置信息放在共享存储上,这些信息包括了集群节点的列表、集群数据库实例到节点的映射以及CRS应用程序资源信息。存放的位置就在OCR Disk上. 在整个集群中,只有一个节点能对OCR Disk 进行读写操作,这个节点叫作Master Node,所有节点都会在内存中保留一份OCR的拷贝,同时有一个OCR Process 从这个内存中读取内容。OCR 内容发生改变时,由Master Node的OCR Process负责同步到其他节点的OCR Process。
Oracle 每4个小时对其做一次备份,并且保留最后的3个备份,以及前一天,前一周的最后一个备份。这个备份由Master Node CRSD进程完成,备份的默认位置是$CRS_HOME/crs/cdata/<cluster_name>目录下,可以通过ocrconfig-backuploc <directory_name> 命令修改到新的目录。每次备份后,备份文件名自动更改,以反应备份时间顺序,最近一次的备份是backup00.ocr。
root@VPRAC01 # ocrconfig -showbackup
vprac02 2015/11/09 14:57:23 /oracle/app/11.2.0/grid/cdata/VPRAC-cluster/backup00.ocr
vprac02 2015/11/09 10:57:23 /oracle/app/11.2.0/grid/cdata/VPRAC-cluster/backup01.ocr
vprac02 2015/11/09 06:57:22 /oracle/app/11.2.0/grid/cdata/VPRAC-cluster/backup02.ocr
vprac02 2015/11/07 22:57:16 /oracle/app/11.2.0/grid/cdata/VPRAC-cluster/day.ocr
vprac02 2015/10/28 22:56:31 /oracle/app/11.2.0/grid/cdata/VPRAC-cluster/week.ocr
vprac01 2013/08/22 11:52:51 /oracle/app/11.2.0/grid/cdata/VPRAC-cluster/backup_20130822_115251.ocr
Oracle 推荐在对集群做调整时,比如增加,删除节点之前,修改RAC IP之前,对OCR做一个备份,可以使用export 备份到指定文件,如果做了replace或者restore 等操作,Oracle 建议使用 cluvfy comp ocr -n all 命令来做一次全面的检查。
对OCR的恢复,可使用ocrconfig 命令:
root@RACTSRT01 # ocrconfig
Name:
ocrconfig - Configuration tool for Oracle Cluster/Local Registry.
Synopsis:
ocrconfig [option]
option:
[-local] -export<filename>
- Export OCR/OLRcontents to a file
[-local] -import<filename> - Import OCR/OLRcontents from a file
[-local] -upgrade [<user>[<group>]]
- Upgrade OCR from previousversion
-downgrade [-version<version string>]
- Downgrade OCR to the specified version
[-local] -backuploc<dirname> - Configure OCR/OLRbackup location
[-local] -showbackup[auto|manual] - Show OCR/OLR backupinformation
[-local] -manualbackup - Perform OCR/OLR backup
[-local] -restore<filename> - Restore OCR/OLRfrom physical backup
-replace <currentfilename> -replacement <new filename>
- Replace a OCR device/file <filename1> with <filename2>
-add <filename> - Add a new OCRdevice/file
-delete <filename> - Remove a OCR device/file
-overwrite - Overwrite OCRconfiguration on disk
-repair -add <filename> |-delete <filename> | -replace <current filename> -replacement<new filename>
- Repair OCR configuration on the local node
-help - Print out thishelp information
Note:
* A log file will be created in
$ORACLE_HOME/log/<hostname>/client/ocrconfig_<pid>.log.Please ensure
you have file creation privileges in the above directory before
running this tool.
* Only -local -showbackup [manual] is supported.
* Use option '-local' to indicate that the operation is to be performedon the Oracle Local Registry.
OCR恢复步骤大致如下:
1)停止所有节点clusterware
# crsctl stop crs
# crsctl stop crs -f
2)以root用户在其中一个节点独占模式启动clusterware
# crsctl start crs -excl -nocrs
备注:如果发现crsd在运行,那么通过如下命令将之停止。
# crsctl stop resource ora.crsd -init
3)创建新的存放ocr和vote disk的磁盘组,磁盘组名和原有的一致(如果想改变位置,需修改/etc/oracle/ocr.loc文件)
备注:如发现无法创建等情况,可以采用如下删除磁盘组等排错思路
SQL> drop diskgroup disk_group_nameforce including contents;
4)还原ocr,并检查
# ocrconfig -restore file_name
# ocrcheck
5)恢复表决磁盘,并检查
# crsctl replace votedisk+asm_disk_group
# crsctl query css votedisk
6)停止独占模式运行的clusterware
# crsctl stop crs -f
7)所有节点正常启动clusterware
# crsctl start crs
8)CVU验证所有RAC节点OCR的完整性
$ cluvfy comp ocr -n all -verbose