本文参考:http://www.dbform.com/html/2012/1875.html
http://www.xifenfei.com/2012/06/%E4%BD%BF%E7%94%A8asm-disk-header-%E8%87%AA%E5%8A%A8%E5%A4%87%E4%BB%BD%E4%BF%A1%E6%81%AF%E6%81%A2%E5%A4%8D.html
在Oracle 10.2.0.5之前,ASM磁盘的头块并没有自己的备份,因此一旦头块损坏,如果没有以前kfed read备份出来的信息,也就没有办法使用kfed merge来作头块恢复,特别是如果一个磁盘组中所有的磁盘头块都出现问题(比如被人为地创建了PV),恢复ASM磁盘头块的操作就会非常麻烦。
但是从Oracle 10.2.0.5之后,ASM磁盘的头块会自动备份在另外一个块中,这实际上是Oracle 11g出现的功能,不过经过测试,在Oracle 10.2.0.5版本中,这个备份也是存在的。
对于10.2.0.5.0以及以后版本,不管au size是多少,asm disk header自动备份存储的位置是第2个au的倒数第2个block.
计算方法:AU中包含的block num[AU_SIZE/block_size]*2-2[因为从第一个块从0计数],通过该方法计算结论为:
1M AU在510
2M AU在1022
4M AU在2046
8M AU在4094
16M AU在8190
32M AU在16382
64M AU在32766
正是因为存在这个备份,所以Oracle 10.2.0.5之后的kfed程序才有了新的repair命令,该命令将备份块直接覆盖到磁盘头块,完成修复工作。
在Oracle 10.2.0.4中,如果尝试执行kfed repair,则会报错说命令行参数不正确,此报错说明并不存在repair命令:
$ kfed repair
KFED-00101: LRM error [102] while parsing command line arguments
但是在Oracle 10.2.0.5中,执行kfed repair,则会说无法打开文件空,而这正说明repair命令是存在的,报错是因为还需要明确指定要修复哪块磁盘:
$ kfed repair
KFED-00303: unable to open file ''
cd $ORACLE_HOME/rdbms/lib
cp ins_rdbms.mk ins_rdbms.mk.bak
make -f ins_rdbms.mk ikfed
数据库版本
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production
CORE 11.2.0.4.0 Production
TNS for Linux: Version 11.2.0.4.0 - Production
NLSRTL Version 11.2.0.4.0 - Production
ASM磁盘属性
SQL> set linesize 300
SQL> col PATH format a30
SQL> select group_number,DISK_NUMBER,PATH,HEADER_STATUS from v$asm_disk where group_number<>0;
GROUP_NUMBER DISK_NUMBER PATH HEADER_STATU
------------ ----------- ------------------------------ ------------
4 2 /dev/mapper/cdpfra1 MEMBER
4 3 /dev/mapper/cdpfra2 MEMBER
3 7 /dev/mapper/cdpdata4 MEMBER
3 6 /dev/mapper/cdpdata3 MEMBER
3 4 /dev/mapper/cdpdata1 MEMBER
3 5 /dev/mapper/cdpdata2 MEMBER
SQL> select group_number,name,BLOCK_SIZE,ALLOCATION_UNIT_SIZE from v$asm_diskgroup;
GROUP_NUMBER NAME BLOCK_SIZE ALLOCATION_UNIT_SIZE
------------ ------------------------------ ---------- --------------------
4 FRA 4096 1048576
3 DATA 4096 1048576
$ kfed read /dev/mapper/cdpfra1 blknum=510|>/tmp/diskhead.510
$ kfed read /dev/mapper/cdpfra1 blknum=0|>/tmp/diskhead.0
$ ll /tmp/diskhead.*
-rw-r--r-- 1 grid oinstall 0 Nov 5 09:47 /tmp/diskhead.0
-rw-r--r-- 1 grid oinstall 0 Nov 5 09:46 /tmp/diskhead.510
$ diff /tmp/diskhead.0 /tmp/diskhead.510
--通过对比发现两者无不同记录返回,证明他们记录内容完全相同
破坏 asm disk header
[grid@hlwzf1 lib]$ kfed read /dev/mapper/cdpfra1
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 2147483650 ; 0x008: disk=2
kfbh.check: 49794540 ; 0x00c: 0x02f7cdec
kfbh.fcn.base: 2493 ; 0x010: 0x000009bd
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr: ORCLDISK ; 0x000: length=8
$ dd if=/dev/zero of=/dev/mapper/cdpfra1 bs=4096 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.00430849 s, 951 kB/s
$ kfed read /dev/mapper/cdpfra1 blknum=0
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 0 ; 0x001: 0x00
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 0 ; 0x008: file=0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
查看破坏后的磁盘属性和操作
SQL> set linesize 300
SQL> col PATH format a30
SQL> select group_number,DISK_NUMBER,PATH,HEADER_STATUS from v$asm_disk where group_number<>0;
GROUP_NUMBER DISK_NUMBER PATH HEADER_STATU
------------ ----------- ------------------------------ ------------
4 2 /dev/mapper/cdpfra1 CANDIDATE
4 3 /dev/mapper/cdpfra2 MEMBER
3 7 /dev/mapper/cdpdata4 MEMBER
3 6 /dev/mapper/cdpdata3 MEMBER
3 4 /dev/mapper/cdpdata1 MEMBER
3 5 /dev/mapper/cdpdata2 MEMBER
SQL> alter diskgroup fra dismount;
Diskgroup altered.
SQL> alter diskgroup fra mount;
alter diskgroup fra mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "2" is missing from group number "4"
使用kfed repair修改损坏asm disk header
$ kfed repair '/dev/mapper/cdpfra1'
$ kfed read /dev/mapper/cdpfra1 blknum=0
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 2147483650 ; 0x008: disk=2
kfbh.check: 49794540 ; 0x00c: 0x02f7cdec
kfbh.fcn.base: 2493 ; 0x010: 0x000009bd
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr: ORCLDISK ; 0x000: length=8
SQL> alter diskgroup fra mount;
Diskgroup altered.
kfed read /dev/mapper/cdpfra1 blknum=0|>/tmp/diskhead.0
还可以使用kfed merge恢复
kfed merge /dev/mapper/cdpfra1 /tmp/diskhead.0