ASM磁盘组故障导致数据库不能起来
环境: HP服务器两台 HPp6000存储 rac 11.2.0.3
故障描述: 给backup 磁盘组添加磁盘时,导致磁盘组不能mount,数据库不能起来。
cd /dev/mapper/backup*
[root@RAC-2 mapper]# ls -l backup*
lrwxrwxrwx 1 oracle oinstall 8 Jan 23 17:04 backup -> ../dm-38
lrwxrwxrwx 1 oracle oinstall 8 Jan 23 17:04 backup1 -> ../dm-42
lrwxrwxrwx 1 oracle oinstall 8 Feb 18 14:33 backup1p1 -> ../dm-55
lrwxrwxrwx 1 oracle oinstall 8 Jan 23 17:04 backup2 -> ../dm-39
lrwxrwxrwx 1 oracle oinstall 8 Feb 18 14:34 backup2p1 -> ../dm-56
lrwxrwxrwx 1 oracle oinstall 8 Jan 23 17:04 backup3 -> ../dm-27
lrwxrwxrwx 1 oracle oinstall 8 Feb 18 14:34 backup3p1 -> ../dm-57
lrwxrwxrwx 1 oracle oinstall 8 Jan 23 17:04 backup4 -> ../dm-35
lrwxrwxrwx 1 oracle oinstall 8 Feb 18 14:34 backup4p1 -> ../dm-58
lrwxrwxrwx 1 oracle oinstall 8 Jan 23 17:04 backup5 -> ../dm-32
lrwxrwxrwx 1 oracle oinstall 8 Jan 23 17:04 backup6 -> ../dm-31
lrwxrwxrwx 1 oracle oinstall 8 Jan 23 17:04 backupp1 -> ../dm-48
Alter diskgroup backup add disk ‘/dev/mapper/backup1p1’;
报错,磁盘组dismount。
尝试通过kfed工具修改磁盘头。
Cd $ORACLE_HOME/rdbms/lib
$make -f ins_rdbms.mk ikfed
Kfed read /dev/mapper/backup1p1 ---检查ASM disk header 信息
手工备份 kfed read /dev/mapper/backup1p1 text=/home/oracle/disk_header/backup1p1.txt
恢复 kfed merge /dev/mapper/backup1p1 text=/home/oracle/disk_header/backup1p1.txt
Kfed repair /dev/mapper/backup1p1
清空磁盘头
Dd if=/dev/zero of=/dev/mapper/backup1p1 bs=4096 count=1
[root@RAC-1 mapper]# dd if=/dev/zero of=/dev/mapper/backupp1 bs=4096 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.00426006 s, 961 kB/s
[root@RAC-1 mapper]# dd if=/dev/zero of=/dev/mapper/backup1p1 bs=4096 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.00449278 s, 912 kB/s
[root@RAC-1 mapper]# dd if=/dev/zero of=/dev/mapper/backup2p1 bs=4096 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.00409579 s, 1.0 MB/s
[root@RAC-1 mapper]# dd if=/dev/zero of=/dev/mapper/backup3p1 bs=4096 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.00398681 s, 1.0 MB/s
[root@RAC-1 mapper]# dd if=/dev/zero of=/dev/mapper/backup4p1 bs=4096 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 0.00416703 s, 983 kB/s
-----------------------
SQL> drop diskgroup backup force including contents;
drop diskgroup backup force including contents
*
ERROR at line 1:
ORA-15039: diskgroup not dropped
ORA-15024: discovered duplicately numbered ASM disk 0
SQL> drop diskgroup backup force including contents;
Diskgroup dropped.
通过ASMCA工具添加backup 磁盘组,磁盘组能mount,磁盘正常,但数据库不能起来。
SUCCESS: diskgroup DATA was mounted
SUCCESS: diskgroup BACKUP was mounted
ORA-00210: cannot open the specified control file
ORA-00202: control file: '+BACKUP/oralnx/controlfile/current.256.832531113'
ORA-17503: ksfdopn:2 Failed to open file +BACKUP/oralnx/controlfile/current.256.832531113
ORA-15012: ASM file '+BACKUP/oralnx/controlfile/current.256.832531113' does not exist
ORA-205 signalled during: ALTER DATABASE MOUNT /* db agent *//* {1:2211:560} */...
NOTE: dependency between database oralnx and diskgroup resource ora.DATA.dg is established
故障原因:backup磁盘组下的controlfile没了。
ASMCMD> cp +data/oralnx/controlfile/current.260.832531113 +BACKUP/oralnx/controlfile/current
copying +data/oralnx/controlfile/current.260.832531113 -> +BACKUP/oralnx/controlfile/current
记住:用户在拷贝到ASM上的时候不要指定文件后面的数值,这是Oracle的ASM用来进行标识的信息。
ls -l ASM只是在目标目录下存储了一个alias,真正的文件被ASM放到了其他的位置。
[grid@RAC-1 ~]$ srvctl config database -d oralnx
Database unique name: oralnx
Database name: oralnx
Oracle home: /oracle/app/oracle/product/11.2/db_1
Oracle user: oracle
Spfile: +DATA/oralnx/spfileoralnx.ora
Domain:
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: oralnx
Database instances: oralnx1,oralnx2
Disk Groups: DATA,BACKUP
Mount point paths:
Services:
Type: RAC
Database is administrator managed
数据库还是不能起来,报控制文件错,检查发现参加文件的控制文件目录有错,修改参数文件。
SQL> startup nomount pfile='/oracle/app/oracle/product/11.2/db_1/dbs/initoralnx1.ora'
ORACLE instance started.
Total System Global Area 8.3645E+10 bytes
Fixed Size 2237488 bytes
Variable Size 5.3956E+10 bytes
Database Buffers 2.9528E+10 bytes
Redo Buffers 159531008 bytes
SQL> alter database mount;
Database altered.
SQL> alter database open;
Database altered.
SQL> create spfile='+DATA/oralnx/spfileoralnx.ora' from pfile='/oracle/app/oracle/product/11.2/db_1/dbs/initoralnx1.ora';
File created.
SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
[oracle@RAC-1 ~]$ exit
logout
[root@RAC-1 ~]# su - grid
[grid@RAC-1 ~]$ srvctl start database -d oralnx
数据库启动正常。一切问题解决。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/22969361/viewspace-1102515/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/22969361/viewspace-1102515/