200T 数据库非归档无备份恢复---惜分飞

一套近200T的,6个节点的RAC,由于存储管线链路不稳定,导致服务器经常性掉盘,引起asm 磁盘组频繁dismount/mount,数据库集群节点不停的重启,修复好链路问题之后,数据库启动报ORA-01113,ORA-01110
 

ORA-01113-ORA-01110


通过Oracle数据库异常恢复检查脚本(Oracle Database Recovery Check)脚本检测,发现有10个数据文件异常,无法正常恢复


该库比较大,有近200T,因此恢复需要各位谨慎(无法做现场备份,另外客户要求2天时间必须恢复好)
 

200t


由于数据库是非归档模式,该库无法通过应用归档日志来实现对这些文件进行恢复,对于这种情况,直接使用dbms_diskgroup把数据文件头拷贝到文件系统中,类似操作

SQL> @dbms_diskgroup_get_block.sql  +DATA/xifenfei.dbf 1 1 /tmp/xff/xifenfei.dbf.header

Parameter 1:

ASM_file_name (required)

Parameter 2:

block_to_extract (required)

Parameter 3

number_of_blocks_to_extract (required)

Parameter 4:

FileSystem_File_Name (required)

old  14:  v_AsmFilename := '&ASM_File_Name';

new  14:  v_AsmFilename := '+DATA/xifenfei.dbf';

old  15:  v_offstart := '&block_to_extract';

new  15:  v_offstart := '1';

old  16:  v_numblks := '&number_of_blocks_to_extract';

new  16:  v_numblks := '1';

old  17:  v_FsFilename := '&FileSystem_File_Name';

new  17:  v_FsFilename := '/tmp/xff/xifenfei.dbf.header';

File: +DATA/xifenfei.dbf

Type: 2 Data File

Size (in logical blocks): 3978880

Logical Block Size: 16384

Physical Block Size: 512

PL/SQL procedure successfully completed.

然后通过bbed修改相关scn

BBED> set filename 'xifenfei.dbf.header'

    FILENAME        xifenfei.dbf.header

BBED> set blocksize 16384

    BLOCKSIZE       16384

BBED> map

 File: xifenfei.dbf.header (0)

 Block: 1                                     Dba:0x00000000

------------------------------------------------------------

 Data File Header

 struct kcvfh, 860 bytes                    @0      

 ub4 tailchk                                @16380  

BBED> p kcvfh.kcvfhckp.kcvcpscn

struct kcvcpscn, 8 bytes                    @484    

   ub4 kscnbas                              @484      0xa8061324

   ub2 kscnwrp                              @488      0x0081

BBED> assign file 295 block 1 kcvfh.kcvfhckp.kcvcpscn = file 1 block 1 kcvfh.kcvfhckp.kcvcpscn;

struct kcvcpscn, 8 bytes                    @484    

   ub4 kscnbas                              @484      0xa8133e2b

   ub2 kscnwrp                              @488      0x0081

然后把修改的数据文件头写回到asm中

SQL> @dbms_diskgroup_cp_block_to_asm.sql  /tmp/xff/xifenfei.dbf.header  +DATA/xifenfei.dbf 1 1

Parameter 1:

v_FsFileName (required)

Parameter 2:

v_AsmFileName (required)

Parameter 3

v_offstart (required)

Parameter 4

v_numblks (required)

old  16: v_FsFileName := '&v_FsFileName';

new  16: v_FsFileName := '/tmp/xff/xifenfei.dbf.header';

old  17: v_AsmFileName := '&v_AsmFileName';

new  17: v_AsmFileName := '+DATA/xifenfei.dbf';

old  18: v_offstart := '&v_offstart';

new  18: v_offstart := '1';

old  19:  v_numblks := '&v_numblks';

new  19:  v_numblks := '1';

File: +DATA/xifenfei.dbf

Type: 2 Data File

Size (in logical blocks): 3978880

Logical Block Size: 16384

PL/SQL procedure successfully completed.

查询文件头是否修改成功

[oracle@xff1 xff]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Sat Aug 10 16:45:02 2024

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:

Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,

Data Mining and Real Application Testing options

SQL> set numw 16

SQL> select CHECKPOINT_CHANGE# from v$datafile_header where file# in (1,295);

CHECKPOINT_CHANGE#

------------------

      556870614571

      556870614571

SQL> recover datafile 295;

Media recovery complete.

通过上述操作,确认bbed修改文件头成功,后续类似方法对其他9个文件进行修改,并打开数据库

SQL> recover database;

Media recovery complete.

SQL> alter database open;

Database altered.

alert日志提示

Sat Aug 10 16:46:11 2024

ALTER DATABASE RECOVER  datafile 295 

Media Recovery Start

Serial Media Recovery started

WARNING! Recovering data file 295 from a fuzzy backup. It might be an online

backup taken without entering the begin backup command.

Media Recovery Complete (xff1)

Completed: ALTER DATABASE RECOVER  datafile 295 

Sat Aug 10 16:46:39 2024

ALTER DATABASE RECOVER  database 

Media Recovery Start

 started logmerger process

Sat Aug 10 16:46:51 2024

WARNING! Recovering data file 1139 from a fuzzy backup. It might be an online

backup taken without entering the begin backup command.

WARNING! Recovering data file 1140 from a fuzzy backup. It might be an online

backup taken without entering the begin backup command.

WARNING! Recovering data file 1601 from a fuzzy backup. It might be an online

backup taken without entering the begin backup command.

WARNING! Recovering data file 1803 from a fuzzy backup. It might be an online

backup taken without entering the begin backup command.

WARNING! Recovering data file 1827 from a fuzzy backup. It might be an online

backup taken without entering the begin backup command.

WARNING! Recovering data file 1931 from a fuzzy backup. It might be an online

backup taken without entering the begin backup command.

WARNING! Recovering data file 2185 from a fuzzy backup. It might be an online

backup taken without entering the begin backup command.

WARNING! Recovering data file 2473 from a fuzzy backup. It might be an online

backup taken without entering the begin backup command.

WARNING! Recovering data file 2616 from a fuzzy backup. It might be an online

backup taken without entering the begin backup command.

Sat Aug 10 16:46:54 2024

Parallel Media Recovery started with 64 slaves

Media Recovery Complete (xff1)

Completed: ALTER DATABASE RECOVER  database 

Sat Aug 10 17:19:58 2024

alter database open

This instance was first to open

Sat Aug 10 17:19:58 2024

SUCCESS: diskgroup DATA was mounted

Sat Aug 10 17:19:58 2024

NOTE: dependency between database xff and diskgroup resource ora.DATA.dg is established

Sat Aug 10 17:20:10 2024

Picked broadcast on commit scheme to generate SCNs

Sat Aug 10 17:20:10 2024

SUCCESS: diskgroup REDO was mounted

Sat Aug 10 17:20:10 2024

NOTE: dependency between database xff and diskgroup resource ora.REDO.dg is established

Thread 1 opened at log sequence 124958

  Current log# 14 seq# 124958 mem# 0: +REDO/xff/log2.ora

Successful open of redo thread 1

MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set

Sat Aug 10 17:20:14 2024

SMON: enabling cache recovery

Instance recovery: looking for dead threads

Instance recovery: lock domain invalid but no dead threads

[33770] Successfully onlined Undo Tablespace 2.

Undo initialization finished serial:0 start:261099864 end:261100854 diff:990 (9 seconds)

Verifying file header compatibility for 11g tablespace encryption..

Verifying 11g file header compatibility for tablespace encryption completed

SMON: enabling tx recovery

Database Characterset is ZHS16GBK

Sat Aug 10 17:20:16 2024

minact-scn: Inst 1 is now the master inc#:2 mmon proc-id:33650 status:0x7

minact-scn status: grec-scn:0x0000.00000000 gmin-scn:0x0000.00000000 gcalc-scn:0x0000.00000000

Starting background process GTX0

Sat Aug 10 17:20:16 2024

GTX0 started with pid=45, OS id=34119

Starting background process RCBG

Sat Aug 10 17:20:16 2024

RCBG started with pid=46, OS id=34121

replication_dependency_tracking turned off (no async multimaster replication found)

Starting background process QMNC

Sat Aug 10 17:20:16 2024

QMNC started with pid=47, OS id=34134

Starting background process SMCO

Completed: alter database open

其他集群其他节点数据库,一切正常


检查数据字典一致性

SQL> @hcheck.sql

HCheck Version 07MAY18 on 10-AUG-2024 18:24:49

----------------------------------------------

Catalog Version 11.2.0.3.0 (1102000300)

db_name: XFF

                   Catalog   Fixed

Procedure Name             Version    Vs Release    Timestamp

Result

------------------------------ ... ---------- -- ---------- --------------

------

.- LobNotInObj             ... 1102000300 <=  *All Rel* 08/10 18:24:49 PASS

.- MissingOIDOnObjCol          ... 1102000300 <=  *All Rel* 08/10 18:24:49 PASS

.- SourceNotInObj          ... 1102000300 <=  *All Rel* 08/10 18:24:49 PASS

.- OversizedFiles          ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS

.- PoorDefaultStorage          ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS

.- PoorStorage             ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS

.- TabPartCountMismatch        ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS

.- OrphanedTabComPart          ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS

.- MissingSum$             ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS

.- MissingDir$             ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS

.- DuplicateDataobj        ... 1102000300 <=  *All Rel* 08/10 18:24:50 PASS

.- ObjSynMissing           ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS

.- ObjSeqMissing           ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS

.- OrphanedUndo            ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS

.- OrphanedIndex           ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS

.- OrphanedIndexPartition      ... 1102000300 <=  *All Rel* 08/10 18:24:51 PASS

.- OrphanedIndexSubPartition   ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS

.- OrphanedTable           ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS

.- OrphanedTablePartition      ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS

.- OrphanedTableSubPartition   ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS

.- MissingPartCol          ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS

.- OrphanedSeg$            ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS

.- OrphanedIndPartObj#         ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS

.- DuplicateBlockUse           ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS

.- FetUet              ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS

.- Uet0Check               ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS

.- SeglessUET              ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS

.- BadInd$             ... 1102000300 <=  *All Rel* 08/10 18:24:52 PASS

.- BadTab$             ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS

.- BadIcolDepCnt           ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS

.- ObjIndDobj              ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS

.- TrgAfterUpgrade         ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS

.- ObjType0            ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS

.- BadOwner            ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS

.- StmtAuditOnCommit           ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS

.- BadPublicObjects        ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS

.- BadSegFreelist          ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS

.- BadDepends              ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS

.- CheckDual               ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS

.- ObjectNames             ... 1102000300 <=  *All Rel* 08/10 18:24:53 PASS

.- BadCboHiLo              ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS

.- ChkIotTs            ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS

.- NoSegmentIndex          ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS

.- BadNextObject           ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS

.- DroppedROTS             ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS

.- FilBlkZero              ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS

.- DbmsSchemaCopy          ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS

.- OrphanedObjError        ... 1102000300 >  1102000000 08/10 18:24:54 PASS

.- ObjNotLob               ... 1102000300 <=  *All Rel* 08/10 18:24:54 PASS

.- MaxControlfSeq          ... 1102000300 <=  *All Rel* 08/10 18:24:55 PASS

.- SegNotInDeferredStg         ... 1102000300 >  1102000000 08/10 18:25:18 PASS

.- SystemNotRfile1         ... 1102000300 >   902000000 08/10 18:25:18 PASS

.- DictOwnNonDefaultSYSTEM     ... 1102000300 <=  *All Rel* 08/10 18:25:18 PASS

.- OrphanTrigger           ... 1102000300 <=  *All Rel* 08/10 18:25:18 PASS

.- ObjNotTrigger           ... 1102000300 <=  *All Rel* 08/10 18:25:18 PASS

---------------------------------------

10-AUG-2024 18:25:18  Elapsed: 29 secs

---------------------------------------

Found 0 potential problem(s) and 0 warning(s)

PL/SQL procedure successfully completed.

Statement processed.

Complete output is in trace file:

/u01/app/oracle/diag/rdbms/xff/xff1/trace/xff1_ora_71148_HCHECK.trc

运气不错,数据字典本身没有损坏,业务直接运行,一切正常(主要原因是在光纤链路不稳定的情况下,客户已经没有往库中写入数据)

  • 17
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值