参考:Data Recovery Advisor - RMAN command line example (文档 ID 762339.1)
数据恢复顾问(Data Recovery Advisor)是一款Oracle数据库工具。该工具会自动诊断数据故障,确定并提出相应修复方案,并执行客户要求的修复。以上修复是基于数据故障是一个硬盘上持久性数据的损坏或者丢失。
数据恢复顾问会在遇到问题时,自动采集数据故障信息,并且,协助执行对于故障的修复。你可以手动修复一个数据故障,或者要求数据恢复顾问来为你执行修复动作。
DRA可以通过企业管理器(EM)Grid Control和/或者恢复管理器(RMAN)来访问。本文将概述在RMAN中使用的命令,结合Oracle MOS和自己的测试。
在RMAN中,有四种DRA的命令:
- List Failure - 列出先前执行过的故障评估结果。可能的话,重新验证现有故障并关闭它们。
- Advise Failure - 提出手动和自动修复方案。
- Repair Failure - 通过运行由ADVISE FAILURE建议的最佳修复方案来自动修复故障。完成之后会重新验证现有故障。
- Change Failure - 使你可以改变故障的状态。
下面通过两个例子来说明DRA工具的用法
在测试之前使用RMAN对数据库进行全备
场景一、模拟控制文件丢失
关闭数据库,mv掉controlfile
[ora_tst@test rman]$ mv /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf.bak
启动数据库,报错
SQL> startup
ORACLE instance started.Total System Global Area 1068937216 bytes
Fixed Size 2166536 bytes
Variable Size 427819256 bytes
Database Buffers 624951296 bytes
Redo Buffers 14000128 bytes
ORA-00205: error in identifying control file, check alert log for more info告警日志中错误提示:
ALTER DATABASE MOUNT
ORA-00210: cannot open the specified control file
ORA-00202: control file: '/u01/oracle/TEST/db/apps_st/data/cntrl01.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory很容易可以发现是由于cntrl01.dbf丢失导致数据库无法mount
下面通过DRA来检测故障,并修复
[ora_tst@test ~]$ rman target /
Recovery Manager: Release 11.1.0.7.0 - Production on Mon Apr 21 13:35:09 2014
Copyright (c) 1982, 2007, Oracle. All rights reserved.
connected to target database: TEST (not mounted)
RMAN> list failure;
using target database control file instead of recovery catalog
List of Database Failures
=========================Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
782 CRITICAL OPEN 21-APR-14 Control file /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf is missing通过list failure命令可以发现故障所在,使用list failure ### detail; ( where ### equlas the failure number)可以查看故障的详细信息。
RMAN> list failure 782 detail;
List of Database Failures
=========================Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
782 CRITICAL OPEN 21-APR-14 Control file /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf is missing
Impact: Database cannot be mounted下面可以通过advise failure;命令让Oracle告诉我们遇到这个故障,应该怎么做
RMAN> advise failure;
List of Database Failures
=========================Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
782 CRITICAL OPEN 21-APR-14 Control file /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf is missing
Impact: Database cannot be mountedanalyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=383 device type=DISK
RMAN-06495: must explicitly specify DBID with SET DBID command
analyzing automatic repair options completeMandatory Manual Actions
========================
no manual actions availableOptional Manual Actions
=======================
no manual actions availableAutomated Repair Options
========================
Option Repair Description
------ ------------------
1 Use a multiplexed copy to restore control file /u01/oracle/TEST/db/apps_st/data/cntrl01.dbf
Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_2401635629.hmadvise failure命令提示,我们可以通过拷贝冗余的controlfile来恢复出cntrl01.dbf,并且Oracle在/u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_2401635629.hm中给出具体的恢复脚本
恢复脚本,我们还可以通过repair failure preview命令来获得
RMAN> repair failure preview;
Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_2401635629.hmcontents of repair script:
# restore control file using multiplexed copy
restore controlfile from '/u01/oracle/TEST/db/apps_st/data/cntrl02.dbf';
sql 'alter database mount';Oracle提示我们运行
restore controlfile from '/u01/oracle/TEST/db/apps_st/data/cntrl02.dbf';
sql 'alter database mount';命令来恢复cntrl01.dbf
执行上述命令
RMAN> restore controlfile from '/u01/oracle/TEST/db/apps_st/data/cntrl02.dbf';
Starting restore at 21-APR-14
using channel ORA_DISK_1channel ORA_DISK_1: copied control file copy
output file name=/u01/oracle/TEST/db/apps_st/data/cntrl01.dbf
output file name=/u01/oracle/TEST/db/apps_st/data/cntrl02.dbf
output file name=/u01/oracle/TEST/db/apps_st/data/cntrl03.dbf
Finished restore at 21-APR-14RMAN> sql 'alter database mount';
sql statement: alter database mount
released channel: ORA_DISK_1这里我是手工执行的脚本,也可以通过
RMAN> repair failure;
来自动修复故障。
数据库起到了mount状态,说明故障修复成功。
RMAN> sql 'alter database open';
sql statement: alter database open
场景二、数据文件丢失
关闭数据库
[ora_tst@test trace]$ mv /u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf /u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf.bak
SQL> startup
ORA-32004: obsolete and/or deprecated parameter(s) specified
ORACLE instance started.Total System Global Area 1068937216 bytes
Fixed Size 2166536 bytes
Variable Size 427819256 bytes
Database Buffers 624951296 bytes
Redo Buffers 14000128 bytes
Database mounted.
ORA-01157: cannot identify/lock data file 401 - see DBWR trace file
ORA-01110: data file 401: '/u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf'启动数据库时报错,查看告警日志,如下:
ALTER DATABASE OPEN
Errors in file /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/trace/TEST_dbw0_27581.trc:
ORA-01157: cannot identify/lock data file 401 - see DBWR trace file
ORA-01110: data file 401: '/u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory下面通过DRA来查看故障,并解决
RMAN> list failure;
using target database control file instead of recovery catalog
List of Database Failures
=========================Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
825 HIGH OPEN 21-APR-14 One or more non-system datafiles are missingRMAN> list failure 825 detail;
List of Database Failures
=========================Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
825 HIGH OPEN 21-APR-14 One or more non-system datafiles are missing
Impact: See impact for individual child failures
List of child failures for parent failure ID 825
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
828 HIGH OPEN 21-APR-14 Datafile 401: '/u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf' is missing
Impact: Some objects in tablespace APPS_TS_TX_DATA might be unavailable通过命令很明显的可以发现故障所在。
RMAN> advise failure;
List of Database Failures
=========================Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
825 HIGH OPEN 21-APR-14 One or more non-system datafiles are missing
Impact: See impact for individual child failures
List of child failures for parent failure ID 825
Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
828 HIGH OPEN 21-APR-14 Datafile 401: '/u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf' is missing
Impact: Some objects in tablespace APPS_TS_TX_DATA might be unavailableanalyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=382 device type=DISK
analyzing automatic repair options completeMandatory Manual Actions
========================
no manual actions availableOptional Manual Actions
=======================
1. If file /u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf was unintentionally renamed or moved, restore it
2. If a standby database is available, then consider a Data Guard switchover or failoverAutomated Repair Options
========================
Option Repair Description
------ ------------------
1 Restore and recover datafile 401
Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_556356707.hmOracle告诉我们/u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf文件renamed或者moved需要恢复,Restore and recover datafile 401。
RMAN> repair failure preview;
Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_556356707.hmcontents of repair script:
# restore and recover datafile
restore datafile 401;
recover datafile 401;可通过
restore datafile 401;
recover datafile 401;来恢复datafile 401。本次测试通过repair failure;命令来自动修改故障
RMAN> repair failure;
Strategy: The repair includes complete media recovery with no data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_556356707.hmcontents of repair script:
# restore and recover datafile
restore datafile 401;
recover datafile 401;Do you really want to execute the above repair (enter YES or NO)? YES
" YES" is an invalid response - please re-enter.Do you really want to execute the above repair (enter YES or NO)? YES
executing repair scriptStarting restore at 21-APR-14
using channel ORA_DISK_1channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00401 to /u01/oracle/TEST/db/apps_st/data/a_txn_data02.dbf
channel ORA_DISK_1: reading from backup piece /u01/oracle/TEST/db/tech_st/11.1.0/dbs/0ep69g7a_1_1
channel ORA_DISK_1: piece handle=/u01/oracle/TEST/db/tech_st/11.1.0/dbs/0ep69g7a_1_1 tag=TAG20140421T110305
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:05:35
Finished restore at 21-APR-14Starting recover at 21-APR-14
using channel ORA_DISK_1starting media recovery
archived log for thread 1 with sequence 22 is already on disk as file /u01/PROD/db/apps_st/data/archive/ARC1_22_825013351.arc
channel ORA_DISK_1: starting archived log restore to default destination
channel ORA_DISK_1: restoring archived log
archived log thread=1 sequence=21
channel ORA_DISK_1: reading from backup piece /u01/oracle/TEST/db/tech_st/11.1.0/dbs/0fp69laa_1_1
channel ORA_DISK_1: piece handle=/u01/oracle/TEST/db/tech_st/11.1.0/dbs/0fp69laa_1_1 tag=TAG20140421T123001
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
archived log file name=/u01/PROD/db/apps_st/data/archive/ARC1_21_825013351.arc thread=1 sequence=21
media recovery complete, elapsed time: 00:00:01
Finished recover at 21-APR-14
repair failure completeDo you want to open the database (enter YES or NO)? YES
database opened场景三、日志组丢失
关闭数据库
删除非当前日志组所有的日志文件
[ora_tst@test trace]$ rm -f /u01/oracle/TEST/db/apps_st/data/log01b.dbf
[ora_tst@test trace]$ rm -f /u01/oracle/TEST/db/apps_st/data/log01a.dbf
SQL> startup
ORA-32004: obsolete and/or deprecated parameter(s) specified
ORACLE instance started.Total System Global Area 1068937216 bytes
Fixed Size 2166536 bytes
Variable Size 427819256 bytes
Database Buffers 624951296 bytes
Redo Buffers 14000128 bytes
Database mounted.
ORA-00313: open failed for members of log group 1 of thread 1
ORA-00312: online log 1 thread 1: '/u01/oracle/TEST/db/apps_st/data/log01b.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3
ORA-00312: online log 1 thread 1: '/u01/oracle/TEST/db/apps_st/data/log01a.dbf'
ORA-27037: unable to obtain file status
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3启动数据库时报错
使用DRA修复上述故障
RMAN> list failure;
using target database control file instead of recovery catalog
List of Database Failures
=========================Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
999 CRITICAL OPEN 21-APR-14 Redo log group 1 is unavailable
1005 HIGH OPEN 21-APR-14 Redo log file /u01/oracle/TEST/db/apps_st/data/log01a.dbf is missing
1002 HIGH OPEN 21-APR-14 Redo log file /u01/oracle/TEST/db/apps_st/data/log01b.dbf is missingRMAN> advise failure;
List of Database Failures
=========================Failure ID Priority Status Time Detected Summary
---------- -------- --------- ------------- -------
999 CRITICAL OPEN 21-APR-14 Redo log group 1 is unavailable
1005 HIGH OPEN 21-APR-14 Redo log file /u01/oracle/TEST/db/apps_st/data/log01a.dbf is missing
1002 HIGH OPEN 21-APR-14 Redo log file /u01/oracle/TEST/db/apps_st/data/log01b.dbf is missinganalyzing automatic repair options; this may take some time
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=381 device type=DISK
analyzing automatic repair options completeMandatory Manual Actions
========================
no manual actions availableOptional Manual Actions
=======================
1. If file /u01/oracle/TEST/db/apps_st/data/log01a.dbf was unintentionally renamed or moved, restore it
2. If file /u01/oracle/TEST/db/apps_st/data/log01b.dbf was unintentionally renamed or moved, restore it
3. If a standby database is available, then consider a Data Guard switchover or failoverAutomated Repair Options
========================
Option Repair Description
------ ------------------
1 Perform incomplete database recovery to SCN 5965141836565
Strategy: The repair includes point-in-time recovery with some data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_3499717585.hmRMAN> repair failure preview;
Strategy: The repair includes point-in-time recovery with some data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_3499717585.hmcontents of repair script:
# database point-in-time recovery
restore database until scn 5965141836565;
recover database until scn 5965141836565;
alter database open resetlogs;RMAN> repair failure;
Strategy: The repair includes point-in-time recovery with some data loss
Repair script: /u01/oracle/TEST/db/tech_st/11.1.0/admin/TEST_test/diag/rdbms/test/TEST/hm/reco_3499717585.hmcontents of repair script:
# database point-in-time recovery
restore database until scn 5965141836565;
recover database until scn 5965141836565;
alter database open resetlogs;Do you really want to execute the above repair (enter YES or NO)? YES
executing repair scriptStarting restore at 21-APR-14
using channel ORA_DISK_1channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to /u01/oracle/TEST/db/apps_st/data/system01.dbf
channel ORA_DISK_1: restoring datafile 00002 to /u01/oracle/TEST/db/apps_st/data/system02.dbf
channel ORA_DISK_1: restoring datafile 00003 to /u01/oracle/TEST/db/apps_st/data/system03.dbf
channel ORA_DISK_1: restoring datafile 00004 to /u01/oracle/TEST/db/apps_st/data/system04.dbf
channel ORA_DISK_1: restoring datafile 00005 to /u01/oracle/TEST/db/apps_st/data/system05.dbf
channel ORA_DISK_1: restoring datafile 00006 to /u01/oracle/TEST/db/apps_st/data/ctxd01.dbf
channel ORA_DISK_1: restoring datafile 00007 to /u01/oracle/TEST/db/apps_st/data/owad01.dbf
channel ORA_DISK_1: restoring datafile 00008 to /u01/oracle/TEST/db/apps_st/data/a_queue02.dbf
channel ORA_DISK_1: restoring datafile 00009 to /u01/oracle/TEST/db/apps_st/data/odm.dbf................................
channel ORA_DISK_1: restoring datafile 00407 to /u01/oracle/TEST/db/apps_st/data/a_ref02.dbf
channel ORA_DISK_1: reading from backup piece /u01/oracle/TEST/db/tech_st/11.1.0/dbs/0ep69g7a_1_1
channel ORA_DISK_1: piece handle=/u01/oracle/TEST/db/tech_st/11.1.0/dbs/0ep69g7a_1_1 tag=TAG20140421T110305
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 01:16:04
Finished restore at 21-APR-14Starting recover at 21-APR-14
using channel ORA_DISK_1starting media recovery
archived log for thread 1 with sequence 21 is already on disk as file /u01/PROD/db/apps_st/data/archive/ARC1_21_825013351.arc
archived log for thread 1 with sequence 22 is already on disk as file /u01/PROD/db/apps_st/data/archive/ARC1_22_825013351.arc
archived log for thread 1 with sequence 23 is already on disk as file /u01/PROD/db/apps_st/data/archive/ARC1_23_825013351.arc
archived log for thread 1 with sequence 24 is already on disk as file /u01/PROD/db/apps_st/data/archive/ARC1_24_825013351.arc
archived log file name=/u01/PROD/db/apps_st/data/archive/ARC1_21_825013351.arc thread=1 sequence=21
archived log file name=/u01/PROD/db/apps_st/data/archive/ARC1_22_825013351.arc thread=1 sequence=22
archived log file name=/u01/PROD/db/apps_st/data/archive/ARC1_23_825013351.arc thread=1 sequence=23
media recovery complete, elapsed time: 00:00:19
Finished recover at 21-APR-14database opened
repair failure complete