如何确认database 已经正常恢复 rman v$datafile_header fuzzy

jnrjian

已于 2024-08-12 17:12:39 修改

阅读量436

点赞数 5

文章标签：数据库 oracle

于 2024-08-12 16:34:35 首次发布

本文链接：https://blog.csdn.net/jnrjian/article/details/141135527

版权

select status, checkpoint_change#,
to_char(checkpoint_time, 'DD-MON-YYYY HH24:MI:SS')
as checkpoint_time, count(*) from v$datafile_header
group by status, checkpoint_change#, checkpoint_time
order by status, checkpoint_change#, checkpoint_time;

select count(*),status,checkpoint_change#,fuzzy from v$datafile_header group by status,checkpoint_change#,fuzzy ;

Oracle Database - Enterprise Edition - Version 10.2.0.1 to 12.2.0.1 [Release 10.2 to 12.2]
Information in this document applies to any platform.

SYMPTOMS

- Partial restore of the database to new server, skipping tablepspaces failing with error:

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1: '/<path>/<filename>'

- Required datafiles have been recovered till the time required and files are non Fuzzy.
- Non CDB environment.

CAUSE

There is issue with the Controlfile

SOLUTION

- Ensure the database has been restored and recovery till the Until time or the SCN specified in the PITR script

- In the below example PITR is being done until time 24-DEC-2018 15:00:00.

- Only few Selected tablespaces have been restored (Out of 45 datafiles only 15 datafiles from system, sysaux, undo, and required tablespace have been restored

Run the below query to find the checkpoint time and SCN in the restored datafiles

STATUS         CHECKPOINT_CHANGE#     CHECKPOINT_TIME         COUNT(*)
------- ------------------ -------------------- -----------------------------------
OFFLINE              0                     30              # Datafiles belong to the skipped tablespaces
ONLINE         592225477425             24-DEC-2018 15:00:00       15             # 15 datafile belong to the needed tablespaces

Check for individual files"

Select status,name,file#,ts#,CHECKPOINT_CHANGE# from v$datafile_header where ts# in(Select ts# from v$tablespace where name in ('<replace with tablespace name seperated by comma including system/sysuax/undo>'));

STATUS NAME                              FILE#        TS#      checkpoint_change#
------- ------------------------------------------------------------------- ---------- ----------
ONLINE /<path>/<filename>              1          0         592225477425
ONLINE  /<path>/<filename>               2          1         592225477425
ONLINE  /<path>/<filename>               3          2         592225477425
ONLINE  /<path>/<filename>              4          4          592225477425
ONLINE  /<path>/<filename>              5          5         592225477425
ONLINE  /<path>/<filename>              6          4        592225477425
.
.

15 rows selected.

All the 15 datafiles are at Same Checkpoint time, Checkpoint change#. These datafiles belong to the tablespaces which have been included in the restore and recovery command
Ensure the needed datafiles are not Fuzzy:

select count(*),status,checkpoint_change#,fuzzy from v$datafile_header group by status,checkpoint_change#,fuzzy ;

COUNT(*) STATUS CHECKPOINT_CHANGE# FUZ
---------- ------- ------------------ ---
30 OFFLINE 0 NO
15 ONLINE 592225477425 NO --------------------------------> All 15 datafiles are at same checkpoint change# and are not fuzzy

The above confirms the 15 datafiles are all not fuzzy and at having the same SCN.

To resolve the issue try Option 1 first. If option 1 does not work, try Option 2

Option 1 : Fake recovery

SQL> Recover database using backup controlfile until cancel ;

Type Cancel

If no errors,then open the database with resetlogs:

SQL> Alter database open resetlogs ;

Option 2 : Recreate controlfile

- If Option 1 still gives the same error recreate the controlfile.

- Save the current database information:
Spool /<path>/<LogfileName>.log
Select name,file#,status,ts# from v$datafile :
Select member,group# from v$logfile ;
Select * from v$log ;
Show parameter control
Spool off

- Create controlfile dump to recreate the controlfile:

Alter session set tracefile_identifier='create' ;

alter database backup controlfile to trace resetlogs ;

- Shut down the restored database

- Edit the controlfile dump file which should be created under the trace directory. Remove the datafiles which are not needed. Keeping only 15 datafiles which were restored and needed

SQL>Startup nomount ;

SQL>@/<path>/<controlfile_dump>

SQL>Alter database mount ;

SQL> Recover database using backup controlfile until cancel ;

Type cancel ;

SQL>Alter database open resetlogs ;

------------------------有 archived log在时追赶某个文件！！！

Cannot proceed with recovery due to ORA-16433:

Sun Aug 17 14:04:07 2014
Media Recovery failed with error 16433
Slave exiting with ORA-283 exception
Errors in file /oradump/prod/<PATH>/logs/diag/diag/rdbms/<db_unique_name/<SID>/trace/<SID>_pr00_32964892.trc:
ORA-00283: recovery session canceled due to errors
ORA-16433: The database must be opened in read/write mode.
Recovery Slave PR00 previously exited with exception 283
ORA-283 signalled during: ALTER DATABASE RECOVER database using backup controlfile until cancel ...

Normally, the solution is to recreate the controlfile. However, the following error is produced when trying to recreate the controlfile:

CREATE CONTROLFILE REUSE DATABASE "<SID>" RESETLOGS ARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-01189: file is from a different RESETLOGS than previous files
ORA-01110: data file 2: '/ora/prod/<db_unique_name/data/<SID>/datafile/o1_mf_undotbs1_334h0kxm_.dbf’

CHANGES

CAUSE

In order to recreate the controlfile, all datafiles must be at the same incarnation, or resetlogs time.
We cannot allow datafiles to be at different incarnations.

Using the current controlfile, which was used and failed with recovery before, we can see that all datafiles, except datafile# 1, have a RESETLOGS_TIME of 17-AUG-2014 13:28:19:

SQL> select status,checkpoint_change#,checkpoint_time, resetlogs_change#,
2 resetlogs_time, count(*), fuzzy from v$datafile_header
3 group by status,checkpoint_change#,checkpoint_time, resetlogs_change#,
4 resetlogs_time, fuzzy;

STATUS     CHECKPOINT_CHANGE# CHECKPOINT_TIME      RESETLOGS_CHANGE# RESETLOGS_TIME              COUNT(*) FUZ
---------- ------------------ -------------------- ----------------- ------------------------- ---------- ---
ONLINE          3645505893814 17-AUG-2014 11:37:33                 1 27-APR-2007 14:07:53               1 NO
ONLINE          3645505893818 17-AUG-2014 13:34:48     3645505897826 17-AUG-2014 13:28:19             156 YES

SQL> select file#, status, checkpoint_change#, checkpoint_time, resetlogs_change#, resetlogs_time, fuzzy from v$datafile_header;

FILE# STATUS     CHECKPOINT_CHANGE# CHECKPOINT_TIME      RESETLOGS_CHANGE# RESETLOGS_TIME            FUZ
------- ---------- ------------------ -------------------- ----------------- ------------------------- ---
      1 ONLINE          3645505893814 17-AUG-2014 11:37:33                 1 27-APR-2007 14:07:53      NO
      2 ONLINE          3645505893818 17-AUG-2014 13:34:48     3645505897826 17-AUG-2014 13:28:19      YES
...
    157 ONLINE          3645505893818 17-AUG-2014 13:34:48     3645505897826 17-AUG-2014 13:28:19      YES

157 rows selected.

SOLUTION

Suggestion

Before recreating the controlfile as described below, take a operating system backup of the current controlfile to be on safe side.

Steps to solve this issue:

The below will only work if you have the redo logs (archived and/or online) to recover the database.

1) Recreate the controlfile with just the datafiles that has lower incarnation, or resetlogs time.
   This list must include the system datafile.

   From the previous example, the CREATE CONTROLFILE command should have the datafile with a RESETLOGS TIME of 27-APR-2007 14:07:53.
   In this case it will be datafile# 1:

CREATE CONTROLFILE REUSE DATABASE "<SID>" RESETLOGS ARCHIVELOG
    MAXLOGFILES 40
    MAXLOGMEMBERS 3
    MAXDATAFILES 200
    MAXINSTANCES 8
    MAXLOGHISTORY 37744
LOGFILE
GROUP 1 (
    '/ora/prod/<db_unique_name/data/<SID>/onlinelog/o1_mf_1_1PN15JjCD_.log',
    '/ora/prod/<db_unique_name/index/<SID>/onlinelog/o1_mf_1_1PN15ZLTY_.log'
) SIZE 1000M BLOCKSIZE 512,
...

GROUP 12 (
'/ora/prod/<db_unique_name/data/<SID>/onlinelog/o1_mf_12_1PN0wPCV4_.log',
'/ora/prod/<db_unique_name/index/<SID>/onlinelog/o1_mf_12_1PN0wdYFq_.log'
) SIZE 1000M BLOCKSIZE 512
-- STANDBY LOGFILE
DATAFILE
'/ora/prod/<db_unique_name/data/<SID>/datafile/o1_mf_system_334h00q2_.dbf'
CHARACTER SET WE8MSWIN1252
;

2) Once the controlfile is recreated, perform recovery to bring it / rollforward to new incarnation:

SQL> recover database using backup controlfile until cancel;

Apply the redo log(s) requested until it gets to the new incarnation

3) Now recreate the controfile again to include "ALL Datafiles". And finally perform recovery of all the datafiles:

SQL> recover database using backup controlfile until cancel;

Apply the redo log(s) requested to get to the desired point in time

4) Open the database with resetlogs:

SQL> alter database open resetlogs;

---------FUZZY 的含义

APPLIES TO:

Oracle Database - Enterprise Edition - Version 19.14.0.0.0 and later
Information in this document applies to any platform.

SYMPTOMS

On Standby, ACTIVATE Fails with ORA-01194 and OPEN READ ONLY also fails with ORA-01194.

SQL> ALTER DATABASE OPEN READ ONLY;
ALTER DATABASE OPEN READ ONLY
*
ERROR at line 1:
ORA-10458: standby database requires recovery
ORA-01194: file 1 needs more recovery to be consistent

ORA-01110: data file 1: '+DATA_DG/XCTRSBPD/DATAFILE/system.315.990754891'

CAUSE

Check the datafile status from the below query:

select HXFIL File_num,substr(HXFNM,1,70) File_name,FHTYP Type,HXERR Validity,FHSCN SCN, FHTNM TABLESPACE_NAME,FHSTA status ,FHRBA_SEQ Sequence from X$KCVFH;
select hxfil FileNo,FHSTA status from x$kcvfhall;
select file#,STATUS, FUZZY from v$datafile_header;

The error ORA-01194 occur when datafiles are in infinite fuzzy . This situation happen when there is difference in SCN between datafile header and the datafile.

SOLUTION

Recreate the standby controlfile. Recreating standby controlfile take care of the SCN change becasue the change is recorded in the REDO hence the mismatch will be taken care on further recovery.

Steps to recreate the standby controlfile from SQLPLUS:

Steps to recreate a Physical Standby Controlfile (Doc ID 459411.1)

Steps to recreate the standby controlfile Using RMAN on ASM and on ASM environment:

Step By Step Guide On How To Recreate Standby Control File When Datafiles Are On ASM And Using Oracle Managed Files (Doc ID 734862.1)

----------------------

APPLIES TO:

Oracle Database - Enterprise Edition - Version 19.15.1.0.0 and later
Information in this document applies to any platform.

SYMPTOMS

Production DB crashed with ORA-00742: Log read detects lost write in thread n sequence xxxxxx block #
ORA-00312: online log <log#> thread <thread#>: '+<Diskgroup>/<DBName>/redologfile.log'.

Database has been recovered using all the archived and available online redo log files and made sure that no datafile is in fuzzy state in v$datafile_header

Recover database is successful.

RMAN> recover database;

Starting recover at 2022-09-14 23:27:56
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=6 instance=<Instance> device type=DISK

starting media recovery
media recovery complete, elapsed time: 00:00:03

Finished recover at 2022-09-14 23:28:18
RMAN>

SQL> select * from v$recover_file;

no rows selected

But while trying to open the database, it fails with below error

SQL> alter database open;
alter database open
*
ERROR at line 1:
ORA-00742: Log read detects lost write in thread <thread#> sequence <seq#> block <Block#>
ORA-00312: online log <log#> thread <thread#>: '+<Diskgroup>/<DBName>/redologfile.log'

While trying to perform clear unarchived logfile as mentioned in document Doc ID 2064718.1, it fails with below error.

SQL> alter database clear unarchived logfile group <group#>;
alter database clear unarchived logfile group <group#>
*
ERROR at line 1:
ORA-01624: log <group#> needed for crash recovery of instance <Instance> (thread <thread#>)
ORA-00312: online log <log#> thread <thread#>: '+<Diskgroup>/<DBName>/redologfile.log'