Dataguard gap修复
https://blog.csdn.net/kiral07/article/details/87191787
https://www.2cto.com/database/201605/507685.html
1. 首先来模拟Gap的产生
1.1. 备库关闭:
SYS@dgtest_s>shutdown immediate;
1.2. 主库切换日志
SYS@dgtest>select SEQUENCE#,ARCHIVED,STATUS from v$log;
SEQUENCE# ARC STATUS
---------- --- ----------------
61 YES ACTIVE
62 YES ACTIVE
63 NO CURRENT
SYS@dgtest>alter system archive log current;
System altered.
SYS@dgtest>select SEQUENCE#,ARCHIVED,STATUS from v$log;
SEQUENCE# ARC STATUS
---------- --- ----------------
64 NO CURRENT
62 YES ACTIVE
63 YES ACTIVE
刚才current的日志已经归档
1.3. 删除归档,产生UNRESOLVABLE GAP
现在删除63号归档
[oracle@primary arch]$ mv 1_63_909786801.dbf 1_63_909786801.dbf.bak
2. 查看报错
2.1. 启动备库
SYS@dgtest_s>startup
2.2. 查看备库的alert
Media Recovery Log /u01/app/oracle/arch/1_62_909786801.dbf
Media Recovery Waiting for thread 1 sequence 63
Fetching gap sequence in thread 1, gap sequence 63-63
Fri May 06 05:28:09 2016
FAL[client]: Failed to request gap sequence
GAP - thread 1 sequence 63-63
DBID 3866310445 branch 909786801
FAL[client]: All defined FAL servers have been attempted.
------------------------------------------------------------
Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization
parameter is defined to a value that's sufficiently large
enough to maintain adequate log switch information to resolve
archivelog gaps.
------------------------------------------------------------
2.3. 主库查询SWITCHOVER_STATUS
SYS@dgtest>SELECT SWITCHOVER_STATUS FROM V$DATABASE;
SWITCHOVER_STATUS
--------------------
UNRESOLVABLE GAP
为UNRESOLVABLE GAP,说明此时的GAP需要我们自己手工去修复,无法自动修复,可自动修复的GAP显示为RESOLVABLE GAP
3. 基于SCM的增量备份修复GAP
3.1. 在备库上查询current scn号
SYS@dgtest_s>select current_scn from v$database;
CURRENT_SCN
-----------
2567388
此处:scn起点
|
当查询v$datafile scn,database scn,datafile_header.scn发现均不一致,这个时候不能使用CURRENT_SCN来作为scn增量备份的起点,这个时候需要根据缺少日志最下sequence#找出scn,然后作为增量备份的起点.所以使用gap中提示的sequence#找出起始scn比CURRENT_SCN 更加靠谱(出现scn不一致情况,一般人工不正常干预dg导致
3.2. 到主库去进行基于此SCN的增量备份
RMAN> BACKUP INCREMENTAL FROM SCN 2567388 DATABASE FORMAT '/u01/app/oracle/oradata/tmp/ora11_scn_%U' tag 'For Standby Gap';
3.3. 传输到备库:
[oracle@primary tmp]$ scp * standby:/u01/app/oracle/oradata/tmp
oracle@standby's password:
ora11_scn_0kr54hvk_1_1 100% 125MB 125.2MB/s 00:01
ora11_scn_0lr54l99_1_1 100% 9664KB 9.4MB/s 00:00
3.4. 备库重新启动到mount,并取消日志应用
SYS@dgtest_s>shutdown immediate;
SYS@dgtest_s>startup mount;
SYS@dgtest_s>alter database recover managed standby database cancel;
3.5. 注册刚才传输过来的备份集
RMAN> CATALOG START WITH '/u01/app/oracle/oradata/tmp';
3.6. recover备库
RMAN> recover database noredo;
恢复完毕,这时我们可以观察备库的alert日志:
Incremental restore complete of datafile 4 /u01/app/oracle/oradata/dgtest_s/users01.dbf
checkpoint is 2893208
last deallocation scn is 3
Incremental restore complete of datafile 3 /u01/app/oracle/oradata/dgtest_s/undotbs01.dbf
checkpoint is 2893208
last deallocation scn is 973300
Incremental restore complete of datafile 5 /u01/app/oracle/oradata/dgtest_s/example01.dbf
checkpoint is 2893208
last deallocation scn is 942056
Mon May 09 05:20:25 2016
Incremental restore complete of datafile 2 /u01/app/oracle/oradata/dgtest_s/sysaux01.dbf
checkpoint is 2893208
last deallocation scn is 956093
Incremental restore complete of datafile 1 /u01/app/oracle/oradata/dgtest_s/system01.dbf
checkpoint is 2893208
last deallocation scn is 955346
发现数据文件的scn号都已经重新刷新,但是此时还不能重新起库,需要重新从主库生成一个standby controlfile。
3.7. 主库备份控制文件
RMAN> BACKUP CURRENT CONTROLFILE FOR STANDBY FORMAT '/u01/app/oracle/oradata/tmp/ctl.bak';
3.8. 传输standby控制文件到备库
oracle@standby's password:
ctl.bak 100% 9664KB 9.4MB/s 00:00
3.9. 备库恢复standby控制文件
备库库起到nomount阶段:
SYS@dgtest_s>shutdown immediate
SYS@dgtest_s>startup nomount;
rman恢复控制文件
RMAN> RESTORE STANDBY CONTROLFILE FROM '/u01/app/oracle/oradata/tmp/ctl.bak';
3.10. mount备库,并取消日志应用
SYS@dgtest_s> alter database mount;
SYS@dgtest_s>alter database recover managed standby database cancel;
3.11. 清空备库日志组
SYS@dgtest_s>ALTER DATABASE CLEAR LOGFILE GROUP 1;
Database altered.
注:如果采用了standby log模式,不需要清空,如果清空会出现
SQL> ALTER DATABASE CLEAR LOGFILE GROUP 1;
ALTER DATABASE CLEAR LOGFILE GROUP 1
*
ERROR at line 1:
ORA-19527: physical standby redo log must be renamed
ORA-00312: online log 1 thread 1: '/u01/oradata/badly9/redo01.log'
说明:如果没有采用standby log模式,有几组需要清空几组
3.12. 备库重设flashback
SYS@dgtest_s>ALTER DATABASE FLASHBACK OFF;
SYS@dgtest_s>ALTER DATABASE FLASHBACK ON;
3.13. 备库开始应用日志
SYS@dgtest_s>ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;
4. 确认修复成功
在主库中执行
SYS@dgtest>alter system switch logfile;
分别主备库中执行select max(sequence#) from v$archived_log;如果一致标示修复成功
SYS@dgtest>select max(sequence#) from v$archived_log;
MAX(SEQUENCE#)
--------------
81
SYS@dgtest_s>select max(sequence#) from v$archived_log;
MAX(SEQUENCE#)
--------------
81
至此GAP修复完毕。
====================================
一、RMAN增量方式恢复
二、手动注册归档日志
在备库出现归档文件序列号不一致以及出现gap可通过2种方式修复
一、RMAN增量方式恢复
1、主库上的归档日志
-rw-r----- 1 oracle oinstall 269K Jan 2 09:46 1_29_996353475.arc
-rw-r----- 1 oracle oinstall 48K Jan 2 09:48 1_30_996353475.arc
-rw-r----- 1 oracle oinstall 31K Jan 2 09:49 1_31_996353475.arc
-rw-r----- 1 oracle oinstall 187K Jan 2 09:54 1_32_996353475.arc
-rw-r----- 1 oracle oinstall 1.5K Jan 2 09:54 1_33_996353475.arc
-rw-r----- 1 oracle oinstall 3.5K Jan 2 09:54 1_34_996353475.arc
-rw-r----- 1 oracle oinstall 1.0K Jan 2 09:54 1_35_996353475.arc
-rw-r----- 1 oracle oinstall 1.5K Jan 2 09:54 1_36_996353475.arc
-rw-r----- 1 oracle oinstall 1.0K Jan 2 09:54 1_37_996353475.arc
-rw-r----- 1 oracle oinstall 1.5K Jan 2 09:54 1_38_996353475.arc
-rw-r----- 1 oracle oinstall 1.5K Jan 2 09:54 1_39_996353475.arc
-rw-r----- 1 oracle oinstall 1.5K Jan 2 09:54 1_40_996353475.arc
-rw-r----- 1 oracle oinstall 1.5K Jan 2 09:54 1_41_996353475.arc
1
2
3
4
5
6
7
8
9
10
11
12
13
2、强制删除主库上的归档日志,备库上的告警日志在等待日志序号32recover
MRP0: Background Managed Standby Recovery process started (orcl)
Serial Media Recovery started
Managed Standby Recovery starting Real Time Apply
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Recovery of Online Redo Log: Thread 1 Group 4 Seq 32 Reading mem 0--->等待序号32的日志recover
Mem# 0: /u01/app/oracle/oradata/std/orcl/st4.log
Completed: alter database recover managed standby database using current logfile disconnect
Wed Jan 02 10:19:08 2019
1
2
3
4
5
6
7
8
9
3、备库上查询gap情况
SQL> select * from V$ARCHIVE_GAP;
THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#
---------- ------------- --------------
1 31 32
1
2
3
4
5
备库已出现gap
4、停止备库的日志应用
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
1
5、查看备库最小的scn
col MIN(CHECKPOINT_CHANGE#) for 999999999999999999
col CURRENT_SCN for 99999999999999999
1
2
(1)
SQL> select min(checkpoint_change#) from v$datafile_header;
MIN(CHECKPOINT_CHANGE#)
-----------------------
1160385
1
2
3
4
5
(2)
SQL> select min(fhscn) from x$kcvfh;
MIN(FHSCN)
------------------------------------------------
1160385
1
2
3
4
5
(3)
SQL> SELECT CURRENT_SCN FROM V$DATABASE;
CURRENT_SCN
------------------
1160384
1
2
3
4
5
6、确定主库是否添加数据文件
select FILE#,name from v$datafile where CREATION_CHANGE#> =1160384;
FILE# NAME
----- -----
5 /u01/app/oracle/oradata/orcl/admin.dbf
1
2
3
4
主库在scn之后产生新的数据文件,需要做RMAN数据文件备份
7、做RMAN数据文件、增量、控制文件备份
#!/bin/bash
export ORACLE_SID=orcl
export ORACLE_HOME=/u01/app/oracle/product/11.2.0/dbhome_1
export ORACLE_BASE=/u01/app/oracle
export NLS_LANG="AMERICAN_AMERICA.ZHS16GBK"
backtime=`date +"20%y%m%d%H%M%S"`
$ORACLE_HOME/bin/rman target / log=/home/rman/db_$backtime.log <<EOF
run{
allocate channel c1 device type disk;
allocate channel c2 device type disk;
allocate channel c3 device type disk;
allocate channel c4 device type disk;
backup as compressed backupset datafile 5 format '/u01/rman/datafile_%U.bak' tag 'datafile';
backup as compressed backupset INCREMENTAL from scn 1160384 database format '/u01/rman/zengliang_%u.bak' tag 'zengliang';
backup current controlfile for standby format '/u01/rman/ctlforstd.bak' tag 'std controlfile';
release channel c1;
release channel c2;
release channel c3;
release channel c4;
}
exit
EOF
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
8、拷贝主库RMAN备份到备库
9、备库恢复
RMAN> shutdown;
RMAN> startup nomount;
RMAN> restore standby controlfile from '/u01/rman/ctlforstd.bak';
RMAN> alter database mount;
RMAN> CATALOG START WITH '/u01/rman';
1
2
3
4
5
还原备库缺失的数据文件
RMAN>restore datafile 5;
1
10、使用增量备份恢复备库,noredo表示不应用redo or 归档
RMAN> RECOVER DATABASE NOREDO;
1
11、清理所有的standby redolog
SQL> ALTER DATABASE CLEAR LOGFILE GROUP 1;
SQL> ALTER DATABASE CLEAR LOGFILE GROUP 2;
SQL> ALTER DATABASE CLEAR LOGFILE GROUP 3;
1
2
3
12、检查备库的监听是否开启
lsnrctl status
13、备库重新接收并日志应用
alter database recover managed standby database using current logfile disconnect;
1
14、备库重新开启ADG
SQL> alter database RECOVER MANAGED STANDBY DATABASE CANCEL;
SQL> alter database open;
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE using current logfile DISCONNECT FROM SESSION;
1
2
3
4
5
15、验证备库修复情况
(1)查看MRP进程运行
(2)归档日志序列号
(3)手动建用户
(4)查看备库告警日志
Attempt to start background Managed Standby Recovery process (orcl)
Wed Jan 02 12:33:06 2019
MRP0 started with pid=25, OS id=3087
MRP0: Background Managed Standby Recovery process started (orcl)
Serial Media Recovery started
Managed Standby Recovery starting Real Time Apply
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Waiting for thread 1 sequence 58
Completed: alter database recover managed standby database using current logfile disconnect
Wed Jan 02 12:34:45 2019
alter database open
ORA-10456 signalled during: alter database open...
Wed Jan 02 12:37:04 2019
RFS[1]: Assigned to RFS process 3104
RFS[1]: Opened log for thread 1 sequence 60 dbid 1524134211 branch 996353475
Wed Jan 02 12:37:04 2019
RFS[2]: Assigned to RFS process 3106
RFS[2]: Opened log for thread 1 sequence 58 dbid 1524134211 branch 996353475
Wed Jan 02 12:37:04 2019
RFS[3]: Assigned to RFS process 3108
RFS[3]: Opened log for thread 1 sequence 59 dbid 1524134211 branch 996353475
Archived Log entry 1 added for thread 1 sequence 59 rlc 996353475 ID 0x5ad90943 dest 2:
RFS[3]: Opened log for thread 1 sequence 61 dbid 1524134211 branch 996353475
Archived Log entry 2 added for thread 1 sequence 58 rlc 996353475 ID 0x5ad90943 dest 2:
Archived Log entry 3 added for thread 1 sequence 61 rlc 996353475 ID 0x5ad90943 dest 2:
RFS[2]: Opened log for thread 1 sequence 62 dbid 1524134211 branch 996353475
Archived Log entry 4 added for thread 1 sequence 60 rlc 996353475 ID 0x5ad90943 dest 2:
Archived Log entry 5 added for thread 1 sequence 62 rlc 996353475 ID 0x5ad90943 dest 2:
Wed Jan 02 12:37:05 2019
Media Recovery Log /u01/arch/1_58_996353475.arc
Media Recovery Log /u01/arch/1_59_996353475.arc
Media Recovery Log /u01/arch/1_60_996353475.arc
RFS[3]: Selected log 4 for thread 1 sequence 63 dbid 1524134211 branch 996353475
Wed Jan 02 12:37:05 2019
Archived Log entry 6 added for thread 1 sequence 63 ID 0x5ad90943 dest 1:
Wed Jan 02 12:37:05 2019
Primary database is in MAXIMUM PERFORMANCE mode
RFS[4]: Assigned to RFS process 3110
RFS[4]: Selected log 4 for thread 1 sequence 64 dbid 1524134211 branch 996353475
Media Recovery Log /u01/arch/1_61_996353475.arc
Media Recovery Log /u01/arch/1_62_996353475.arc
Media Recovery Log /u01/arch/1_63_996353475.arc
Media Recovery Waiting for thread 1 sequence 64 (in transit)
Recovery of Online Redo Log: Thread 1 Group 4 Seq 64 Reading mem 0
Mem# 0: /u01/app/oracle/oradata/std/orcl/st4.log
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
附备库没有接收日志应用直接open数据库报错告警日志输出
Managed Standby Recovery starting Real Time Apply
Media Recovery Waiting for thread 1 sequence 58
Wed Jan 02 12:32:40 2019
Standby crash recovery need archive log for thread 1 sequence 58 to continue.
Please verify that primary database is transporting redo logs to the standby database.
Wait timeout: thread 1 sequence 58
Standby Crash Recovery aborted due to error 16016.
Errors in file /u01/app/oracle/diag/rdbms/std/orcl/trace/orcl_ora_2859.trc:
ORA-16016: archived log for thread 1 sequence# 58 unavailable
Recovery interrupted!
Completed Standby Crash Recovery.
Signalling error 1152 for datafile 1!
Errors in file /u01/app/oracle/diag/rdbms/std/orcl/trace/orcl_ora_2859.trc:
ORA-10458: standby database requires recovery
ORA-01152: file 1 was not restored from a sufficiently old backup
ORA-01110: data file 1: '/u01/app/oracle/oradata/std/orcl/system01.dbf'
ORA-10458 signalled during: alter database open...
Wed Jan 02 12:33:06 2019
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
二、手动注册归档日志
1、当备库告警日志出现GAP警告
Sun Jan 06 12:46:45 2019
alter database recover managed standby database using current logfile disconnect
Attempt to start background Managed Standby Recovery process (orcl)
Sun Jan 06 12:46:45 2019
MRP0 started with pid=26, OS id=2645
MRP0: Background Managed Standby Recovery process started (orcl)
Serial Media Recovery started
Managed Standby Recovery starting Real Time Apply
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Log /u01/arch/1_74_996353475.arc
Error opening /u01/arch/1_74_996353475.arc
Attempting refetch
Media Recovery Waiting for thread 1 sequence 74
Fetching gap sequence in thread 1, gap sequence 74-74
Completed: alter database recover managed standby database using current logfile disconnect
Sun Jan 06 12:48:40 2019
FAL[client]: Failed to request gap sequence--------->出现GAP
GAP - thread 1 sequence 74-74
DBID 1524134211 branch 996353475
FAL[client]: All defined FAL servers have been attempted.
------------------------------------------------------------
Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization
parameter is defined to a value that's sufficiently large
enough to maintain adequate log switch information to resolve
archivelog gaps.
------------------------------------------------------------
Sun Jan 06 12:48:53 2019
Archived Log entry 39 added for thread 1 sequence 95 ID 0x5ad90943 dest 1:
Sun Jan 06 12:48:53 2019
Primary database is in MAXIMUM PERFORMANCE mode
RFS[11]: Assigned to RFS process 2650
RFS[11]: Selected log 4 for thread 1 sequence 96 dbid 1524134211 branch 996353475
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
2、手动从主库拷贝缺失的归档日志到备库
3、手动注册
SQL> alter database register logfile '/u01/arch/old/1_94_996353475.arc';
Database altered.
1
2
3
生成自动注册归档日志脚本
#!/bin/bash
echo "" > /u01/arch/apply.sql
for i in {75..94}
do
echo "alter database register logfile '/u01/arch/old/1_${i}_996353475.arc';" >> /u01/arch/apply.sql
done