问题现象
当主库出现问题,准备进行adg failover切换的时候,会有如下报错:
SQL> alter database recover managed standby database finish force;
alter database recover managed standby database finish force
*
ERROR at line 1:
ORA-00283: recovery session canceled due to errors
ORA-16171: RECOVER…FINISH not allowed due to gap for thr 1, seq 64-66
尝试解决
初步遇到改报错,尝试过以下修复方法,没能成功:
SQL> alter database recover managed standby database finish force;
alter database recover managed standby database finish force
*
ERROR at line 1:
ORA-00283: recovery session canceled due to errors
ORA-16171: RECOVER...FINISH not allowed due to gap for thr 1, seq 64-66
SQL> alter database commit to switchover to primary;
alter database commit to switchover to primary
*
ERROR at line 1:
ORA-16139: media recovery required
------主备切换的时候,提示需要recover,所以standby数据库尝试进行recover,由于日志找不到,所以指定cancel。
SQL> recover standby database until cancel;
ORA-00279: change 789548 generated at 05/09/2020 16:09:10 needed for thread 1
ORA-00289: suggestion :
/u01/app/oracle/oradata/arch/zhuodg/ZHUODG/archivelog/2020_05_09/o1_mf_1_64_%u_.
arc
ORA-00280: change 789548 for thread 1 is in sequence #64
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
cancel
Media recovery cancelled.
---不完全恢复完成后,提示还是报错,所以不完全恢复不能修复此故障。
SQL> alter database commit to switchover to primary;
alter database commit to switchover to primary
*
ERROR at line 1:
ORA-16139: media recovery required
SQL> alter database recover managed standby database finish force;
alter database recover managed standby database finish force
*
ERROR at line 1:
ORA-00283: recovery session canceled due to errors
ORA-16171: RECOVER...FINISH not allowed due to gap for thr 1, seq 64-66
问题原因
由于某种原因,主备库之间不同步,备库缺失不分归档,导致failover的时候报错。
下面演示以下3种情况下的failover能否成功,顺便解决此问题。
正常切换
备库查询
1、查询角色
SQL> SELECT database_role role, name, db_unique_name, platform_id,
2 open_mode, log_mode, flashback_on, protection_mode, protection_level
3 FROM v$database;
ROLE NAME DB_UNIQUE_NAME PLATFORM_ID OPEN_MODE LOG_MODE FLASHBACK_ON PROTECTION_MODE PROTECTION_LEVEL
---------------- --------- ------------------------------ ----------- -------------------- ------------ ------------------ -------------------- --------------------
PHYSICAL STANDBY ZHUO zhuodg 13 READ ONLY WITH APPLY ARCHIVELOG NO MAXIMUM PERFORMANCE MAXIMUM PERFORMANCE
1 row selected.
2、查看主备同步状态(可以有多种方法)
SQL> SELECT * FROM v$dataguard_stats WHERE name LIKE '%lag%';
NAME VALUE UNIT TIME_COMPUTED DATUM_TIME
-------------------------------- ---------------------------------------------------------------- ------------------------------ ------------------------------ ------------------------------
transport lag +00 00:00:00 day(2) to second(0) interval 05/09/2020 16:10:37 05/09/2020 16:10:36
apply lag +00 00:00:00 day(2) to second(0) interval 05/09/2020 16:10:37 05/09/2020 16:10:36
2 rows selected.
SQL> !date -R
Sat, 09 May 2020 16:10:45 +0800
3、检查归档日志是否连续
查询待转换standby数据库的V
a
r
c
h
i
v
e
g
a
p
视
图
,
确
认
归
档
文
件
是
否
连
续
:
S
Q
L
>
s
e
l
e
c
t
∗
f
r
o
m
v
archive_gap视图,确认归档文件是否连续: SQL> select * from v
archivegap视图,确认归档文件是否连续:SQL>select∗fromvarchive_gap;
no rows selected
如果有返回记录,按照列出的记录号复制对应的归档文件到待转换的standby服务器。这一步非常重要,必须确保所有已生成的归档文件均已存在于standby服务器,不然可能会因为数据不一致造成转换时报错。
具体报错和解决方法下面会指出。
4、检查归档文件是否完整
SQL> SELECT al.thrd "Thread", almax "Last Seq Received", lhmax "Last Seq Applied"
2 FROM (select thread# thrd, MAX(sequence#) almax
3 FROM v$archived_log
4 WHERE resetlogs_change#=(SELECT resetlogs_change# FROM v$database) GROUP BY thread#) al,
5 (SELECT thread# thrd, MAX(sequence#) lhmax
6 FROM v$log_history
7 WHERE resetlogs_change#=(SELECT resetlogs_change# FROM v$database) GROUP BY thread#) lh
8 WHERE al.thrd = lh.thrd;
Thread Last Seq Received Last Seq Applied
---------- ----------------- ----------------
1 66 66
用该语句取得当前数据库各线程已归档文件最大序列号,如果primary和standby最大序列号不相同,必须将多出的序号对应的归档日志复制到待转换的standby服务器。以上相同,不做处理。
5、启动failover
SQL> alter database recover managed standby database finish force;
Database altered.
force关键字将会停止当前活动的RFS进程,以便立即执行failover
6、切换standby为primary
SQL> alter database commit to switchover to primary;
Database altered.
7、启动新的primary数据库
SQL> alter database open;
Database altered.
8、查看状态
SQL> SELECT database_role role, name, db_unique_name, platform_id,
2 open_mode, log_mode, flashback_on, protection_mode, protection_level
3 FROM v$database;
ROLE NAME DB_UNIQUE_NAME PLATFORM_ID OPEN_MODE LOG_MODE FLASHBACK_ON PROTECTION_MODE PROTECTION_LEVEL
---------------- --------- ------------------------------ ----------- -------------------- ------------ ------------------ -------------------- --------------------
PRIMARY ZHUO zhuodg 13 MOUNTED ARCHIVELOG NO MAXIMUM PERFORMANCE UNPROTECTED
1 row selected.
有gap的切换
1、查询是否有日志缺失
SQL> select * from v$archive_gap;
THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#
-------- ------------- --------------
1 67 68
SQL> SELECT al.thrd "Thread", almax "Last Seq Received", lhmax "Last Seq Applied"
2 FROM (select thread# thrd, MAX(sequence#) almax
3 FROM v$archived_log
4 WHERE resetlogs_change#=(SELECT resetlogs_change# FROM v$database) GROUP BY thread#) al,
5 (SELECT thread# thrd, MAX(sequence#) lhmax
6 FROM v$log_history
7 WHERE resetlogs_change#=(SELECT resetlogs_change# FROM v$database) GROUP BY thread#) lh
8 WHERE al.thrd = lh.thrd;
Thread Last Seq Received Last Seq Applied
---------- ----------------- ----------------
1 69 66
1 row selected.
有gap,启动failover报错:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE
*
ERROR at line 1:
ORA-00283: recovery session canceled due to errors
ORA-16171: RECOVER...FINISH not allowed due to gap for thr 1, seq 67-68
以上standby端有日志缺失,丢失了67-68号归档日志。所以failover就会报错ORA-16171.
可以解决gap
1、从其他地方拷贝缺失的归档到备库,然后进行注册
SQL> alter database register logfile '/home/oracle/o1_mf_1_67_hcdshnhm_.arc';
Database altered.
SQL> alter database register logfile '/home/oracle/o1_mf_1_68_hcdsj7r9_.arc';
Database altered.
SQL> SELECT al.thrd "Thread", almax "Last Seq Received", lhmax "Last Seq Applied"
2 FROM (select thread# thrd, MAX(sequence#) almax
3 FROM v$archived_log
4 WHERE resetlogs_change#=(SELECT resetlogs_change# FROM v$database) GROUP BY thread#) al,
5 (SELECT thread# thrd, MAX(sequence#) lhmax
6 FROM v$log_history
7 WHERE resetlogs_change#=(SELECT resetlogs_change# FROM v$database) GROUP BY thread#) lh
8 WHERE al.thrd = lh.thrd;
Thread Last Seq Received Last Seq Applied
---------- ----------------- ----------------
1 69 66
1 row selected.
由于上面执行过一次failover,报错,mrp进程异常终止,所以重启mrp进程,应用下日志。
2、启动mrp进程,检查日志是否应用完成
SQL> recover managed standby database using current logfile disconnect;
Media recovery complete.
SQL> SELECT al.thrd "Thread", almax "Last Seq Received", lhmax "Last Seq Applied"
2 FROM (select thread# thrd, MAX(sequence#) almax
3 FROM v$archived_log
4 WHERE resetlogs_change#=(SELECT resetlogs_change# FROM v$database) GROUP BY thread#) al,
5 (SELECT thread# thrd, MAX(sequence#) lhmax
6 FROM v$log_history
7 WHERE resetlogs_change#=(SELECT resetlogs_change# FROM v$database) GROUP BY thread#) lh
8 WHERE al.thrd = lh.thrd;
Thread Last Seq Received Last Seq Applied
---------- ----------------- ----------------
1 69 69
1 row selected.
SQL> select * from v$archive_gap;
no rows selected
3、执行切换
SQL> alter database recover managed standby database finish force;
Database altered.
SQL> alter database commit to switchover to primary;
Database altered.
SQL> alter database open;
Database altered.
解决不了gap
1、取消mrp进程
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
Database altered.
2、启用Activate standby database功能
SQL> ALTER DATABASE ACTIVATE STANDBY DATABASE;
Database altered.
3、open数据库
SQL> alter database open;
Database altered.
SQL> SELECT database_role role, name, db_unique_name, platform_id,
2 open_mode, log_mode, flashback_on, protection_mode, protection_level
3 FROM v$database;
ROLE NAME DB_UNIQUE_NAME PLATFORM_ID OPEN_MODE LOG_MODE FLASHBACK_ON PROTECTION_MODE PROTECTION_LEVEL
---------------- --------- ------------------------------ ----------- -------------------- ------------ ------------------ -------------------- --------------------
PRIMARY ZHUO zhuodg 13 READ WRITE ARCHIVELOG NO MAXIMUM PERFORMANCE MAXIMUM PERFORMANCE
1 row selected
总结
参考
How to Perform Failover When GAP on Standby (Doc ID 846087.1) To BottomTo Bottom
In this Document
Goal
Solution
APPLIES TO:
Oracle Database - Enterprise Edition - Version 10.2.0.1 to 11.2.0.3 [Release 10.2 to 11.2]
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Express Cloud Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Information in this document applies to any platform.
Checked for relevance on 27-Sep-2012
Checked for relevance on 5-Aug-2014
Checked for relevance on 10-Dec-2015
GOAL
How to Perform Failover When GAP on Standby?
If GAP is present then failover will fail with:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE
*
ERROR at line 1:
ORA-00283: recovery session canceled due to errors
ORA-16171: RECOVER…FINISH not allowed due to gap for thr 1, seq 4398-4398
SOLUTION
There are two ways to do this.
-
Resolve the GAP
-
Without resolving GAP
-
Resolve the GAP
a. If primary server is up then copy the missing log files to standby and then register them.
SQL> ALTER DATABASE REGISTER LOGFILE ‘’;
example:
SQL> ALTER DATABASE REGISTER LOGFILE ‘/tmp/arc_1652321.arc’;
b. Finish the recovery
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH FORCE;
c. Perform the failover
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;
d. If the physical standby database has been opened read-only since the instance was started, perform this step. Otherwise, skip to next step.
SQL> STARTUP MOUNT FORCE;
e. Set the protection mode to maximum performance and open the database for read/write access:
SQL> ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE PERFORMANCE;
SQL> ALTER DATABASE OPEN;
- Without resolving GAP.
a. Cancel managed recovery if it is working:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
If managed recovery is not working then this command will fails with:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL
*
ERROR at line 1:
ORA-16136: Managed Standby Recovery not active
b. Activate standby database:
SQL> ALTER DATABASE ACTIVATE STANDBY DATABASE;
Please be aware that you may have to recreate the standby database after activating the former standby database!
For more details please see
Oracle® Data Guard
Concepts and Administration
10g Release 2 (10.2)
8.4 Recovering Through the OPEN RESETLOGS Statement
c. If the physical standby database has been opened read-only since the instance was started, perform this step. Otherwise, skip to next step.
SQL> STARTUP MOUNT FORCE;
d. Set the protection mode to maximum performance and open the database for read/write access:
SQL> ALTER DATABASE SET STANDBY DATABASE TO MAXIMIZE PERFORMANCE;
SQL> ALTER DATABASE OPEN;