oracle data guard Failover切换(官方文档方案)

oracle data guard Failover切换(官方文档方案)

步骤1:将主库中所有未发送的归档日志和redo日志刷新到目标备库。

如果可以mount主库,则可以将主库中任何未发送的存档日志和current redo日志刷新到备库。
如果此操作成功,即使主库没有使用零数据丢失数据保护模式,也可以进行零数据丢失故障转移。
确保Redo Apply在目标备库中处于活动状态。
mount,但不要open主库。
如果无法mount主库,请转到步骤2。

在主库中执行以下SQL语句:
SQL> ALTER SYSTEM FLUSH REDO TO ‘target_db_name’;
(注:target_db_name备库的db_unique_name,需要用引号引起。)

此SQL语句将所有未发送的redo日志从主库刷新到备库,并等待该redo应用到备库。
如果此语句完成时没有错误,请转到步骤5。
如果此语句完成时出现错误,或者由于无法再等待语句完成而必须停止该语句,请继续执行步骤2。

步骤2:验证备库是否有每个主库redo线程最新的存档日志文件。

查询备库上的VKaTeX parse error: Expected 'EOF', got '#' at position 62: …T UNIQUE THREAD#̲ AS THREAD, MAX…ARCHIVED_LOG;

从主库复制归档日志到备库,并注册
SQL> ALTER DATABASE REGISTER PHYSICAL LOGFILE ‘归档日志’;

步骤3:检查,并解决存在的归档日志GAP。

查询备库上的V$ARCHIVE_GAP视图,以检查备库上是否存在归档日志GAP。
SQL> SELECT THREAD#, LOW_SEQUENCE#, HIGH_SEQUENCE# FROM V$ARCHIVE_GAP;

如果存在归档日志GAP,则从主库复制归档日志到备库,并注册
SQL> ALTER DATABASE REGISTER PHYSICAL LOGFILE ‘归档日志’;

步骤4:重复步骤3,直到备库上所有的归档日志GAP都消除。

在解决归档日志GAP后,必须重复查询,直到返回no rows selected。(no rows selected,表示没有归档日志GAP。)
如果在执行步骤2~步骤4之后,无法解决归档日志GAP(例如,因为您无权访问失败的主数据库的系统),则在failover切换期间将发生某些数据丢失。

步骤5:停止redo日志应用。

在备库中执行以下SQL语句:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

步骤6:完成应用所有接收到的redo日志数据。

在备库中执行以下SQL语句:
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH;

如果此语句完成时没有任何错误,请继续执行步骤7。
如果发生错误,则是备库没有应用某些接收到的redo数据。在继续下一步骤之前,请尝试解决错误并重新执行该语句。

请注意,如果在步骤3和步骤4中存在没有解决的归档日志GAP问题,您将收到一个错误,提示存在归档日志GAP。
如果该错误无法解决,仍可以通过在备库上执行以下SQL语句来进行failover故障转移(但是会丢失一些数据):
SQL> ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE;
当ACTIVATE语句完成时,请继续执行步骤9。

步骤7:验证备库是否已准备好成为主库。

查询备库上V$DATABASE视图中的SWITCHOVER_STATUS。
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;

如果SWITCHOVER_STATUS的值为TO PRIMARY或SESSIONS ACTIVE,表示备库已准备好切换到主角色。
如果这两个值都没有返回,请验证备库中Redo Apply是否处于活动状态,并继续查询该视图,直到返回TO PRIMARY或SESSIONS ACTIVE。

步骤8:将备库切换到主角色。

在备库中执行以下SQL语句:
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;

注:如果在步骤7中执行的SWITCHOVER_STATUS查询返回值为TO PRIMARY,则可以省略switchover语句中的WITH SESSION SHUTDOWN。

步骤9:打开新的主数据库。

SQL> ALTER DATABASE OPEN;

步骤10:备份新的主数据库。

对新的主数据库进行完全备份。

--------------------------以下为在vmware测试步骤----------------------------------

实践步骤:

1、初始状态

master库:
SYS@master> select database_role,switchover_status,protection_mode from v$database;

DATABASE_ROLE	 SWITCHOVER_STATUS    PROTECTION_MODE
---------------- -------------------- --------------------
PRIMARY 	 TO STANDBY	      MAXIMUM PERFORMANCE


初始数据:
SYS@master> select * from wen;

NAME	   TIME
---------- ---------
king	   16-MAR-20

slave库:
SYS@slave> select database_role,switchover_status,protection_mode from v$database;

DATABASE_ROLE	 SWITCHOVER_STATUS    PROTECTION_MODE
---------------- -------------------- --------------------
PHYSICAL STANDBY NOT ALLOWED	      MAXIMUM PERFORMANCE


初始数据:
SYS@slave> select * from wen;

NAME	   TIME
---------- ---------
king	   16-MAR-20

2、制造数据差异

2.1、关闭slave网卡
[root@slave ~]# systemctl stop network
[root@slave ~]# ping master
connect: Network is unreachable

master的日志报错:
Mon Mar 16 02:49:36 2020
ORA-16198: LGWR received timedout error from KSR
LGWR: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (16198)
LGWR: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned
Error 16198 for archive log file 3 to 'slave'
ORA-16198: LGWR received timedout error from KSR
LGWR: Error 16198 disconnecting from destination LOG_ARCHIVE_DEST_2 standby host 'slave'  ----网络已断开
2.2、master库插入数据,并切redo日志,最后shutdown abort
SYS@master> insert into wen values('before-1',sysdate);

1 row created.

SYS@master> commit;

Commit complete.

SYS@master> alter system switch logfile;

System altered.

SYS@master> insert into wen values('before-2',sysdate);

1 row created.

SYS@master> commit;

Commit complete.

SYS@master> alter system switch logfile;

System altered.

SYS@master> insert into wen values('after-1',sysdate);

1 row created.

SYS@master> commit;

Commit complete.

SYS@master> shu abort;
ORACLE instance shut down.

master库的alter日志:
Mon Mar 16 02:53:08 2020
Shutting down instance (abort)
License high water mark = 9
USER (ospid: 49190): terminating the instance
Instance terminated by USER, pid = 49190
Mon Mar 16 02:53:09 2020
Instance shutdown complete   ----master库已shutdown abort
2.3、开启slave的网卡,并确认表 wen 的数据
--开启网卡
[root@slave ~]# systemctl start network
[root@slave ~]# ping master
PING master (192.168.189.39) 56(84) bytes of data.
64 bytes from master (192.168.189.39): icmp_seq=1 ttl=64 time=1.24 ms
64 bytes from master (192.168.189.39): icmp_seq=2 ttl=64 time=0.664 ms
^C
--- master ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.664/0.954/1.245/0.292 ms
--确认表wen的数据
SYS@slave> select * from wen;

NAME	   TIME
---------- ---------
king	   16-MAR-20

----slave库和master库相差数据;
----相差的三条数据:两条数据在归档日志,一条数据在current redo日志。

3、failover切换,并使用flush功能

3.1、使用flush将未发送的归档日志、redo日志同步到slave库
--master开启mount状态
SYS@master> startup mount;
ORACLE instance started.

Total System Global Area  605450240 bytes
Fixed Size		    2255632 bytes
Variable Size		  398460144 bytes
Database Buffers	  197132288 bytes
Redo Buffers		    7602176 bytes
Database mounted.

--使用flush
SYS@master> alter system flush redo to 'slave';

System altered.

master的alert日志:

Completed: ALTER DATABASE   MOUNT
Mon Mar 16 02:59:57 2020
ALTER SYSTEM FLUSH REDO TO 'slave' CONFIRM APPLY
ALTER SYSTEM FLUSH REDO TO slave CONFIRM APPLY [Process Id: 50824] (master)
ARCH: STARTING ARCH PROCESSES
Mon Mar 16 02:59:57 2020
ARC0 started with pid=22, OS id=50860 
ARC0: Archival started
ARCH: STARTING ARCH PROCESSES COMPLETE
ARC0: STARTING ARCH PROCESSES
Flush redo: No wait for non-current ORLs to be archived
Waiting for all FAL entries to be archived...
All FAL entries have been archived.
Waiting for dest_id 2 to become synchronized...
Mon Mar 16 02:59:58 2020
ARC1 started with pid=23, OS id=50863 
Mon Mar 16 02:59:58 2020
ARC2 started with pid=24, OS id=50865 
Mon Mar 16 02:59:58 2020
ARC3 started with pid=25, OS id=50867 
ARC1: Archival started
ARC2: Archival started
ARC1: Becoming the 'no FAL' ARCH
ARC1: Becoming the 'no SRL' ARCH
ARC2: Becoming the heartbeat ARCH
ARC3: Archival started
ARC0: STARTING ARCH PROCESSES COMPLETE
ARC0: Standby redo logfile selected for thread 1 sequence 41 for destination LOG_ARCHIVE_DEST_2
Active, synchronized flush redo target has been identified
Managed Real Time Apply recovery running at physical standby 'LOG_ARCHIVE_DEST_2'
Flush End-Of-Redo Log thread 1 sequence 43 has been fixed
Flush Redo: Primary highest seen SCN set to 0x0.0x3472a9
ARCH: Noswitch archival of thread 1, sequence 43
ARCH: End-Of-Redo Branch archival of thread 1 sequence 43
ARCH: LGWR is scheduled to archive destination LOG_ARCHIVE_DEST_2 after log switch
ARCH: Standby redo logfile selected for thread 1 sequence 43 for destination LOG_ARCHIVE_DEST_2
Flush End-Of-Redo Log thread 1 sequence 43
Archived Log entry 64 added for thread 1 sequence 43 ID 0xb887939c dest 1:
ARCH: Archiving is disabled due to current logfile archival
Primary will wait for slave standby to have applied all redo
Final check for a target standby that has recovered all redo. Check will be made a few times.
LOG_ARCHIVE_DEST_2 is a potential flush redo target
LOG_ARCHIVE_DEST_2 is a potential flush redo target
LOG_ARCHIVE_DEST_2 has also applied all redo from primary
Active, synchronized target has been identified that has applied all the redo from the primary.   ----slave库已应用所有redo日志
Flush Redo: Primary redo moved to standby
Mon Mar 16 03:00:58 2020
ARC1: Archiving disabled



slave的alert日志:
Archived Log entry 103 added for thread 1 sequence 40 ID 0xb887939c dest 1:
Mon Mar 16 02:41:32 2020
Media Recovery Log /u01/app/arch/slave/1_39_1034744850.dbf
Media Recovery Log /u01/app/arch/slave/1_40_1034744850.dbf
Media Recovery Waiting for thread 1 sequence 41 (in transit)
Recovery of Online Redo Log: Thread 1 Group 4 Seq 41 Reading mem 0
  Mem# 0: /u01/app/oracle/oradata/slavelog/sla_redo04.log
Mon Mar 16 02:59:59 2020
Killing 1 processes with pids 57050 (idle RFS by thread/sequence) in order to allow the in-transit gap to be queued. Requested by OS process 58635
Mon Mar 16 03:00:01 2020
RFS[4]: Assigned to RFS process 58664
RFS[4]: Selected log 4 for thread 1 sequence 41 dbid -1211978201 branch 1034744850
Mon Mar 16 03:00:01 2020
RFS[5]: Assigned to RFS process 58662
RFS[5]: Opened log for thread 1 sequence 42 dbid -1211978201 branch 1034744850
Archived Log entry 104 added for thread 1 sequence 42 rlc 1034744850 ID 0xb887939c dest 2:
Mon Mar 16 03:00:02 2020
Archived Log entry 105 added for thread 1 sequence 41 ID 0xb887939c dest 1:
Mon Mar 16 03:00:02 2020
Media Recovery Log /u01/app/arch/slave/1_42_1034744850.dbf
Media Recovery Waiting for thread 1 sequence 43
Mon Mar 16 03:00:03 2020
RFS[6]: Assigned to RFS process 58667
RFS[6]: Selected log 4 for thread 1 sequence 43 dbid -1211978201 branch 1034744850
Mon Mar 16 03:00:03 2020
Archived Log entry 106 added for thread 1 sequence 43 ID 0xb887939c dest 1:
Mon Mar 16 03:00:03 2020
Standby switchover readiness check: Checking whether recoveryapplied all redo..
Database not available for switchover
  End-Of-REDO archived log file has not been recovered
  Incomplete recovery SCN:0:3417156 archive SCN:0:3437225
Physical Standby did not apply all the redo from the primary.
Media Recovery Log /u01/app/arch/slave/1_43_1034744850.dbf
Identified End-Of-Redo (move redo) for thread 1 sequence 43 at SCN 0x0.3472a9
Resetting standby activation ID 3095892892 (0xb887939c)
Media Recovery Waiting for thread 1 sequence 44
Mon Mar 16 03:00:04 2020
Standby switchover readiness check: Checking whether recoveryapplied all redo..
Physical Standby applied all the redo from the primary.    ----slave库已应用所有redo日志
3.2、slave库停止redo日志应用
SYS@slave> alter database recover managed standby database cancel;

Database altered.
3.3、slave库完成所有redo日志应用
SYS@slave> alter database recover managed standby database finish;

Database altered.
3.4、验证slave库是否已准备好成为主库
SYS@slave> select switchover_status from v$database;

SWITCHOVER_STATUS
--------------------
TO PRIMARY
3.5、将slave库切换到主角色
SYS@slave> alter database commit to switchover to primary;

Database altered.
3.6、打开新的主数据库
SYS@slave> alter database open;

Database altered.

4、确认状态及数据

--确认状态
SYS@slave> select database_role,switchover_status,protection_mode from v$database;

DATABASE_ROLE	 SWITCHOVER_STATUS    PROTECTION_MODE
---------------- -------------------- --------------------
PRIMARY 	 FAILED DESTINATION   MAXIMUM PERFORMANCE

--确认数据
SYS@slave> select * from wen;

NAME	   TIME
---------- ---------
king	   16-MAR-20
before-1   16-MAR-20
before-2   16-MAR-20
after-1    16-MAR-20

–使用flush功能的failover切换完成–

总结

1、官方文档对failover切换操作描述很细致,能应对日常的数据库维护。
2、缺少对主库中redo数据的操作描述,(虽然有flush功能,但主库需要mount)
3、后续将讨论在failover切换时,如何正确处理主库中的redo数据。

谨记:心存敬畏,行有所止。

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值