oracle data guard failover 切换(手动处理redo)

22 篇文章 0 订阅

oracle data guard failover 切换(手动处理redo)

1 初始状态

master库:
select database_role,switchover_status,protection_mode from v$database;

DATABASE_ROLE	 SWITCHOVER_STATUS    PROTECTION_MODE
---------------- -------------------- --------------------
PRIMARY 	 TO STANDBY	      MAXIMUM PERFORMANCE

初始数据
SYS@master> select * from wen;

NAME	   TIME
---------- ---------
king	   09-MAR-20


slave库:
SYS@slave> select database_role,switchover_status,protection_mode from v$database;

DATABASE_ROLE	 SWITCHOVER_STATUS    PROTECTION_MODE
---------------- -------------------- --------------------
PHYSICAL STANDBY NOT ALLOWED	      MAXIMUM PERFORMANCE

初始数据
SYS@slave> select * from wen;

NAME	   TIME
---------- ---------
king	   09-MAR-20

2 制造数据差异

2.1 关闭slave网卡

[root@slave ~]# systemctl stop network

[root@slave ~]# ping master
connect: Network is unreachable

master的日志报错:
Wed Mar 11 04:54:11 2020
ORA-16198: LGWR received timedout error from KSR
LGWR: Attempting destination LOG_ARCHIVE_DEST_2 network reconnect (16198)
LGWR: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned
Error 16198 for archive log file 3 to 'slave'
ORA-16198: LGWR received timedout error from KSR
LGWR: Error 16198 disconnecting from destination LOG_ARCHIVE_DEST_2 standby host 'slave'  ----网络已断开
Destination LOG_ARCHIVE_DEST_2 is UNSYNCHRONIZED
LGWR: Failed to archive log 3 thread 1 sequence 10 (16198)
Thread 1 advanced to log sequence 11 (LGWR switch)
  Current log# 1 seq# 11 mem# 0: /u01/app/oracle/oradata/master/redo01.log
Wed Mar 11 04:54:11 2020
Archived Log entry 9 added for thread 1 sequence 10 ID 0xb880a537 dest 1:

2.2 master库插入数据,并切redo日志,最后shutdown abort

SYS@master> insert into wen values('leo',sysdate);

1 row created.

SYS@master> commit;

Commit complete.

SYS@master> alter system switch logfile;

System altered.

SYS@master> insert into wen values('white',sysdate);

1 row created.

SYS@master> commit;

Commit complete.

SYS@master> alter system switch logfile;

System altered.

SYS@master> insert into wen values('neo',sysdate);

1 row created.

SYS@master> commit;

Commit complete.

SYS@master> select * from wen;

NAME	   TIME
---------- ---------
king	   09-MAR-20
leo	       09-MAR-20
white	   09-MAR-20
neo	       09-MAR-20

SYS@master> shu abort
ORACLE instance shut down.

2.3 开启slave的网卡,并确认表 wen 的数据。

--开网卡
[root@slave ~]# systemctl start network
[root@slave ~]# ping master
PING master (192.168.189.39) 56(84) bytes of data.
64 bytes from master (192.168.189.39): icmp_seq=1 ttl=64 time=1.07 ms
64 bytes from master (192.168.189.39): icmp_seq=2 ttl=64 time=0.648 ms
^C
--- master ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.648/0.863/1.078/0.215 ms

--确认数据
SYS@slave> select * from wen;

NAME	   TIME
---------- ---------
king	   09-MAR-20

slave库当前缺少数据。

3 手动注册归档,并使用redo做不完全恢复

3.1 查看slave的归档信息

SYS@slave> select max(sequence#) from v$archived_log;

MAX(SEQUENCE#)
--------------
	    14           --slave已经归档到14

SYS@slave> archive log list;
Database log mode	       Archive Mode
Automatic archival	       Enabled
Archive destination	       /u01/app/arch/slave
Oldest online log sequence     0
Next log sequence to archive   0
Current log sequence	       15  --slave当前redo为15

3.2 查看master的alert日志,确认master和slave相差的归档个数,及master库发生异常时的current redo

LGWR: Destination LOG_ARCHIVE_DEST_2 network reconnect abandoned
Error 16198 for archive log file 3 to 'slave'
ORA-16198: LGWR received timedout error from KSR
LGWR: Error 16198 disconnecting from destination LOG_ARCHIVE_DEST_2 standby host 'slave'
Mon Mar 09 23:17:09 2020
LGWR: Failed to archive log 3 thread 1 sequence 15 (16198)
Thread 1 advanced to log sequence 16 (LGWR switch)
  Current log# 1 seq# 16 mem# 0: /u01/app/oracle/oradata/master/redo01.log
Mon Mar 09 23:17:09 2020
Archived Log entry 15 added for thread 1 sequence 15 ID 0xb87f4fff dest 1:
Mon Mar 09 23:17:40 2020
Thread 1 advanced to log sequence 17 (LGWR switch)
  Current log# 2 seq# 17 mem# 0: /u01/app/oracle/oradata/master/redo02.log  --master的current redo为redo02.log
Mon Mar 09 23:17:40 2020
Archived Log entry 16 added for thread 1 sequence 16 ID 0xb87f4fff dest 1:  --master已经归档到16
Mon Mar 09 23:21:27 2020
WARN: ARC0: Terminating pid 118500 hung on an I/O operation
Mon Mar 09 23:21:28 2020
WARN: ARC3: Terminating pid 118500 hung on an I/O operation


master和slave相差1516号归档日志,master库异常关闭时的current redo为redo02.log

4 将相差的归档日志和current redo复制到slave库

--复制归档日志
[oracle@master master]$ scp 1_15_1034631553.dbf oracle@192.168.189.40:/home/oracle/wen
oracle@192.168.189.40's password: 
1_15_1034631553.dbf                           100%  675KB  54.0MB/s   00:00    
[oracle@master master]$ scp 1_16_1034631553.dbf oracle@192.168.189.40:/home/oracle/wen
oracle@192.168.189.40's password: 
1_16_1034631553.dbf                           100%   16KB   2.9MB/s   00:00    

--复制redo日志
[oracle@master master]$ cd /u01/app/oracle/oradata/master
[oracle@master master]$ scp redo02.log oracle@192.168.189.40:/home/oracle/wen
oracle@192.168.189.40's password: 
redo02.log                                    100%   50MB  50.0MB/s   00:01 

5 slave库注册归档日志(slave的状态为mount,否则归档日志不会应用)

5.1 注册归档日志

SYS@slave> alter database register logfile '/home/oracle/wen/1_15_1034631553.dbf';

Database altered.

SYS@slave> alter database register logfile '/home/oracle/wen/1_16_1034631553.dbf';

Database altered.

SYS@slave> alter database recover managed standby database cancel;

Database altered.

SYS@slave> alter database recover managed standby database disconnect from session;

Database altered.

注:
1、slave库的状态为mount; 
2、使用alter database recover managed standby database disconnect from session应用日志。


slave库的alert日志:

Tue Mar 10 01:44:43 2020
alter database recover managed standby database cancel
Tue Mar 10 01:44:43 2020
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /u01/app/oracle/diag/rdbms/slave/slave/trace/slave_pr00_129684.trc:
ORA-16037: user requested cancel of managed recovery operation
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Recovered data files to a consistent state at change 2659224
Tue Mar 10 01:44:43 2020
MRP0: Background Media Recovery process shutdown (slave)
Managed Standby Recovery Canceled (slave)
Completed: alter database recover managed standby database cancel
Tue Mar 10 01:44:57 2020
alter database recover managed standby database disconnect from session
Attempt to start background Managed Standby Recovery process (slave)
Tue Mar 10 01:44:57 2020
MRP0 started with pid=25, OS id=10934 
MRP0: Background Managed Standby Recovery process started (slave)
 started logmerger process
Tue Mar 10 01:45:02 2020
Managed Standby Recovery not using Real Time Apply
Parallel Media Recovery started with 4 slaves
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Waiting for thread 1 sequence 15
Completed: alter database recover managed standby database disconnect from session
Error 1034 received logging on to the standby
FAL[client, USER]: Error 1034 connecting to master for fetching gap sequence
Tue Mar 10 01:45:23 2020
alter database register logfile '/home/oracle/wen/1_15_1034631553.dbf'
There are 1 logfiles specified.
ALTER DATABASE REGISTER [PHYSICAL] LOGFILE
Completed: alter database register logfile '/home/oracle/wen/1_15_1034631553.dbf'
Tue Mar 10 01:45:33 2020
Media Recovery Log /home/oracle/wen/1_15_1034631553.dbf
Media Recovery Waiting for thread 1 sequence 16
Tue Mar 10 01:45:40 2020
alter database register logfile '/home/oracle/wen/1_16_1034631553.dbf'
There are 1 logfiles specified.
ALTER DATABASE REGISTER [PHYSICAL] LOGFILE
Resynchronizing thread 1 from sequence 15 to 16
Completed: alter database register logfile '/home/oracle/wen/1_16_1034631553.dbf'
Tue Mar 10 01:45:44 2020
Media Recovery Log /home/oracle/wen/1_16_1034631553.dbf
Media Recovery Waiting for thread 1 sequence 17
Tue Mar 10 01:45:57 2020
alter database recover managed standby database cancel
Tue Mar 10 01:45:57 2020
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /u01/app/oracle/diag/rdbms/slave/slave/trace/slave_pr00_10940.trc:
ORA-16037: user requested cancel of managed recovery operation
Recovery interrupted!
Tue Mar 10 01:45:58 2020
MRP0: Background Media Recovery process shutdown (slave)
Managed Standby Recovery Canceled (slave)
Completed: alter database recover managed standby database cancel
Tue Mar 10 01:46:11 2020
alter database recover managed standby database disconnect from session
Attempt to start background Managed Standby Recovery process (slave)
Tue Mar 10 01:46:11 2020
MRP0 started with pid=25, OS id=11035 
MRP0: Background Managed Standby Recovery process started (slave)
 started logmerger process
Tue Mar 10 01:46:16 2020
Managed Standby Recovery not using Real Time Apply
Parallel Media Recovery started with 4 slaves
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Waiting for thread 1 sequence 17
Completed: alter database recover managed standby database disconnect from session

5.2 查看slave的归档信息

SYS@slave> select max(sequence#) from v$archived_log;

MAX(SEQUENCE#)
--------------
	    16                        --slave库已经注册到16号归档

SYS@slave> archive log list;
Database log mode	       Archive Mode
Automatic archival	       Enabled
Archive destination	       /u01/app/arch/slave
Oldest online log sequence     0
Next log sequence to archive   0
Current log sequence	       16

6 使用master库的current redo进行不完全恢复(slave的状态为mount)

--slave库取消日志应用
SYS@slave> alter database recover managed standby database cancel;

Database altered.

--使用master库的current redo 进行不完全恢复
SYS@slave> recover standby database until cancel;
ORA-00279: change 2659662 generated at 03/09/2020 23:17:40 needed for thread 1
ORA-00289: suggestion : /u01/app/arch/slave/1_17_1034631553.dbf
ORA-00280: change 2659662 for thread 1 is in sequence #17


Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/home/oracle/wen/redo02.log                  --为从master库复制过来的current redo
Log applied.
Media recovery complete.

--进行failover故障转移
SYS@slave> alter database activate standby database;

Database altered.

--开库
SYS@slave> alter database open;

Database altered.

7 状态和数据确认

SYS@slave> select database_role,switchover_status,protection_mode from v$database;

DATABASE_ROLE	 SWITCHOVER_STATUS    PROTECTION_MODE
---------------- -------------------- --------------------
PRIMARY 	 FAILED DESTINATION   MAXIMUM PERFORMANCE


SYS@slave> select * from wen;

NAME	   TIME
---------- ---------
king	   09-MAR-20
leo	       09-MAR-20
white	   09-MAR-20
neo	       09-MAR-20

—failover切换完成—

总结:
1、slave库应用归档时要注意数据库状态是否为mount,且不能使用using current logfile方式应用日志;
2、failover切换后,通过删除master库来重建dg架构。(使用flash方式时,新备库不能应用日志,后续确认是否是个人原因)
3、参考了https://www.oschina.net/question/565065_126200。

谨记:心存敬畏,行有所止。

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值