DG_broker已经配置完了,可是还是需要我们进行手动切换,如果我们想让数据库自动切换,那么还需要配置FSFO了。
FSFO 将闪回数据库用作将故障主数据库恢复为备用数据库流程的一部分,所以我们需要启用数据库的闪回功能:
Flashback Database 闪回数据库功能测试
1.使用Flashback Database的前提条件
1)启用了flashback database
2)必须打开flash recovery area,若为RAC,flash recovery area必须位于共享存储中。
3)必须处于archivelog模式,开启FORCE LOGGING
一.开启数据库的闪回功能
(1)开启主库的闪回功能
SQL> ALTER SYSTEM SET db_recovery_file_dest_size=2g SCOPE=BOTH sid='*';
System altered.
SQL> ALTER SYSTEM SET db_recovery_file_dest='+arch' SCOPE=BOTH sid='*';
System altered.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
[oracle@rac1 ~]$ srvctl stop database -d orcl
[oracle@rac1 ~]$ srvctl start database -d orcl -o mount
[oracle@rac1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Thu Oct 18 10:18:13 2018
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> alter database flashback on;
Database altered.
SQL> alter database open;
Database altered.
(2)开启备库的闪回功能
从数据库在启用数据库闪回之前,需要关闭日志应用功能
DGMGRL> edit database standby set state=APPLY-OFF;
Succeeded.
DGMGRL> exit
SQL> alter system set db_recovery_file_dest='+arch02' scope=both sid='*';
System altered.
SQL> ALTER SYSTEM SET db_recovery_file_dest_size=2g SCOPE=BOTH sid='*';
System altered.
[oracle@1 ~]$ srvctl stop database -d orcl
[oracle@1 ~]$ srvctl start database -d orcl -o mount
[oracle@1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Thu Oct 18 10:18:13 2018
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> alter database flashback on;
Database altered.
SQL> alter database open;
Database altered.
开启从库日志应用功能
[oracle@rac1 ~]$ dgmgrl sys/oracle@orcl1
DGMGRL for Linux: Version 11.2.0.4.0 - 64bit Production
Copyright (c) 2000, 2009, Oracle. All rights reserved.
Welcome to DGMGRL, type "help" for information.
Connected.
DGMGRL> edit database standby set state=APPLY-ON;
Succeeded.
二.设置故障切换目标
启用了主备库的闪回功能后,我们就可以启动FSFO了,登录dgmgrl主机连接主库:
(1).orcl的故障转移目标为standby,反过来也需要设置,可以反复的故障转移。
DGMGRL> edit database orcl set property FastStartFailoverTarget='standby';
Property "faststartfailovertarget" updated
DGMGRL> edit database standby set property FastStartFailoverTarget='orcl';
Property "faststartfailovertarget" updated
DGMGRL> enable FAST_START FAILOVER
Enabled.
DGMGRL> SHOW CONFIGURATION
Configuration - htz
Protection Mode: MaxAvailability
Databases:
orcl - Primary database
Warning: ORA-16819: fast-start failover observer not started
standby - (*) Physical standby database
Warning: ORA-16819: fast-start failover observer not started
Fast-Start Failover: ENABLED
Configuration Status:
WARNING
三.启动观察期observer
可是我们发现DGMGRL告警了,那是因为我们没有启动观察器(observer)的原因,那我们接下来就启动观察器吧!(由于observer的启动会一直占用session 窗口的,所以建议写成脚本挂后台)
登录dgmgrl主机执行以下命令:
编写脚本,内容如下
Vi observer.sh
nohup dgmgrl sys/oracle@ogg2 "start observer file=FSFO.dat" >>fsfo.log 2>&1 &
Chmod u+x observer.sh
./observer.sh
启动observer后,我们再看一下配置状态
DGMGRL> show configuration
Configuration - htz
Protection Mode: MaxAvailability
Databases:
orcl - Primary database
standby - (*) Physical standby database
Fast-Start Failover: ENABLED
Configuration Status:
SUCCESS
四.修改延迟参数
注意:
DGMGRL> edit database orcl set property DelayMins=0;
DGMGRL> edit database standby set property DelayMins=0;
这两个的值要为0,不能为1,否者导致主备不能实时同步,、
fast-start failover,observer都会出现问题
这样我们的FSFO就配置完成了,下面我们模拟主库宕机后,FSFO的切换
五.模拟主库断电或者故障
[oracle@rac2 ~]$ srvctl stop database -d orcl -o abort
查看观察器的日志
[oracle@rac1 ~]$ more fsfo.log
DGMGRL for Linux: Version 11.2.0.4.0 - 64bit Production
Copyright (c) 2000, 2009, Oracle. All rights reserved.
Welcome to DGMGRL, type "help" for information.
Connected.
Observer started
11:16:00.81 Thursday, October 18, 2018
Initiating Fast-Start Failover to database "standby"...
Performing failover NOW, please wait...
Failover succeeded, new primary is "standby"
11:16:41.95 Thursday, October 18, 2018
(1)查看standby端数据库的状态,会发现已经变成主库
SQL> select db_unique_name,database_role,switchover_status,open_mode from gv$database;
DB_UNIQUE_NAME DATABASE_ROLE SWITCHOVER_STATUS OPEN_MODE
------------------------------ ---------------- -------------------- --------------------
standby PRIMARY TO STANDBY READ WRITE
standby PRIMARY SESSIONS ACTIVE READ WRITE
(2)我们重新正常启动主库orcl,会发现orcl的状态是备库,standby备库成为主库
[oracle@rac2 ~]$ srvctl start database -d orcl
查看观察器observer的日志
11:20:48.51 Thursday, October 18, 2018
Initiating reinstatement for database "orcl"...
Reinstating database "orcl", please wait...
Operation requires shutdown of instance "orcl2" on database "orcl"
Shutting down instance "orcl2"...
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "orcl2" on database "orcl"
Starting instance "orcl2"...
ORACLE instance started.
Database mounted.
Continuing to reinstate database "orcl" ...
Reinstatement of database "orcl" succeeded
[oracle@rac2 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Thu Oct 18 11:20:42 2018
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> select db_unique_name,database_role,switchover_status,open_mode,inst_id from gv$database;
DB_UNIQUE_NAME DATABASE_ROLE SWITCHOVER_STATUS
------------------------------ ---------------- --------------------
OPEN_MODE INST_ID
-------------------- ----------
orcl PHYSICAL STANDBY NOT ALLOWED
READ ONLY 2
orcl PHYSICAL STANDBY NOT ALLOWED
READ ONLY 1
从dgmgrl中我们也能看到问题,日志应用已经停止
DGMGRL> show configuration
Configuration - htz
Protection Mode: MaxAvailability
Databases:
standby - Primary database
orcl - (*) Physical standby database
Error: ORA-16766: Redo Apply is stopped
Fast-Start Failover: ENABLED
Configuration Status:
ERROR
解决.Error: ORA-16766: Redo Apply is stopped
数据库alert的日志
Errors in file /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/orcl1_mrp0_26206.trc:
ORA-01237: cannot extend datafile 5
ORA-01110: data file 5: '+DATA/orcl/datafile/users.259.986827373'
ORA-17505: ksfdrsz:1 Failed to resize file to size 3040 blocks
ORA-15041: diskgroup "DATA" space exhausted
Managed Standby Recovery not using Real Time Apply
(1)Step 1 : DGMGRL> disable configuration;
(2)从主库备份控制文件
[oracle@standby2 dbs]$ rman target /
Recovery Manager: Release 11.2.0.4.0 - Production on Thu Oct 18 12:47:53 2018
Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.
connected to target database: ORCL (DBID=1514608050)
RMAN> backup current controlfile for standby format '/u01/app/oracle/stdbyctl.bkp';
Starting backup at 18-OCT-18
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=78 instance=orcl2 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
including standby control file in backup set
channel ORA_DISK_1: starting piece 1 at 18-OCT-18
channel ORA_DISK_1: finished piece 1 at 18-OCT-18
piece handle=/u01/app/oracle/stdbyctl.bkp tag=TAG20181018T124840 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 18-OCT-18
(3)第二步备份user表空间
SQL> alter tablespace users begin backup;
Tablespace altered.
ASMCMD> cp users.dbf /home/grid
copying +data02/standby/datafile/users.dbf -> /home/grid/users.dbf
SQL> alter tablespace users end backup;
Tablespace altered.
(4)将备份的控制文件和users复制到备库
(5)在备库恢复控制文件和数据文件
将数据库关闭,在一个节点上将实例启动到nomount状态
restore standby controlfile to '+DATA/orcl/controlfile/control01.ctl' from '/u01/app/oracle/stdbyctl.bkp';
restore standby controlfile to '+DATA//controlfile/control01.ctl' from '/u0'/u01/app/oracle/stdbyctl.bkp'
恢复控制到文件后,将实例启动到mount状态下
rman>report schema;
1 0 SYSTEM *** '+DATA/orcl/datafile/system.262.987071533
2 0 SYSAUX *** +DATA/orcl/datafile/sysaux.260.987071533
3 0 UNDOTBS1 *** +DATA/orcl/datafile/undotbs1.265.987071533
4 0 UNDOTBS2 *** +DATA/orcl/datafile/undotbs2.264.987071535
5 0 USERS *** +DATA/orcl/datafile/users.270.987071535
List of Temporary Files
=======================
File Size(MB) Tablespace Maxsize(MB) Tempfile Name
---- -------- -------------------- ----------- --------------------
1 63 TEMP 32767
注意:这里面的数据文件的名称是“现在”主库的数据库文件名称,是因为当时配置adg的时候没有将备库和主库的数据文件名称设置一样,导致,所以还要做一步数据文件名称的rename
SQL> alter system set standby_file_management=manual scope=both sid='*';
System altered.
SQL> alter database rename file '+DATA/orcl/datafile/system.262.987071533' to '+DATA/orcl/datafile/SYSAUX.268.986827339';
Database altered.
SQL> alter database rename file '+DATA/orcl/datafile/sysaux.260.987071533' to '+DATA/orcl/datafile/SYSTEM.262.986827323';
Database altered.
SQL> alter database rename file '+DATA/orcl/datafile/undotbs1.265.987071533' to '+DATA/orcl/datafile/UNDOTBS1.263.986827351';
Database altered.
SQL> alter database rename file '+DATA/orcl/datafile/undotbs2.264.987071535' to '+DATA/orcl/datafile/UNDOTBS2.266.986827367';
Database altered.
SQL> alter database rename file '+DATA/orcl/datafile/users.270.987071535' to '+DATA/orcl/datafile/USERS.259.986827373';
Database altered.
SQL> alter system set standby_file_management= scope=both sid='*';
System altered.
RMAN> catalog datafilecopy
'+DATA/orcl/datafile/SYSAUX.268.986827339';
RMAN> catalog datafilecopy
'+DATA/orcl/datafile/SYSTEM.262.986827323';
RMAN> catalog datafilecopy
'+DATA/orcl/datafile/UNDOTBS1.263.986827351';
RMAN> catalog datafilecopy
'+DATA/orcl/datafile/UNDOTBS2.266.986827367';
'+DATA/orcl/datafile/USERS.259.986827373';
RMAN> switch database to copy;
datafile 1 switched to datafile copy '+DATA/orcl/datafile/SYSAUX.268.986827339';
datafile 2 switched to datafile copy '+DATA/orcl/datafile/SYSTEM.262.986827323';
datafile 3 switched to datafile copy '+DATA/orcl/datafile/UNDOTBS1.263.986827351';
datafile 4 switched to datafile copy '+DATA/orcl/datafile/UNDOTBS2.266.986827367';
datafile 5 switched to datafile copy '+DATA/orcl/datafile/USERS.259.986827373';
在orcl库上清理所有的日志组
SQL> select group# from v$log;
GROUP#
----------
1
2
3
SQL> alter database clear logfile group 1;
Database altered.
SQL> alter database clear logfile group 2;
Database altered.
SQL> alter database clear logfile group 3;
Database altered.
J. Query v$standby_log and clear all standby redo logs
SQL> select group# from v$standby_log;
GROUP#
----------
4
5
6
SQL> alter database clear logfile group 4;
Database altered.
SQL> alter database clear logfile group 5;
Database altered.
SQL> alter database clear logfile group 6;
Database altered.
SQL> select group# from v$standby_log;
no row selected
SQL> alter database add standby logfile group 4 size 50m;
Database altered.
SQL> alter database add standby logfile group 5 size 50m;
Database altered.
SQL> alter database add standby logfile group 6 size 50m;
Database altered.
开启恢复模式,并检查orcl数据库的状态,可以看出orcl作为备库重新应用日志
SQL> Alter database recover managed standby database using current logfile disconnect from session;
Database altered.
SQL> select inst_id,db_unique_name,database_role,open_mode,switchover_status from gv$database;
INST_ID DB_UNIQUE_NAME DATABASE_ROLE OPEN_MODE
---------- ------------------------------ ---------------- --------------------
SWITCHOVER_STATUS
--------------------
1 orcl PHYSICAL STANDBY READ ONLY WITH APPLY
NOT ALLOWED
2 orcl PHYSICAL STANDBY READ ONLY WITH APPLY
NOT ALLOWED
在dgmgrl工具检查
DGMGRL> show configuration
Configuration - htz
Protection Mode: MaxAvailability
Databases:
standby - Primary database
orcl - (*) Physical standby database
Fast-Start Failover: ENABLED
Configuration Status:
SUCCESS
报错:ORA-16766: Redo Apply is stopped也不在了
查看主库和备库两边的归档情况
主库
[oracle@standby1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Thu Oct 18 14:47:21 2018
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> select * from v$archive_gap;
no rows selected
备库
[oracle@rac1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Thu Oct 18 15:04:41 2018
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> select * from v$archive_gap;
no rows selected