Data Guard(Data Guard Physical Standby Switchover )-CSDN博客

10.2 Data Guard Physical Standby Switchover [ID 751600.1]

		修改时间 15-DEC-2009 类型 REFERENCE 状态 PUBLISHED

In this Document
Purpose
10.2 Data Guard Physical Standby Switchover
1.0 Prerequisites / Preparation
2.0 Pre-Switchover Checks
3.0 Switchover
4.0 Post-Switchover Steps
5.0 Create a Guaranteed Restore Point on Each Switchover Database
6. References

Applies to:

Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 10.2.0.4
Information in this document applies to any platform.

Purpose

This note is intended as an accessory to the following resources:

The goal of this document is be used as a basis to in developing your own robust switchover procedure.

Oracle strongly recommends you apply the latest patchset/bundle patch for your version prior to proceeding.

10.2 Data Guard Physical Standby Switchover

1.0 Prerequisites / Preparation

These are items that should only have to be done once during configuration and setup.

1.1. Apply Latest Patch Bundle

Review the Document 466181.1 "10g Upgrade Companion", and make sure to check the “Patches Recommended” tab.
See Document 756671.1 for the latest available patches or patchset updates.
1.2. Setup Service Relocation for a Local Standby (optional)

See Data Guard Switchover and Failover MAA paper.

1.3. Review the MAA Best Practice Papers

See the References section.

1.4. Review MAA Data Guard Best Practices

In the 10g High Availability Best Practices 10g Release 2 (10.2) guide see 2.4 Configuring Oracle Database 10g with Data Guard

1.5. Verify the Setup

1.5.1. With Broker

1. Review Prerequisites for First Use
2. Enable Broker to restart instances

To enable DGMGRL to restart instances during the course of broker operations, a service with a specific name must be statically registered with the listener of each instance. The value for the GLOBAL_DBNAME attribute must be set to a concatenation of db_unique_name_DGMGRL.db_domain. For example, in the LISTENER.ORA file:

LISTENER = (DESCRIPTION = (ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=) (PORT=)))) SID_LIST_LISTENER=(SID_LIST=(SID_DESC= (SID_NAME=) (GLOBAL_DBNAME=_DGMGRL.db_domain) (ORACLE_HOME=)))

1.5.2. Without Broker

Follow the steps at Verify the Physical Standby Database Is Performing Properly
1.6. Understand and test fallback options

See 11.1 guide, Failed Switchovers to Physical Standby Databases. Still applies to 10.2 as well.

Check DG Admin troubleshooting guide, Problems Switching Over to a Standby Database.

2.0 Pre-Switchover Checks

These steps should be completed before the switchover planned maintenance window begins. Recommendation is that these are done a couple days in advance.

2.1. Verify Configuration Health

2.1.1 With Broker

2.1.1.1 Verify Data Guard Environment Health

CLI - see Monitoring a Data Guard Configuration

GUI - see " Verifying a Broker Configuration" The broker health check performs the following:

Shows the current data protection mode setting, including the current redo transport service settings for each database and whether or not the standby redo log files are configured properly. If standby redo log files are needed for any database, the Verify results will allow you to automatically configure them.

Validates each database for the current status.
Performs a log switch on the primary database and then verifies that the log file was applied on each standby database.
Shows the results of the Verify operation, including errors, if any. The Verify operation completes successfully if there are no errors and an online redo log file was successfully applied to at least one standby database.
Shows any databases or RAC instances that are not discovered. Undiscovered databases and instances could prevent a failover or switchover from completing successfully.
Detects inconsistencies between database properties and their corresponding values in the database itself. It also provides a mechanism for resolving these inconsistencies.
2.1.1.2. Cancel apply delay for the target standby using CLI or GUI

Note: if flashback database is not enabled as part of normal operations then canceling any apply delay should be done just prior to a switchover to maintain standby delay protection for any possible primary database issues.

On the standby Capture the current value

DGMGRL> SHOW DATABASE DELAYMINS;

On the standby turn off any delay

CLI – DGMGRL> EDIT DATABASE ‘’ SET PROPERTY 'DELAYMINS'='0';

GUI – See Changing the Properties of a Database
2.1.2. Without Broker

2.1.2.1. Verify Managed Recovery is Running (non-broker) on the standby

SQL> SELECT PROCESS
FROM V$MANAGED_STANDBY
WHERE PROCESS LIKE 'MRP%';

2.1.2.2. Cancel apply delay for the target standby using SQL

On the target physical standby database capture the current delay value

SQL> SELECT DELAY_MINS
FROM V$MANAGED_STANDBY
WHERE PROCESS = 'MRP0';

On the target physical standby database turn off delay if > 0

SQL> RECOVER MANAGED STANDBY DATABASE CANCEL

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE NODELAY USING CURRENT LOGFILE DISCONNECT FROM SESSION;

2.2. Ensure Online Redo Log Files on the Target Physical Standby have been cleared

Online redo logs on the target physical standby need to be cleared before that standby database can become a primary database. Although this will automatically happen as part of the SWITCHOVER TO PRIMARY command, it is recommended that the logs are cleared prior to the switchover. If you have set the LOG_FILE_NAME_CONVERT parameter in the spfile, online redo logs will be automatically cleared the first time managed recovery is started on the standby.

Oracle recommends setting LOG_FILE_NAME_CONVERT to automatically clear online redo logs on the physical standby database. In the event the primary database and the physical standby database have the exact same directory path to the online redo logs, it is acceptable to set LOG_FILE_NAME_CONVERT such that the entry pairs have the same value.

As an example, if the online redo logs are stored in /oradata/order_db/redo for both the primary and physical standby databases on their respective servers, you can set the parameter value as

LOG_FILE_NAME_CONVERT=’/oradata/order_db/redo/’,’/oradata/order_db/redo/’

This will initiate automatic clearing of the online redo logs on the physical standby database when managed recovery is started.

Clearing online redo logs as part of the SWITCHOVER TO PRIMARY command can make the switchover command susceptible to termination by another process that is waiting on access to the CONTROLFILE. The CONTROLFILE waiter will attempt to kill the switchover after a timeout is 15 minutes.

If you have not set your environment to automatically clear the online redo logs you should manually clear them at some point prior to the switchover. This can be done at any time.

On the target physical standby run the following query to determine if the online redo logs have not been cleared:

SQL> SELECT DISTINCT L.GROUP#
FROM V$LOG L, V$LOGFILE LF
WHERE L.GROUP# = LF.GROUP#
AND L.STATUS NOT IN (‘UNUSED’, ‘CLEARING’,’CLEARING_CURRENT’);

If the above query returns rows, on the target physical standby issue the following statement for each GROUP# returned:

SQL> ALTER DATABASE CLEAR LOGFILE GROUP ;

If the switchover is performed using SQL*Plus is terminated by a CONTROLFILE waiter timeout, just re-issue the SWITCHOVER TO PRIMARY command until it completes successfully. If you encounter the timeout while attempting a switchover using Data Guard Broker, you must go into SQL*Plus, attaching to the target physical standby database and re-issue the SWITCHOVER TO PRIMARY command until it completes successfully. You must then drop and recreate your Broker configuration.

In both cases, you should monitor your alert log to ensure your online redo logs are being cleared and you are not experiencing some other issue.

2.3. Check for Previously Disabled Redo Threads

This check is to evaluate if you are vulnerable to Bug 6266023 (fixed in 10.2.0.4.2 patchset) which will cause a switchover to fail..

To determine if this situation exists, on your primary database, first run the following query to determine if there are any threads with a SEQUENCE# greater than 0:

SQL> SELECT THREAD#
FROM V$THREAD
WHERE SEQUENCE# > 0;

On the primary database, determine the current database redo branch:

SQL> SELECT TO_CHAR(RESETLOGS_CHANGE#)
FROM V$DATABASE_INCARNATION
WHERE STATUS = ‘CURRENT’;

Any of the threads identified by the first query are disabled provided there are no archive log or log history records in the control file of the target physical standby database on the current branch of redo on the primary.

To determine this, substitute the resetlogs_change# from the primary database (found in the second query) into the query below and execute it on the target physical standby database for each thread reported from the first query above.

SQL> SELECT ‘CONDITION MET’
FROM SYS.DUAL
WHERE NOT EXISTS (SELECT 1
FROM V$ARCHIVED_LOG
WHERE THREAD# =
AND RESETLOGS_CHANGE# = )
AND NOT EXISTS (SELECT 1
FROM V$LOG_HISTORY
WHERE THREAD# =
AND RESETLOGS_CHANGE# = );

If this query returns a row for any of the threads, you have a disabled thread with a non-zero SEQUENCE# that can prevent a switchover from the primary database to the physical standby database. In this case, you must apply the latest patchset or use one of the following workarounds:

Workaround 1 – Does not require primary database downtime

Briefly enable the previously disabled thread(s) and switch logs on the primary database to send logs and populate entries into V$ARCHIVED_LOG and V$LOG_HISTORY on the physical standby database. This workaround does not require downtime on the primary database, but it is not a permanent workaround. Until either the 10.2.0.4.2 or higher patchset is applied or the second workaround listed is performed, you would need to perform these steps prior to every switchover. Log shipping and managed recovery should remain on during this process.

1. At primary enable the disabled threads and switch logs. The multiple disable/enable and switch logfile commands are required to ensure all manner of disabled threads (internally disabled and externally disabled) are handled correctly.

On the primary database, enable each of the disabled threads;
SQL> ALTER DATABASE DISABLE THREAD ;
SQL> ALTER DATABASE ENABLE THREAD ;
SQL> ALTER DATABASE DISABLE THREAD ;
SQL> ALTER DATABASE ENABLE THREAD ;

Perform the following 4 times on the primary database:

SQL> ALTER SYSTEM SWITCH LOGFILE;

Do the following 1 time on the primary database:

SQL> ALTER SYSTEM ARCHIVE LOG CURRENT;

2. Verify that the primary has archived a log for the thread(s);

On the primary database issue the following;
SQL> SELECT THREAD#, MAX(SEQUENCE#)
FROM V$LOG_HISTORY
WHERE RESETLOGS_CHANGE# =
(SELECT RESETLOGS_CHANGE#
FROM V$DATABASE_INCARNATION
WHERE STATUS = ‘CURRENT’)
GROUP BY THREAD#;

SQL> SELECT THREAD#, MAX(SEQUENCE#)
FROM V$ARCHIVED_LOG
WHERE RESETLOGS_CHANGE# =
(SELECT RESETLOGS_CHANGE#
FROM V$DATABASE_INCARNATION
WHERE STATUS = ‘CURRENT’)
GROUP BY THREAD#;

3. Ensure that redo apply has progressed through the enabled thread logs:

Connect to the primary database and issue;
SQL> SELECT TO_CHAR(RESETLOGS_CHANGE#)
FROM V$DATABASE_INCARNATION
WHERE STATUS = ‘CURRENT’;

Connect to target physical standby database and issue;
SQL> SELECT THREAD#,MAX(SEQUENCE#)
FROM V$LOG_HISTORY
WHERE RESETLOGS_CHANGE# =
GROUP BY THREAD#;

SQL> SELECT THREAD#, MAX(SEQUENCE#)
FROM V$ARCHIVED_LOG
WHERE RESETLOGS_CHANGE# =
GROUP BY THREAD#;

NOTE: Both these queries should return values for the threads enabled in step 1.

4. On the primary database, disable the threads enabled in step 1 (run the DISABLE for each thread);
SQL> ALTER DATABASE DISABLE THREAD ;

SQL> ALTER SYSTEM ARCHIVE LOG CURRENT;

5. Verify, by repeating step 2, the primary has archived the logs of the disabled thread(s). The last query may need to be run a few times to give time for the final archival to complete. The output from the two queries should match once the archiving is caught up.

6. Repeat step 3 to ensure that the standby has received and applied through all the logs from the disabled threads.

7. Repeat Section 2.1, "Verify Configuration Health" and proceed to Section 2.4, "Check if the standby has ever been open read-only"

Workaround 2 – Requires primary database downtime

Reset the SEQUENCE# to 0 for the disabled threads by opening the primary database with the RESETLOGS option. This workaround is a permanent fix, however it requires downtime on the primary database. Log shipping should remain on during this process.

If using the Data Guard Broker, use dgmgrl to connect to the primary database and disable the configuration
DGMGRL> DISABLE CONFIGURATION;
At each physical standby database, stop the managed recovery process
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

Ensure managed recovery has been stopped.
SQL> SELECT *
FROM GV$MANAGED_STANDBY
WHERE PROCESS = ‘MRP0’;
At the primary database, switch logfiles 2 times to ensure the primary database has advanced beyond the point managed recovery has recovered to on the physical standby database(s).
SQL> ALTER SYSTEM SWITCH LOGFILE;
SQL> ALTER SYSTEM SWITCH LOGFILE;
This process requires a clean shutdown of the primary database.
Shutdown each instance of the primary database normal
SQL> SHUTDOWN IMMEDIATE
Start one instance of the primary database in mount mode
SQL> STARTUP MOUNT
Start recovery on this primary database instance. No actual recovery will be performed, this is to prepare for opening the database with RESETLOGS.
SQL> RECOVER DATABASE UNTIL CANCEL;

Media recovery complete.
Open the primary database with RESETLOGS option.
SQL> ALTER DATABASE OPEN RESETLOGS;
Generate an archive log file from the primary database:
SQL> ALTER DATABASE ARCHIVE LOG CURRENT;

When the standby database recognizes the new redo branch, media recovery will return an ORA-19906 error, this is expected. The alert log for the standby database will show something similar to:

MRP0: Incarnation has changed! Retry recovery...
Wed Dec 2 06:47:52 2009
Errors in file /ade/mgirkar_103/oracle/rdbms/log/x1032_mrp0_13389.trc:
ORA-19906: recovery target incarnation changed during recovery
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Wed Dec 2 06:47:54 2009
Errors in file /ade/mgirkar_103/oracle/rdbms/log/x1032_mrp0_13389.trc:
ORA-19906: recovery target incarnation changed during recovery
Wed Dec 2 06:48:14 2009
Managed Standby Recovery starting Real Time Apply
Media Recovery apply resetlogs offline range for datafile 1, incarnation : 0
Media Recovery apply resetlogs offline range for datafile 2, incarnation : 0
Media Recovery apply resetlogs offline range for datafile 3, incarnation : 0
Media Recovery apply resetlogs offline range for datafile 4, incarnation : 0
Media Recovery apply resetlogs offline range for datafile 5, incarnation : 0
parallel recovery started with 2 processes
Media Recovery Log /ade/mgirkar_103/oracle/work/arc_save/db2r_602567e5_1_1_704530059.arc
Media Recovery Waiting for thread 1 sequence 2 (in transit)

9. Ensure both the primary and the physical standby database are on the same redo branch.

On both the primary and physical standby database, issue the following query:

SQL> SELECT TO_CHAR(RESETLOGS_CHANGE#), RESETLOGS_TIME
FROM V$DATABASE_INCARNATION
WHERE STATUS = ‘CURRENT’;

The new branch of redo should be clearly identified by the RESETLOGS_TIME. If the new branch does not appear on the physical standby, do not continue, instead investigate why the new redo branch has not registered at the physical standby database.

NOTE: It may take a few moments for the physical standby to receive the logs from the primary database and recognize the change in branch.

10. Restart the managed recovery process.

a. If using Data Guard Broker, use dgmgrl to connect to the primary database and re-enable the configuration disabled as part of the first step of this workaround. Enabling the configuration will also start the managed recovery process.
DGMGRL> ENABLE CONFIGURATION;
a. If managing the configuration using SQL*Plus, connect to the physical standby database(s) and restart the managed recovery process.
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE NODELAY USING CURRENT LOGFILE DISCONNECT FROM SESSION;

11. Repeat Section 2.1, "Verify Configuration Health" and proceed to Section 2.4, "Check if the standby has ever been open read-only"

2.4. Check if the standby has ever been open read-only

On the target physical standby database run this query:

SQL> SELECT VALUE
FROM V$DATAGUARD_STATS
WHERE NAME='standby has been open';

If the target physical standby was open read-only then restart the standby

SQL> SHUTDOWN IMMEDIATE

SQL> STARTUP MOUNT

2.5. Verify there are no large GAPS.

Identify the current sequence number for each thread on the primary database

SQL> SELECT THREAD#, SEQUENCE#

FROM V$THREAD;

Verify the target physical standby database has applied up to, but not including the logs from the primary query. On the standby the following query should be no more than 1-2 less than the primary query result.

SQL> SELECT THREAD#, MAX(SEQUENCE#)
FROM V$ARCHIVED_LOG
WHERE APPLIED = 'YES'
AND RESETLOGS_CHANGE# = (SELECT RESETLOGS_CHANGE#
FROM V$DATABASE_INCARNATION
WHERE STATUS = ‘CURRENT’)
GROUP BY THREAD#;

If large gaps exist (more than 3 logs) then consult the Oracle® Data Guard Concepts and Administration, 10g Release 2 (10.2) guide: Section 5.8 Section 12.11 “Resolving Archive Gaps Manually”. If the gap is not resolved by Data Guard automatically then consult, “ Manually Determining and Resolving Archive Gaps”.

If a large redo apply lag (greater than 2 logs) persists then review the MAA best practice paper, “ Data Guard Redo Apply & Media Recovery” and also consult the Oracle® Data Guard Concepts and Administration 11g Release 1 (11.1) guide to monitor in more detail, 9.5 Monitoring Primary, Physical Standby.

2.6. Use “THROUGH ALL SWITCHOVER” on Bystander Standbys

Redo apply should be started with the “THROUGH ALL SWITCHOVER ” clause at each standby database in the configuration. The Broker starts managed recovery with the “THROUGH ALL SWITCHOVER” clause.
See Managing Data Guard Configurations Having Multiple Standby Databases - Best Practices for details.

2.7. Verify Primary and Standby TEMP Files Match

On the standby for each temporary tablespace, verify that temporary files associated with that tablespace on the primary database also exist on the standby database. Temp files added after initial standby creation are not propagated to the standby. Run this query on both the primary and target physical standby databases and verify that they match.

SQL> SELECT TMP.NAME FILENAME, BYTES, TS.NAME TABLESPACE
FROM V$TEMPFILE TMP, V$TABLESPACE TS
WHERE TMP.TS#=TS.TS#;

If the queries do not match then you can correct the mismatch now or immediately after the open of the new primary.

To correct now: add or delete a tempfile now requires managed recovery to be stopped and the standby to be open read only. Opening the standby read-only will require a database close and open before becoming the new primary, see “Open the new primary database”.

To correct post-primary-open: see “Correct any tempfile mismatch” step of Switchover

2.8. Verify that there is no issue with V$LOG_HISTORY on the Standby

( Bug 6010833, 10.2.0.3 patch available on Linux 32-bit, this is included in the 6081547 patch bundle ( Document 6081547.8) listed above under “Apply Latest Patch Bundle”. It is assumed any potential issues with “Check for Previously Disabled Redo Threads” have already been resolved.)

Determine threads that have been active at some point on the primary database:

SQL> SELECT THREAD#, SEQUENCE#
FROM V$THREAD
WHERE SEQUENCE# > 0;

Get the RESETLOGS_CHANGE# from the primary database:
SQL> SELECT RESETLOGS_CHANGE#
FROM V$DATABASE_INCARNATION
WHERE STATUS = ‘CURRENT’;

On the target physical standby database, get the maximum sequence numbers for each thread from V$LOG_HISTORY:
SQL> SELECT THREAD#, MAX(SEQUENCE#)
FROM V$LOG_HISTORY
WHERE RESETLOGS_CHANGE#=< resetlogs_change# from the primary V$DATABASE_INCARNATION.RESETLOGS_CHANGE# >
GROUP BY THREAD#;

The last SEQUENCE# for each THREAD# from V$LOG_HISTORY on the target physical standby database should be close (the difference in log sequences < 3) to the SEQUENCE# for each THREAD# from V$THREAD on the primary database. If the difference in log sequences is greater than 3 or no row is returned for the thread, you have encountered this problem and should recreate the standby controlfile. See Note 459411.1. If backups are being done on the standby without an RMAN Catalog then backup history will be lost. It is highly recommended to use an RMAN Catalog for all backups.

2.9.Verify no old partial Standby Redo Logs on the Standby

( Bug 7159505, fixed in 10.2.0.5 and 11.1.0.7; 10.2.0.3 patch available on Solaris Sparc64 and can be requested for other platforms. This patch conflicts with the 6081547 patch bundle ( Document 6081547.8) and would require a patch merge request if you want to apply this on top of the 6081547 patch bundle ( Document 6081547.8) .)

Get the RESETLOGS_CHANGE# from the primary database:
SQL> SELECT RESETLOGS_CHANGE#
FROM V$DATABASE_INCARNATION
WHERE STATUS = ‘CURRENT’;

On the target physical standby database, identify any active standby redo logs (SRL’s)
SQL> SELECT GROUP#, THREAD#, SEQUENCE#
FROM V$STANDBY_LOG
WHERE STATUS = 'ACTIVE'
ORDER BY THREAD#,SEQUENCE#;

On the target physical standby database, identify maximum applied sequence number(s).
SQL> SELECT THREAD#, MAX(SEQUENCE#)
FROM V$LOG_HISTORY
WHERE RESETLOGS_CHANGE#=< resetlogs_change# from the primary V$DATABASE_INCARNATION.RESETLOGS_CHANGE# >
GROUP BY THREAD#;

If there are any active SRL's that have a thread#/sequence# less than the thread#/sequence# returned from the V$LOG_HISTORY (meaning the recovery has progressed beyond the active SRL) query, clear them on the target physical standby.

SQL> RECOVER MANAGED STANDBY DATABASE CANCEL

SQL> ALTER DATABASE CLEAR LOGFILE GROUP ;

3.0 Switchover

3.1. Clear Potential Blocking Parameters & Jobs

Capture current job state on the primary

SQL> SELECT *
FROM DBA_JOBS_RUNNING; [depending on what the running job is, be ready to terminate]

SQL> SELECT OWNER, JOB_NAME, START_DATE, END_DATE, ENABLED
FROM DBA_SCHEDULER_JOBS
WHERE ENABLED=’TRUE’
AND OWNER <> ‘SYS”;

SQL> SHOW PARAMETER job_queue_processes

Note: cron job candidates to be disabled among others: oracle text sync and optimizer, RMAN backups, application garbage collectors, application background agents.

Block further job submission

SQL> ALTER SYSTEM SET job_queue_processes=0 SCOPE=BOTH SID=’*’;

SQL> EXECUTE DBMS_SCHEDULER.DISABLE( );

Disable any cron jobs that may interfere.

3.2. Shutdown all mid-tiers (optional)

This can be done in parallel to the switchover.

$ opmnctl stopall

Note: If using a local standby with an application that is following the “ Client Failover in Data Guard Configurations for Highly Available Oracle Databases” paper recommendations to utilize a database startup trigger that ensures the application database service is only active on the primary, this step can be skipped.

3.3. Monitor Switchover

3.3.1. With Broker

3.3.1.1. Turn on Data Guard tracing on primary and standby

Capture the current value for each instance

DGMGRL> SHOW INSTANCE LogArchiveTrace;

Set Data Guard trace level to 8191 for each instance

DGMGRL> EDIT INSTANCE SET PROPERTY LogArchiveTrace=8191;

Trace output will appear under the destination pointed to by the database parameter BACKGROUND_DUMP_DEST with “mrp” in the file name.

3.3.1.2. Tail Broker Logs (optional) on all instances

Locate Broker logs by showing database parameter background_dump_dest

SQL> SHOW PARAMETER background_dump_dest

Tail the broker logs

> tail –f /dr*
3.3.2. Without Broker

3.3.2.1. Turn on Data Guard tracing on primary and standby

Tracing is turned on to have diagnostic information available in case any issues arise. Turning on tracing does not have any noticeable impact on switchover time but does require space for the trace output.

Capture the current value on both the primary and the target physical standby databases

SQL> SHOW PARAMETER log_archive_trace

Set Data Guard trace level to 8191 on both the primary and the target physical standby databases

SQL> ALTER SYSTEM SET log_archive_trace=8191;

Trace output will appear under the destination pointed to by the database parameter BACKGROUND_DUMP_DEST with “mrp” in the file name.
3.3.3. Tail Primary and Standby alert logs on all instances

Locate alert logs by showing database parameter background_dump_dest on both the primary and the target physical standby databases

SQL> SHOW PARAMETER background_dump_dest

Tail the alert log on both the primary and the target physical standby databases

> tail –f /al*
3.4. Create Guaranteed Restore Points (optional)

The standard switchover fallback options should suffice for successfully backing out of a switchover. However, if you want an additional fallback option then you can create a guaranteed restore point on the primary and standby database participating in the switchover. If you want to do this see “ Create a Guaranteed Restore Point on Each Switchover Database” for details. If a guaranteed restore point is created, make sure it is dropped post-switchover.

3.5. Switchover

3.5.1. With Broker

3.5.1.1. Data Guard Broker command line utility

See Performing a Switchover Operation

Connect to the primary database using the DGMGRL command line utility as sys using the same password as the sys user on the primary and standby databases

Issue the switchover to command:

DGMGRL> SWITCHOVER TO ;

3.5.1.2. EM switchover

To start a switchover using Enterprise Manager, select the standby database that you want to change to the primary role and click Switchover.

Note: Following the open of the new primary there will be an increase in I/O while the new primary’s standby redo logs are cleared.
3.5.2. Without Broker

3.5.2.0. Verify that the primary database can be switched to the standby role

Query the SWITCHOVER_STATUS column of the V$DATABASE view on the primary database. For example:
SQL> SELECT SWITCHOVER_STATUS
FROM V$DATABASE;

SWITCHOVER_STATUS
-----------------
TO STANDBY

A value of TO STANDBY or SESSIONS ACTIVE (requires the WITH SESSION SHUTDOWN clause on the switchover command) indicates that the primary database can be switched to the standby role. If neither of these values is returned, a switchover is not possible because redo transport is either mis-configured or is not functioning properly. See Chapter 5 of the " Oracle® Data Guard Concepts and Administration, 10g Release 1 (10.2)" guide for information about configuring and monitoring redo transport.

3.5.2.1. If RAC, then shutdown all secondary primary instances

A normal shutdown can be done, but to expedite the shutdown issue a SHUTDOWN ABORT on secondary RAC instances on the primary only

3.5.2.2. Switchover the primary to a standby database

SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY WITH SESSION SHUTDOWN;

If an ORA-16139 is encountered, as long as V$DATABASE.DATABASE_ROLE=’PHYSICAL STANDBY’, then you can proceed. A common case where this can occur is when there are a large number of data files, greater than 1,000, the apply of the EOR log will timeout.. Once managed recovery is started on the new standby it will recover.

If the role was not changed then you need to cancel the switchover and review the alert logs and trace files further.

3.5.2.3. Verify the standby has received the end-of-redo (EOR) log(s)

In the primary alert log you should see messages like this:

Mon Nov 3 06:53:13 2008
ARCH: Noswitch archival of thread 1, sequence 21
ARCH: End-Of-Redo Branch archival of thread 1 sequence 21
ARCH: Archiving is disabled due to current logfile archival
Clearing standby activation ID 2821924805 (0xa83327c5)
The primary database controlfile was created using the
'MAXLOGFILES 192' clause.
There is space for up to 188 standby redo logfiles
Use the following SQL commands on the standby database to create
standby redo logfiles that match the primary database:
ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 524288000;
…
Archivelog for thread 1 sequence 21 required for standby recovery
Switchover: Primary controlfile converted to standby controlfile succesfully.
MRP0 started with pid=18, OS id=32583
Mon Nov 3 06:53:15 2008
MRP0: Background Managed Standby Recovery process started (sfs_stby1)
Mon Nov 3 06:53:20 2008
Managed Standby Recovery not using Real Time Apply
Mon Nov 3 06:53:20 2008
parallel recovery started with 3 processes
Online logfile pre-clearing operation disabled by switchover
Media Recovery Log +REGR/sfs_stby/archivelog/2008_11_03/thread_1_seq_21.258.669
97593
Identified End-Of-Redo for thread 1 sequence 21
Mon Nov 3 06:53:21 2008
Media Recovery End-Of-Redo indicator encountered
Mon Nov 3 06:53:21 2008
Media Recovery Applied until change 8338654
Mon Nov 3 06:53:21 2008
MRP0: Media Recovery Complete: End-Of-REDO (sfs_stby1)
Resetting standby activation ID 2821924805 (0xa83327c5)
Mon Nov 3 06:53:21 2008
MRP0: Background Media Recovery process shutdown (sfs_stby1)
Mon Nov 3 06:53:22 2008
SUCCESS: diskgroup REGR was dismounted
Mon Nov 3 06:53:22 2008
Switchover: Complete - Database shutdown required (sfs_stby1)
Mon Nov 3 06:53:22 2008
Completed: ALTER DATABASE COMMIT TO SWITCHOVER TO STANDBY WITH SESSION SHUTDOWN

And correspondingly in the standby alert log file you should see messages like this:

Mon Nov 3 06:53:17 2008
Media Recovery Log +REGR2/sfs/archivelog/2008_11_03/thread_1_seq_21.3819.669797593
Identified End-Of-Redo for thread 1 sequence 21
Mon Nov 3 06:53:17 2008
Media Recovery End-Of-Redo indicator encountered
Mon Nov 3 06:53:17 2008
Media Recovery Applied until change 8338654
Mon Nov 3 06:53:17 2008
MRP0: Media Recovery Complete: End-Of-REDO (sfs1)
Resetting standby activation ID 2821924805 (0xa83327c5)
Mon Nov 3 06:53:19 2008
MRP0: Background Media Recovery process shutdown (sfs1)

3.5.2.4. If the standby is a RAC configuration, then shutdown all secondary standby instances

A normal shutdown can be done, but to expedite this operation, issue a SHUTDOWN ABORT on the secondary non-apply RAC instances.

3.5.2.5. Verify that the standby database can be switched to the primary role

Query the SWITCHOVER_STATUS column of the V$DATABASE view on the standby database. For example:
SQL> SELECT SWITCHOVER_STATUS
FROM V$DATABASE;

SWITCHOVER_STATUS
-----------------
TO PRIMARY

A value of TO PRIMARY or SESSIONS ACTIVE indicates that the standby database is ready to be switched to the primary role. If neither of these values is returned, verify that redo apply is active and that redo transport is configured and working properly. Continue to query this column until the value returned is either TO PRIMARY or SESSIONS ACTIVE.

3.5.2.6. Check if the standby has ever been open read-only

If the target physical standby has been open read-only (found in Pre-Switchover check 2.5) and you have not bounced the target physical standby, do so now.

3.5.2.7. Switchover the standby database to a primary

SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY;

3.5.2.8. Open the new primary database:

SQL> ALTER DATABASE OPEN;

Note: Beginning with Oracle Database 10g Release 2, you can open the new production database directly from the mount state if the standby database was not opened read-only since the last time the database was started. If the database has been opened read-only, it will need to be restarted.

Note: There will be an increase in I/O while the new primary’s standby redo logs are cleared.

3.5.2.9. Correct any tempfile mismatch

If there was a tempfile mismatch in Pre-switchover check, “Verify Primary and Standby TEMP Files Match” that was not corrected then correct it now on the new primary.

3.5.2.10. Restart the new standby

On the the new standby database (old production database), bring it to the mount state and start managed recovery. This can be done in parallel to the new primary open.

SQL> SHUTDOWN IMMEDIATE;

SQL> STARTUP MOUNT;

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;

3.5.2.11. If the production and standby databases are configured in a RAC, then start all instances on primary and standby

3.6. Contingency or Fallback

See 11.1 guide, Failed Switchovers to Physical Standby Databases. Still applies to 10.2 as well.

Check DG Admin troubleshooting guide, Problems Switching Over to a Standby Database

4.0 Post-Switchover Steps

4.1. Set Trace to Prior Value

4.1.1. With Broker

For every instance: DGMGRL> EDIT INSTANCE SET PROPERTY LogArchiveTrace=3.3.1.1>;

4.1.2. Without Broker

For each database:

SQL> ALTER SYSTEM SET log_archive_trace=;
4.2. Reset Jobs

SQL> ALTER SYSTEM SET job_queue_processes= scope=both sid=’*’

SQL> EXECUTE DBMS_SCHEDULER.ENABLE();

Enable any cron jobs that were diabled in 3.1

4.3. Schedule and conduct the incremental backup, roll-forward, and tape backups for 10.2

Retain/move backup schedule to the standby

4.4. Reset apply delay for the target standby

Reverse steps in 2.1.1.2 or 2.1.2.2

4.5. Drop any Switchover Guaranteed Restore Points

SQL> DROP RESTORE POINT SWITCHOVER_START_GRP ;

5.0 Create a Guaranteed Restore Point on Each Switchover Database

5.1. Review Prerequisites & Best Practices

About Flashback Database
Guaranteed Restore Points and Flash Recovery Area Space Usage
Logging for Flashback Database With Guaranteed Restore Points Defined
Flashback Database Best Practices & Performance Document 565535.1.
5.2. Create a guaranteed restore point on the primary

5.2.1. Verify if flashback database is on or a guaranteed restore point already exists

SQL> SELECT FLASHBACK_ON FROM V$DATABASE;

If this query returns “YES” (flashback database is on) or “RESTORE POINT ONLY” (Flashback is on but one can only flashback to an existing guaranteed restore point) then proceed to creating the guaranteed restore point.

NOTE: Unless you have a backport for Bug 7568556, “ACTIVE APPLY RATE SEEN FROM 63MB/S TO 544KB/S AFTER RESTORE POINT ENABLED”, you should not have just a guaranteed restore point only (V$DATABASE.FLASHBACK_ON=”RESTORE POINT ONLY”) and ensure that flashback database is also on (V$DATABASE.FLASHBACK_ON=”YES”) when creating a guaranteed restore point.

If this query returns “NO” then you need to turn on flashback database before creating the guaranteed restore point. This requires the database to be mounted.

See Enabling Logging for Flashback Database for those steps.

5.2.2. Create the guaranteed restore point

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
SQL> CREATE RESTORE POINT SWITCHOVER_START_GRP GUARANTEE FLASHBACK DATABASE;

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT;
5.3. Create a guaranteed restore point on the standby

5.3.1. Verify if flashback database is on or a guaranteed restore point already exists

SQL> SELECT FLASHBACK_ON FROM V$DATABASE;

If this query returns “YES” (flashback database is on) or “RESTORE POINT ONLY” (Flashback is on but one can only flashback to an existing guaranteed restore point) then proceed to creating the guaranteed restore point.

If this query returns “NO” then you need to turn on flashback database before creating the guaranteed restore point. This requires being in the MOUNT state.

See Enabling Logging for Flashback Database for those steps.

5.3.2. Create the guaranteed restore point

SQL> CREATE RESTORE POINT SWITCHOVER_START_GRP GUARANTEE FLASHBACK DATABASE;