srvctl service -failback no relocate save state

$ srvctl stop instance -d orcl -i orcl1 -f

In 11.2, you need to specify -f option if you want service failover (start in another instance))  to available instance (ie, srvctl stop instance -d xxx -i xxx1 -f) when stopping instance using srvctl.

2)If you want to stop instance and failover the services to another instance, you need to use '-failover' option and '-force'

ie)
$ srvctl status service -d <RAC>
Service <RAC>_test01 is running on instance(s) <SID1>
$ srvctl stop instance -d <RAC> -i <SID1> -failover -f
$ srvctl status service -d <RAC>
Service <RAC>_test01 is running on instance(s) <SID2>

3) If you want to stop both instance AND services running on the instance, use '-force' option

srvctl stop instance -d orcl -i orcl1 -force -failover 

$ srvctl relocate service -db orcl -service pdbtest  -oldinst orcl1 -newinst orcl2
 If user wants the the pdb at node#1 to be closed, user need to stop the pdb on node#1 manually, eg: execute "alter pluggable database pdb1 close" on node#1.

Goal

NOTE: In the images, examples and document that follow, user details, cluster names, hostnames, directory paths, filenames, etc. represent a fictitious sample (and are used to provide an illustrative example only). Any similarity to actual persons, or entities, living or dead, is purely coincidental and not intended in any manner.

Sometimes user might want to start some of the pdbs in only one node for the following concern:

  1. User wants to reduce the load by not open the pdbs on both nodes.

  2. User wants some of the pdbs be activated on other nodes if the node they are running crashed.
 

Goal

When we start up RAC database, database service does not start on preferred instance.

Solution

When we use automatic services(default) in an administrator-managed database, during planned database start-up, services may start on the first instances to start rather than their preferred instances.---数据库起来时,谁先启动instance,service就在谁上面,而不是在prefer

An automatic managed service will be started on the very first instance that starts (preferred or available) and this is what it is happening, so this is working as expected and documented.

If we want the services to start on an specific instance first, then we need to modify the service from AUTOMATIC to MANUAL, but in that case the service will not be started unless someone manually does it.

 

srvctl config service -d <RADDB> -s <SERVICE_NAME>
Service name: <SERVICE_NAME
Service is enabled
Server pool: <SERVICE_POOL>
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC <DTP transaction: false

Reference:  Introduction to Automatic Workload Management

Real Application Clusters Administration and Deployment Guide
5 Introduction to Automatic Workload Management

Instance Preference

When you define a service for an administrator-managed database, you define which instances normally support that service using SRVCTL with the -r option. These are known as the preferred instances. You can also define other instances to support a service if the service's preferred instance fails using SRVCTL with the -a option. These are known as available instances.

When you specify preferred instances, you are specifying the number of instances on which a service normally runs. Oracle Clusterware attempts to ensure that the service always runs on the number of instances for which you have configured the service. Afterwards, due to either instance failure or planned service relocations, a service may be running on an available instance. You cannot control which available instance to which Oracle Clusterware relocates the services if there are multiple instances in the list.

When a service moves to an available instance, Oracle Database does not move the service back to the preferred instance when the preferred instance restarts because:

  • The service is running on the desired number of instances.

  • Maintaining the service on the current instance provides a higher level of service availability.

  • Not moving the service back to the initial preferred instance prevents a second outage.--没有用failover 选项,避免二次停机。 不过可以在维护期间restart 一下service,在通知db可用
    srvctl stop instance -d orcl -i orcl1 -force -failover; 

  • srvclt stop service -d xx   -f ; srvclt start service -d xx  

You can, however, easily automate fail back to the preferred instance by using FAN callouts.

Solution

User can create service for a particular pdb,  let it run at preferred cdb instance, and let it to be able to failover to available cdb instance.

For example, there is one RAC system with two nodes: at node#1 ,  cdb db instance is orcl1,  at node#2, cdb db instance is orcl2. There is pluggable database pdb1 at the cdb.

User can create a service for this pdb, to let it run(open) at orcl1 instance, but not run (mount) at orcl2 instance:

$ srvctl add service -d orcl -s pdbtest -pdb pdb1 -preferred orcl1 -available orcl2

Here, the service name is pdbtest, and if this service started, then pdb1 will be opened only at orcl1 ,  if the orcl1 cdb instance crashed, then the service will failover to another node, in another word,  this pdb1 will be opened at cdb instance orcl2 at another node.

Start Service

The following is the command to start this service:

$ srvctl start service -db orcl -service pdbtest

When client application access the service via SCAN listener, then the connect request will be directed only to the preferred node.

The pdb can be opened by the starting of the service:

Before service starting:

At node#1:
SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           MOUNTED

At node#2:
SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           MOUNTED

After service starting:

At node#1:
$ srvctl start service -d orcl -s pdbtest

At node#1:
SQL> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE NO <-------- Changed
SQL>

At node#2:
SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           MOUNTED <-------- Not changed
SQL>

Service Failover

There are two types of failover:  planned failover ,  unplanned failover.

Planned failover

If user want to make a planned failover (relocate), then a relocate operation can be done as the follows:

$ srvctl relocate service -db orcl -service pdbtest  -oldinst orcl1 -newinst orcl2

This will failover the pdb service pdbtest from orcl1 cdb instance on node#1 to orcl2 cdb instance on node#2.

The test result is as following:

At node#1:
SQL> show pdbs;
   CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
        2 PDB$SEED                       READ ONLY  NO
        3 PDB1                           READ WRITE NO
SQL>
At node#2:
SQL> show pdbs;
   CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
        2 PDB$SEED                       READ ONLY  NO
        3 PDB1                           MOUNTED
SQL>

At node#1:
$ srvctl relocate service -db orcl -service pdbtest  -oldinst orcl1 -newinst orcl2

At node#1:
SQL> show pdbs;
   CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
        2 PDB$SEED                        READ ONLY  NO
        3 PDB1                            READ WRITE NO <==== still alive
SQL>

At node#2:
SQL> show pdbs;
   CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
        2 PDB$SEED                       READ ONLY  NO
        3 PDB1                           READ WRITE NO <---- Changed
SQL>

The fact that the relocate operation does not close the pdb at the original node, is expected behavior.

If user wants the the pdb at node#1 to be closed, user need to stop the pdb on node#1 manually, eg: execute "alter pluggable database pdb1 close" on node#1.

After the relocate operation, we can see that the service shift to the other node(available node):

Service status before relocate:

$ crsctl stat res -t

ora.orcl.pdbtest.svc
     1        ONLINE  ONLINE       node1                     STABLE

 Service status after relocate:

$ crsctl stat res -t

ora.orcl.pdbtest.svc
     1        ONLINE  ONLINE       node2  <<<<               STABLE

Now, when client application access the service via SCAN listener, then the connect request will be directed only to the available instance (here it is orcl2). 

Unplanned Failover

If the node on which this pdb is running on crashes, then the service will failover to other node (available node), and the pdb will also be opened on this available node. We can simulate this by killing PMON process to see the effect:

At Node#1:
SQL> show pdbs;
    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE NO

At Node#2:
SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           MOUNTED
SQL>

At Node#1:
$ crsctl stat res -t

ora.orcl.pdbtest.svc
      1        ONLINE  ONLINE       node1                    STABLE

At Node#1:
# ps -ef | grep pmon
oracle   1269214       1  0 00:43 ?        00:00:00 ora_pmon_orcl1
# kill -9 1269214  <---- killing the pmon process to make orcl1 instance crash

At Node#2:
$ crsctl stat res -t
ora.orcl.pdbtest.svc
      1        ONLINE  ONLINE       node2  <<<               STABLE <---- service shifted to node#2

At Node#2:
SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           READ WRITE NO <-------- failover occurred
SQL>

After a while, let us check node#1 :

At Node#1:
SQL> show pdbs;

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         2 PDB$SEED                       READ ONLY  NO
         3 PDB1                           MOUNTED <=== CDB restarted by cluster, but pdb not opened.
SQL>

Please note, that each of the following operation, is regarded as planned shutdown, so failover of the service / pdb will not occur.

SQL>alter pluggable database pdb close abort;

$ srvctl stop instance -d orcl -i orcl1 -f

But user can use "srvctl stop instance -d orcl -i orcl1 -force -failover" to make the service failover to the available node.

-------

Symptoms

'srvctl stop instance' brings the instance down but no service fails over to available instance
In pre-11.2, stopping instance through srvctl/sqlplus triggered service failover to available instance


1) Create service called "<RAC>_test01" on <RAC> DB with <SID1> prefer and <SID2> available.
srvctl add service -d <RAC> -s <RAC>_test01 -r <SID1> -a <SID2>


2) Check service created
srvctl config service -d <RAC> -a
==>
Service name: <RAC>_test01
Service is enabled
Server pool: <RAC>_test01
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Preferred instances: <SID1>
Available instances: <SID2>
Service name: <RAC>_test01
Service is enabled
Server pool: <RAC>_test01
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: false
Failover type: NONE
Failover method: NONE
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Preferred instances: <SID1>
Available instances:


3) start <RAC>_test01 service
srvctl start service -d <RAC> -s <RAC>_test01

4) confirm the service is srunning on <SID1>
srvctl status service -d <RAC>
==>
Service <RAC>_test01 is running on instance(s) <SID1>

5) Stop instance using srvctl with imemediate(default) or abort
srvctl stop instance -d <RAC> -i <SID1>
OR
srvctl stop instance -d <RAC> -i <SID1> -o abort

srvctl status service -d <RAC>
==>
Service <RAC>_test01 is not running


The service does not failover but stops

Changes

Upgrade to 11.2

Cause

This is expected behavior in 11.2

If using sqlplus to shutdown instance

In pre-11.2.0.4 (11.2.0.2), the service  will failover to available instance

In 11.2.0.4 or above (12.1.*), the service will not failover but stop

If shutdown instance using srvctl
In pre-11.2, the service will failover to available instance.
In 11.2, the service will not failover but stop

In 11.2, you need to specify -f option if you want service failover (start in another instance))  to available instance (ie, srvctl stop instance -d xxx -i xxx1 -f) when stopping instance using srvctl.

Alternatively, you may relocate service to another instance before stopping instance through srvctl
srvctl relocate service -d <RAC> -s <RAC>_test01 -i <SID1> -t <SID2>

Solution

Use -f option with srvctl to have services on going down instance fail over to available instance.

srvctl stop instance -d <RAC> -i <SID1> -f
srvctl status service -d <RAC>
==>
Service <RAC>_test01 is running on instance(s) <SID2>

In 12c, the syntax/behaviour is changed.

1) If stopping instance without -force or -failover option while you have service running on the stopping instance, errors (PRCD-1315,PRCR-1014, PRCR-1065, CRS-2529) are reported

ie)
$srvctl config service -d <RAC> -s <RAC>_test01

Service name: <RAC>_test01
...
....
Preferred instances: <SID1>
Available instances: <SID2>

$srvctl status service -d <RAC>
Service <RAC>_test01 is running on instance(s) <SID1>

$ srvctl stop instance -d <RAC> -i <SID1>
PRCD-1315 : failed to stop instances for database <RAC>
PRCR-1014 : Failed to stop resource ora.<RAC>.db
PRCR-1065 : Failed to stop resource ora.<RAC>.db
CRS-2529: Unable to act on 'ora.<RAC>.db' because that would require stopping or relocating 'ora.<RAC>.<RAC>_test01.svc', but the force option was not specified

2)If you want to stop instance and failover the services to another instance, you need to use '-failover' option and '-force'

ie)
$ srvctl status service -d <RAC>
Service <RAC>_test01 is running on instance(s) <SID1>
$ srvctl stop instance -d <RAC> -i <SID1> -failover -f
$ srvctl status service -d <RAC>
Service <RAC>_test01 is running on instance(s) <SID2>

3) If you want to stop both instance AND services running on the instance, use '-force' option

srvctl status service -d <RAC> -s <RAC>_test01
Service <RAC>_test01 is running on instance(s) <SID1>
$ srvctl stop instance -d <RAC> -i <SID1> -force
$ srvctl status service -d <RAC> -s <RAC>_test01
Service <RAC>_test01 is not running.




Reference :


Oracle® Real Application Clusters Administration and Deployment Guide
11g Release 2 (11.2)
Part Number E16795-08

-----------------------------
srvctl stop instance

Stops instances and STOPS ANY SERVICES RUNNING ON SPECIFIED INSTANCES, UNLESS YOU SPECIFY THE -F OPTION. If you specify -f, then the services fail over to an available instance when the instance stops.
Syntax and Options

Use the srvctl stop instance command with the following syntax:

srvctl stop instance -d db_unique_name {[-n node_name]|[-i "instance_name_list"]}
[-o stop_options] [-f]


-f This option fails the running services over to another instance

Service Failover to Node2 (Available Instance) From Node1 (Preferrred Instance) After JDK upgrade (Doc ID 3085853.1)

Applies to:

Oracle Database - Enterprise Edition - Version 19.23.0.0.0 and later
Information in this document applies to any platform.

Symptoms

  • The service (ora.<Database apps>.svc) related to application operations was initially running on Node 1.(preferred Instance)
  • During the JDK upgrade on Node1, the service failed over to Node 2(Available instance)
  • After completing the upgrade on Node 1, the service returned to Node 1 as expected.
  • However, when the upgrade on Node 2 was completed and its cluster resources came online, the service unexpectedly failed over to Node 2 again.
  • This behavior contradicts the requirement for the service to remain on Node 1.

srvctl config service -s <service> -d <DBname>
Service name: servicename
Server pool:
Cardinality: 1
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: true
Global: false
Commit Outcome: false
Failover type: SELECT
Failover method: BASIC
Failover retries: 20
Failover delay: 2
Failover restore: NONE
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: NONE
Edition:
Pluggable database name:
Hub service:
Maximum lag time: ANY
SQL Translation Profile:
Retention: 86400 seconds
Replay Initiation Time: 300 seconds
Drain timeout:
Stop option:
Session State Consistency: DYNAMIC
GSM Flags: 0
Service is enabled
Preferred instances: instance1
Available instances: instance2
CSS critical: no
Service uses Java: false

Changes

Cause

service failovered to available instance from preferred instance

  • The service(ora.<Database apps>.svc) failed over to Node 2 due to the default "failback" setting in Oracle Database 19c.
  • The "failback" parameter, by default set to "No", indicates that the service should return to its original or a new primary database after a failover.
  • In this case, the service failed back to Node 2, which was the original primary database, despite the intention for it to remain on Node 1.


 

Solution

To prevent the service from failing back to Node 2 after the JDK upgrade, follow these steps:

  1. Review the OSwatcher logs on Node 1 to investigate any potential issues from the OS perspective.
  2. If no OS-related issues are found, proceed with the following configuration change:
  3. Set the "failback" parameter to "Yes" to enable the service to remain on the preferred instance (Node 1) after a failover..

By enabling the "failback" feature and ensuring the service remains on the preferred instance, you can prevent unexpected failovers and maintain the desired service configuration.

Symptoms

 srvctl config service does not report state of failback when it is "NO"

$ srvctl modify service -d <db_name> -s <service_name> -failback yes

$ srvctl config service -d orcl -s test02 | grep -i fail
Failover type:
Failover method:
Failover retries:
Failover delay:
Failover restore: NONE
Failback : true  <--- Here.

$ srvctl modify service -d <db_name> -s <service_name> -failback no

$ srvctl config service -d orcl -s test02 | grep -i fail
Failover type:
Failover method:
Failover retries:
Failover delay:
Failover restore: NONE

We can't check state of failback.

Cause

 The issue is analyzed and discussed in internal / unpublished Bug 29891936.

Solution

 The fix for 29891936 is included 21c.
If you don't see the state of failback, please consider it set to No.

-----Oracle service 和预期配置的不一样

Symptoms

NOTE: In the images, examples and document that follow, user details, cluster names, hostnames, directory paths, filenames, etc. represent a fictitious sample (and are used to provide an illustrative example only). Any similarity to actual persons, or entities, living or dead, is purely coincidental and not intended in any manner.

Service is configured with cardinality one and expected to run on available instance. But it was started on two instances and CRS unaware that the service is running on more than one instance.
This leads to:-

  • Server side load balancing issues.
    • As the service(s) starts on multiple instances, this could overload the nodes.
       
  • Performance degradation.
    • As the users gets connected to multiple instances, this could lead to cluster waits as multiple instances need to access to the same set of data depending on the workload.

srvctl config service -service MY_SVC1_PRD -db RACDB

Service name: MY_SVC1_PRD 
Cardinality: 1

..

Edition:
Pluggable database name: PDB1
...

Preferred instances: RACDB1
Available instances: RACDB2,RACDB3 

Note : Above output is truncated to show only interested values


CONFIGURATION

  • 3 node RAC Database.
  • Pluggable database PDB1 runs on all the instances.
  • Database Configuration :-
DBUNIQUE NAMERACDB
DBNAMERACDB
SERVICE_NAMESRACDB.<domain_name>
INSTANCE_NAMESRACDB1, RACDB2, RACDB3
HOST NAMESRACDB1, RACDB2, RACDB3
PDBPDB1
  •  Service Configuration :- 
SERVICE NAME PREFERRED INSTANCESAVAILABLE INSTANCESPDB
MY_SVC1_PRDRACDB1RACDB2, RACDB3PDB1
MY_SVC2_PRDRACDB2RACDB1, RACDB3PDB1
MY_SVC3_PRDRACDB2RACDB1, RACDB3PDB1

  •  Services are running on the preferred instances as per srvctl  and crsctl  

srvctl status service -db RACDB

Service MY_SVC1_PRD is running on instance(s) RACDB1
Service MY_SVC2_PRD is running on instance(s) RACDB2
Service MY_SVC3_PRD is running on instance(s) RACDB2 

crsctl stat res -t 

...

ora.RACDB.MY_SVC1_PRD.svc
           1     ONLINE ONLINE     RACDB1     STABLE

ora.RACDB.MY_SVC2_PRD.svc
           2     ONLINE ONLINE     RACDB2     STABLE

ora.RACDB.MY_SVC3_PRD.svc
           2     ONLINE ONLINE     RACDB2     STABLE 


  •  But users are able to connect to multiple instances. 

Eg: MY_SVC1_PRD_USER is expected to connect to instance RACDB1 only as the service is running on RACDB1. But MY_SVC1_PRD_USER  is able to make new connections to both RACDB1 and RACDB2.

SQL> select INST_ID, USERNAME, SERVICE_NAME, count(*) from gv$session where username like 'MY_SVC1_PRD_USER%' group by INST_ID, USERNAME, SERVICE_NAME order by INST_ID, USERNAME, SERVICE_NAME;

INST_ID     USERNAME         SERVICE_NAME    COUNT(*)
---------- ----------------  --------------- ---------
1          MY_SVC1_PRD_USER  MY_SVC1_PRD     51
2          MY_SVC1_PRD_USER  MY_SVC1_PRD     53

  

Changes

 Instance termination and service failover.

Cause

  • The local listeners on both the nodes as well as SCAN listeners are aware of the service MY_SVC1_PRD.
  • Hence the user MY_SVC1_PRD_USER is able to connect to both the instance and the connections are load balanced across both the instances.

INSTANCE 1 : [ Hostname - RACDB1.<domain_name> , Instance name  RACDB1 ]

RACDB1> lsnrctl status LISTENER_SCAN1

Service "MY_SVC1_PRD.<domain_name>" has 2 instance(s).
      Instance "RACDB1", status READY, has 1 handler(s) for this service...
      Instance "RACDB2", status READY, has 1 handler(s) for this service...

RACDB1> lsnrctl status LISTENER

Service "MY_SVC1_PRD.<domain_name>" has 1 instance(s).
      Instance "RACDB1", status READY, has 1 handler(s) for this service... 

INSTANCE 2 : [ Hostname - RACDB2.<domain_name> , Instance name : RACDB2 ]

RACDB2> lsnrctl status LISTENER_SCAN2

Service "MY_SVC1_PRD.<domain_name>" has 2 instance(s).
      Instance "RACDB1", status READY, has 1 handler(s) for this service...
      Instance "RACDB2", status READY, has 1 handler(s) for this service...

RACDB2> lsnrctl status LISTENER

Service "MY_SVC1_PRD.<domain_name>" has 1 instance(s).
       Instance "RACDB2", status READY, has 1 handler(s) for this service...

  • Database is aware of the service and is running on the instances RACDB1 and RACDB2.

SYS:RACDB1>select inst_id, service_id, name, con_name, creation_date from gv$active_services where name='MY_SVC1_PRD';

INST_ID    SERVICE_ID NAME           CON_NAME     CREATION_DATE
---------- ---------- -------------- ------------ -------------
1           1         MY_SVC1_PRD    PDB1         05-MAR-21
2           1         MY_SVC1_PRD    PDB1         05-MAR-21 

  • The reason for this behavior is because the pdb state was saved in the database.

SYS:RACDB1>select * from dba_pdb_saved_states;

CON_ID     CON_NAME   INSTANCE_NAME  CON_UID     GUID                            STATE
---------- ---------  -------------  ---------- -------------------------------- -----
3          PDB1       RACDB1         925456677  A01B66341CC01148E0539648EC0A28D0 OPEN
3          PDB1       RACDB2         595314999  A839B67599FF7448E0539648EC0A2842 OPEN

PDB SAVE STATE

  • You can preserve the open mode of one or more PDBs when the CDB restarts by using the ALTER PLUGGABLE DATABASE SQL statement with a pdb_save_or_discard_state clause.
  • You can do this in the following way: 
    • Specify SAVE STATE to preserve the PDBs' mode when the CDB is restarted.
      • For example,
        • If a PDB is in open read/write mode before the CDB is restarted, then the PDB is in open read/write mode after the CDB is restarted.
        • If a PDB is in mounted mode before the CDB is restarted, then the PDB is in mounted mode after the CDB is restarted.
    • Specify DISCARD STATE to ignore the PDBs' open mode when the CDB is restarted.
      • When DISCARD STATE is specified for a PDB, the PDB is always mounted after the CDB is restarted.
         
  • Refer note Document 1933511.1 How to Preserve Open Mode of PDBs When the CDB Restarts for more details.
  • When saving the state of PDBs , the details of the PDBs and its state will be saved in DBA_PDB_SAVED_STATES and the running service details of the respective PDBs are recorded in pdb_svc_state$.As the PDB/service state was saved in multiple instances, it overrides the CRS service configuration and starts the service on all the saved instances. (NOTE: even if "srvctl disable service" was performed on the service at the CRS level, the PDB save state would override this as well).
  • As the service was automatically started due to the pdb save state and are not started through cluster, CRS will not be aware that service is running on multiple instances and reports the service being running on one instance.

Solution

  • In RAC it is not recommended to save the state of PDBs. RAC will open the PDBs on a node if the services are defined on that PDB.
  • It is no longer necessary to save the state in RAC environments.
  • Saving state leads to opening the service/PDB on the nodes where it is not intended and the performance may be affected adversely.
  • An additional check is introduced in Oracheck to give warning about the saved state. (screen shot attached)


 Immediate solution :-

  1. Discard saved state of PDBs from all the instance. 
    • SQL> alter pluggable database all discard state instances=all; -- Discard the saved state on all instances 
      SQL> select * from dba_pdb_saved_states; -- Check if the state is cleared  

  2. Find the status of the service.
    • SQL> select inst_id, name,con_name from gv$active_services where name='<SERVICE_NAME>';

  3. Change the container to the specific PDB whenever the service is is running (con_name in the above output should give the PDB name )
    • SQL> alter session set container=<PDB_NAME>;

        
  4. Stop the service using dbms_service procedure on the specific instance where the service should not be running
    • SQL> exec dbms_service.stop_service('<SERVICE_NAME>','<INSTANCE_NAME>');

        
  5. Verify and make sure the service is running on one expected node.
    • SQL> select inst_id, name,con_name from gv$active_services where name='<SERVICE_NAME>';

        
  • EXAMPLE
    • SYS:RACDB2> alter pluggable database all discard state instances=all;

      Pluggable database altered.
       

      SYS:RACDB2> select * from dba_pdb_saved_states;

      no rows selected.

      SYS:RACDB2> select inst_id, name,con_name from gv$active_services where name='MY_SVC1_PRD;

      INST_ID    SERVICE_ID NAME          CON_NAME
      ---------- ---------- ------------- -----------
      1          3          MY_SVC1_PRD   PDB1
      2          3          MY_SVC1_PRD   PDB1

      SYS:RACDB2> alter session set container=PDB1;

      Session altered.
       

      SYS:PCD70452>exec dbms_service.stop_service('MY_SVC1_PRD','RACDB2');

      PL/SQL procedure successfully completed.

      SQL> select inst_id, name,con_name from gv$active_services where name='MY_SVC1_PRD';

      INST_ID    SERVICE_ID  NAME         CON_NAME
      ---------- ---------- ------------- -----------
      1          3          MY_SVC1_PRD   PDB1 

      Note: The above steps will not disconnect the existing connections on instance 2, but will not allow new connection to the instance 2. Existing connections should be killed explicitly from application or database.

Long term Solution :- 

  • Avoid saving the state on PDBs on RAC databases as RAC will open the PDBs on a node if the services are defined on that PDB.
  • Introduce monitoring on DBA_PDB_SAVED_STATES to make sure that the states are not saved accidentally.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值