Dataguard搭建问题小记-CSDN博客

由于希望测试一个DG问题，所以在Vmware上搭建测试环境，primary和standby放在同一台主机上。
搭建过程就不赘述，但是在最后，遇到了一些问题，这里记录一下，以备后续查阅。

----搭建过程中的一些命令-----

 
     
          1 
          . backup database format  
          '/tmp/bk_%U' 
          ; 
         
 
          2 
          . backup current controlfile  
          for  
          standby format  
          '/tmp/stdbyctl.bkp' 
          ; 
         
 
          3 
          . catalog start  
          with  
          '/tmp/' 
          ; 
         
 
          4 
          .  
          set  
          controlfile autobackup format  
          for  
          device type disk to  
          '/tmp/%F' 
          ; 
         
 
              
          restore standby controlfile from  
          '/tmp/stdbyctl.bkp' 
          ;  
         
 
          5 
          . RMAN> CONNECT TARGET SYS/oracle@db; 
         
 
          RMAN> CONNECT AUXILIARY SYS/oracle@stddb; 
         
 
   

6. 最会std的pfile文件内容

 
          db_file_name_convert=  
          '+data/db/' 
          , 
          '+reco/stddb/'  
         
          log_file_name_convert=  
          '+RECO/db/archivelog/' 
          , 
          '+RECO/stddb/archivelog_std/'  
         
          log_archive_format=%t_%s_%r.arc  
         
          standby_file_management=auto  
         
          compatible= 
          '11.2.0.0.0'  
         
          control_files= 
          '+RECO/stddb/controlfile/current.260.834947597'     
          <<<<<<在恢复control之前，先设定为control_files= 
          '+RECO' 
          恢复之后，根据实际位置更改  
         
          db_block_size= 
          8192  
         
          processes= 
          150  
         
          remote_login_passwordfile= 
          'EXCLUSIVE'  
         
          undo_tablespace= 
          'UNDOTBS1'  
         
          undo_management=auto  
         
          log_archive_config= 
          'dg_config=(primdb,stbdb)'  
         
          log_archive_dest_1= 
          'location=+RECO/stddb/archivelog_std/  valid_for=(all_logfiles,all_roles)  db_unique_name=stbdb'

问题一：

在设置listener的过程中，由于设定静态监听，在standby启动之后，同时又会注册一个动态监听，目前有两个监听，其中动态监听状态为blocked。
遇到的问题就是，在duplicate连接的时候，就会发生ORA-12528错误，而不能正常连接到standby。

 
     
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
         
 
          [oracle@OEL ~]$ lsnrctl status 
         
 
          LSNRCTL  
          for  
          Linux: Version  
          11.2 
          . 
          0.1 
          . 
          0  
          - Production on  
          30 
          -DEC- 
          2013  
          10 
          : 
          46 
          : 
          03 
         
 
          Copyright (c)  
          1991 
          ,  
          2009 
          , Oracle.  All rights reserved. 
         
 
          Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT= 
          1521 
          )) 
         
 
          STATUS of the LISTENER  
         
 
          ------------------------  
         
 
          Alias                     LISTENER  
         
 
          Version                   TNSLSNR  
          for  
          Linux: Version  
          11.2 
          . 
          0.1 
          . 
          0  
          - Production  
         
 
          Start  
          Date                 
          27 
          -NOV- 
          2013  
          12 
          : 
          40 
          : 
          26  
         
 
          Uptime                     
          32  
          days  
          22  
          hr.  
          5  
          min.  
          37  
          sec  
         
 
          Trace Level               off  
         
 
          Security                  ON: Local OS Authentication  
         
 
          SNMP                      OFF  
         
 
          Listener Parameter File   /u01/app/ 
          11.2 
          . 
          0 
          /grid/network/admin/listener.ora  
         
 
          Listener Log File         /u01/app/oracle/diag/tnslsnr/OEL/listener/alert/log.xml  
         
 
          Listening Endpoints Summary...  
         
 
             
          (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))  
         
 
             
          (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=OEL.localdomain)(PORT= 
          1521 
          )))  
         
 
          Services Summary...  
         
 
          Service  
          "+ASM"  
          has  
          1  
          instance(s).  
         
 
             
          Instance  
          "+ASM" 
          , status READY, has  
          1  
          handler(s)  
          for  
          this  
          service...  
         
 
          Service  
          "PRIMDB"  
          has  
          1  
          instance(s).  
         
 
             
          Instance  
          "db" 
          , status READY, has  
          1  
          handler(s)  
          for  
          this  
          service...  
         
 
          Service  
          "dbXDB"  
          has  
          1  
          instance(s).  
         
 
             
          Instance  
          "db" 
          , status READY, has  
          1  
          handler(s)  
          for  
          this  
          service...  
         
 
          Service  
          "stbdb"  
          has  
          1  
          instance(s).  
         
 
             
          Instance  
          "stbdb" 
          , status UNKNOWN, has  
          1  
          handler(s)  
          for  
          this  
          service...    <<<<<<<<<<<<<<<<<<<<<<  
         
 
          Service  
          "stddb"  
          has  
          1  
          instance(s).  
         
 
             
          Instance  
          "stddb" 
          , status READY, has  
          1  
          handler(s)  
          for  
          this  
          service...     <<<<<<<<<<<<<<<<<<<<<<  
         
 
          The command completed successfully  
         
 
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
         
 
   

通过设置tnsnames.ora，可以解决这个问题。

 
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  
         
          stddb =  
         
          (DESCRIPTION =  
         
          (ADDRESS_LIST =  
         
          (ADDRESS = (PROTOCOL = TCP)(HOST = OEL.localdomain)(PORT =  
          1521 
          ))  
         
          )  
         
          (CONNECT_DATA = (SERVICE_NAME = stddb)(UR=A)) <--------------------In order to avoid error ORA- 
          12528  
          ) 
         
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

问题二：

在执行下面命令之后，直接primary就crash掉，经过查询alert发现，是redo broken导致的
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;

查询standby alert，发现如下信息：
（在ALTER DATABASE RECOVER...开始之后，第一件事情就是clear redo log，这样就直接导致primary crash）

 
     
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  
         
 
          # tail -f alert_stddb.log  
         
 
          Clearing online redo logfile  
          1  
          complete  
         
 
          Clearing online redo logfile  
          2  
          +DATA/db/redo02.log  
         
 
          Clearing online log  
          2  
          of thread  
          1  
          sequence number  
          5  
         
 
          Clearing online redo logfile  
          2  
          complete  
         
 
          Clearing online redo logfile  
          3  
          +DATA/db/redo03.log  
         
 
          Clearing online log  
          3  
          of thread  
          1  
          sequence number  
          3  
         
 
          Tue Dec  
          24  
          14 
          : 
          32 
          : 
          49  
          2013  
         
 
          Clearing online redo logfile  
          3  
          complete  
         
 
          Tue Dec  
          24  
          14 
          : 
          32 
          : 
          49  
          2013  
         
 
          Media Recovery Waiting  
          for  
          thread  
          1  
          sequence  
          4  
         
 
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
         
 
   

在重建几次之后，问题依然。
由于datafile和archivelog都使用convert参数转换了，但是redo是无法转换的，如何解决这个问题呢？
经过分析，发现在duplicate的最后，有一些警告信息：

 
          ~~~~~~~~~duplicate the standby databsae~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  
         
          RMAN> CONNECT TARGET SYS/oracle@db;  
         
          RMAN> CONNECT AUXILIARY SYS/oracle@stddb; 
         
          connected to auxiliary database: DB (not mounted) 
         
          RMAN> duplicate target database  
          for  
          standby; 
         
          Starting Duplicate Db at  
          24 
          -DEC- 
          13 
         
          allocated channel: ORA_AUX_DISK_1  
         
          channel ORA_AUX_DISK_1: SID= 
          13  
          device type=DISK 
         
          contents of Memory Script: 
         
          {  
         
          restore clone standby controlfile;  
         
          }  
         
          executing Memory Script 
         
          Starting restore at  
          24 
          -DEC- 
          13 
         
          using channel ORA_AUX_DISK_1 
         
          channel ORA_AUX_DISK_1: starting datafile backup  
          set  
          restore 
         
          channel ORA_AUX_DISK_1: restoring control file  
         
          channel ORA_AUX_DISK_1: reading from backup piece /tmp/stdbyctl.bkp  
         
          channel ORA_AUX_DISK_1: piece handle=/tmp/stdbyctl.bkp tag=TAG20131224T133449  
         
          channel ORA_AUX_DISK_1: restored backup piece  
          1  
         
          channel ORA_AUX_DISK_1: restore complete, elapsed time:  
          00 
          : 
          00 
          : 
          08  
         
          output file name=+RECO/stddb/controlfile/current. 
          259.835018533  
         
          Finished restore at  
          24 
          -DEC- 
          13 
         
          contents of Memory Script: 
         
          {  
         
          sql clone  
          'alter database mount standby database' 
          ;  
         
          }  
         
          executing Memory Script 
         
          sql statement: alter database mount standby database 
         
          contents of Memory Script: 
         
          {  
         
          set  
          newname  
          for  
          tempfile   
          1  
          to  
         
          "+RECO/stddb/temp01.dbf" 
          ;  
         
          switch  
          clone tempfile all;  
         
          set  
          newname  
          for  
          datafile   
          1  
          to  
         
          "+RECO/stddb/system01.dbf" 
          ;  
         
          set  
          newname  
          for  
          datafile   
          2  
          to  
         
          "+RECO/stddb/sysaux01.dbf" 
          ;  
         
          set  
          newname  
          for  
          datafile   
          3  
          to  
         
          "+RECO/stddb/undotbs01.dbf" 
          ;  
         
          set  
          newname  
          for  
          datafile   
          4  
          to  
         
          "+RECO/stddb/users01.dbf" 
          ;  
         
          restore  
         
          clone database  
         
          ;  
         
          }  
         
          executing Memory Script 
         
          executing command: SET NEWNAME 
         
          renamed tempfile  
          1  
          to +RECO/stddb/temp01.dbf  
          in  
          control file 
         
          executing command: SET NEWNAME 
         
          executing command: SET NEWNAME 
         
          executing command: SET NEWNAME 
         
          executing command: SET NEWNAME 
         
          Starting restore at  
          24 
          -DEC- 
          13 
         
          using channel ORA_AUX_DISK_1 
         
          channel ORA_AUX_DISK_1: starting datafile backup  
          set  
          restore 
         
          channel ORA_AUX_DISK_1: specifying datafile(s) to restore from backup  
          set  
         
          channel ORA_AUX_DISK_1: restoring datafile  
          00001  
          to +RECO/stddb/system01.dbf  
         
          channel ORA_AUX_DISK_1: restoring datafile  
          00002  
          to +RECO/stddb/sysaux01.dbf  
         
          channel ORA_AUX_DISK_1: restoring datafile  
          00003  
          to +RECO/stddb/undotbs01.dbf  
         
          channel ORA_AUX_DISK_1: restoring datafile  
          00004  
          to +RECO/stddb/users01.dbf  
         
          channel ORA_AUX_DISK_1: reading from backup piece /tmp/bk_01osanei_1_1  
         
          channel ORA_AUX_DISK_1: piece handle=/tmp/bk_01osanei_1_1 tag=TAG20131224T132953  
         
          channel ORA_AUX_DISK_1: restored backup piece  
          1  
         
          channel ORA_AUX_DISK_1: restore complete, elapsed time:  
          00 
          : 
          04 
          : 
          27  
         
          Finished restore at  
          24 
          -DEC- 
          13 
         
          contents of Memory Script: 
         
          {  
         
          switch  
          clone datafile all;  
         
          }  
         
          executing Memory Script 
         
          datafile  
          1  
          switched to datafile copy 
         
          input datafile copy RECID= 
          1  
          STAMP= 
          835019033  
          file name=+RECO/stddb/system01.dbf  
         
          datafile  
          2  
          switched to datafile copy  
         
          input datafile copy RECID= 
          2  
          STAMP= 
          835019033  
          file name=+RECO/stddb/sysaux01.dbf  
         
          datafile  
          3  
          switched to datafile copy  
         
          input datafile copy RECID= 
          3  
          STAMP= 
          835019033  
          file name=+RECO/stddb/undotbs01.dbf  
         
          datafile  
          4  
          switched to datafile copy  
         
          input datafile copy RECID= 
          4  
          STAMP= 
          835019033  
          file name=+RECO/stddb/users01.dbf  
         
          ORACLE error from auxiliary database: ORA- 
          01511 
          : error  
          in  
          renaming log/data files      <<<<<<<<<<<<<<<<<<<<<<<<  
         
          ORA- 
          01275 
          : Operation RENAME  
          is  
          not allowed  
          if  
          standby file management  
          is  
          automatic.    <<<<<<<<<<<<<<<<<<<<<<<< 
         
          RMAN- 
          05535 
          : WARNING: All redo log files were not defined properly. 
         
          ORACLE error from auxiliary database: ORA- 
          01511 
          : error  
          in  
          renaming log/data files  
         
          ORA- 
          01275 
          : Operation RENAME  
          is  
          not allowed  
          if  
          standby file management  
          is  
          automatic. 
         
          RMAN- 
          05535 
          : WARNING: All redo log files were not defined properly. 
         
          ORACLE error from auxiliary database: ORA- 
          01511 
          : error  
          in  
          renaming log/data files  
         
          ORA- 
          01275 
          : Operation RENAME  
          is  
          not allowed  
          if  
          standby file management  
          is  
          automatic. 
         
          RMAN- 
          05535 
          : WARNING: All redo log files were not defined properly. 
         
          Finished Duplicate Db at  
          24 
          -DEC- 
          13  
         
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

后手动修改当standby_file_management为MANUAL模式，可以duplicate并且rename成功，并rename正broken0~2.
（这个是我在pfile中有这个standby_file_management=auto）

 
          checkpoint  
          is  
          977553 
         
          last deallocation scn  
          is  
          959057  
         
          Undo Optimization current scn  
          is  
          974128  
         
          Tue Dec  
          24  
          15 
          : 
          16 
          : 
          57  
          2013  
         
          Switch of datafile  
          1  
          complete to datafile copy  
         
          checkpoint  
          is  
          977553  
         
          Switch of datafile  
          2  
          complete to datafile copy  
         
          checkpoint  
          is  
          977553  
         
          Switch of datafile  
          3  
          complete to datafile copy  
         
          checkpoint  
          is  
          977553  
         
          Switch of datafile  
          4  
          complete to datafile copy  
         
          checkpoint  
          is  
          977553  
         
          alter database rename file  
          '+DATA/db/redo01.log'  
          to  
          'broken0'  
         
          Completed: alter database rename file  
          '+DATA/db/redo01.log'  
          to  
          'broken0'    
          <<<<<<<<  
         
          alter database rename file  
          '+DATA/db/redo02.log'  
          to  
          'broken1'  
         
          Completed: alter database rename file  
          '+DATA/db/redo02.log'  
          to  
          'broken1'  
         
          alter database rename file  
          '+DATA/db/redo03.log'  
          to  
          'broken2'  
         
          Completed: alter database rename file  
          '+DATA/db/redo03.log'  
          to  
          'broken2'  
         
          RFS connections are allowed

在执行ALTER DATABASE RECOVER...的时候，依然需要去清空redo，只不过，应经将redo rename了，所以不会影响主库的redo log了

 
     
          ~~~~~~~~~~~~~~~~~~~~~ 
         
 
          Tue Dec  
          24  
          15 
          : 
          45 
          : 
          37  
          2013  
         
 
          ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION  
         
 
          Attempt to start background Managed Standby Recovery process (stddb)  
         
 
          Tue Dec  
          24  
          15 
          : 
          45 
          : 
          37  
          2013  
         
 
          MRP0 started  
          with  
          pid= 
          21 
          , OS id= 
          21343  
         
 
          MRP0: Background Managed Standby Recovery process started (stddb)  
         
 
            
          started logmerger process  
         
 
          Tue Dec  
          24  
          15 
          : 
          45 
          : 
          43  
          2013  
         
 
          Managed Standby Recovery not using Real Time Apply  
         
 
          Parallel Media Recovery started  
          with  
          2  
          slaves  
         
 
          Waiting  
          for  
          all non-current ORLs to be archived...  
         
 
          All non-current ORLs have been archived.  
         
 
          Errors  
          in  
          file /u01/app/oracle/product/ 
          11.2 
          . 
          0 
          /dbhome_1/log/diag/rdbms/stddb/stddb/ 
          trace 
          /stddb_mrp0_21343.trc:  
         
 
          ORA- 
          00313 
          : open failed  
          for  
          members of log group  
          1  
          of thread  
          1  
         
 
          ORA- 
          00312 
          : online log  
          1  
          thread  
          1 
          :  
          '/u01/app/oracle/product/11.2.0/dbhome_1/dbs/broken0'      
          <<<<<<<<<<<<<<<<  
         
 
          ORA- 
          27037 
          : unable to obtain file status  
         
 
          Linux-x86_64 Error:  
          2 
          : No such file or directory  
         
 
          Additional information:  
          3  
         
 
          ......  
         
 
          Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION  
         
 
          Clearing online redo logfile  
          1  
          complete  
         
 
          Errors  
          in  
          file /u01/app/oracle/product/ 
          11.2 
          . 
          0 
          /dbhome_1/log/diag/rdbms/stddb/stddb/ 
          trace 
          /stddb_mrp0_21343.trc:  
         
 
          ORA- 
          00313 
          : open failed  
          for  
          members of log group  
          2  
          of thread  
          1  
         
 
          ORA- 
          00312 
          : online log  
          2  
          thread  
          1 
          :  
          '/u01/app/oracle/product/11.2.0/dbhome_1/dbs/broken1'  
         
 
          ORA- 
          27037 
          : unable to obtain file status  
         
 
          Linux-x86_64 Error:  
          2 
          : No such file or directory  
         
 
          Additional information:  
          3  
         
 
          ......  
         
 
          Tue Dec  
          24  
          15 
          : 
          45 
          : 
          48  
          2013  
         
 
          Clearing online redo logfile  
          2  
          complete  
         
 
          Errors  
          in  
          file /u01/app/oracle/product/ 
          11.2 
          . 
          0 
          /dbhome_1/log/diag/rdbms/stddb/stddb/ 
          trace 
          /stddb_mrp0_21343.trc:  
         
 
          ORA- 
          00313 
          : open failed  
          for  
          members of log group  
          3  
          of thread  
          1  
         
 
          ORA- 
          00312 
          : online log  
          3  
          thread  
          1 
          :  
          '/u01/app/oracle/product/11.2.0/dbhome_1/dbs/broken2'  
         
 
          ORA- 
          27037 
          : unable to obtain file status  
         
 
          Linux-x86_64 Error:  
          2 
          : No such file or directory  
         
 
          Additional information:  
          3  
         
 
          ......  
         
 
          Clearing online redo logfile  
          3  
          complete  
         
 
          Media Recovery Waiting  
          for  
          thread  
          1  
          sequence  
          4 
         
 
   

总结：由于之前的DG搭建，都是在两个不同的主机上实施，也没有关心过这个问题，因为standby清空primary redo的动作一定是失败的。但是现在是在一个主机上，standby能发现原primary的redo，并且由于rename失败，所以就直接clear primary redo导致primary 无法启动。
而root cause就是我在编辑pfile的时候，多写了standby_file_management=auto，其实不写的话，默认值是manual的，就不会出现这个问题。