Oracle adg进行switchover主备切换后,ogg如何恢复?

8 篇文章 0 订阅
3 篇文章 0 订阅
作者介绍:老苏,10余年DBA工作运维经验,擅长Oracle、MySQL、PG、Mongodb数据库运维(如安装迁移,性能优化、故障应急处理等)
公众号:老苏畅谈运维
欢迎关注本人公众号,更多精彩与您分享。

Oracle主备进行switchover切换时,原先在主库运行的OGG进程是否会异常,如何进行恢复?

环境说明

主机名IP地址实例名数据库角色OGG角色数据库版本OGG版本
ora11g10.10.10.160ora11g主库源端11.2.0.419.1.0.0.4
ora11gdg10.10.10.41dgora11g备库/11.2.0.4/
11gbbed10.10.10.44dbbbed/目标端11.2.0.419.1.0.0.4

说明:10.10.10.160为主库,当前为OGG源端,10.10.10.41是10.10.10.160的备库,10.10.10.44为OGG的目标端。

switchover切换后OGG状态

发生switchover切换前 10.10.10.160为主库,当前为OGG源端,上面有进程抽取和投递进程:EXTCS、DPCS。

发生switchover切换后10.10.10.160为备库,10.10.10.41变为主库,那么10.10.10.160上面的EXTCS、DPCS能正常工作吗?关于如何进行switchover切换,可以参考之前的文档,下面我们切换完后看一下他们进程状态:

--有一张测试表test_job,后台存储过程每个1分钟自动插入数据。
sys@ora11g> select max(C_DATE) from szr.test_job;

MAX(C_DATE)
-------------------
2024-08-19 15:34:24

--切换完进程状态
GGSCI (ora11g) 12> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     RUNNING     DPCS        00:00:00      00:00:02    
EXTRACT     ABENDED     EXTCS       00:00:01      00:10:14    

发生主备switchover切换后,在主库的ogg源端extract进程会abend,报错如下:

2024-08-19T15:48:19.056+0800  INFO    OGG-06604  Oracle GoldenGate Capture for Oracle, extcs.prm:  Database ORA11G CPU info: CPU Count 2, CPU Core Count 2, CPU Socket Count 2.
2024-08-19T15:48:19.056+0800  WARNING OGG-00727  Oracle GoldenGate Capture for Oracle, extcs.prm:  Switch extract to archived log only mode on physical standby database.
2024-08-19T15:48:19.056+0800  WARNING OGG-01830  Oracle GoldenGate Capture for Oracle, extcs.prm:  LOGRETENTION is disabled by default in ARCHIVEDLOGONLY mode.
2024-08-19T15:48:19.154+0800  ERROR   OGG-00060  Oracle GoldenGate Capture for Oracle, extcs.prm:  Extract requires a value specified for parameter ALTARCHIVELOGDEST when in archived log only mode.
2024-08-19T15:48:19.159+0800  ERROR   OGG-01668  Oracle GoldenGate Capture for Oracle, extcs.prm:  PROCESS ABENDING.
2024-08-19T15:48:20.973+0800  INFO    OGG-00987  Oracle GoldenGate Command Interpreter for Oracle:  GGSCI command (oracle): info all.

恢复方式1

默认无法在ADG环境抽取redo,解决方案是在抽取进程中添加参数:

TRANLOGOPTIONS MINEFROMACTIVEDG

--参考MOS文档,注意OGG版本需在 12.1.2.1.0 及以后才有这个参数
Does GoldenGate Support Extracting Redo from Active Data Guard? (Doc ID 1299805.1)
 Does GoldenGate support extract reading redo from generated from an active data guard?
 Yes, this is supported in 12.1.2.1.0 (but not previous version) GoldenGate classic extract, with parameter:
 TRANLOGOPTIONS MINEFROMACTIVEDG
 For versions older than 12.1.2.1.0 the below error can occur:
 ERROR   OGG-00303  Unrecognized option (MINEFROMACTIVEDG) for TRANLOGOPTIONS.
 So Goldengate must be upgraded to 12.1.2.1.0 or later versions to resolve it.
GGSCI (ora11g) 15> edit param extcs

EXTRACT extcs
userid ggadmin,password ggadmin123
TRANLOGOPTIONS  dblogreader
TRANLOGOPTIONS MINEFROMACTIVEDG
LOGALLSUPCOLS
UPDATERECORDFORMAT compact
GETTRUNCATES
GETUPDATEBEFORES
NOCOMPRESSDELETES
NOCOMPRESSUPDATES
DBOPTIONS ALLOWUNUSEDCOLUMN
TRANLOGOPTIONS EXCLUDEUSER ggadmin
FETCHOPTIONS NOUSESNAPSHOT
EXTTRAIL ./dirdat/cs
DISCARDFILE ./dirrpt/extcs.dsc, APPEND, MEGABYTES 1024
discardrollover at 3:00
warnlongtrans 1h,checkinterval 5m
CACHEMGR CACHESIZE 1024MB, CACHEDIRECTORY ./dirtmp
REPORTCOUNT EVERY 60 SECONDS, RATE
DDL INCLUDE MAPPED
--DDLOPTIONS ADDTRANDATA
DDLOPTIONS REPORT
TABLE SZR.*;

DDLOPTIONS ADDTRANDATA 需要注释掉,否则会报错:

DDLOPTIONS ADDTRANDATA is not supported on read only database

上述参数改完后,还会遇到下列错误:

2024-08-19T16:01:23.948+0800  INFO    OGG-01517  Oracle GoldenGate Capture for Oracle, extcs.prm:  Position of first record processed Sequence 342, RBA 625168, SCN 0.3730950 (3730950), 2024 M08 19 15:35:00.
2024-08-19T16:01:23.949+0800  ERROR   OGG-02803  Oracle GoldenGate Capture for Oracle, extcs.prm:  Encountered a Data Guard role transition. Alter Extract to SCN 3,751,013 and restart Extract, or recreate Extract with the correct number of threads at SCN 3,751,013.
2024-08-19T16:01:23.949+0800  ERROR   OGG-01668  Oracle GoldenGate Capture for Oracle, extcs.prm:  PROCESS ABENDING.
2024-08-19T16:01:24.913+0800  INFO    OGG-02232  Oracle GoldenGate Capture for Oracle, dpcs.prm:  Switching to next trail file /ogg/dirdat/cs000000003 at 2024-08-19 16:01:24.909392 due to EOF. with current RBA 1,793,635.
2024-08-19T16:01:25.014+0800  INFO    OGG-00987  Oracle GoldenGate Command Interpreter for Oracle:  GGSCI command (oracle): info all.

3751013是备库变成主库时的scn,可以通过原备库(新主库)的alert日志看到SwitchOver after complete recovery through change 3751013,或者通过v$database视图的列STANDBY_BECAME_PRIMARY_SCN获取。

解决方法:

--修改extract的捕获scn为报错显示的scn
GGSCI (ora11g) 22> ALTER EXTRACT extcs  scn 3751013;
EXTRACT altered.

--重新启动
GGSCI (ora11g) 23> start extcs

Sending START request to MANAGER ...
EXTRACT EXTCS starting


GGSCI (ora11g) 24> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     RUNNING     DPCS        00:00:00      00:00:08    
EXTRACT     RUNNING     EXTCS       00:00:00      00:00:00    

查看目标端复制进程:

GGSCI (11gbbed) 24> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
REPLICAT    RUNNING     REPCS       00:00:00      00:00:02   

SQL> select max(C_DATE) from szr.test_job;

MAX(C_DATE)
-------------------
2024-08-19 16:16:18

复制进程正常,测试表的数据也发生变化了。

恢复方式2

上面的恢复方法是修改原有的OGG配置,那么能不能把OGG拷贝到切换的后的主库上去,正常跑起来呢?也就是主备切换之后的新主库进行抽取,在新主库上配置新的抽取进程,看一下具体操作过程:

–关闭源端所有ogg进程

GGSCI (ora11g) 35> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     RUNNING     DPCS        00:00:00      00:00:08    
EXTRACT     ABENDED     EXTCS       00:00:00      00:02:16    


GGSCI (ora11g) 36> stop *

Sending STOP request to EXTRACT DPCS ...
Request processed.
EXTRACT EXTCS is already stopped.


GGSCI (ora11g) 37> stop mgr
Manager process is required by other GGS processes.
Are you sure you want to stop it (y/n)?y

Sending STOP request to MANAGER ...
Request processed.
Manager stopped.

GGSCI (ora11g) 38> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     STOPPED                                           
EXTRACT     STOPPED     DPCS        00:00:00      00:00:45    
EXTRACT     ABENDED     EXTCS       00:00:00      00:03:10  

sys@ora11g> select max(C_DATE) from szr.test_job;

MAX(C_DATE)
-------------------
2024-08-19 16:33:37

–安装目录拷贝到新主库(10.10.10.41)

--拷贝到相同路径下
[oracle@ora11g:/ogg]$ scp -r * 10.10.10.41:/ogg
PS:注意属主,权限

–新主库添加环境变量

$ vi .bash_profile
###OGG_HOME
export OGG_HOME=/ogg
export PATH=$ORACLE_HOME/bin:$PATH:$ORACLE_HOME/OPatch:$OGG_HOME

–如果两边的数据库安装目录,实例名配置不一样,则修改配置文件里面的配置ORACLE_HOME,ORACLE_SID
–启动进程

GGSCI (ora11gdg) 4> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     STOPPED                                           
EXTRACT     STOPPED     DPCS        00:00:00      00:07:40    
EXTRACT     ABENDED     EXTCS       00:00:00      00:10:05    


GGSCI (ora11gdg) 5> start mgr
Manager started.


GGSCI (ora11gdg) 6> start *

Sending START request to MANAGER ...
EXTRACT DPCS starting

Sending START request to MANAGER ...
EXTRACT EXTCS starting
GGSCI (ora11gdg) 9> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     RUNNING     DPCS        00:00:00      00:00:09    
EXTRACT     ABENDED     EXTCS       00:00:00      00:10:38  

EXTCS启动不起来,报错如下:

2024-08-19 16:46:50  ERROR   OGG-02803  Encountered a Data Guard role transition. Alter Extract to SCN 3,793,698 and restart Extract, or recreate Extract with the correct number of threads at SCN 3,793,698.

***********************************************************************
*                   ** Run Time Statistics **                         *
***********************************************************************

按之前的方法进行处理:

--修改extract的捕获scn为报错显示的scn
GGSCI (ora11gdg) 15>  ALTER EXTRACT extcs  scn 3793698;
EXTRACT altered.

 
GGSCI (ora11gdg) 16> start extcs

Sending START request to MANAGER ...
EXTRACT EXTCS starting


GGSCI (ora11gdg) 17> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
EXTRACT     RUNNING     DPCS        00:00:00      00:00:05    
EXTRACT     RUNNING     EXTCS       00:00:00      00:00:05    

在这里插入图片描述

抽取进程启动正常后,看一下复制进程和测试表的数据变化

GGSCI (11gbbed) 26> info all

Program     Status      Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                           
REPLICAT    RUNNING     REPCS       00:00:04      00:00:06    

SQL> select sysdate,max(C_DATE) from szr.test_job;

SYSDATE 	    MAX(C_DATE)
------------------- -------------------
2024-08-19 16:50:14 2024-08-19 16:49:32

可以看到,数据同步正常。

总结:switchover切换,OGG抽取进程进程需要进行处理,才能使其恢复正常工作。

关注我,学习更多的数据库知识!
请添加图片描述

  • 15
    点赞
  • 11
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

老苏畅谈运维

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值