3RAC单机复制配置

3.1、环境简介

性质

IP

系统

ORACLE版本

源端

10.123.112.201/10.123.112.202

LINUX rhel5 64

10.2.0.1

目标端

10.123.112.235

LINUX rhel5 32

10.2.0.1


3.2、源端安装OCFS2集群文件系统

RAC环境中为了实现高可用性,需将OGG安装在集群文件系统中,这样OGG可以访问RAC中的所有节点,我们这里测试采用OCFS2文件系统。

http://oss.oracle.com下载与LINUX内核相符的OCFS2 RPM

LINUX下执行uname –r查看系统内核版本 eg

[oracle@node2ocfs]$ uname -r

2.6.18-92.el5


使用ROOT用户安装OCFS2RPM

[root@node1ocfs]# rpm -ivh ocfs2-tools-1.2.7-1.el5.x86_64.rpm \\

ocfs2console-1.2.7-1.el5.x86_64.rpm\\

ocfs2-2.6.18-92.el5-1.2.9-1.el5.x86_64.rpm

进入OCFS2控制台界面

[root@node1 ~]#ocfs2console

在出现的窗体中选择[Clucster]-[ConfigureNodes]"NodeConfiguration"对话框中,输入2个专用互连的节点名、IP 地址、端口号后,选择 [Clucster]-[PropagateCluster Configuration] ,提示"Finished"

配置后的信息显示如下:


在集群中的所有节点上以 root 用户帐户的身份运行以下命令
   export PATH=$PATH:/sbin:/usr/sbin
   /etc/init.d/o2cb enable

创建ocfs2文件系统,其中-N选项用于指明最多允许多少个节点同时使用此文件系统:

# mkfs -t ocfs2-N 2 /dev/sdh1

挂载分区:

# mount /dev/sdh1/ggate


配置启动自动载入(所有节点):
   export PATH=$PATH:/sbin:/usr/sbin
   chkconfig --add o2cb
   /etc/init.d/o2cb configure
/etc/rc.local增加入下内容:

chown -Roracle:dba /ggate
chmod -R 775 /ggate


3.3、源端安装GoldenGate

GoldenGate安装目录(OCFS2目录/ggate)解压安装文件

unzipogg112101_fbo_ggs_Linux_x64_ora10g_64bit.zip

tar–xvf fbo_ggs_Linux_x64_ora10g_64bit.tar


设置环境变量

在用户参数文件中添加以下内容:

exportGGATE_HOME=/ggate

exportLD_LIBRARY_PATH=$GGATE_HOME:$ORACLE_HOME/lib

注意:添加后需使参数文件生效


安装GoldenGate

进入OGG控制台创建OGG工作目录

然后在安装目录下执行 ./ggsci  进入OGG控制台

执行命令 createsubdirs创建工作目录,显示如下:

GGSCI(node1) 1> create subdirs

Creatingsubdirectories under current directory /ggate

Parameterfiles                /ggate/dirprm:already exists

Reportfiles                   /ggate/dirrpt:created

Checkpointfiles               /ggate/dirchk:created

Processstatus files             /ggate/dirpcs: created

SQLscript files               /ggate/dirsql:created

Databasedefinitions files       /ggate/dirdef: created

Extractdata files              /ggate/dirdat: created

Temporaryfiles               /ggate/dirtmp:created

Stdoutfiles                   /ggate/dirout:created


3.4、目标端安装GoldenGate

环境相同,安装方法与4.3一致,仅仅是安装位置不同,安装过程略,注意安装包与平台一致。


3.5、配置源端数据库

数据库模式配置

源端数据库必须开启归档模式

Alterdatabase archivelog;

开启最小附加日志

Alterdatabase add supplemental log data;

使用SELECTSUPPLEMENTAL_LOG_DATA_MIN FROM V$DATABASE;

可查看是否开启了最小附加日志;


源端数据库创建GoldenGate数据库用户并授权:(我们这里以ogg为例,使用其他亦可)

createuser ogg identified by oracle default tablespace DATA_OL;

grantconnect,resource,unlimited tablespace to ogg;

grantexecute on utl_file to ogg;

grantselect any dictionary,select any table to ogg;

grantalter any table to ogg;

grantflashback any table to ogg;

grantexecute on DBMS_FLASHBACK to ogg;

添加表级transdata

GGSCI(node1) 1> dblogin userid ogg,password oracle

Successfullylogged into database.

GGSCI(node1) 2> add trandata SCOTT.DEPT

Loggingof supplemental redo data enabled for table SCOTT.DEPT.

GGSCI(node1) 3> add trandata SCOTT.EMP

Loggingof supplemental redo data enabled for table SCOTT.EMP.


3.6、配置源端进程组

配置管理进程mgr

GGSCI(node1) 1> edit param mgr

(粘贴下面这段配置)

PORT7839

DYNAMICPORTLIST7840-7939

--AUTOSTARTER *

AUTORESTARTEXTRACT *,RETRIES 5,WAITMINUTES 3

PURGEOLDEXTRACTS./dirdat/*,usecheckpoints, minkeepdays 3

LAGREPORTHOURS1

LAGINFOMINUTES30

LAGCRITICALMINUTES45

参数说明均与单点配置相同,参考3.5部分

启动管理进程:

GGSCI(node1) 2> start mgr

Managerstarted.

GGSCI(node1) 3> info all

Program     Status     Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING


配置抽取进程:

GGSCI(node1) 6> add extract extnd,tranlog,begin now,threads 2

EXTRACTadded.

GGSCI(node1) 7> add exttrail ./dirdat/nd,extract extnd,megabytes 100

EXTTRAILadded.

GGSCI(node1) 8> edit params extnd

(粘贴下面这段配置)

EXTRACTextnd

SETENV(NLS_LANG = "AMERICAN_AMERICA.UTF8")

SETENV(ORACLE_HOME = "/u01/app/oracle/product/10.2.0/db_1")

USERID ogg@RAC, PASSWORDoracle

--GETTRUNCATES

REPORTCOUNTEVERY 1 MINUTES, RATE

DISCARDFILE./dirrpt/extnd.dsc,APPEND,MEGABYTES 1024

--THREADOPTIONS  MAXCOMMITPROPAGATIONDELAY 60000 IOLATENS60000

DBOPTIONS  ALLOWUNUSEDCOLUMN

WARNLONGTRANS2h,CHECKINTERVAL 3m

EXTTRAIL./dirdat/nd

--TRANLOGOPTIONSEXCLUDEUSER USERNAME

FETCHOPTIONSNOUSESNAPSHOT

TRANLOGOPTIONS  CONVERTUCS2CLOBS

TABLEscott.dept;

TABLEscott.emp;

注意:threadsRAC节点数相同即可,RAC中不再使用ORACLE_SID设置,而使用USERID ogg@RAC,注意两个节点均可连接数据库。


添加传输进程,配置参数

GGSCI(node1) 2> add extract dpend,exttrailsource ./dirdat/nd

EXTRACTadded.

GGSCI(node1) 3> add rmttrail /uo1/app/ogg/dirdat/nd, EXTRACT DPEND

RMTTRAILadded.

GGSCI(node1) 4> edit params dpend

(粘贴下面这段配置)

EXTRACTdpend

SETENV(NLS_LANG = AMERICAN_AMERICA.UTF8)

USERID ogg@RAC, PASSWORDoracle

PASSTHRU

RMTHOST10.123.112.235, MGRPORT 7839, compress

RMTTRAIL/uo1/app/ogg/dirdat/nd

TABLEscott.dept;

TABLEscott.emp;


3.7、配置目标数据库

目标库创建GoldenGate数据库用户并授权:

createuser ogg identified by oracle default tablespace USERS;

grantconnect,resource,unlimited tablespace to ogg;

grantexecute on utl_file to ogg;

grantselect any dictionary,select any table to ogg;

grantalter any table to ogg;

grantflashback any table to ogg;

grantexecute on DBMS_FLASHBACK to ogg;

grantinsert any table to ogg;

grantdelete any table to ogg;

grantupdate any table to ogg;


添加checkpoint

GGSCI(sun.linux) 2> edit params GLOBALS


然后在参数文件中输入

GGSCHEMAogg

CHECKPOINTTABLEogg.checkpoint

GGSCI(sun.linux) 4> dblogin userid ogg,password oracle

Successfullylogged into database.

GGSCI(sun.linux) 5> add checkpointtable ogg.checkpoint

Successfullycreated checkpoint table ogg.checkpoint.


3.8、配置目标端进程组

配置MGR参数

GGSCI(sun.linux) 6> edit params mgr

(粘贴下面这段配置)

PORT7839

DYNAMICPORTLIST7840-7939

--AUTOSTARTER *

AUTORESTARTEXTRACT *,RETRIES 5,WAITMINUTES 3

PURGEOLDEXTRACTS./dirdat/*,usecheckpoints, minkeepdays 3

LAGREPORTHOURS1

LAGINFOMINUTES30

LAGCRITICALMINUTES45


配置复制队列

GGSCI(sun.linux)8> add replicat repnd,exttrail/uo1/app/ogg/dirdat/nd,checkpointtable ogg.checkpoint

REPLICATadded.

GGSCI(sun.linux) 10> edit params repnd

(粘贴下面这段配置)

REPLICATrepnd

SETENV(NLS_LANG = AMERICAN_AMERICA.UTF8)

USERIDogg, PASSWORD oracle

ASSUMETARGETDEFS

REPERRORdefault,discard

discardfile./dirrpt/repnd.dsc,append,megabytes 50

mapscott.*,target pmsbi.*;


3.9、启动进程进行数据同步

启动源端进程组

启动抽取进程和传输进程:

startextnd

startdpend

启动后使用info all查看进程状态,正常status应该RUNNING,显示如下:

GGSCI(node1) 19> info all

Program     Status     Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                          

EXTRACT     RUNNING    DPEND       00:00:00      00:00:09  

EXTRACT     RUNNING    EXTND       00:00:00      00:00:04


启动目标端进程

startrepnd

显示如下:

GGSCI(sun.linux) 2> info all

Program     Status     Group       Lag at Chkpt  Time Since Chkpt

MANAGER     RUNNING                                          

REPLICAT    RUNNING    REPND       00:00:00      00:00:03

到此RAC到单点OGG的安装配置就完成了,可以进行数据同步测试了。


4RAC单机下的HA配置

4部分的RACà单机的配置仅仅完成了数据复制的功能,不包含高可用的配置,当运行GoldenGate的节点出现故障时复制功能就将终止,如何使复制功能继续可用呢,有如下两种方式:

4.1、节点故障的手工处理方式

因为GoldenGate 安装在共享目录下,我们可以通过任一个节点连接到共享目录,启动GoldenGate运行界面。如果其中一个节点失败,导致GoldenGate进程中止,可以直接手工在另外一个节点启动进程组即可。


4.2GoldenGateHA配置

我们可以通过使用CRS来管理GoldenGate资源组,并且使用RACvip连接到GoldenGate,一旦数据库的某一个节点宕掉,Oracleclusterware将自动切换到另一个可用节点。

添加一个应用程序VIP资源

GoldenGate vip资源创建一个profile

[oracle@node1ggate]$ cd $ORA_CRS_HOME/bin

[oracle@node1bin]$ pwd

/u01/app/oracle/product/10.2.0/crs_1/bin

[oracle@node1 bin]$crs_profile –create ggvip –t application \\

–a /u01/app/oracle/product/10.2.0/crs_1\\

-o oi=eth0,ov=192.168.73.203,on=255.255.255.0

其中:ggvip为创建的应用程序vip的名字

把这个资源注册到CRS:

[oracle@node1 bin]$crs_register ggvip

vip 的所有权给root,root用户下执行:

[root@node1 bin]#./crs_setperm ggvip –o root

oracle用户分配启动这个资源的权限:

[root@node1 bin]#./crs_setperm ggvip –u user:oracle:r-x

通过oracle用户启动这个资源:

[oracle@node1bin]$ crs_start ggvip

Attempting tostart `ggvip` on member `node1`

Start of`ggvip` on member `node1` succeeded.

查看资源状态显示如下:

[oracle@node1bin]$ crs_stat ggvip -t

Name           Type           Target    State    Host      

------------------------------------------------------------

ggvip          application    ONLINE   ONLINE    node1

创建一个action程序

action程序我们这里放到共享磁盘上,action程序最少需要可以接受三个参数:start,stop,check

startstop:返回0成功,1 失败;

check     :返回0表示GoldenGate在运行,1 表示不运行;

下面为示例程序 gg_action.scr的内容:


#!/bin/sh

#set the OracleGoldengate installation directory

exportGGS_HOME=/ggate

#set the oraclehome to the database to ensure GoldenGate will get the

#rightenvironment settings to be able to connect to the database

exportORACLE_HOME=/u01/app/oracle/product/10.2.0/db_1

#specify delayafter start before checking for successful start

start_delay_secs=5

#Include theGoldenGate home in the library path to start GGSCI

exportLD_LIBRARY_PATH=$ORACLE_HOME/lib:${GGS_HOME}:${LD_LIBRARY_PATH}

#check_processvalidates that a manager process is running at the PID

#thatGoldenGate specifies.

check_process() {

if ( [ -f"${GGS_HOME}/dirpcs/MGR.pcm" ] )

then

 pid=`cut -f8"${GGS_HOME}/dirpcs/MGR.pcm"`

 if [ ${pid} = `ps -e |grep ${pid} |grep mgr|cut -d " " -f2` ]

 then

   #manager process is running on the PID exitsuccess

   exit 0

 else

 if [ ${pid} = `ps -e |grep ${pid} |grep mgr|cut -d " " -f1` ]

 then

   #manager process is running on the PID exitsuccess

   exit 0

 else

   #manager process is not running on the PID

   exit 1

 fi

fi

else

 #manager is not running because there is noPID file

 exit 1

fi

}


#call_ggsci isa generic routine that executes a ggsci command

call_ggsci () {

 ggsci_command=$1

 ggsci_output=`${GGS_HOME}/ggsci << EOF

 ${ggsci_command}

 exit

 EOF`

}

case $1 in

'start')

 #start manager

 call_ggsci 'start manager'

 #there is a small delay between issuing thestart manager command

 #and the process being spawned on the OS.wait before checking

 sleep ${start_delay_secs}

 #check whether manager is running and exitaccordingly

 check_process

 ;;

'stop')

 #attempt a clean stop for all non-managerprocesses

 #call_ggsci 'stop er *'

 #ensure everything is stopped

 call_ggsci 'stop er *!'

 #call_ggsci 'kill er *'

 #stop manager without (y/n) confirmation

 call_ggsci 'stop manager!'

 #exit success

 exit 0

 ;;

'check')

 check_process

 ;;

'clean')

 #attempt a clean stop for all non-managerprocesses

 #call_ggsci 'stop er *'

 #ensure everything is stopped

 #call_ggsci 'stop er *!'

 #in case there are lingering processes

 call_ggsci 'kill er *'

 #stop manager without (y/n) confirmation

 call_ggsci 'stop manager!'

 #exit success

 exit 0

 ;;

'abort')

 #ensure everything is stopped

 call_ggsci 'stop er *!'

 #in case there are lingering processes

 call_ggsci 'kill er *'

 #stop manager without (y/n) confirmation

 call_ggsci 'stop manager!'

 #exit success

 exit 0

 ;;

esac

添加一个应用程序profile

[oracle@node1 ggate]$cd $ORA_CRS_HOME/bin

[oracle@node1bin]$ pwd

/u01/app/oracle/product/10.2.0/crs_1/bin

[oracle@node1 bin]$crs_profile –create GG_app –t application \\

–r ggvip –a/ggate/gg_action.scr –o ci=10


其中:-r ggvip表示ggvip必须在GoldenGate启动之前运行,

-a /ggate/gg_action.scr指定action 脚本的位置,在每个节点必须都可用

–o ci=10:检查的时间间隔设置为10


把这个资源注册到CRS:

[oracle@node1 bin]$crs_register GG_app


vip 的所有权给root,root用户下执行:

[root@node1 bin]#./crs_setperm ggvip –o oracle

oracle用户分配启动这个资源的权限:

[root@node1 bin]#./crs_setperm GG_app –u user:oracle:r-x

通过oracle用户启动这个资源:

[oracle@node1bin]$ crs_start GG_app

Attempting tostart `GG_app` on member `node1`

Start of`GG_app` on member `node1` succeeded.

查看资源状态显示如下:

[oracle@node1bin]$ crs_stat GG_app -t

Name           Type           Target    State     Host      

------------------------------------------------------------

GG_app         application    ONLINE   ONLINE    node1


测试节点迁移

在测试环境中可以使用crs_relocate –fGG_app使它强行漂移:过程显示如下:

[oracle@node1~]$ crs_stat -t

Name           Type           Target    State    Host      

------------------------------------------------------------

GG_app         application   ONLINE    ONLINE    node1      

ggvip           application   ONLINE    ONLINE    node1      

ora....AC1.srv    application   ONLINE    ONLINE    node1      

ora....AC2.srv    application   ONLINE    ONLINE    node2      

ora.RAC.RAC.cs application    ONLINE   ONLINE    node2      

ora....C1.inst    application   ONLINE    ONLINE    node1      

ora....C2.inst    application   ONLINE    ONLINE    node2      

ora.RAC.db     application    ONLINE   ONLINE    node1      

ora....E1.lsnr    application   ONLINE    ONLINE    node1      

ora.node1.gsd   application   ONLINE    ONLINE    node1      

ora.node1.ons   application   ONLINE    ONLINE    node1      

ora.node1.vip   application   ONLINE    ONLINE    node1      

ora....E2.lsnr    application   ONLINE    ONLINE    node2      

ora.node2.gsd   application   ONLINE    ONLINE    node2      

ora.node2.ons   application   ONLINE    ONLINE   node2      

ora.node2.vip   application   ONLINE    ONLINE    node2      

[oracle@node1 ~]$ crs_relocate -f GG_app

Attempting to stop `GG_app` on member `node1`

Stop of `GG_app` on member `node1` succeeded.

Attempting to stop `ggvip` on member `node1`

Stop of `ggvip` on member `node1` succeeded.

Attempting to start `ggvip` on member `node2`

Start of `ggvip` on member `node2` succeeded.

Attempting to start `GG_app` on member `node2`

Start of `GG_app` on member `node2` succeeded.

[oracle@node1~]$ crs_stat -t

Name           Type           Target    State    Host      

------------------------------------------------------------

GG_app         application    ONLINE   ONLINE    node2      

ggvip           application    ONLINE   ONLINE    node2      

ora....AC1.srv    application   ONLINE    ONLINE    node1      

ora....AC2.srv    application   ONLINE    ONLINE    node2      

ora.RAC.RAC.cs  application   ONLINE    ONLINE    node2      

ora....C1.inst     application   ONLINE    ONLINE    node1      

ora....C2.inst     application   ONLINE    ONLINE    node2      

ora.RAC.db      application   ONLINE    ONLINE    node1      

ora....E1.lsnr     application   ONLINE    ONLINE    node1      

ora.node1.gsd    application   ONLINE    ONLINE   node1      

ora.node1.ons    application   ONLINE    ONLINE    node1      

ora.node1.vip    application   ONLINE    ONLINE    node1      

ora....E2.lsnr     application   ONLINE    ONLINE    node2      

ora.node2.gsd    application   ONLINE    ONLINE    node2      

ora.node2.ons    application   ONLINE    ONLINE    node2      

ora.node2.vip    application   ONLINE    ONLINE    node2      

可以看到GoldenGate成功转移到2节点运行了。


5、常见错误及解决方法

5.1OGG-00446

启动源端抽取进程extnd,ggserr.log错误显示如下:

2012-08-1711:11:38  ERROR   OGG-00446 Oracle GoldenGate Capture for Oracle, extnd.prm:  Could not find archived log for sequence45835 thread 1 under default destinations SQL , errorretrieving redo file name for sequence 45835, archived = 1, use_alternate =0Not able to establish initial position for begin time 2012-08-15 17:28:28.

导致原因:早期归档日志被删除或已备份,导致找不到归档日志文件;

处理方法:

将备份的归档日志恢复到归档日志目录下,即可解决错误;

测试库可以指定抽取进程从某个时间点开始读取日志,跳过已删除的归档日志文件,命令如下:alterextract extnd,begin 2012-8-16 16:38;


5.2OGG-01223

启动源端传输进程DPENDggserr.log错误显示如下:

2012-08-1711:43:50  WARNING OGG-01223  Oracle GoldenGate Capture for Oracle,dpend.prm:  TCP/IP error 79 (Connectionrefused).

2012-08-1711:45:01  WARNING OGG-01223  Oracle GoldenGate Capture for Oracle,dpend.prm:  TCP/IP error 79 (Connectionrefused).

导致原因:因为目标端110MGR进程没有启动,导致报错

处理方法:

在目标端启动startmgr启动进程后,再启动源端的传输进程DPEND,错误消失,文件顺利传输过来了。


正常的日志如下:

2012-08-1714:31:51  INFO    OGG-00993 Oracle GoldenGate Capture for Oracle, dpend.prm:  EXTRACT DPEND started.

2012-08-1714:33:13  INFO    OGG-01226 Oracle GoldenGate Capture for Oracle, dpend.prm:  Socket buffer size set to 27985 (flush size27985).

2012-08-1714:33:26  INFO    OGG-01052 Oracle GoldenGate Capture for Oracle, dpend.prm:  No recovery is required for target file F:\\ogg\\dirdat\\nd000000,at RBA 0 (file not opened).

2012-08-1714:33:26  INFO    OGG-01478 Oracle GoldenGate Capture for Oracle, dpend.prm:  Output file F:\\ogg\\dirdat\\nd is using formatRELEASE 11.2.


5.3OGG-01224

启动源端传输进程DPENDggserr.log错误显示如下:

2012-08-2205:33:10  ERROR   OGG-01224 Oracle GoldenGate Capture for Oracle, dpend.prm:  TCP/IP error 113 (No route to host).

2012-08-2205:33:10  ERROR   OGG-01668 Oracle GoldenGate Capture for Oracle, dpend.prm:  PROCESS ABENDING.

导致原因:因为目标端235上的防火墙没有关闭,导致报错

处理方法:

在目标端机器关闭防火墙后,再启动源端的传输进程DPEND,错误消失,文件顺利传输过来了。


5.4OGG-01031

启动源端传输进程DPENDggserr.log错误显示如下:

2012-08-28 15:09:39  ERROR  OGG-01031  Oracle GoldenGateCapture for Oracle, dpend.prm:  There isa problem in network communication, a remote file problem, encryption keys fortarget and source do not match (if using ENCRYPT) or an unknown error. (Replyreceived is Unable to open file "/uo1/app/ogg/dirdat/nd000004" (error2, No such file or directory)).

2012-08-28 15:09:41  ERROR  OGG-01668  Oracle GoldenGateCapture for Oracle, dpend.prm:  PROCESSABENDING.目标端ggserr.log错误显示如下:

2012-08-2815:06:30  WARNING OGG-01223  Oracle GoldenGate Collector for Oracle:  Unable to lock file"/uo1/app/ogg/dirdat/nd000004" (error 11, Resource temporarilyunavailable).  Lock currently held byprocess id (PID) 13854.

2012-08-2815:06:30  WARNING OGG-01223  Oracle GoldenGate Collector for Oracle:  Unable to open file"/uo1/app/ogg/dirdat/nd000004" (error 2, No such file or directory).

导致原因:可能是网络出现过故障,OGG源端的Data Pump进程与目标断了联系,目标端mgr为其启动的server进程一直还在运行,下次data pump重启时目标mgr会试图生成另外一个server进程,这样两个进程会争同一个队列文件。

处理方法:

1、停掉源端的所有data pump,使用ps –ef|grep server(或OGG安装目录)看看是不是还有OGGserver进程在跑,如果有,杀死它(一定要确认源端data pump全停掉,并且杀的是server进程,不要杀其它extract/replicat/mgr等),重启源端data pump即可。

2、可能是目标端的trail file出问题了,前滚重新生成一个新的队列文件

SEND EXTRACT xxx ETROLLOVER

或者:alter extract xxx etrollover

xxxdatapump的名称


5.5OGG-01154

错误信息:2011-03-29 15:53:57  WARNINGOGG-01154  Oracle GoldenGate Delivery forOracle, repya.prm:  SQL error 14402mapping EPMA.D_METER to E

PMA.D_METER OCIError ORA-14402: updating partition key column would cause a partition change(status = 14402), SQL .

导致原因:源端更新了分区列,但目标端没有打开行移动,导致更新时报错;

处理方法:SQLPLUS>alter table SCHEMA.TABLENAME enable row movement;