在前面的示例:
Oracle Golden Gate 系列六 --11gR2 Ora2Ora 单向复制GG 示例
http://blog.csdn.net/tianlesoftware/article/details/6950018
中简单的描述了一下GG 的同步,就是先配置好GG,然后进行同步,但实际情况下,是Source 端的数据已经存在,这种情况下就要先进行initial data load,使Source 和 Target 的同步对象上Data 一致,然后在进行GG的online change synchronization。
在GG的官方文档上对initial data load 有一章节进行说明,鉴于我之前的几篇Blog都是基于理论的多,所以以后的blog会偏重于理论加示例的方式进行。理论的知识还是要先看,没有理论,其他都是空谈。
一.官网有关数据装载的说明
详细内容可以参考Administrator'sGuide 中的Running aninitial data load 小节。
You can use Oracle GoldenGate to:
(1)Perform a standalone batch loadto populate database tables for migration or other purposes.
(2)Load data into database tablesas part of an initial synchronization run in preparation for changesynchronization with Oracle GoldenGate.
The initial loadcan be performed from an active source database. Users and applications canaccess and update data while the load is running. You can perform initial loadfrom a quiesced source database if you delay access to the source tables untilthe target load is completed.
1.1 Using parallel processing in an initial load
For all initialload methods except those performed with a database utility, you can load largedatabases more quickly by using parallel Oracle GoldenGate processes.
To use parallel processing
1. Follow the directions in this chapterfor creating an initial-load Extract and an initialload Replicat for each setof parallel processes that you want to use.
2. With the TABLE and MAP parameters,specify a different set of tables for each pair of Extract-Replicat processes,or you can use the SQLPREDICATE option of TABLE to partition the rows of largetables among the different Extract processes.
1.2 Prerequisites for initial load
1.2.1 Disable DDL processing
Before executingan initial load, disable DDL extraction and replication. DDL processing iscontrolled by the DDL parameter in the Extract and Replicat parameter files.
之前整理的Blog中的第四节有说明:
Oracle Golden Gate 系列五 --GG 使用配置 说明
http://blog.csdn.net/tianlesoftware/article/details/6947973
SQL> @ddl_disable.sql; --> disable the DDL trigger.
SQL> @ddl_enable.sql; --> enable the DDL trigger.
1.2.2 Prepare the target tables
The followingare suggestions that can make the load go faster and help you to avoid errors.
--以下建议可以增加load 速度和避免错误。
(1)Data: Make certain that the targettables are empty. Otherwise, there may be duplicate-row errors or conflictsbetween existing rows and rows that are being loaded.
--确保target 表是空的,否则可能会出现重复行或者行冲突。
(2)Constraints: Disable foreign-keyconstraints and check constraints. Foreign-key constraints can cause errors,and check constraints can slow down the loading process. Constraints can bereactivated after the load concludes successfully.
--禁用外键约束和check 约束,外键约束可能导致错误,而check 约束会到时load速度变慢。 约束可以待我们数据load 完成之后在激活。
(3) Indexes: Remove indexes from thetarget tables. Indexes are not necessary for inserts. They will slow down theloading process significantly. For each row that is inserted into a table, thedatabase will update every index on that table. You can add back the indexesafter the load is finished.
--移除target 表上的索引。 Target 表上的索引会增加load的时间。 如果有索引,在维护表的同时还需要维护索引的信息。 待数据load 完毕之后在重建索引,重建的速度要比维护快很多。
NOTE:
A primary indexis required for all applications that access DB2 for z/OS target tables. Youcan delete all other indexes from the target tables, except for the primaryindex.
--DB2上的primary 索引是必须的。
(4)Keys: To use the HANDLECOLLISIONSfunction to reconcile incremental data changes with the load, each target tablemust have a primary or unique key. If you cannot create a key through yourapplication, use the KEYCOLS option of the TABLE and MAP parameters to specifycolumns as a substitute key for Oracle GoldenGate’s purposes. A key helps identifywhich row to process. If you cannot create keys, the source database must be quiescedfor the load.
--使用HANDLECOLLISIONS 参数可以保证load的一致性,但是该函数要求target table 必须有primary 或者unique key。 如果没有,可以使用KEYCOLS 选项来指定. 有关该参数的具体作用,在后面讲direct load 时在说明。
1.2.3 Configure theManager process
On the sourceand target systems, configure and start a Manager process. One Manager can beused for the initial-load processes and the change-synchronization processes.
Oracle Golden Gate 系列七 -- 配置 GGManager process
http://blog.csdn.net/tianlesoftware/article/details/6953745
1.2.4 Create adata-definitions file
Adata-definitions file is required if the source and target databases havedissimilar definitions. Oracle GoldenGate uses this file to convert the data tothe format required by the target database.
--当source 和 target database 同步的表结构不一样时,可使用配置data-definitions file,GG 使用该文件转换data到target database 需要的格式。
1.2.5 Create change-synchronizationgroups
NOTE:
If the load isperformed from a quiet source database and will not be followed by continuouschange synchronization, you can omit these groups.
--如果使用quiet load,则可以忽略一下内容。但是对于direct load,还是需要仔细的看一下。
To prepare forthe capture and replication of transactional changes during the initial load, createonline Extract and Replicat groups. You will start these groups during the loadprocedure. See the instructions in this documentation that are appropriate forthe type of replication configuration that you will be using.
Do not start theExtract or Replicat groups until instructed to do so in the initial-load instructions.Change synchronization keeps track of transactional changes while the load isbeing applied, and then the target tables are reconciled with those changes.
--在我们load 之前,先创建Extract 和Replicat 进程,当我们进行load 时,也需要运行这2个进程。
Extrace 进程会在我们load 期间抓取变化的内容,等我们load 结束,在进行同步。
NOTE:
The first timethat Extract starts in a new Oracle GoldenGate configuration, any opentransactions will be skipped. Only transactions that begin after Extract startsare captured.
If the sourcedatabase will remain active during the initial load, include the HANDLECOLLISIONSparameter in the Replicat parameter file; otherwise do not use it.
--如果我们Source database 在load 期间还是激活的,那么必须在Replicat 参数里添加HANDLECOLLISIONS 参数。 否则不要使用。
HANDLECOLLISIONSaccounts for collisions that occur during the overlap of time between the initialload and the ongoing change replication. It reconciles insert operations forwhich the row already exists, and it reconciles update and delete operationsfor which the row does not exist.
--HANDLECOLLISIONS 控制initial load 和 changereplication 之间的一致性,如insert 时,记录已经存在,update 和delete 时 记录已经不存在。
It can be used in these ways:
(1)globally for all tables in aparameter file
(2)as an on/off toggle for groupsof tables
(3)within MAP statements to enableor disable the error handling for specific table pairs.
1.2.6 Sharing parameters between process groups
Some of theparameters that you use in a change-synchronization parameter file also are requiredin an initial-load Extract and initial-load Replicat parameter file. You cancopy those parameters from one parameter file to another, or you can store themin a central file and use the OBEY parameter in each parameter file to retrievethem. Alternatively, you can create an Oracle GoldenGate macro for the sharedparameters and then call the macro from each parameter file with the MACROparameter.
二.Loading data with an Oracle GoldenGate direct load
官网提供的Load 方法有如下几种:
(1)Loading data with a databaseutility
(2)Loading data from file to Replicat
(3)Loading data from file to databaseutility
(4)Loading data with an OracleGoldenGate direct load
(5)Loading data with a direct bulkload to SQL*Loader
(6)Loading data with Teradata loadutilities
这里只看direct load,其他的参考官方文档。
To use an OracleGoldenGate direct load, you run an Oracle GoldenGate initial-load Extract toextract the source records and send them directly to an initial-load Replicattask.
--为了使用direct load,必须先运行initial-load Extract进程,将source 段的记录抽取出来,然后将记录发送到target 端的initial-load Replicattask.
A task isstarted dynamically by the Manager process and does not require the use of a Collectorprocess or file. The initial-load Replicat task delivers the load in largeblocks to the target database. Transformation and mapping can be done byExtract, Replicat, or both. During the load, the change-synchronization groupsextract and replicate incremental changes, which are then reconciled with theresults of the load.
--这里要注意的事,initial-load Replicat task 是由ManagerProcess 控制动态启动的。 不需要人工的启动这个进程。 控制参数在Source 端的Extract 参数里配置,参数是:rmttask。
NOTE:
This method doesnot support extraction of LOB or LONG data. As an alternative, see “Loadingdata from file to Replicat” or “Loading data from file to database utility”.
--Direct Load 不支持LOB和LONG data,如果有这两种类型,可以考虑使用file toReplicat 或者 File to database utility。
You can controlwhich port is used by Replicat by specifying the DYNAMICPORTLIST parameter inthe Manager parameter file. When starting a process such as Replicat, Managerfirst looks for a port defined with DYNAMICPORTLIST. If no ports are listed,Manager chooses a port number by incrementing from its own port number until aport is available.
--可以通过DYNAMICPORTLIST参数控制Replicat 使用的端口。如:
DYNAMICPORTLIST 7802-7820
--注意这里如果分配的端口端少于extract-replicat进程对的话,会导致部分进程因通讯失败而出错。
OracleGoldenGate direct load does not support tables that have columns that contain LOBs,LONGs, user-defined types (UDT), or any other large data type that is greaterthan 4 KB in size.
To load data with anOracle GoldenGate direct load:
1. Make certain to satisfy “Prerequisitesfor initial load” 。
2. On the source and target systems, runGGSCI and start Manager.
START MANAGER
NOTE:
In a Windows cluster, start the Manager resource from the Cluster Administrator.
3. On the source, issue the followingcommand to create the initial-load Extract.
ADD EXTRACT<initial-load Extract name>, SOURCEISTABLE
Where:
(1) <initial-load Extract name> is the name of the initial-loadExtract, up to eight characters. –Extract 名字最多8个字符。
(2) SOURCEISTABLE designates Extract as an initial-load process thatreads complete records directly from the source tables. Do not use any of theother ADD EXTRACT service options or datasource arguments.
--标记该Extract进程是一个initial-load进程,不要和其他的参数一起使用。
4. On the source system, issue thefollowing command to create an initial-load Extract parameter file.
EDIT PARAMS<initial-load Extract name>
5. Enter the parameters listed in Table 33in the order shown, starting a new line for each parameter statement.
Table 33 Initial-load Extract parametersfor Oracle GoldenGate direct load
6. Enter any appropriate optional Extractparameters listed in the Oracle GoldenGate Windows and UNIX Reference Guide.
7. Save and close the file.
---Target System
8. On the target system, issue thefollowing command to create the initial-load Replicat task.
ADD REPLICAT<initial-load Replicat name>, SPECIALRUN
Where:
(1) <initial-load Replicat name>is the name of the initial-load Replicat task.
(2) SPECIALRUN identifies theinitial-load Replicat as a one-time run, not a continuous process.
–注意这个参数作用,表示一次性加载完成。
9. On the target system, issue thefollowing command to create an initial-load Replicat parameter file.
EDIT PARAMS<initial-load Replicat name>
10. Enter the parameters listed in Table 34in the order shown, starting a new line for each parameter statement.
Table 34 Initial-load Replicat parametersfor Oracle GoldenGate direct load
11. Enter any appropriate optional Replicatparameters listed in the Oracle GoldenGate Windows and UNIX Reference Guide.
12. Save and close the parameter file.
13. On the source system, start changeextraction.
START EXTRACT<Extract group name>
14. (Oracle, if replicating sequences)Issue the DBLOGIN command as the user who has EXECUTE privilege onupdate.Sequence.
GGSCI>DBLOGIN USERID DBLOGINuser, PASSWORD password
15. (Oracle, if replicating sequences)Issue the following command to update each source sequence and generate redo.From the redo, Replicat performs initial synchronization of the sequences onthe target. You can use an asterisk wildcard for any or all characters in thename of a sequence (but not the owner).
FLUSH SEQUENCE<owner.sequence>
16. On the source system, start theinitial-load Extract.
START EXTRACT<initial-load Extract name>
NOTE:
Do not start theinitial-load Replicat. The Manager process starts it automatically
and terminates it when the load is finished.
--注意这里,不要在Target 端启动initial-load replicat 进程,它是自动启动,并在load 完成时自动中断。
17. On the target system, issue thefollowing command to find out if the load is finished. Wait until the load isfinished before going to the next step.
VIEW REPORT<initial-load Extract name>
--这里要注意,等load 完成之后才可以进行其他的操作。
18. On the target system, start changereplication.
START REPLICAT<Replicat group name>
19. On the target system, issue thefollowing command to verify the status of change
replication.
INFO REPLICAT<Replicat group name>
20. Continue to issue the INFO REPLICATcommand until you have verified that Replicat posted all of the change datathat was generated during the initial load. For example, if the initial-loadExtract stopped at 12:05, make sure Replicat posted data up to that point.
21. On the target system, issue thefollowing command to turn off the HANDLECOLLISIONS parameter and disable theinitial-load error handling.
SEND REPLICAT<Replicat group name>, NOHANDLECOLLISIONS
--关闭该参数,该参数的作用,前面已经说过。
22. On the target system, edit the Replicatparameter file to remove the HANDLECOLLISIONS parameter. This preventsHANDLECOLLISIONS from being enabled again the next time Replicat starts.
EDIT PARAMS<Replicat group name>
23. Save and close the parameter file. Fromthis point forward, Oracle GoldenGate continues to synchronize data changes.
三.Load 说明
在第二节中看了direct load,因为相对与其他的方法而言,这是比较直观的一种方法,其他几种官方介绍的初始化方式要么需要借助其他数据库工具(如extract->SQL*Loader),要么中间走了完全没必要的步骤导致性能很差(如extract->file->replicat方式),都不算纯正的GoldenGate方式。
Direct Load架构:
上图中,显示了初始化加载启用了两条同步路线:
(1)上面一条是真正的initial load,负责将源数据端的数据一次性发送到目标数据库;
(2)下面一条,就是普通的GoldenGate同步进程,负责抓取初始化加载时源端数据库进行的在线数据变化。
在实际应用中,往往需要在生产库(源数据库)不停机的状态下,将数据加载到备用数据库(目标数据库)中并应用实时同步,在数据初始化的过程中,生产库将继续进行正常的事务操作,所以此时需要有抓取进程在初始化时开始将这些变化捕获,以免数据丢失。
实际部署时需要注意正确的执行顺序,大致可以分为以下几步:
(1) 源端和目标端创建配置各个同步进程。
(2) 开启源端同步抓取进程(图上的Change Extract),开始捕获变化。
(3) 开启初始化进程(图上的Initial-Load Extract),开始数据初始化加载。
(4) 等初始化加载结束,开启目标端复制应用进程(图上的ChangeReplicat),开始实时同步应用。
在目标端复制应用进程(图上的Change Replicat)中,需要在参数文件中配置HANDLECOLLISIONS参数,以避免重复应用第2和第3步之间的数据变化,因为这部分数据已经包含在初始化加载中传到目标数据库中了。
该参数在前面的准备工作中有说明:
To use theHANDLECOLLISIONS function to reconcile incremental data changes with the load,each target table must have a primary or unique key.
注意:
GoldenGate的初始化同步不会也不需要去初始化目标端的SCN号。GoldenGate的同步与Streams不同,它不需要依赖两端数据库保持一致的SCN来应用同步,实际上它只在抓取时可能会与数据库的SCN有关联(抓取时可以指定源数据库的特定SCN号开始解析日志),在trail传输以及目标端应用时,都和源端数据库的SCN毫无关系。它的实质是通过自己的一套队列文件检查点机制来实现队列数据的管理,在目标端则通过数据的唯一键来定位数据行,trail文件最终解析成SQL语句在目标端数据库执行而实现数据的应用。
所以这里的初始化加载,完全可以使用其他数据库工具来实现,比如说exp/imp、SQL*Loader、RMAN复制数据库等。
也正式因为如此,所以在实际使用中,尽量使用其他高效的数据库传输工具来完成初始化加载,而不要用GoldenGate提高的初始化功能。
网上朋友的一个测试:对600万的表进行初始化,使用direct load 需要30分钟,而impdp 仅1分30秒。
exp/imp 与 expdp/impdp 对比 及使用中的一些优化事项
http://blog.csdn.net/tianlesoftware/article/details/6093973
http://blog.csdn.net/tianlesoftware/article/details/6260138
http://blog.csdn.net/tianlesoftware/article/details/4698642
四.DirectLoad 示例
4.1 准备工作
4.1.1 测试数据是一张270万的分区表,信息如下:
SQL> select count(1) from pdba;
COUNT(1)
----------
2713235
SQL> desc pdba
NameNull? Type
------------------------------------------------- ----------------------------
ID NOT NULL NUMBER
TIME NOT NULLDATE
SQL> select table_name,partition_name from user_tab_partitions where table_name='PDBA';
TABLE_NAME PARTITION_NAME
------------------------------ ------------------------------
PDBA P1
PDBA P2
PDBA P3
PDBA P4
4.1.2 禁用DDL,因为我之前的测试环境已经启用了DDL
gg1:/home/oracle> cd /u01/ggate/
--一定要进入GG的根目录
gg1:/u01/ggate> sqlplus / as sysdba;
SQL*Plus: Release 11.2.0.3.0 Production onTue Nov 15 19:59:41 2011
SQL> @ddl_disable.sql
Trigger altered.
4.1.3 在Target端创建目标表
这里我直接使用Toad 获取表的SQL 之后,在target database上执行了一下。
其他的准备工作在上次实验中已经完成。
4.2 使用GG Direct Load 方式初始化
这里的一些准备工作就跳过,如果归档设置等,不做说明,可以参考:
Oracle Golden Gate 系列六 --11gR2 Ora2Ora 单向复制GG 示例
http://blog.csdn.net/tianlesoftware/article/details/6950018
4.2.1 Source 端操作:
(1) 添加 initial-load Extract 提取进程
GGSCI (gg1) 5> info all
ProgramStatus Group Lag Time Since Chkpt
MANAGERSTOPPED
EXTRACTABENDED EXT1 00:00:00 143:13:06
GGSCI (gg1) 6> start manager
Manager started.
GGSCI (gg1) 2> add extract ext2,sourceistable
EXTRACT added.
--没有tranlog,意味着不是通过日志方式;没有begin XXX,表示还未启动;使用sourceistable参数不会使用检查点机制
GGSCI (gg1) 3> info all
ProgramStatus Group Lag Time Since Chkpt
MANAGERSTOPPED
EXTRACTABENDED EXT1 00:00:00 143:10:18
GGSCI (gg1) 11> info extractext2
EXTRACTEXT2 Initialized 2011-11-15 20:09 Status STOPPED
Checkpoint LagNot Available
Log Read CheckpointNot Available
First Record Record 0
TaskSOURCEISTABLE
修改ext2的参数如下:
GGSCI (gg1) 12> view params ext2
extract ext2
userid ggate@gg1, password ggate
rmthost 192.168.3.200, mgrport 7809
rmttask replicat, group rept2
--注意是rmttask,指定目标复制进程名,直接管理target 端的进程自动启动。
table dave.pdba;
(2)增加一个抽取online log的Extract
这里我直接使用之前的ext1,相关的创建命令如下:
GGSCI (gg1) 11>add extractext1,tranlog, begin now
GGSCI (gg1) 12>add exttrail/u01/ggate/dirdat/lt, extract ext1
GGSCI (gg1) 14> view params ext1
extract ext1
userid ggate@gg1, password ggate
rmthost gg2, mgrport 7809
rmttrail /u01/ggate/dirdat/lt
table dave.pdba;
4.2.2 Target 端操作:
(1)添加direct load 对应的Replicat 进程
GGSCI (gg2) 4> add replicat rept2,specialrun
REPLICAT added.
-- specialrun 参数表示一次性加载。
修改rept2参数如下:
GGSCI (gg2) 10> view params rept2
replicat rept2
ASSUMETARGETDEFS
userid ggate@gg2,password ggate
reperror default, discard
discardfile/u01/ggate/dirdat/rep2_discard.txt, append, megabytes 100
BATCHSQL
INSERTAPPEND
MAP dave.pdba, TARGET dave.pdba;
注意:
这里的extract和replicat进程添加完后在info all中看不到这个进程,但是view report和 info extract name/info replicat name可以跟踪到。
(2)添加online change的Replicat 进程及checkpoint table
这里还是使用之前的rep1 进程,相关命令如下:
--在Target 端添加checkpoint表:
GGSCI (gg2) 6> edit params ./GLOBAL
GGSCHEMA ggate
CHECKPOINTTABLE ggate.checkpoint
GGSCI (gg2) 12> dblogin useridggate@gg2,password ggate
Successfully logged into database.
GGSCI (gg2) 13> add checkpointtableggate.checkpoint
Successfully created checkpointtableGGATE.CHECKPOINT.
--创建同步队列
GGSCI (gg2) 14> add replicatrep1,exttrail /u01/ggate/dirdat/lt, checkpointtable ggate.checkpoint
REPLICAT added.
修改相关的参数:
GGSCI (gg2) 11> view params rep1
replicat rep1
ASSUMETARGETDEFS
userid ggate@gg2,password ggate
discardfile/u01/ggate/dirdat/rep1_discard.txt, append, megabytes 10
HANDLECOLLISIONS --当目标端已有数据时,略过重复数据错误
map dave.pdba, target dave.pdba;
4.2.3 开始初始化
(1)On the source system, start changeextraction.
GGSCI (gg1) 17> START EXTRACT ext1
Sending START request to MANAGER ...
EXTRACT EXT1 starting
(2)On the source system, start theinitial-load Extract.
GGSCI (gg1) 21> START EXTRACText2
Sending START request to MANAGER ...
EXTRACT EXT2 starting
(3)On the target system, issue thefollowing command to find out if the load is finished. Waituntil the load is finished before going to the next step.
VIEW REPORT<initial-load Extract name>
GGSCI (gg1) 34> info extract ext2
EXTRACTEXT2 LastStarted 2011-11-16 11:10 StatusRUNNING
Checkpoint Lag Not Available
Log Read Checkpoint Table DAVE.PDBA
2011-11-1611:14:18 Record 1610801
Task SOURCEISTABLE
GGSCI (gg1) 36> info extract ext2
EXTRACTEXT2 Last Started 2011-11-1611:10 Status RUNNING
Checkpoint Lag Not Available
Log Read Checkpoint Table DAVE.PDBA
2011-11-16 11:15:18 Record 2035567
Task SOURCEISTABLE
通过以上2个对比,1分钟近40w条数据。
GGSCI (gg1) 37> view report ext2
2011-11-16 11:09:43 INFOOGG-01017 Wildcard resolution setto IMMEDIATE because
SOURCEISTABLE is used.
***********************************************************************
Oracle GoldenGate Capture forOracle
Version 11.1.1.1 OGGCORE_11.1.1_PLATFORMS_110421.2040
Linux, x64, 64bit (optimized), Oracle 11g on Apr 30 2011 18:52:51
Copyright (C) 1995, 2011, Oracle and/or itsaffiliates. All rights reserved.
Starting at 2011-11-1611:09:43
***********************************************************************
Operating System Version:
Linux
Version #1 SMP Tue Aug 18 15:59:52 EDT2009, Release 2.6.18-164.el5xen
Node: gg1
Machine: x86_64
soft limit hard limit
Address Space Size :unlimited unlimited
Heap Size :unlimited unlimited
File Size :unlimited unlimited
CPU Time : unlimitedunlimited
Process id: 4804
Description:
***********************************************************************
** Running with the followingparameters **
***********************************************************************
extract ext2
userid ggate@gg1, password *****
rmthost 192.168.3.200, mgrport 7809
rmttask replicat, group rept2
table dave.pdba;
2011-11-16 11:09:55 WARNING OGG-00869 No unique key is defined for table PDBA. All
viable columns will be used to represent thekey, but may not guarantee uniqueness.
KEYCOLS may be used to define the key.
Using the following key columns for sourcetable DAVE.PDBA: ID, TIME.
CACHEMGR virtual memory values (may havebeen adjusted)
CACHEBUFFERSIZE: 64K
CACHESIZE: 8G
CACHEBUFFERSIZE (soft max): 4M
CACHEPAGEOUTSIZE (normal): 4M
PROCESS VM AVAIL FROM OS (min): 16G
CACHESIZEMAX (strict force to disk): 13.99G
Database Version:
Oracle Database 11g Enterprise EditionRelease 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE11.2.0.3.0 Production
TNS for Linux: Version 11.2.0.3.0 -Production
NLSRTL Version 11.2.0.3.0 - Production
Database Language and Character Set:
NLS_LANG ="AMERICAN_AMERICA.zhs16gbk"
NLS_LANGUAGE = "AMERICAN"
NLS_TERRITORY = "AMERICA"
NLS_CHARACTERSET = "ZHS16GBK"
Processing table DAVE.PDBA
确认操作结束:
GGSCI (gg2) 25> info replicat rept2
REPLICATREPT2 Initialized 2011-11-15 20:26 Status STOPPED
Checkpoint Lag 00:00:00 (updated 14:50:41 ago)
Log Read Checkpoint Not Available
Task SPECIALRUN
GGSCI (gg1) 39> view report ext2
2011-11-16 11:09:43 INFO OGG-01017Wildcard resolution set to IMMEDIATE because
SOURCEISTABLE is used.
***********************************************************************
Oracle GoldenGate Capture forOracle
Version 11.1.1.1 OGGCORE_11.1.1_PLATFORMS_110421.2040
Linux, x64, 64bit (optimized), Oracle 11g on Apr 30 2011 18:52:51
Copyright (C) 1995, 2011, Oracle and/or itsaffiliates. All rights reserved.
Starting at 2011-11-1611:09:43
***********************************************************************
Operating System Version:
Linux
Version #1 SMP Tue Aug 18 15:59:52 EDT2009, Release 2.6.18-164.el5xen
Node: gg1
Machine: x86_64
soft limit hard limit
Address Space Size :unlimited unlimited
Heap Size :unlimited unlimited
File Size :unlimited unlimited
CPU Time : unlimitedunlimited
Process id: 4804
Description:
***********************************************************************
** Running with the followingparameters **
***********************************************************************
extract ext2
userid ggate@gg1, password *****
rmthost 192.168.3.200, mgrport 7809
rmttask replicat, group rept2
table dave.pdba;
2011-11-16 11:09:55 WARNING OGG-00869 No unique key is defined for table PDBA. All
viable columns will be used to represent thekey, but may not guarantee uniqueness.
KEYCOLS may be used to define the key.
Using the following key columns for sourcetable DAVE.PDBA: ID, TIME.
CACHEMGR virtual memory values (may havebeen adjusted)
CACHEBUFFERSIZE: 64K
CACHESIZE: 8G
CACHEBUFFERSIZE (soft max): 4M
CACHEPAGEOUTSIZE (normal): 4M
PROCESS VM AVAIL FROM OS (min): 16G
CACHESIZEMAX (strict force to disk): 13.99G
Database Version:
Oracle Database 11g Enterprise EditionRelease 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE11.2.0.3.0 Production
TNS for Linux: Version 11.2.0.3.0 -Production
NLSRTL Version 11.2.0.3.0 - Production
Database Language and Character Set:
NLS_LANG ="AMERICAN_AMERICA.zhs16gbk"
NLS_LANGUAGE = "AMERICAN"
NLS_TERRITORY = "AMERICA"
NLS_CHARACTERSET = "ZHS16GBK"
Processing table DAVE.PDBA
***********************************************************************
* ** Run Time Statistics** *
***********************************************************************
Report at 2011-11-16 11:17:01 (activitysince 2011-11-16 11:09:55)
Output to rept2:
From Table DAVE.PDBA:
# inserts: 2713235
# updates: 0
# deletes: 0
# discards: 0
REDO Log Statistics
Bytes parsed 0
Bytes output 279463205
初始化的工作已经完成,几分钟就完成了同步工作,比之前网友测试的数据快,不过我这里的测试数据比较特殊,分区表数据虽多,但是只有2列,这个也是影响数据的一个重要因素。
(4) On the target system, start changereplication.
GGSCI (gg2) 26> start replicat rep1
Sending START request to MANAGER ...
REPLICAT REP1 starting
(5)On the target system, issue thefollowing command to verify the status of change replication.
GGSCI(gg2) 28> info replicat rep1
REPLICAT REP1Last Started 2011-11-16 11:23Status RUNNING
Checkpoint Lag 00:00:00 (updated 158:24:45 ago)
Log Read Checkpoint File /u01/ggate/dirdat/lt000001
2011-11-0910:39:45.497561 RBA 1009
(6). On the target system, issue thefollowing command to turn off the HANDLECOLLISIONS parameter and disable theinitial-load error handling.
取消HANDLECOLLISIONS参数
GGSCI (gg2) 30> send replicatrep1,NOHANDLECOLLISIONS
Sending NOHANDLECOLLISIONS requestto REPLICAT REP1 ...
REP1 No tables found matchingGGATE.* to set NOHANDLECOLLISIONS.
(7) On the target system, edit theReplicat parameter file to remove the HANDLECOLLISIONS parameter. This preventsHANDLECOLLISIONS from being enabled again the next time Replicat starts.
GGSCI (gg2) 32> view params rep1
replicat rep1
ASSUMETARGETDEFS
userid ggate@gg2,password ggate
discardfile/u01/ggate/dirdat/rep1_discard.txt, append, megabytes 10
--HANDLECOLLISIONS
map dave.pdba, target dave.pdba;
至此,使用DirectLoad 方式初始的测试完成。
4.3 使用expdp/impdp 进行初始化
在前面演示了使用Direct Load 的方式进行初始化,当我们的表很大时,这种方法可能会需要较长的时间,这种情况下,我们可以考虑使用dblink 直接从Source db 将数据同步到target db。
这个步骤大致如下:
(1) 配置同步的Change Extract 和 Change Replicat 进程
(2) 启动Change Extract 进程,捕捉改变的数据。
(3) 使用expdp/impdp 迁移数据
(4) 完成迁移后启动Change replicat,完成数据同步。
4.3.1 准备工作
--先启用我们的DDL支持
SQL> @ddl_enable.sql
Trigger altered.
--truncate target db 上的pdba 表
SQL> truncate table pdba;
Table truncated.
4.3.2 添加change Extract 和 change Replicat 进程
--change Extract 进程
GGSCI (gg1) 11>add extractext1,tranlog, begin now
GGSCI (gg1) 12>add exttrail/u01/ggate/dirdat/lt, extract ext1
GGSCI (gg1) 14> view params ext1
extract ext1
userid ggate@gg1, password ggate
rmthost gg2, mgrport 7809
rmttrail /u01/ggate/dirdat/lt
table dave.pdba;
--Change Replicat 进程
--在Target 端添加checkpoint表:
GGSCI (gg2) 6> edit params ./GLOBAL
GGSCHEMA ggate
CHECKPOINTTABLE ggate.checkpoint
GGSCI (gg2) 12> dblogin useridggate@gg2,password ggate
Successfully logged into database.
GGSCI (gg2) 13> add checkpointtableggate.checkpoint
Successfully created checkpointtableGGATE.CHECKPOINT.
--创建同步队列
GGSCI (gg2) 14> add replicatrep1,exttrail /u01/ggate/dirdat/lt, checkpointtable ggate.checkpoint
REPLICAT added.
修改相关的参数:
GGSCI (gg2) 11> view params rep1
replicat rep1
ASSUMETARGETDEFS
userid ggate@gg2,password ggate
discardfile/u01/ggate/dirdat/rep1_discard.txt, append, megabytes 10
HANDLECOLLISIONS --当目标端已有数据时,略过重复数据错误
map dave.pdba, target dave.pdba;
4.3.3 在Source 端启动changeExtract进程:ext1
启动Extract 进程之后就会捕捉Source 端的变化。
GGSCI (gg1) 43> view params ext1
extract ext1
userid ggate@gg1, password ggate
rmthost gg2, mgrport 7809
rmttrail /u01/ggate/dirdat/lt
table dave.pdba;
GGSCI (gg1) 44> info all
ProgramStatus Group Lag Time Since Chkpt
MANAGERRUNNING
EXTRACTRUNNING EXT1 00:00:00 00:00:02
4.3.4 使用expdp 导出导入数据
http://blog.csdn.net/tianlesoftware/article/details/6260138
gg1:/home/oracle> expdp dave/dave directory=backup dumpfile=pdba.dmp logfile=table.log tables=pdba;
Export: Release 11.2.0.3.0 - Production onWed Nov 16 12:21:32 2011
Copyright (c) 1982, 2011, Oracle and/or itsaffiliates. All rights reserved.
Connected to: Oracle Database 11gEnterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining andReal Application Testing options
Starting"DAVE"."SYS_EXPORT_TABLE_02": dave/******** directory=backupdumpfile=pdba.dmp logfile=table.log tables=pdba
Estimate in progress using BLOCKS method...
Processing object typeTABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 60 MB
Processing object typeTABLE_EXPORT/TABLE/TABLE
. . exported"DAVE"."PDBA" 49.14 MB 2713235rows
Master table"DAVE"."SYS_EXPORT_TABLE_02" successfully loaded/unloaded
******************************************************************************
Dump file set for DAVE.SYS_EXPORT_TABLE_02is:
/u01/backup/pdba.dmp
Job"DAVE"."SYS_EXPORT_TABLE_02" successfully completed at12:22:16
在Source DB上删除一部分数据,已体现变化。
SQL> select count(1) from pdba;
COUNT(1)
----------
2713235
SQL> select count(1) from pdba wheretrunc(time)=to_date('2010/1/11','yyyy/mm/dd');
COUNT(1)
----------
6613
删除这6613条记录:
SQL> Delete from pdba wheretrunc(time)=to_date('2010/1/11','yyyy/mm/dd');
6613 rows deleted.
SQL> commit;
Commit complete.
SQL> select count(1) from pdba;
COUNT(1)
----------
2706622
将dump 传到备库,然后impdp进去:
gg1:/u01/backup> scp pdba.dmp 192.168.3.200:/u01/backup
The authenticity of host '192.168.3.200(192.168.3.200)' can't be established.
RSA key fingerprint is04:39:b6:d7:61:44:18:42:80:4b:37:1a:31:5d:a7:55.
Are you sure you want to continueconnecting (yes/no)? yes
Warning: Permanently added '192.168.3.200'(RSA) to the list of known hosts.
oracle@192.168.3.200's password:
pdba.dmp100% 49MB 3.5MB/s00:14
gg2:/u01/backup> impdp dave/davedirectory=backup dumpfile=pdba.dmp logfile=table.log tables=pdba table_exists_action=replace;
Import: Release 11.2.0.3.0 - Production onWed Nov 16 12:39:40 2011
Copyright (c) 1982, 2011, Oracle and/or itsaffiliates. All rights reserved.
Connected to: Oracle Database 11gEnterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Miningand Real Application Testing options
Master table "DAVE"."SYS_IMPORT_TABLE_01"successfully loaded/unloaded
Starting"DAVE"."SYS_IMPORT_TABLE_01": dave/******** directory=backupdumpfile=pdba.dmp logfile=table.log tables=pdba table_exists_action=replace
Processing object typeTABLE_EXPORT/TABLE/TABLE
Processing object typeTABLE_EXPORT/TABLE/TABLE_DATA
. . imported"DAVE"."PDBA" 49.14 MB 2713235rows
Job"DAVE"."SYS_IMPORT_TABLE_01" successfully completed at12:40:16
4.3.5 启动Target 的Replicat 进程
SQL> conn dave/dave;
Connected.
SQL> select count(1) from pdba;
COUNT(1)
----------
2713235
GGSCI (gg2) 37> start replicat rep1
Sending START request to MANAGER ...
REPLICAT REP1 starting
GGSCI (gg2) 38> info all
ProgramStatus Group Lag Time Since Chkpt
MANAGERRUNNING
REPLICATRUNNING REP1 00:00:00 00:28:17
SQL> select count(1) from pdba;
COUNT(1)
----------
2706622
在次查看,我们之前delete的数据,在Target 端也被delete 了。
清除Replicat 进程上的HANDLECOLLISIONS参数:
GGSCI (gg2) 40> send replicatrep1,NOHANDLECOLLISIONS
Sending NOHANDLECOLLISIONS request toREPLICAT REP1 ...
REP1 No tables found matching GGATE.* toset NOHANDLECOLLISIONS.
该参数是修改当前进程,使其生效,这样就不用重启replicat 进程。
修改参数,下次重启时生效:
GGSCI (gg2) 41> edit params rep1
GGSCI (gg2) 42> view params rep1
replicat rep1
ASSUMETARGETDEFS
userid ggate@gg2,password ggate
discardfile /u01/ggate/dirdat/rep1_discard.txt,append, megabytes 10
--HANDLECOLLISIONS
map dave.pdba, target dave.pdba;
至此,有关GG 数据初始化的内容就全部结束。
-------------------------------------------------------------------------------------------------------
版权所有,文章允许转载,但必须以链接方式注明源地址,否则追究法律责任!
Blog: http://blog.csdn.net/tianlesoftware
Weibo: http://weibo.com/tianlesoftware
Email: tianlesoftware@gmail.com
Skype: tianlesoftware
-------加群需要在备注说明Oracle表空间和数据文件的关系,否则拒绝申请----
DBA1 群:62697716(满); DBA2 群:62697977(满)DBA3 群:62697850(满)
DBA 超级群:63306533(满); DBA4 群:83829929(满) DBA5群: 142216823(满)
DBA6 群:158654907(满) DBA7 群:69087192(满)DBA8 群:172855474
DBA 超级群2:151508914 DBA9群:102954821 聊天 群:40132017(满)