网上学习资料一大堆,但如果学到的知识不成体系,遇到问题时只是浅尝辄止,不再深入研究,那么很难做到真正的技术提升。
一个人可以走的很快,但一群人才能走的更远!不论你是正从事IT行业的老鸟或是对IT行业感兴趣的新人,都欢迎加入我们的的圈子(技术交流、学习资源、职场吐槽、大厂内推、面试辅导),让我们一起学习成长!
Extract 进程用来捕获数据源,有三种类型:
- 1)、initial load:加载整张表的数据,属于批量加载
- 2)、恢复日志/事务日志:恢复数据库表的数据
- 3)、捕获模型,实时监控日志文件,已有数据,立即捕获
Data pump
是Extract的辅助可选组件,如果不配置Data pump,Extract将捕获的数据直接发给目标机器上的Collector进程。
08-[理解]-OGG 数据同步之拓扑结构及支持环境
OGG架构和原理,发现架构划分比较细(分工比较细),OGG在实际项目中,使用时,常见拓扑结构和支持环境,以后如果考虑使用OGG实时同步数据时参考。
Oracle数据库配置集群Cluster,称为Oracle RAC(Real Application Cluster
由此可见,GoldenGate TDM的复制模式非常灵活,用户可以根据自己的需求选择特定的复制方式,并根据系统扩展对复制进行扩展。
源和目标的操作系统和数据库可以进行任意的组合。
目前来说,如果企业项目使用OGG进行数据同步时,通常还是SRC为:Oracle数据库,DST:Oracle数据库或Kafka消息队列。
09-[掌握]-OGG 数据同步之测试环境准备
如何配置OGG实现实时数据同步到Kafka,无需掌握步骤,比较繁琐,让DBA完成。
提供虚拟机【
node1.itcast.cn
】中Docker 容器【myoracle】已经安装完OGG(源端和目标端)
,只需要启动OGG的源端SRC服务和目标端DST服务即可。
- 1)、源端SRC
- Manager管理(
mgr
)、Extract
进程、LocalTrail、Pump
进程- 2)、目标端DST
- Manager管理(
mgr
)、Remote Trail 、复制进程Replicat
由于使用OGG实时采集Oracle数据库表数据(日志文件),将数据同步到Kafka消息对象,所以首先启动Kafka服务(先启动Zookeeper服务),打开提供【
node2.itcast.cn
】,使用CM界面启动ZK和Kafka服务。
启动OGG配置服务,分为源端和目标端,参考提供【启动命令:Oracle数据库和OGG服务.txt】,具体命令;
# ============= 切换到 oracle 账号,并且启动Oracle数据库 =============
# 第一步:启动源端mgr进程
[root@node1 ~]# docker exec -it myoracle /bin/bash
[root@server01 oracle]# su - oracle
Last login: Mon Mar 15 02:06:07 UTC 2021 on pts/1
-bash: warning: setlocale: LC_ALL: cannot change locale (en_US): No such file or directory
-bash: warning: setlocale: LC_ALL: cannot change locale (en_US): No such file or directory
[oracle@server01 ~]$ cd $OGG_SRC_HOME
[oracle@server01 src]$ ./ggsci
Oracle GoldenGate Command Interpreter for Oracle
Version 11.2.1.0.3 14400833 OGGCORE_11.2.1.0.3_PLATFORMS_120823.1258_FBO
Linux, x64, 64bit (optimized), Oracle 11g on Aug 23 2012 20:20:21
Copyright (C) 1995, 2012, Oracle and/or its affiliates. All rights reserved.
GGSCI (server01) 1> start mgr
Manager started.
GGSCI (server01) 2> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT ABENDED EXTKAFKA 00:00:00 4598:05:35
EXTRACT ABENDED PUKAFKA 00:00:00 4598:05:31
# 第二步:启动目标端mgr进程
[root@node1 ~]# docker exec -it myoracle /bin/bash
[root@server01 oracle]# su - oracle
Last login: Mon Mar 15 03:48:01 UTC 2021 on pts/0
-bash: warning: setlocale: LC_ALL: cannot change locale (en_US): No such file or directory
-bash: warning: setlocale: LC_ALL: cannot change locale (en_US): No such file or directory
[oracle@server01 ~]$ cd $OGG_TGR_HOME
[oracle@server01 tgr]$ ./ggsci
start mgrOracle GoldenGate for Big Data
Version 12.3.1.1.1
Oracle GoldenGate Command Interpreter
Version 12.3.0.1.0 OGGCORE_OGGADP.12.3.0.1.0GA_PLATFORMS_170828.1608
Linux, x64, 64bit (optimized), Generic on Aug 28 2017 17:13:45
Operating system character set identified as US-ASCII.
Copyright (C) 1995, 2017, Oracle and/or its affiliates. All rights reserved.
GGSCI (server01) 2> start mgr
Manager started.
GGSCI (server01) 3> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT ABENDED REKAFKA 00:00:00 4598:06:12
# 第三步:启动源端extract进程
GGSCI (server01) 3> start EXTKAFKA
Sending START request to MANAGER ...
EXTRACT EXTKAFKA starting
# 第四步:启动源端pump进程
GGSCI (server01) 4> start PUKAFKA
Sending START request to MANAGER ...
EXTRACT PUKAFKA starting
GGSCI (server01) 5> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
EXTRACT RUNNING EXTKAFKA 4598:08:10 00:00:00
EXTRACT RUNNING PUKAFKA 00:00:00 4598:07:38
# 第五步:启动目标端replicate进程
GGSCI (server01) 4> start REKAFKA
Sending START request to MANAGER ...
REPLICAT REKAFKA starting
GGSCI (server01) 7> info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT RUNNING REKAFKA 00:00:00 00:00:08
使用KafkaTool工具连接Kafka集群,查看Topic信息和数据
10-[掌握]-OGG 数据同步之物流数据同步Kafka
采用OGG中间件将Oracle数据库表的数据实时同步到Kafka消息队列中:
- 1)、源端 :Oracle数据库【itcast】
- 2)、采集工具:OGG,分为SRC和DST
- 3)、目标端:Kafka 消息队列【logistics】
测试:向Oracle数据库某张表中插入insert、更新update及删除delete操作,查看Kafka Topic中数据
目前OGG11版本,延迟性相对较大,在2s左右,需要进行合理化调参配置,到OGG12版本,很快。
- 1)、插入数据测试
-- 插入数据INSERT
INSERT INTO ITCAST."tbl_company"("id", "company_name", "city_id", "company_number", "company_addr", "company_addr_gis", "company_tel", "is_sub_company", "state", "cdt", "udt", "remark")VALUES(11, '广州码农速递邮箱公司', 440100, NULL, '广州校区', '117.28177895734918_31.842711680531399', NULL, 1, 1, TO_DATE('2020-06-13 15:24:51','yyyy-mm-dd hh24:mi:ss'), TO_DATE('2020-06-13 15:24:51','yyyy-mm-dd hh24:mi:ss'), NULL);
同步至Kafka Topic中JSON数据
{
"table": "ITCAST.tbl\_company",
"op\_type": "I",
"op\_ts": "2021-03-15 03:57:07.000306",
"current\_ts": "2021-03-15T03:57:20.578000",
"pos": "00000000150000001245",
"after": {
"id": 11,
"company\_name": "广州码农速递邮箱公司",
"city\_id": 440100,
"company\_number": null,
"company\_addr": "广州校区",
"company\_addr\_gis": "117.28177895734918\_31.842711680531399",
"company\_tel": null,
"is\_sub\_company": 1,
"state": 1,
"cdt": "2020-06-13 15:24:51",
"udt": "2020-06-13 15:24:51",
"remark": null
}
}
- 2)、更新数据测试
-- 更新数据UPDATE
UPDATE ITCAST."tbl_company" SET "company_name"='广州码农速递有限公司-1' WHERE "id"=11;
同步至Kafka Topic中JSON数据
{
"table": "ITCAST.tbl\_company",
"op\_type": "U",
"op\_ts": "2021-03-15 03:59:28.000248",
"current\_ts": "2021-03-15T03:59:40.378000",
"pos": "00000000150000001980",
"before": {
"id": 11,
"company\_name": "广州码农速递邮箱公司",
"city\_id": 440100,
"company\_number": null,
"company\_addr": "广州校区",
"company\_addr\_gis": "117.28177895734918\_31.842711680531399",
"company\_tel": null,
"is\_sub\_company": 1,
"state": 1,
"cdt": "2020-06-13 15:24:51",
"udt": "2020-06-13 15:24:51",
"remark": null
},
"after": {
"id": 11,
"company\_name": "广州码农速递有限公司-1",
"city\_id": 440100,
"company\_number": null,
"company\_addr": "广州校区",
"company\_addr\_gis": "117.28177895734918\_31.842711680531399",
"company\_tel": null,
"is\_sub\_company": 1,
"state": 1,
"cdt": "2020-06-13 15:24:51",
"udt": "2020-06-13 15:24:51",
"remark": null
}
}
- 3)、删除数据测试
-- 删除数据DELETE
DELETE ITCAST."tbl_company" WHERE "id"=11;
同步至Kafka Topic中JSON数据
{
"table": "ITCAST.tbl\_company",
"op\_type": "D",
"op\_ts": "2021-03-15 04:01:16.000109",
"current\_ts": "2021-03-15T04:01:30.756000",
"pos": "00000000150000002328",
"before": {
"id": 11,
"company\_name": "广州码农速递有限公司-1",
"city\_id": 440100,
"company\_number": null,
"company\_addr": "广州校区",
"company\_addr\_gis": "117.28177895734918\_31.842711680531399",
"company\_tel": null,
"is\_sub\_company": 1,
"state": 1,
"cdt": "2020-06-13 15:24:51",
"udt": "2020-06-13 15:24:51",
"remark": null
}
}
可以查看Oracle数据库日志存储,命令如下:
[root@node1 ~]# docker exec -it myoracle /bin/bash
[root@server01 oracle]# su - oracle
[oracle@server01 ~]$ source ~/.bash_profile
-bash: warning: setlocale: LC_ALL: cannot change locale (en_US): No such file or directory
-bash: warning: setlocale: LC_ALL: cannot change locale (en_US): No such file or directory
[oracle@server01 ~]$
[oracle@server01 ~]$ cd $ORACLE_BASE/flash_recovery_area/ORCL
![img](https://img-blog.csdnimg.cn/img_convert/ff1fd7af4a765ce15daa119b1130671d.png)
![img](https://img-blog.csdnimg.cn/img_convert/e040ab9e0dd424da6bce8661f5109284.png)
![img](https://img-blog.csdnimg.cn/img_convert/d7fb956a91d05836cc11c5c621d17034.png)
**既有适合小白学习的零基础资料,也有适合3年以上经验的小伙伴深入学习提升的进阶课程,涵盖了95%以上大数据知识点,真正体系化!**
**由于文件比较多,这里只是将部分目录截图出来,全套包含大厂面经、学习笔记、源码讲义、实战项目、大纲路线、讲解视频,并且后续会持续更新**
**[需要这份系统化资料的朋友,可以戳这里获取](https://bbs.csdn.net/topics/618545628)**
e or directory
[oracle@server01 ~]$
[oracle@server01 ~]$ cd $ORACLE_BASE/flash_recovery_area/ORCL
[外链图片转存中...(img-l4Yb4KWf-1715829022404)]
[外链图片转存中...(img-vN099opl-1715829022404)]
[外链图片转存中...(img-Da6oXM14-1715829022405)]
**既有适合小白学习的零基础资料,也有适合3年以上经验的小伙伴深入学习提升的进阶课程,涵盖了95%以上大数据知识点,真正体系化!**
**由于文件比较多,这里只是将部分目录截图出来,全套包含大厂面经、学习笔记、源码讲义、实战项目、大纲路线、讲解视频,并且后续会持续更新**
**[需要这份系统化资料的朋友,可以戳这里获取](https://bbs.csdn.net/topics/618545628)**