安装背景
由于公司最近在开发大数据,需要把oracle数据同步到数据仓中,期间技术选择考虑到很多方式
OGG(Oracle GoldenGate),oracle提供,收费,配置繁琐一批,资料很少,都是针对linux,windows下报一堆错,google没有找到相关解决方案,
Databus canal otter datax kettle kafka-connect-oracle做如下对比:
databus LinkedIn开源,从相关网站了解到性能比较成熟,但是自己没用过,资料很少,学习成本较大
canal和otter 阿里开源,目前支持mysql ,内部说支持oracle,但是没有资料,不知道后期有么有
datax和kettle考虑到数据库是热数据同步,业务大的情况下,频繁撸库风险很大,数据迁移倒是不错方案
Confluent Platform (部分功能免费)官网明确说只支持Linux,论坛找到confluent-windows-5.0.1 兼容版本,经过一番折腾,还是报错有兴趣同学可以研究一下https://github.com/mduhan/confluent-windows-5.0.1
kafka-connect-oracle国外开源技术,基于relog日志发送数据至kafka,国内,国外资料比较多,不过都是针对Linux,谷歌和百度找了一圈没有windows如下解决错误解决方案
经过一番折腾后来问题解决,配置如下:
1、 数据库权限设置
create role logmnr_role;
grant create session to logmnr_role;
grant execute_catalog_role,select any transaction ,select any dictionary to logmnr_role;
create tablespace ogg_ts datafile 'G:\data\test_date01.dbf' size 256M autoextend on next 64M;
create user kafka identified by kafka default tablespace ogg_ts;
grant logmnr_role to kafka;
alter user kafka quota unlimited on users;
2、kafka-connect-oracle编译
下载地址
mvn编译kafka-connect-oracle,将编译后
kafka-connect-oracle-1.0.jar和oracle.jar 复制到kafka的libs目录下
将OracleSourceConnector.properties复制到config目录下
OracleSourceConnector.properties配置如下
name=oracle-logminer-connector
connector.class=com.ecer.kafka.connect.oracle.OracleSourceConnector
db.name.alias=
tasks.max=1
topic=ogg_test
db.name=temps
db.hostname=192.168.1.2
db.port=1522
db.user=ogg
db.user.password=ogg
db.fetch.size=1
table.whitelist=OGG.TEST_OGG
table.blacklist=
parse.dml.data=true
reset.offset=true
start.scn=
multitenant=false
connect-standalone.properties配置如下
bootstrap.servers=192.168.1.109:9092
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true
offset.storage.file.filename=/tmp/connect.offsets
offset.flush.interval.ms=10000
server.properties配置如下
broker.id=0
listeners=PLAINTEXT://192.168.1.109:9092
host.name=192.168.1.109
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=18000
group.initial.rebalance.delay.ms=0
delete.topic.enable=true
auto.create.topics.enable=true
windows 启动回报如下错误:
F:\kafka_2.13-2.6.0\bin>windows\connect-standalone.bat ..\config\connect-standalone.properties .
.\config\OracleSourceConnector.properties
[2020-11-06 17:48:36,503] WARN could not get type for name org.easymock.IArgumentMatcher from any class loader (org.reflections.Reflections)
org.reflections.ReflectionsException: could not get type for name org.easymock.IArgumentMatcher
at org.reflections.ReflectionUtils.forName(ReflectionUtils.java:312)
at org.reflections.Reflections.expandSuperTypes(Reflections.java:382)
at org.reflections.Reflections.<init>(Reflections.java:140)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader$InternalReflections.<ini
t>(DelegatingClassLoader.java:444)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanPluginPath(Delegatin
gClassLoader.java:334)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.scanUrlsAndAddPlugins(De
legatingClassLoader.java:268)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initPluginLoader(Delegat
ingClassLoader.java:216)
at org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader.initLoaders(DelegatingCl
assLoader.java:209)
at org.apache.kafka.connect.runtime.isolation.Plugins.<init>(Plugins.java:61)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:79)
Caused by: java.lang.ClassNotFoundException: org.easymock.IArgumentMatcher
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.reflections.ReflectionUtils.forName(ReflectionUtils.java:310)
... 9 more
最终选择通过cygwin模拟linux环境
setup-x86_64.exe下载地址
https://www.cygwin.com/setup-x86_64.exe
镜像地址建议选国内(翻墙除外)
http://mirrors.ustc.edu.cn/cygwin/
https://mirrors.aliyun.com/cygwin/
https://mirrors.ustc.edu.cn/cygwin/
安装好之后,配置一下linux下jdk环境
JAVA_HOME=/cygdrive/d/tools/jdk1.8/jdk1.8.0_111
JRE_HOME=/cygdrive/d/tools/jdk1.8/jre
CLASS_PATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib
PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
export JAVA_HOME JRE_HOME CLASS_PATH PATH
启动kafka服务
./kafka-server-start.sh ../config/server.properties
启动worker创建connector
./connect-standalone.sh ../config/connect-standalone.properties ../config/OracleSourceConnector.properties
查看topic消息
./kafka-console-consumer.sh --bootstrap-server 192.168.1.109:9092 --from-beginning --topic ogg_test
启动后表中进行ddl操作,可以看到数据成功发送至kafka topic中,至此问题搞定,至于集群之类的,该咋配还咋配,这里不做赘述。