CDH6.3.2下使用debezium1.6.1 mysql5.7 记录

目录

1. 环境版本信息

2. 背景资料

2.1 Kafka Connect

3. debezium1.6.1在CDH6.3.2环境下的安装

4. 使用

5. 遇到的问题

 参考:



1. 环境版本信息


集群环境: CDH6.3.2

debezium: debezium-connector-mysql-1.6.1.Final-plugin

MYSQL: 5.7.32


2. 背景资料

2.1 Kafka Connect

        Kafka Connect用于在Apache Kafka和其他系统之间可扩展且可靠地数据流传输数据的工具,连接器可以轻松地将大量数据导入或导出。Kafka Connect当前支持两种模式,standalone和distributed两种模式。 standalone主要用于入门测试,所以我们来实现distributed模式。

         Distributed, 分布式模式可以在处理工作中自动平衡,允许动态扩展或缩减,并在活动任务以及配置和偏移量提交数据中提供容错能力。和standalone模式非常类似,最大区别在于启动的类和配置参数,参数决定了Kafka Connect流程如果存储偏移量,如何分配工作,在分布式模式下,Kafka Connect将偏移量,配置和任务状态存储在topic中。建议手动创建topic指定分区数,也可以通过配置文件参数自动创建topic。

        debezium作为Kafka Connect插件与kafka进行整合使用,其依赖于kafka上,所以在安装Debezium时需要提前安装好Zookeeper,Kafka,以及Kakfa Connect。

3. debezium1.6.1在CDH6.3.2环境下的安装

        安装过程还是很简单的,第一步解压jar包到kafka lib目录;第二步,配置;第三步scp配置文件到其它节点。结束

第一步:解压jar包到kafka lib目录

tar -zxvf debezium-connector-mysql-1.6.1.Final-plugin.tar.gz

cd debezium-connector-mysql

mv *.jar /opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/kafka/libs/

第二步:配置

cd /opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/etc/kafka/conf.dist/

cp connect-distributed.properties /opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/kafka/config/
###重命名
mv connect-distributed.properties cdc-mysql-connect-distributed1.properties

cp connect-log4j.properties /opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/kafka/config
vi cdc-mysql-connect-distributed1.properties 

 内容大致如下

bootstrap.servers=kk1:9092,kk2:9092,kk3:9092

#如果想发送普通的json格式而不是avro格式的话,很简单key.converter.schemas.enable和value.converter.schemas.enable设置为false就行。这样就能发送普通的json格式数据。
key.converter.schemas.enable=true
value.converter.schemas.enable=true


#offset.storage.topic:用于存储offsets;这个topic应该配置多个partition和副本。
offset.storage.topic=connect-offsets
offset.storage.replication.factor=1
#offset.storage.partitions=25

#config.storage.topic:topic用于存储connector和任务配置;注意,这应该是一个单个的partition,多副本的topic
config.storage.topic=connect-configs
config.storage.replication.factor=1

#status.storage.topic:用于存储状态;这个topic 可以有多个partitions和副本
status.storage.topic=connect-status
status.storage.replication.factor=1

plugin.path=/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/kafka/libs/
#plugin.path=
### kafka-group.id
group.id=cdc-connect-mysql
### topic cdc-connect-offsets设置
offset.storage.topic=cdc-connect-mysql-offsets
offset.storage.replication.factor=3
offset.storage.partitions=50
### topic cdc-connect-configs设置
config.storage.topic=cdc-connect-mysql-configs
config.storage.replication.factor=3
config.storage.partitions=1
### topic cdc-connect-status设置
status.storage.topic=cdc-connect-mysql-status
status.storage.replication.factor=3
status.storage.partitions=10

第三步:分发配置文件(debezium jar包所有kafka节点都要有哦,略)

scp cdc-mysql-connect-distributed1.properties  your_host:/opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/kafka/config

4. 使用

#每个kafka节点都要执行
#进入kafka
cd /opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/kafka/
#开启kafka
./bin/connect-distributed.sh config/cdc-mysql-connect-distributed.properties

打开 debezium connector

curl -H "Content-Type: application/json" -X POST -i 'http://xx.xx.xx.x:8083/connectors'   -d  '{
	"name": "debezium-test-5017",
	"config": {
		"connector.class": "io.debezium.connector.mysql.MySqlConnector",
		"database.hostname": "xxx.0.xxx.xxx",
		"database.port": "3306",
		"database.user": "root",
		"database.password": "xxxxxxx",
		"database.include.list": "datax_web",
		"database.dbname": "datax_web",
		"database.serverTimezone": "UTC",
		"database.server.id": "316545017",
		"database.server.name": "debezium_mysql_5017",
		"database.history.kafka.bootstrap.servers": "xx:9092,xx:9092,xx:9092",
		"database.history.kafka.topic": "debezium_test"
	}
  }' 

执行结果如下则成功

 打开 网址 

http://xxx:8083/connectors/debezium-test-5017/status

见如下running信息则可继续下一步

查看kafka内topic是否新增监察目标mysql表所对应的topic

#进入kafka
cd /opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/kafka/

bin/kafka-topics.sh --bootstrap-server k1:9092 --list

 产生了我们需要监视的表topic

kafka打开一个console消费客户端,选中debezium_mysql_5017.datax_web.job_group这个topic,观察变化

bin/kafka-console-consumer.sh --bootstrap-server xx:9092 --topic debezium_mysql_5017.datax_web.job_group --from-beginning

mysql修改 job_group 表内数据此时kafka console消费客户端会变化可见如下


5. 遇到的问题

[2021-08-04 10:52:23,499] INFO Added alias 'ActivateTracingSpan' to plugin 'io.debezium.transforms.tracing.ActivateTracingSpan' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:383)
[2021-08-04 10:52:23,500] INFO Added alias 'RegexRouter' to plugin 'org.apache.kafka.connect.transforms.RegexRouter' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:383)
[2021-08-04 10:52:23,501] INFO Added alias 'TimestampRouter' to plugin 'org.apache.kafka.connect.transforms.TimestampRouter' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:383)
[2021-08-04 10:52:23,501] INFO Added alias 'ValueToKey' to plugin 'org.apache.kafka.connect.transforms.ValueToKey' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:383)
[2021-08-04 10:52:23,501] INFO Added alias 'BasicAuthSecurityRestExtension' to plugin 'org.apache.kafka.connect.rest.basic.auth.extension.BasicAuthSecurityRestExtension' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:383)
[2021-08-04 10:52:23,518] ERROR Stopping due to error (org.apache.kafka.connect.cli.ConnectDistributed:82)
org.apache.kafka.common.config.ConfigException: Missing required configuration "key.converter" which has no default value.
	at org.apache.kafka.common.config.ConfigDef.parseValue(ConfigDef.java:474)
	at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:464)
	at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:62)
	at org.apache.kafka.common.config.AbstractConfig.<init>(AbstractConfig.java:75)
	at org.apache.kafka.connect.runtime.WorkerConfig.<init>(WorkerConfig.java:364)
	at org.apache.kafka.connect.runtime.distributed.DistributedConfig.<init>(DistributedConfig.java:277)
	at org.apache.kafka.connect.cli.ConnectDistributed.startConnect(ConnectDistributed.java:91)
	at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:76)

解决

cdc-connect-distributed.properties内增加

key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
value.converter.schemas.enable=true

开启后发现出现如下问题

[2021-08-04 10:41:38,827] WARN The configuration 'config.storage.topic' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2021-08-04 10:41:38,828] WARN The configuration 'group.id' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2021-08-04 10:41:38,828] WARN The configuration 'status.storage.topic' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2021-08-04 10:41:38,828] WARN The configuration 'plugin.path' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2021-08-04 10:41:38,828] WARN The configuration 'status.storage.partitions' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2021-08-04 10:41:38,828] WARN The configuration 'config.storage.partitions' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2021-08-04 10:41:38,828] WARN The configuration 'config.storage.replication.factor' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2021-08-04 10:41:38,828] WARN The configuration 'offset.storage.partitions' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2021-08-04 10:41:38,828] WARN The configuration 'status.storage.replication.factor' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2021-08-04 10:41:38,828] WARN The configuration 'value.converter.schemas.enable' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2021-08-04 10:41:38,828] WARN The configuration 'offset.storage.replication.factor' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2021-08-04 10:41:38,829] WARN The configuration 'offset.storage.topic' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2021-08-04 10:41:38,829] WARN The configuration 'value.converter' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2021-08-04 10:41:38,829] WARN The configuration 'key.converter' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2021-08-04 10:41:38,831] INFO Kafka version: 2.2.1-cdh6.3.2 (org.apache.kafka.common.utils.AppInfoParser:109)
[2021-08-04 10:41:38,831] INFO Kafka commitId: unknown (org.apache.kafka.common.utils.AppInfoParser:110)
[2021-08-04 10:41:38,871] WARN [AdminClient clientId=adminclient-1] Connection to node -2 (kk2/1.1.1.145:7091) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient:746)
[2021-08-04 10:41:38,877] WARN [AdminClient clientId=adminclient-1] Connection to node -3 (kk2/1.1.1.146:7091) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient:746)
[2021-08-04 10:41:38,882] WARN [AdminClient clientId=adminclient-1] Connection to node -1 (kk3/1.1.1.87:7091) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient:746)
[2021-08-04 10:41:38,984] WARN [AdminClient clientId=adminclient-1] Connection to node -1 (kk3/1.1.1.87:7091) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient:746)
[2021-08-04 10:41:38,986] WARN [AdminClient clientId=adminclient-1] Connection to node -3 (kk2/1.1.1.146:7091) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient:746)
[2021-08-04 10:41:38,987] WARN [AdminClient clientId=adminclient-1] Connection to node -2 (kk2/1.1.1.145:7091) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient:746)
[2021-08-04 10:41:39,090] WARN [AdminClient clientId=adminclient-1] Connection to node -3 (kk2/1.1.1.146:7091) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient:746)
[2021-08-04 10:41:39,091] WARN [AdminClient clientId=adminclient-1] Connection to node -1 (kk3/1.1.1.87:7091) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient:746)
[2021-08-04 10:41:39,194] WARN [AdminClient clientId=adminclient-1] Connection to node -2 (kk2/1.1.1.145:7091) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient:746)
[2021-08-04 10:41:39,396] WARN [AdminClient clientId=adminclient-1] Connection to node -1 (kk3/1.1.1.87:7091) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient:746)
[2021-08-04 10:41:39,397] WARN [AdminClient clientId=adminclient-1] Connection to node -3 (kk2/1.1.1.146:7091) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient:746)
[2021-08-04 10:41:39,499] WARN [AdminClient clientId=adminclient-1] Connection to node -2 (kk2/1.1.1.145:7091) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient:746)

 排查发现防火墙打开了

systemctl status iptables

systemctl stop iptables

重试依然报错

telnet xxxx 7091

参考[3]后发现,可能是配置问题。

 结论是:使用 connect-distributed.properties 改写一下,不要自己从头写,然后port cdh支持的是9092;防火墙要关下。


 参考:

[1] Debezium同步MySQL变更到kafka集群及REST API使用方法汇总.皛皛.url[Debezium同步MySQL变更到kafka集群及REST API使用方法汇总_lichunliang的博客-CSDN博客]

[2] 大数据技术之Debezium.菜鸟Coders.url[大数据技术之Debezium_菜鸟Coders的博客-CSDN博客_debezium]

[3] kafka连接生产者报错Connection to node -1 could not be established. Broker may not be available.千淘万漉.url[kafka连接生产者报错Connection to node -1 could not be established. Broker may not be available._千淘万漉-CSDN博客]

  • 19
    点赞
  • 21
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值