Kafka-connect将Kafka数据同步到文件

一、背景信息

Kafka Connect主要用于将数据流输入和输出消息队列Kafka版。Kafka Connect主要通过各种Source Connector的实现,将数据从第三方系统输入到Kafka Broker,通过各种Sink Connector实现,将数据从Kafka Broker中导入到第三方系统。

官方文档:How to use Kafka Connect - Getting Started | Confluent Documentation

二、开发环境

中间件版本
zookeeper3.7.0
kafka2.10-0.10.2.1

三、配置Kafka Connect

1、进到kafka的config文件夹下,修改connect-standalone.properties

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# These are defaults. This file just demonstrates how to override some settings.
bootstrap.servers=kafka-0:9092

# The converters specify the format of data in Kafka and how to translate it into Connect data. Every Connect user will
# need to configure these based on the format they want their data in when loaded from or stored into Kafka
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# Converter-specific settings can be passed in by prefixing the Converter's setting with the converter we want to apply
# it to
key.converter.schemas.enable=false
value.converter.schemas.enable=false

# The internal converter used for offsets and config data is configurable and must be specified, but most users will
# always want to use the built-in default. Offset and config data is never visible outside of Kafka Connect in this format.
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false

offset.storage.file.filename=/tmp/connect.offsets
# Flush much faster than normal, which is useful for testing/debugging
offset.flush.interval.ms=10000
# 核心配置
#plugin.path=/home/kafka/plugins

注意:Kafka Connect的早期版本不支持配置plugin.path,您需要在CLASSPATH中指定插件位置

vi /etc/profile
export CLASSPATH=/home/kafka/*
source /etc/profile

2、修改connect-file-sink.properties

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

name=local-file-sink
connector.class=FileStreamSink
tasks.max=1
#文件路径
file=/home/kafka/disposefile/data_t1.txt
#文件数据输出到kafka的topic
topics=sink_t1

3、修改connect-file-source.properties

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

name=local-file-source
connector.class=FileStreamSource
tasks.max=1
file=/home/kafka/disposefile/data_t1.txt
#kafka的topic数据输出到文件
topic=source_t1

四、启动命令

nohup bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties > /home/kafka/connectlog/connect-file.log 2>&1 &

Apache Kafka是一个分布式流处理平台,而Kafka Connect是一个用于将Kafka与外部系统集成的工具,它简化了在Kafka和其他系统间移动数据的过程。InfluxDB是一个开源的时序数据库,专门用于存储和分析时间序列数据。 在Kafka Connect同步Kafka和InfluxDB,你可以使用KafkaConnectors来实现数据的双向流动。一个典型的同步过程可能包括以下几个步骤: 1. 配置InfluxDB Connector:你需要在Kafka Connect中配置InfluxDB Connector,指定连接的数据库、主机、认证信息等参数。这个配置可以写在connectors的配置文件中,也可以通过REST API动态提交。 2. 启动Kafka Connect:配置完成后,你需要启动Kafka Connect服务,并确保InfluxDB Connector已经加载并启动。 3. 数据同步:一旦Connector启动,它就会根据配置将Kafka中的数据同步到InfluxDB,或者将InfluxDB中的数据导入到Kafka主题。 下面是一个简单的样例配置,这个配置定义了一个从Kafka同步数据到InfluxDB的Connector: ```properties name=InfluxDBSinkConnector connector.class=io.confluent.connect.influxdb.InfluxDBSinkConnector tasks.max=1 topics=my_kafka_topic influxdb.url=http://localhost:8086 influxdb.topic.map=my_kafka_topic:my_influxdb_database.my_influxdb_measurement influxdb.username=root influxdb.password=influxdbpassword ``` 请注意,这个配置需要根据你的实际情况进行调整,例如Kafka和InfluxDB的地址、认证信息、主题、数据库和测量名称等。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值