connect 中 有两个重要概念, task 与 worker
task: 负责将数据移入或移出 kafka
worker: 相当于connecter 和任务的容器,用于负责管理连接器的配置,启动连接器和连接器任务,提供 rest api。
转换器: kafka connect 和其它存储系统直接发送或者接受数据之间转换数据。
worker进程用到配置文件 connect-standalone.properties
[root@localhost config]# ls
connect-console-sink.properties connect-distributed.properties connect-file-source.properties connect-standalone.properties log4j.properties server.properties trogdor.conf
connect-console-source.properties connect-file-sink.properties connect-log4j.properties consumer.properties producer.properties tools-log4j.properties zookeeper.properties
[root@localhost config]# pwd
/opt/kafka/kafka_2.12-2.2.1/config
[root@localhost config]# vi connect-standalone.properties
# These are defaults. This file just demonstrates how to override some settings.
# kafka 集群 连接的地址
bootstrap.servers=localhost:9092
# The converters specify the format of data in Kafka and how to translate it into Connect data. Every Connect user will
# need to configure these based on the format they want their data in when loaded from or stored into Kafka
# 格式转化类
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# Converter-specific settings can be passed in by prefixing the Converter's setting with the converter we want to apply
# it to
# json 消息中是否包含schema
key.converter.schemas.enable=true
value.converter.schemas.enable=true
# 保存偏移量的文件路径
offset.storage.file.filename=/tmp/connect.offsets
# Flush much faster than normal, which is useful for testing/debugging
# 设定提交偏移量的频率
offset.flush.interval.ms=10000
# Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
# (connectors, converters, transformations). The list should consist of top level directories that include
# any combination of:
# a) directories immediately containing jars with plugins and their dependencies
# b) uber-jars with plugins and their dependencies
# c) directories immediately containing the package directory structure of classes of plugins and their dependencies
# Note: symlinks will be followed to discover dependencies or plugins.
# Examples:
# plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
#plugin.path=
...skipping...
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# These are defaults. This file just demonstrates how to override some settings.
bootstrap.servers=localhost:9092
# The converters specify the format of data in Kafka and how to translate it into Connect data. Every Connect user will
# need to configure these based on the format they want their data in when loaded from or stored into Kafka
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# Converter-specific settings can be passed in by prefixing the Converter's setting with the converter we want to apply
# it to
key.converter.schemas.enable=true
value.converter.schemas.enable=true
offset.storage.file.filename=/tmp/connect.offsets
# Flush much faster than normal, which is useful for testing/debugging
offset.flush.interval.ms=10000
source 使用到的配置文件 connect-file-source.properties
[root@localhost config]# vi connect-file-source.properties
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# 配置连接器的名称
name=local-file-source
# 连接器的全限定名称,设置类名称也是可以的
connector.class=FileStreamSource
# task 数量
tasks.max=1
# 数据源的文件路径
file=/tmp/source.txt
# 主题名称
topic=heilu
Slink 使用到的配置文件connect-file-sink.properties
[root@localhost config]# vi connect-file-sink.properties
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
name=local-file-sink
connector.class=FileStreamSink
tasks.max=1
file=/tmp/sink.txt
topics=heilu
已经启动了kafka
然后,启动 source 连接器
root@localhost kafka_2.12-2.2.1]# bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties // 启动命令
...
[2019-12-02 19:10:54,424] INFO Kafka version: 2.2.1 (org.apache.kafka.common.utils.AppInfoParser:109)
[2019-12-02 19:10:54,425] INFO Kafka commitId: 55783d3133a5a49a (org.apache.kafka.common.utils.AppInfoParser:110)
[2019-12-02 19:10:54,441] INFO Created connector local-file-source (org.apache.kafka.connect.cli.ConnectStandalone:106) // 启动成功
启动 slink 连接器
[root@localhost kafka_2.12-2.2.1]# bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-sink.properties // 启动命令
...
2019-12-02 19:17:27,331] ERROR Stopping due to error (org.apache.kafka.connect.cli.ConnectStandalone:124)
org.apache.kafka.connect.errors.ConnectException: Unable to initialize REST server
at org.apache.kafka.connect.runtime.rest.RestServer.initializeServer(RestServer.java:177)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:85)
Caused by: java.io.IOException: Failed to bind to 0.0.0.0/0.0.0.0:8083
以上错误需要修改:
config/connect-standalone.properties
# These are defaults. This file just demonstrates how to override some settings.
bootstrap.servers=192.168.131.130:9092
rest.port=8084 // 添加的参数
# The converters specify the format of data in Kafka and how to translate it into Connect data. Every Connect user will
# need to configure these based on the format they want their data in when loaded from or stored into Kafka
#key.converter=org.apache.kafka.connect.json.JsonConverter
key.converter=org.apache.kafka.connect.storage.StringConverter // 修改
#value.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.storage.StringConverter // 修改
# Converter-specific settings can be passed in by prefixing the Converter's setting with the converter we want to apply
# it to
key.converter.schemas.enable=true
value.converter.schemas.enable=true
source 写入文本信息
[root@localhost kafka_2.12-2.2.1]# echo "hello,sink">>/tmp/source.txt
[root@localhost kafka_2.12-2.2.1]# cat /tmp/source.txt
hello,sink
查看 slink 文件
[root@localhost kafka_2.12-2.2.1]# cat /tmp/sink.txt
{"schema":{"type":"string","optional":false},"payload":"hello,sink"}
[root@localhost kafka_2.12-2.2.1]#