kafka 2.7 单机启动步骤、kafka-connect配置
有点忘记了,所以要重新记录kafka单机启动步骤,首先是下载kafka.tar.gz包,然后解压,不需要进行任何配置,就可以进行下面的操作了。
1、kafka 2.7 单机启动步骤
整体的流程是,先启动zookeeper,然后启动kafka-server,最后就可以启动生产者消费者什么的了。
1.1 启动zookeeper
进入bin
$ bin/zookeeper-server-start.sh config/zookeeper.properties
后台运行加daemon
sh zookeeper-server-start.sh -daemon /root/apps/kafka_2.13-2.7.0/config/zookeeper.properties
1.2 启动kafka-server
另外开窗口启动
$ bin/kafka-server-start.sh config/server.properties
后台运行加daemon
sh kafka-server-start.sh -daemon /root/apps/kafka_2.13-2.7.0/config/server.properties
1.3 创建topic
$ bin/kafka-topics.sh --create --topic quickstart-events --bootstrap-server localhost:9092
1.4 生产者
$ bin/kafka-console-producer.sh --topic quickstart-events --bootstrap-server localhost:9092
1.5 消费者
打括号表示可选
$ bin/kafka-console-consumer.sh --topic quickstart-events [--from-beginning] --bootstrap-server localhost:9092
2、kafka-connect配置
在confluent找Kafka Connect JDBC的Connector压缩包下载,然后传到服务器上。
[root@iZuf6fp7azbufaq6loxytsZ apps]# ls
confluentinc-kafka-connect-jdbc jdk1.8.0_281 jdk-8u281-linux-x64.tar.gz kafka_2.13-2.7.0
上传解压好后如上所示。
接下来只需要配置两个文件:source-quickstart-sqlite.properties和connect-standalone.properties
2.1 source-quickstart-sqlite.properties
该文件路径在…/confluentinc-kafka-connect-jdbc/etc下,也就是confluentinc-kafka-connect-jdbc下的etc包下,尽量不要改动原本的样板文件,cp一份,然后打开,下面我将用自己完成的文件给大家看,关键的配置我会注释(下面这个很乱,大部分是注释,大家可以跳过直接看下一个精简的代码块):
#
# Copyright 2018 Confluent Inc.
#
# Licensed under the Confluent Community License (the "License"); you may not use
# this file except in compliance with the License. You may obtain a copy of the
# License at
#
# http://www.confluent.io/confluent-community-license
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OF ANY KIND, either express or implied. See the License for the
# specific language governing permissions and limitations under the License.
#
# A simple example that copies all tables from a SQLite database. The first few settings are
# required for all connectors: a name, the connector class to run, and the maximum number of
# tasks to create:
name=test-source-dm-jdbc
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=1
# The remaining configs are specific to the JDBC source connector. In this example, we connect to a
# SQLite database stored in the file test.db, use and auto-incrementing column called 'id' to
# detect new rows as they are added, and output to topics prefixed with 'test-sqlite-jdbc-', e.g.
# a table called 'users' will be written to the topic 'test-sqlite-jdbc-users'.
connection.url=jdbc:dm://数据库iP:5236/库?user=用户名&password=密码&characterEncoding=utf-8
#table.whitelist=kmeans
table.whitelist=household3
mode=incrementing
incrementing.column.name=id
topic.prefix=test-
# Define when identifiers should be quoted in DDL and DML statements.
# The default is 'always' to maintain backward compatibility with prior versions.
# Set this to 'never' to avoid quoting fully-qualified or simple table and column names.
#quote.sql.identifiers=always
其实很多都是注释,主要需要配置的如下:
name=test-source-dm-jdbc // 该connector的名字
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector // 使用的connector,如果连接不同数据源如oracle可能需要修改
tasks.max=1 // tasks的数量,如果部署集群且要求流速快则要修改
// 这里是配置文件给我们打样,url如何写
#connection.url=jdbc:mysql://10.129.225.254:3306/sparkdb?user=root&password=root&useUnicode=true&characterEncoding=utf-8&serverTimezone=UTC
#connection.url=jdbc:mysql://192.168.120.1:3306/sparkdb?user=root&password=root&useUnicode=true&characterEncoding=utf-8&serverTimezone=UTC
connection.url=jdbc:dm://数据库iP:5236/库?user=用户名&password=密码&characterEncoding=utf-8
table.whitelist=household3 // 需要监听的表
mode=incrementing // 模式,有增量,时序,增量+时序三种
incrementing.column.name=id // 监听依据字段
topic.prefix=test- // topic前缀,最后会得到整个topic为test-household3
这就是最麻烦的部分,已经配置好了,其实就是配置了jdbc连接,监听表的一些信息。
2.2 connect-standalone.properties
该文件的路径是…/kafka_2.13-2.7.0/config,在kafka文件夹下的config子文件夹下,同样的建议各位cp一份来操作,这个文件就没那么多复杂的配置了,我还是把完整的以及精简版贴出来:
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# These are defaults. This file just demonstrates how to override some settings.
bootstrap.servers=localhost:9092
# The converters specify the format of data in Kafka and how to translate it into Connect data. Every Connect user will
# need to configure these based on the format they want their data in when loaded from or stored into Kafka
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
# Converter-specific settings can be passed in by prefixing the Converter's setting with the converter we want to apply
# it to
key.converter.schemas.enable=true
value.converter.schemas.enable=true
offset.storage.file.filename=/tmp/connect.offsets
# Flush much faster than normal, which is useful for testing/debugging
offset.flush.interval.ms=10000
# Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
# (connectors, converters, transformations). The list should consist of top level directories that include
# any combination of:
# a) directories immediately containing jars with plugins and their dependencies
# b) uber-jars with plugins and their dependencies
# c) directories immediately containing the package directory structure of classes of plugins and their dependencies
# Note: symlinks will be followed to discover dependencies or plugins.
# Examples:
# plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
plugin.path=/root/apps/confluentinc-kafka-connect-jdbc
下面是精简的:
bootstrap.servers=localhost:9092
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true
offset.storage.file.filename=/tmp/connect.offsets
offset.flush.interval.ms=10000
// 只需要修改最后一行,这行代表connector需要导入的包的路径,我们需要连接的第三方数据源也需要放在里面,当然我建议各位放在上上层里,把扫描范围扩大一点
# Note: symlinks will be followed to discover dependencies or plugins.
# Examples:
# plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
plugin.path=/root/apps/confluentinc-kafka-connect-jdbc
3、 启动kafka-connect
接下来只需要在kafka的bin中启动kafka-connector就行了,进入bin输入:
[root@iZuf6fp7azbufaq6loxytsZ bin]# sh connect-standalone.sh /root/apps/kafka_2.13-2.7.0/config/connect-standalone.properties /root/apps/confluentinc-kafka-connect-jdbc/etc/source-dmjdbc-dm.properties
我的代码如上所示,中间两个文件的路径大家根据实际情况调整。启动成功后可以看到SQL语句。