下载并解压
$ wget https://mirrors.tuna.tsinghua.edu.cn/apache/kafka/2.3.0/kafka_2.12-2.3.0.tgz
$ tar -xzf kafka_2.12-2.3.0.tgz
$ cd kafka_2.12-2.3.0
启动服务
kafka使用zookeeper,要先启动zookeeper;
单节点安装,请参考kafka 单节点安装
//# 创建topic
$ bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic test
//# 查看topci
$ bin/kafka-topics.sh --list --bootstrap-server localhost:9092
可以设置brokers自动创建topic;
生产消费信息
//# 运行生产者
$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
This is a message
This is another message
//# 运行消费者
$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning
设置多个broker
//# 复制多个broker配置文件
$ cp config/server.properties config/server-1.properties
$ cp config/server.properties config/server-2.properties
//# 修改配置
$ vi config/server-1.properties
broker.id=1
listeners=PLAINTEXT://:9093
log.dirs=/tmp/kafka-logs-1
$ vi config/server-2.properties
broker.id=2
listeners=PLAINTEXT://:9094
log.dirs=/tmp/kafka-logs-2
broker.id
为节点在集群中的名称所以要唯一;
因为运行在单机上,要更改端口和日志路径;
//# 启动
$ bin/kafka-server-start.sh config/server-1.properties &
$ bin/kafka-server-start.sh config/server-2.properties &
//# 创建有3个复本因子的topic
$ bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 3 --partitions 1 --topic my-replicated-topic
//# 查看topic在哪个broker
$ bin/kafka-topics.sh --describe --bootstrap-server localhost:9092 --topic my-replicated-topic
Topic:my-replicated-topic PartitionCount:1 ReplicationFactor:3 Configs:
Topic: my-replicated-topic Partition: 0 Leader: 1 Replicas: 1,2,0 Isr: 1,2,0
第1行为所有分区的总览;其他行为每个分区的信息;现在只有一个分区,所以只有一行;
“leader”: 该分区处理对外读写的主节点;
“replicas”: 复制分区的节点列表;
“isr”: 在同步复本的节点列表;
节点1是leader;1个分区放在节点1上;
//# 查看单topic
$ bin/kafka-topics.sh --describe --bootstrap-server localhost:9092 --topic test
Topic:test PartitionCount:1 ReplicationFactor:1 Configs:
Topic: test Partition: 0 Leader: 0 Replicas: 0 Isr: 0
test主题没有复本,在节点0上;
//# 为集群生产信息
$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic my-replicated-topic
my test message 1
my test message 2
//# 消费信息
$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --from-beginning --topic my-replicated-topic
故障转移
//# broker 1是主节点,杀死
$ ps aux | grep server-1.properties
$ kill -9 7564
//# 查看集群状态
$ bin/kafka-topics.sh --describe --bootstrap-server localhost:9092 --topic my-replicated-topic
Topic:my-replicated-topic PartitionCount:1 ReplicationFactor:3 Configs:
Topic: my-replicated-topic Partition: 0 Leader: 2 Replicas: 1,2,0 Isr: 2,0
节点1不在同步列表中了;主节点变为节点2;
kafka connect导入导出数据
Kafka Connect是kafka用来导出导入数据的工具;
//# 创建文件
$ echo -e "foo\nbar" > test.txt
//# connect配置文件
$ vi config/connect-standalone.properties
#连接的kakfa brokers
bootstrap.servers=localhost:9092
#数据序列化的格式
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true
offset.storage.file.filename=/tmp/connect.offsets
offset.flush.interval.ms=10000
//# source connector的配置
$ vi config/connect-file-source.properties
#connector的名称
name=local-file-source
#类名
connector.class=FileStreamSource
tasks.max=1
file=test.txt
topic=connect-test
//# sink connector的配置
$ vi config/connect-file-sink.properties
name=local-file-sink
connector.class=FileStreamSink
tasks.max=1
file=test.sink.txt
topics=connect-test
//# 启动connect
$ bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties
connect会从source读取数据,格式化存入kafka,通过sink导出数据
//# 查看sink
$ more test.sink.txt
//# 查看kafka
$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic connect-test --from-beginning
//# 写入source
$ echo Another line >> test.txt
使用kafka streams处理数据
kafka stream是客户端库,构建关键任务的实时应用和微服务;