接上篇flume+springboot,本篇讲解一下是如何集成kafka的。
1、首先,在slave1或者slave2的flume/conf中添加配置文件:kafka_flume.conf
a1.sources = r1
a1.sinks = k1
a1.channels = c1
a1.sources.r1.type = avro
a1.sources.r1.bind = 10.1.18.202
a1.sources.r1.port = 44444
#这个配置是可以直接进行打印的logger日志文件
#a1.sinks.k1.type = logger
#对于sink的配置描述 使用avro日志做数据的消费
a1.sinks.k1.type = avro
# hostname是最终传给的主机名称或者ip地址
a1.sinks.k1.hostname = 10.1.18.201
a1.sinks.k1.port = 44444
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
还是使用webserver作为数据源,使用master:44444作为sink方
2、在master的flume/conf中添加一个kafka_flume.conf文件,作为master的启动文件
a1.sources = r1
a1.sinks = k1
a1.channels = c1
a1.sources.r1.type = avro
a1.sources.r1.bind = 10.1.18.201
a1.sources.r1.port = 44444
#a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
#a1.sinks.k1.kafka.topic = testtopics
#a1.sinks.k1.kafka.bootstrap.servers = 10.1.18.201:9092
#a1.sinks.k1.kafka.flumeBatchSize = 5
#a1.sinks.k1.kafka.producer.acks = 1
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.brokerList = 10.1.18.201:9092
a1.sinks.k1.topic = testtopics
a1.sinks.k1.batchSize = 5
a1.sinks.k1.requiredAcks =1
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
其中sink的那部分配置比起上一篇文章来说很明显不同,这里的sink是kafka的某个topic,我这里的topicName是testtopics
简单说下kafka怎么新建话题
bin/kafka-topics.sh --create --zookeeper master:2181,slave1:2181,slave2:2181 --replication-factor 2 --partitions 2 --topic testtopics
新建话题后查看话题
bin/kafka-topics.sh --list --zookeeper master:2181,slave1:2181,slave2:2181
3、最后一步就是启动 运行
(1)启动javaweb项目
root@slave1:/opt/software# nohup java -jar logs_flume-0.0.1-SNAPSHOT.jar
(2)启动集群中所有节点的zk(这里没有使用kafka中内置的zk)
root@slave1:/opt/zookeeper# bin/zkServer.sh start
(3)启动集群中所有节点的kafka
root@slave1:/opt/kafka_2.11-2.1.1# bin/kafka-server-start.sh config/server.properties &
(4)查看是否都启动成功
(5)启动flume-agent(分别启动slave和master的flume,注意配置文件分别对应不同的文件!)
./bin/flume-ng agent --conf conf --conf-file conf/flume-conf.properties --name a1 -Dflume.root.logger=INFO,console
(6)最后一步,创建一个消费者
bin/kafka-console-consumer.sh --bootstrap-server master:9092 --from-beginning --topic testtopics
在客户端运行javaweb,查看程序是否运行成功,消费者端是否正确输出
日志已正确打印出
完活,下一步就是将kafka的持久化到hdfs中了