日志收集之flume和kafka集成

最新推荐文章于 2022-10-12 11:13:34 发布

Luckyman_zz

最新推荐文章于 2022-10-12 11:13:34 发布

阅读量282

点赞数

文章标签： flume kafka

本文链接：https://blog.csdn.net/weixin_43497829/article/details/106067805

版权

对于收集日志文件，选择flume和kafka的组合，其中flume完成对日志的聚集功能，kafka实现数据流的缓冲和削峰。
在这里插入图片描述
由上图可见，flume可以作为生产者，也可以作为消费者。
1 flume作为生产者
flume作为生产者，内部架构可以分为两种：source-channel和source-channel-sink
（1）source-channel架构
source-channel架构，channel选择为KafkaChannel，需求分析如下：
在这里插入图片描述编写配置文件kafka_channel.conf：

# netcat source
a1.sources = r1
a1.sources.r1.type = netcat
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 66666

# kafka channel
a1.channels = c1
a1.channels.c1.type = org.apache.flume.channel.kafka.KafkaChannel
a1.channels.c1.kafka.bootstrap.servers = hadoop102:9092
a1.channels.c1.kafka.topic = flume01
a1.channels.c1.kafka.consumer.group.id = flume-consume

# bind the source to the channel
a1.sources.r1.channels = c1

依次启动zookeeper集群和kafka集群，检查kafka中的话题：

[jl@hadoop102 job]$  kafka-topics.sh --list --bootstrap-server hadoop102:9092
__consumer_offsets

启动flume，运行命令，生成数据：

[jl@hadoop102 flume]$ bin/flume-ng agent -n a1 -c conf/ -f job/kafka_channel.conf -Dflume.root.logger=INFO,console

[jl@hadoop102 ~]$ nc localhost 66666
hello      
OK

再次检查话题：

[jl@hadoop102 job]$  kafka-topics.sh --list --bootstrap-server hadoop102:9092
__consumer_offsets
flume01

启动消费者：

[jl@hadoop102 job]$ kafka-console-consumer.sh --bootstrap-server hadoop102:9092 --topic flume01

在这里插入图片描述（2）source-channel-sink架构
source-channel-sink架构，source为netcat source，channel选择为memory channel，sink选择为kafka sink需求分析如下：
准备配置文件kafka_sink.conf：

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# netcat source
a1.sources.r1.type = netcat
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 66666

#  channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# kafka sink
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.kafka.bootstrap.servers = hadoop102:9092
a1.sinks.k1.kafka.topic = flume02
a1.sinks.k1.kafka.consumer.group.id = flume-consume

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

依次启动zookeeper集群和kafka集群，检查kafka中的话题：

[jl@hadoop102 job]$  kafka-topics.sh --list --bootstrap-server hadoop102:9092
__consumer_offsets
flume01

启动flume，运行命令，生成数据：

[jl@hadoop102 flume]$ bin/flume-ng agent -n a1 -c conf/ -f job/kafka_sink.conf -Dflume.root.logger=INFO,console

[jl@hadoop102 ~]$ nc localhost 66666
hello      
OK

再次检查话题：

[jl@hadoop102 job]$  kafka-topics.sh --list --bootstrap-server hadoop102:9092
__consumer_offsets
flume01
flume02

启动消费者：

[jl@hadoop102 job]$ kafka-console-consumer.sh --bootstrap-server hadoop102:9092 --topic flume02

在这里插入图片描述 2 flume作为消费者
flume作为生产者，内部架构可以分为两种：channel-sink和source-channel-sink
（1）channel-sink
需求分析：
准备kafka_channel配置文件：

# Name the components on this agent
a1.sinks = k1
a1.channels = c1

# kafka channel
a1.channels.c1.type = org.apache.flume.source.kafka.KafkaChannel
a1.channels.c1.kafka.bootstrap.servers = hadoop102:9092
a1.channels.c1.kafka.topics = flume01
a1.channels.c1.kafka.consumer.group.id = custom.g.id


# logger sink
a1.sinks.k1.type = logger

# bind sink to the channel
a1.sinks.k1.channel = c1

依次启动zookeeper集群和kafka集群，启动flume：

[jl@hadoop102 flume]$ bin/flume-ng agent -n a1 -c conf/ -f job/kafka_channel.conf -Dflume.root.logger=INFO,console

启动kafka的生产者添加数据

[jl@hadoop102 flume]$ kafka-console-producer.sh --broker-list hadoop102:9092 -topic flume01

在这里插入图片描述

（2）source-channel-sink
需求分析：
在这里插入图片描述准备配置文件kafka_source.conf:

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# kafka source
a1.sources.r1.type = org.apache.flume.source.kafka.KafkaSource
a1.sources.r1.kafka.bootstrap.servers = hadoop102:9092
a1.sources.r1.kafka.topics = flume01
a1.sources.r1.kafka.consumer.group.id = custom.g.id
a1.sources.r1.batchSize = 100


# memory channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# logger sink
a1.sinks.k1.type = logger

# bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

依次启动zookeeper集群和kafka集群，启动flume：

[jl@hadoop102 flume]$ bin/flume-ng agent -n a1 -c conf/ -f job/kafka_source.conf -Dflume.root.logger=INFO,console

启动kafka的生产者添加数据

[jl@hadoop102 flume]$ kafka-console-producer.sh --broker-list hadoop102:9092 -topic flume01

在这里插入图片描述

Luckyman_zz

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
日志收集之flume和kafka集成

对于收集日志文件，选择flume和kafka的组合，其中flume完成对日志的聚集功能，kafka实现数据流的缓冲和削峰。由上图可见，flume可以作为生产者，也可以作为消费者。1 flume作为生产者flume作为生产者，内部架构可以分为两种：source-channel和source-channel-sink（1）source-channel架构source-channel架构，channel选择为KafkaChannel，需求分析如下：编写配置文件kafka_channel.conf：#
复制链接

扫一扫