flume版本1.7,kafka版本0.9.0.0
1、技术选型
Agent1:exec source + memory channel + avro sink
Agent2:avro source + memory channel + kafka sink
2、Agent1配置
exec-memory-avro.conf
exec-memory-avro.sources = r1
exec-memory-avro.sinks = k1
exec-memory-avro.channels = c1
# 配置source
# Describe/configure the source
exec-memory-avro.sources.r1.type = exec
exec-memory-avro.sources.r1.command = tail -f /home/hadoop/data/data.log
exec-memory-avro.sources.r1.shell = /bin/sh -c
# Describe the sink
# 配置sink
exec-memory-avro.sinks.k1.type = avro
exec-memory-avro.sinks.k1.hostname = localhost
exec-memory-avro.sinks.k1.port = 44444
#配置channel
# Use a channel which buffers events in memory
exec-memory-avro.channels.c1.type = memory
exec-memory-avro.channels.c1.capacity = 1000
exec-memory-avro.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
exec-memory-avro.sources.r1.channels = c1
exec-memory-avro.sinks.k1.channel = c1
3、Agent2配置
avro-memory-kafka.conf
avro-memory-kafka.sources = r1
avro-memory-kafka.sinks = k1
avro-memory-kafka.channels = c1
# 配置source
# Describe/configure the source
avro-memory-kafka.sources.r1.type = avro
avro-memory-kafka.sources.r1.bind = localhost
avro-memory-kafka.sources.r1.port = 44444
# Describe the sink
# 配置kafka sink
avro-memory-kafka.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
avro-memory-kafka.sinks.k1.kafka.bootstrap.servers = localhost:9092
avro-memory-kafka.sinks.k1.kafka.topic = kafka_topic
avro-memory-kafka.sinks.k1.kafka.flumeBatchSize = 5
#配置channel
# Use a channel which buffers events in memory
avro-memory-kafka.channels.c1.type = memory
avro-memory-kafka.channels.c1.capacity = 1000
avro-memory-kafka.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
avro-memory-kafka.sources.r1.channels = c1
avro-memory-kafka.sinks.k1.channel = c1
4、按照kafka单节点单broker的部署和使用文章中的方法启动kafka,并创建名为kafka_topic的topic
5、分别启动kafka、avro-memory-kafka.conf、exec-memory-avro.conf
6、测试
向data.log中增加数据:
启动kafka消费者消费数据
END!!!