为什么要用flume对接kafka
生产环境增量数据多都是日志文件选用flume更好的达到实时监控
加入kafka的作用:可以对接多个业务线(将数据分类发送到不同topic)也可以动态增加业务线,不用增加备份(解耦)
案例一:用netcat,kafkasink达到简单对接
1.flume配置文件
a1.sources = r1
a1.sinks = k1
a1.channels = c1
#source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
#channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
#sink
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.kafka.topic = first
a1.sinks.k1.kafka.bootstrap.servers = 192.168.56.20:9092
a1.sinks.k1.kafka.flumeBatchSize = 20
a1.sinks.k1.kafka.producer.ack = 1
a1.sinks.k1.kafka.producer.linger.ms = 1
#bind
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
2.启动flume