Flume与HBASE、Kafka集成相关配置
这里首先设置两台Flume采集应用服务日志,将数据Push到第三台Flume进行日志合并、预处理。然后通过两个Channel分别将数据发送到HBASE和Kafka中。关于Flume基础可以参照Flume解析。
这里配置三台节点机器,其中agent2、agent3节点配置flume,用于从应用服务将数据采集到agent1节点。以agent2节点为例修改配置文件。
agent2.sources = r1
agent2.channels = c1
agent2.sinks = k1
agent2.sources.r1.type = exec
agent2.sources.r1.command = tail -F /opt/datas/weblogs.log
agent2.sources.r1.channels = c1
agent2.channels.c1.type = memory
agent2.channels.c1.capacity = 10000
agent2.channels.c1.transactionCapacity = 10000
agent2.channels.c1.keep-alive = 5
agent2.sinks.k1.type = avro
agent2.sinks.k1.channel = c1
agent2.sinks.k1.hostname = bigdata-pro01
agent2.sinks.k1.port = 5555
Flume与HBASE集成
agent1.sources = r1
agent1.channels = kafkaC hbaseC
agent1.sinks = kafkaSink hbaseSink
agent1.sources.r1.type = avro
agent1.sources.r1.channels = hbaseC
agent1.sources.r1.bind = bigdata-pro01
agent1.sources.r1.port = 5555
agent1.sources.r1.threads = 5
//Define a memory channel called hbaseC on agent1
agent1