上游每个flume都配两个sink,而且这两个作为一个组,组策略为fail over processor(失败切换另一个avro-sink)
下游收集者,配置两个flume,一个作为主,一个作为备, 而且上游的 主avro-sink发送到 主avro-source 备avro-sink发送到 备avro-source,那么,万一下游汇聚者1 主avro-sources 挂掉了,上游的avro-sink会自动切换到备avro-sink,备avro-source开始工作
这就是flume通过配置实现HA,必须是级联模式才有HA的说法
上游 第一级
# 级联高可用配置第一级
a1.sources = r1
a1.sinks = k1 k2
a1.channels = c1
a1.sources.r1.channels = c1
a1.sources.r1.type = TAILDIR
a1.sources.r1.filegroups = g1
a1.sources.r1.filegroups.g1 = /logdata/a.*
a1.sources.r1.fileHeader = false
a1.channels.c1.type = memory
a1.channels.c1.capacity = 2000
a1.channels.c1.transactionCapacity = 1000
a1.sinks.k1.channel = c1
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = doitedu02
a1.sinks.k1.port = 4444
a1.sinks.k2.channel = c1
a1.sinks.k2.type = avro
a1.sinks.k2.hostname = doitedu03
a1.sinks.k2.port = 4444
#组配置
a1.sinkgroups = g1
#一个组关联两个sink
a1.sinkgroups.g1.sinks = k1 k2
#组策略:失败切换
a1.sinkgroups.g1.processor.type = failover
#配置两个sink的权重
a1.sinkgroups.g1.processor.priority.k1 = 200
a1.sinkgroups.g1.processor.priority.k2 = 100
#主节点宕机惩罚时间
a1.sinkgroups.g1.processor.maxpenalty = 5000
下游 第二级
# 级联高可用配置第2级(节点1)
a1.sources = r1
a1.sinks = k1
a1.channels = c1
a1.sources.r1.channels = c1
a1.sources.r1.type = avro
a1.sources.r1.bind = doitedu02
a1.sources.r1.port = 4444
a1.sources.r1.batchSize = 100
a1.channels.c1.type = memory
a1.channels.c1.capacity = 2000
a1.channels.c1.transactionCapacity = 1000
a1.sinks.k1.channel = c1
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.kafka.bootstrap.servers = doitedu01:9092,doitedu02:9092,doitedu03:9092
a1.sinks.k1.kafka.topic = failover
a1.sinks.k1.kafka.producer.acks = 1
# 级联高可用配置第2级(节点2)
a1.sources = r1
a1.sinks = k1
a1.channels = c1
a1.sources.r1.channels = c1
a1.sources.r1.type = avro
a1.sources.r1.bind = doitedu03
a1.sources.r1.port = 4444
a1.sources.r1.batchSize = 100
a1.channels.c1.type = memory
a1.channels.c1.capacity = 2000
a1.channels.c1.transactionCapacity = 1000
a1.sinks.k1.channel = c1
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.kafka.bootstrap.servers = doitedu01:9092,doitedu02:9092,doitedu03:9092
a1.sinks.k1.kafka.topic = failover
a1.sinks.k1.kafka.producer.acks = 1