一般的agent的数据源过来是从Source->Channel->Sink。在上一长讲解的时候都是单Sink,单Channel,单Source,其实三个组件之间还有有一层架构。
在Source和Channel之间有一个选择器,叫做Channel Selectors用来选择Source的数据怎么流向Channel。有Replicating(复制)方式,就是将备份方式到各个指定的Channel中;Multiplexing(复用)方式,这种方式就是根据不同的业务到不同的Channel中;自定义。这需要编写代码。
Replicating 方式
a1.sources = r1
a1.channels = c1 c2 c3
a1.source.r1.selector.type = replicating
a1.source.r1.channels = c1 c2 c3
Multiplexing 方式
a1.sources = r1
a1.channels = c1 c2 c3 c4
a1.sources.r1.selector.type = multiplexing
a1.sources.r1.selector.header = state #event头中state字段
a1.sources.r1.selector.mapping.CZ = c1 #state=CZ
a1.sources.r1.selector.mapping.US = c2 c3 #state=US
a1.sources.r1.selector.default = c4 #默认
在Channel和Sink之间也有一个处理器,叫做Sink Procesors。这里的Sink是以group为基础。processor.type有以下几种类型:default, failover or load_balance。default,就是默认的单个sink,可以不用写出来,按照之前的 source - channel - sink 写法。failover,故障转移,会通过配置维护了一个优先级列表。保证每一个有效的事件都会被处理,较高优先级的值Sink会更早被激活。load_balance,负载均衡,负载均衡片处理器提供在多个Sink之间负载平衡的能力。实现支持通过round_robin(轮询)或者random(随机)参数来实现负载分发。
failover故障转移配置
#Name the components on this agent
a1.sources= r1
a1.sinks= k1 k2
a1.channels= c1 c2
a1.sinkgroups= g1
a1.sinkgroups.g1.sinks= k1 k2
a1.sinkgroups.g1.processor.type= failover
a1.sinkgroups.g1.processor.priority.k1= 5
a1.sinkgroups.g1.processor.priority.k2= 10
a1.sinkgroups.g1.processor.maxpenalty= 10000
#Describe/configure the source
a1.sources.r1.type= syslogtcp
a1.sources.r1.port= 50000
a1.sources.r1.host= 192.168.233.128
a1.sources.r1.channels= c1 c2
#Describe the sink
a1.sinks.k1.type= avro
a1.sinks.k1.channel= c1
a1.sinks.k1.hostname= 192.168.233.129
a1.sinks.k1.port= 50000
a1.sinks.k2.type= avro
a1.sinks.k2.channel= c2
a1.sinks.k2.hostname= 192.168.233.130
a1.sinks.k2.port= 50000
# Usea channel which buffers events in memory
a1.channels.c1.type= memory
a1.channels.c1.capacity= 1000
a1.channels.c1.transactionCapacity= 100
负载均衡的配置
# Name the components on this agent
a1.sources = r1
a1.sinks = k1 k2
a1.channels = c1
a1.sinkgroups = g1
a1.sinkgroups.g1.sinks = k1 k2
a1.sinkgroups.g1.processor.type =load_balance
a1.sinkgroups.g1.processor.backoff = true
a1.sinkgroups.g1.processor.selector =round_robin
# Describe/configure the source
a1.sources.r1.type = syslogtcp
a1.sources.r1.port = 50000
a1.sources.r1.host = 192.168.233.128
a1.sources.r1.channels = c1
# Describe the sink
a1.sinks.k1.type = avro
a1.sinks.k1.channel = c1
a1.sinks.k1.hostname = 192.168.233.129
a1.sinks.k1.port = 50000
a1.sinks.k2.type = avro
a1.sinks.k2.channel = c1
a1.sinks.k2.hostname = 192.168.233.130
a1.sinks.k2.port = 50000
# Use a channel which buffers events inmemory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100