目录
四、agent4:processor-load_balance. 6
六、agent6:interceptor-正则匹配.. 10
七、agent7:selector-interceptor 12
八、另外两台服务器配置 tdh001(agent1) tdh003(agent1) 14
配置信息:使用三台主机 tdh001 tdh002 tdh003
tdh002为收集外部数据节点
一、agent1:扇出
source(http) tdh001(agent1) tdh003(agent1)
http命令:curl -X POST -d '[{ "headers" :{"a" : "a1","b" : "b1"},"body" : "hello~http~flume~"}]' http://11.11.192.13:33333
flume启动命令:(在conf目录) ../bin/flume-ng agent --conf . --conf-file ./agent1 --name agent1 -Dflume.root.logger=INFO,console
下同
--------------------------------------------------------------------------------------
agent1.sources=s1
agent1.channels=c1 c2
agent1.sinks=k1 k2
agent1.sources.s1.type=http
agent1.sources.s1.port=33333
agent1.channels.c1.type=memory
agent1.channels.c1.capacity=1000
agent1.channels.c1.transactioncapacity=100
agent1.channels.c2.type=memory
agent1.channels.c2.capacity=1000
agent1.channels.c2.transactioncapacity=100
agent1.sinks.k1.type=avro
agent1.sinks.k1.hostname=11.11.192.1
agent1.sinks.k1.port=33333
agent1.sinks.k2.type=avro
agent1.sinks.k2.hostname=11.11.192.25
agent1.sinks.k2.port=33333
agent1.sources.s1.channels=c1 c2
agent1.sinks.k1.channel=c1
agent1.sinks.k2.channel=c2
二、agent2:扇出-多路复用
source(http) tdh001(agent1) tdh003(agent1)
curl -X POST -d '[{ "headers" :{"flag" : "a"},"body" : "aaaaaaaaa~"}]' http://11.11.192.13:33333
../bin/flume-ng agent --conf . --conf-file ./agent1 --name agent1 -Dflume.root.logger=INFO,console
--------------------------------------------------------------------------------------
agent2.sources=s1
agent2.channels=c1 c2
agent2.sinks=k1 k2
agent2.sources.s1.type=http
agent2.sources.s1.bind=0.0.0.0
agent2.sources.s1.port=33333
agent2.sources.s1.selector.type=multiplexing
agent2.sources.s1.selector.header=flag
agent2.sources.s1.selector.mapping.a=c1
agent2.sources.s1.selector.mapping.b=c2
agent2.sources.s1.selector.default=c1
agent2.channels.c1.type=memory
agent2.channels.c1.capacity=1000
agent2.channels.c1.transactioncapacity=100
agent2.channels.c2.type=memory
agent2.channels.c2.capacity=1000
agent2.channels.c2.transactioncapacity=100
agent2.sinks.k1.type=avro
agent2.sinks.k1.hostname=tdh001
agent2.sinks.k1.port=33333
agent2.sinks.k2.type=avro
agent2.sinks.k2.hostname=tdh003
agent2.sinks.k2.port=33333
agent2.sources.s1.channels=c1 c2
agent2.sinks.k1.channel=c1
agent2.sinks.k2.channel=c2
三、agent3:processor-failover
source(http) tdh001(agent1) tdh003(agent1)
curl -X POST -d '[{"headers":{"flag":"a"},"body":"aaaaaa"}]' http://11.11.192.13:33333
../bin/flume-ng agent --conf . --conf-file ./agent3 --name a3 -Dflume.root.logger=INFO,console
--------------------------------------------------------------------------------------
a3.sources=s1
a3.channels=c1
a3.sinks=k1 k2
a3.sources.s1.type=http
a3.sources.s1.port=33333
a3.channels.c1.type=memory
a3.channels.c1.capacity=1000
a3.channels.c1.transactioncapacity=100
a3.sinkgroups=g1
a3.sinkgroups.g1.sinks=k1 k2
a3.sinkgroups.g1.processor.type=failover
a3.sinkgroups.g1.processor.priority.k1=5
a3.sinkgroups.g1.procassor.priority.k2=10
a3.sinkgroups.g1.processor.maxpenalty=10000
a3.sinks.k1.type=avro
a3.sinks.k1.hostname=tdh001
a3.sinks.k1.port=33333
a3.sinks.k2.type=avro
a3.sinks.k2.hostname=tdh003
a3.sinks.k2.port=33333
a3.sources.s1.channels=c1
a3.sinks.k1.channel=c1
a3.sinks.k2.channel=c1
四、agent4:processor-load_balance source(http) tdh001(agent1) tdh003(agent1)
curl -X POST -d '[{"headers":{"flag":"a"},"body":"aaaaaa"}]' http://11.11.192.13:33333
../bin/flume-ng agent --conf . --conf-file ./agent4 --name a4 -Dflume.root.logger=INFO,console
--------------------------------------------------------------------------------------
a4.sources=s1
a4.channels=c1
a4.sinks=k1 k2
a4.sources.s1.type=http
a4.sources.s1.port=33333
a4.channels.c1.type=memory
a4.channels.c1.capacity=1000
a4.channels.c1.transactioncapacity=100
a4.sinkgroups=g1
a4.sinkgroups.g1.sinks=k1 k2
a4.sinkgroups.g1.processor.type=load_balance
a4.sinkgroups.g1.processor.selector=round_robin
a4.sinkgroups.g1.processor.backoff=true
a4.sinks.k1.type=avro
a4.sinks.k1.hostname=tdh001
a4.sinks.k1.port=33333
a4.sinks.k2.type=avro
a4.sinks.k2.hostname=tdh003
a4.sinks.k2.port=33333
a4.sources.s1.channels=c1
a4.sinks.k1.channel=c1
a4.sinks.k2.channel=c1
五、agent5:interceptor
source(http) tdh001(agent1)
curl -X POST -d '[{"headers":{"":""},"body":"A"}]' http://11.11.192.13:33333
../bin/flume-ng agent --conf . --conf-file ./agent5 --name a5 -Dflume.root.logger=INFO,console
--------------------------------------------------------------------------------------
a5.sources=s1
a5.channels=c1
a5.sinks=k1
a5.sources.s1.type=http
a5.sources.s1.port=33333
a5.sources.s1.interceptors=i1 i2 i3 i4
## timestamp
a5.sources.s1.interceptors.i1.type=timestamp
a5.sources.s1.interceptors.i1.preserveExisting=false
##host
a5.sources.s1.interceptors.i2.type=host
a5.sources.s1.interceptors.i2.useIP=false
a5.sources.s1.interceptors.i2.hostHeader=a5host
a5.sources.s1.interceptors.i2.preserveExisting=false
##静态头,key、value自定义
a5.sources.s1.interceptors.i3.type=static
a5.sources.s1.interceptors.i3.key=static_key
a5.sources.s1.interceptors.i3.value=static_value
a5.sources.s1.interceptors.i3.preserveExisting=false
##UUID
a5.sources.s1.interceptors.i4.type=org.apache.flume.sink.solr.morphline.UUIDInterceptor$Builder
a5.sources.s1.interceptors.i4.headerName=uuid
a5.sources.s1.interceptors.i4.prefix=su
a5.sources.s1.interceptors.i4.preserveExisting=false
a5.channels.c1.type=memory
a5.channels.c1.capacity=1000
a5.channels.c1.transactioncapacity=100
a5.sinks.k1.type=avro
a5.sinks.k1.hostname=tdh001
a5.sinks.k1.port=33333
a5.sources.s1.channels=c1
a5.sinks.k1.channel=c1
六、agent6:interceptor-正则匹配
source(http) tdh001(agent1)
curl -X POST -d '[{"headers":{"":""},"body":"111aa:bb"}]' http://11.11.192.13:33333
../bin/flume-ng agent --conf . --conf-file ./agent6 --name a6 -Dflume.root.logger=INFO,console
-------------------------------------------------------------------------------------- a6.sources=s1
a6.channels=c1
a6.sinks=k1
a6.sources.s1.type=http
a6.sources.s1.port=33333
a6.sources.s1.interceptors=i1 i2 i3
## search replace
a6.sources.s1.interceptors.i1.type=search_replace
a6.sources.s1.interceptors.i1.searchPattern=[1-9]+
a6.sources.s1.interceptors.i1.replaceString=0
a6.sources.s1.interceptors.i1.charset=UTF-8
## regex filter
a6.sources.s1.interceptors.i2.type=regex_filter
a6.sources.s1.interceptors.i2.regex=^flume.*
a6.sources.s1.interceptors.i2.excludeEvents=true
## regex extractor ()代表一个组
a6.sources.s1.interceptors.i3.type=regex_extractor
a6.sources.s1.interceptors.i3.regex=([a-z]+):([a-z]+)
a6.sources.s1.interceptors.i3.serializers=s1 s2
a6.sources.s1.interceptors.i3.serializers.s1.name=one
a6.sources.s1.interceptors.i3.serializers.s2.name=two
a6.channels.c1.type=memory
a6.channels.c1.capacity=1000
a6.channels.c1.transactioncapacity=100
a6.sinks.k1.type=avro
a6.sinks.k1.hostname=tdh001
a6.sinks.k1.port=33333
a6.sources.s1.channels=c1
a6.sinks.k1.channel=c1
七、agent7:selector-interceptor
source(http) tdh001(agent1) tdh003(agent1)
curl -X POST -d '[{"headers":{"":""},"body":"1aa:bb"}]' http://11.11.192.13:33333
curl -X POST -d '[{"headers":{"":""},"body":"2aa:bb"}]' http://11.11.192.13:33333
../bin/flume-ng agent --conf . --conf-file ./agent7 --name a7 -Dflume.root.logger=INFO,console
a7.sources=s1
a7.channels=c1 c2
a7.sinks=k1 k2
a7.sources.s1.type=http
a7.sources.s1.port=33333
a7.sources.s1.interceptors=i1 i2
a7.sources.s1.interceptors.i1.type=regex_extractor
a7.sources.s1.interceptors.i1.regex=(^1).*
a7.sources.s1.interceptors.i1.serializers=se1
a7.sources.s1.interceptors.i1.serializers.se1.name=one
a7.sources.s1.interceptors.i2.type=regex_extractor
a7.sources.s1.interceptors.i2.regex=(^2).*
a7.sources.s1.interceptors.i2.serializers=se1
a7.sources.s1.interceptors.i2.serializers.se1.name=two
a7.sources.s1.selector.type=multiplexing
a7.sources.s1.selector.header=one
a7.sources.s1.selector.mapping.1=c1
a7.sources.s1.selector.default=c1
a7.sources.s1.selector.type=multiplexing
a7.sources.s1.selector.header=two
a7.sources.s1.selector.mapping.2=c2
a7.sources.s1.selector.default=c1
a7.channels.c1.type=memory
a7.channels.c1.capacity=1000
a7.channels.c1.transactioncapacity=100
a7.channels.c2.type=memory
a7.channels.c2.capacity=1000
a7.channels.c2.transactioncapacity=100
a7.sinks.k1.type=avro
a7.sinks.k1.hostname=tdh001
a7.sinks.k1.port=33333
a7.sinks.k2.type=avro
a7.sinks.k2.hostname=tdh003
a7.sinks.k2.port=33333
a7.sources.s1.channels=c1 c2
a7.sinks.k1.channel=c1
a7.sinks.k2.channel=c2
八、另外两台服务器配置 tdh001(agent1) tdh003(agent1)
agent1.sources=s1
agent1.channels=c1
agent1.sinks=k1
agent1.sources.s1.type=avro
agent1.sources.s1.bind=0.0.0.0
agent1.sources.s1.port=33333
agent1.channels.c1.type=memory
agent1.channels.c1.capacity=1000
agent1.channels.c1.transactioncapacity=100
agent1.sinks.k1.type=logger
agent1.sources.s1.channels=c1
agent1.sinks.k1.channel=c1
九、flume+kafka
agent.sources=s1
agent.channels=c1
agent.sinks=k1
agent.sources.s1.type=exec
agent.sources.s1.command=tail -F /log
agent.channels.c1.type=memory
agent.channels.c1.capacity=1000
agent.channels.c1.transactionCapacity=100
agent.sinks.k1.type=org.apache.flume.sink.kafka.KafkaSink
agent.sinks.k1.brokerList=11.11.192.13:33333
agent.sinks.k1.topic=flume_kafka
agent.sinks.k1.serializer.class=kafka.serializer.StringEncoder
agent.sources.s1.channels=c1
agent.sinks.k1.channel=c1
在安装kafka的目录下执行以下命令:
启动zookeeper:bin/zookeeper-server-start.sh -daemon config/zookeeper.properties &
启动kafka:bin/kafka-server-start.sh -daemon config/server.properties &
创建topic:bin/kafka-topics.sh --create --zookeeper localhost:33333
--replication-factor 1 --partitions 1 --topic flume_kafka
消费kafka数据:bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic flume_kafka --from-beginning --zookeeper master