Flume实例

目录

 

一、agent1:扇出.. 1

二、agent2:扇出-多路复用.. 3

三、agent3:processor-failover 5

四、agent4:processor-load_balance. 6

五、agent5:interceptor 8

六、agent6:interceptor-正则匹配.. 10

七、agent7:selector-interceptor 12

八、另外两台服务器配置 tdh001(agent1)  tdh003(agent1) 14

九、flume+kafka. 15

 

 

 

 

 

配置信息:使用三台主机 tdh001   tdh002   tdh003 

tdh002为收集外部数据节点

 

一、agent1:扇出

source(http)  tdh001(agent1)  tdh003(agent1)

       http命令:curl -X POST -d '[{ "headers" :{"a" : "a1","b" : "b1"},"body" : "hello~http~flume~"}]' http://11.11.192.13:33333

      flume启动命令:(在conf目录) ../bin/flume-ng agent --conf . --conf-file ./agent1 --name agent1 -Dflume.root.logger=INFO,console

下同

--------------------------------------------------------------------------------------

agent1.sources=s1

agent1.channels=c1 c2

agent1.sinks=k1 k2

 

agent1.sources.s1.type=http

agent1.sources.s1.port=33333

 

agent1.channels.c1.type=memory

agent1.channels.c1.capacity=1000

agent1.channels.c1.transactioncapacity=100

 

agent1.channels.c2.type=memory

agent1.channels.c2.capacity=1000

agent1.channels.c2.transactioncapacity=100

 

agent1.sinks.k1.type=avro

agent1.sinks.k1.hostname=11.11.192.1

agent1.sinks.k1.port=33333

 

agent1.sinks.k2.type=avro

agent1.sinks.k2.hostname=11.11.192.25

agent1.sinks.k2.port=33333

 

agent1.sources.s1.channels=c1 c2

agent1.sinks.k1.channel=c1

agent1.sinks.k2.channel=c2

 

二、agent2:扇出-多路复用

source(http)  tdh001(agent1)  tdh003(agent1)

       curl -X POST -d '[{ "headers" :{"flag" : "a"},"body" : "aaaaaaaaa~"}]' http://11.11.192.13:33333

       ../bin/flume-ng agent --conf . --conf-file ./agent1 --name agent1 -Dflume.root.logger=INFO,console

--------------------------------------------------------------------------------------

agent2.sources=s1

agent2.channels=c1 c2

agent2.sinks=k1 k2

 

agent2.sources.s1.type=http

agent2.sources.s1.bind=0.0.0.0

agent2.sources.s1.port=33333

agent2.sources.s1.selector.type=multiplexing

agent2.sources.s1.selector.header=flag

agent2.sources.s1.selector.mapping.a=c1

agent2.sources.s1.selector.mapping.b=c2

agent2.sources.s1.selector.default=c1

 

agent2.channels.c1.type=memory

agent2.channels.c1.capacity=1000

agent2.channels.c1.transactioncapacity=100

 

agent2.channels.c2.type=memory

agent2.channels.c2.capacity=1000

agent2.channels.c2.transactioncapacity=100

 

agent2.sinks.k1.type=avro

agent2.sinks.k1.hostname=tdh001

agent2.sinks.k1.port=33333

 

agent2.sinks.k2.type=avro

agent2.sinks.k2.hostname=tdh003

agent2.sinks.k2.port=33333

 

agent2.sources.s1.channels=c1 c2

agent2.sinks.k1.channel=c1

agent2.sinks.k2.channel=c2

三、agent3:processor-failover 

source(http)  tdh001(agent1)  tdh003(agent1)

       curl -X POST -d '[{"headers":{"flag":"a"},"body":"aaaaaa"}]' http://11.11.192.13:33333

       ../bin/flume-ng agent --conf . --conf-file ./agent3 --name a3 -Dflume.root.logger=INFO,console

--------------------------------------------------------------------------------------

a3.sources=s1

a3.channels=c1

a3.sinks=k1 k2

 

a3.sources.s1.type=http

a3.sources.s1.port=33333

 

a3.channels.c1.type=memory

a3.channels.c1.capacity=1000

a3.channels.c1.transactioncapacity=100

 

a3.sinkgroups=g1

a3.sinkgroups.g1.sinks=k1 k2

a3.sinkgroups.g1.processor.type=failover

a3.sinkgroups.g1.processor.priority.k1=5

a3.sinkgroups.g1.procassor.priority.k2=10

a3.sinkgroups.g1.processor.maxpenalty=10000

 

a3.sinks.k1.type=avro

a3.sinks.k1.hostname=tdh001

a3.sinks.k1.port=33333

 

a3.sinks.k2.type=avro

a3.sinks.k2.hostname=tdh003

a3.sinks.k2.port=33333

 

a3.sources.s1.channels=c1

a3.sinks.k1.channel=c1

a3.sinks.k2.channel=c1

四、agent4:processor-load_balance  source(http)  tdh001(agent1)  tdh003(agent1)

       curl -X POST -d '[{"headers":{"flag":"a"},"body":"aaaaaa"}]' http://11.11.192.13:33333

       ../bin/flume-ng agent --conf . --conf-file ./agent4 --name a4 -Dflume.root.logger=INFO,console

--------------------------------------------------------------------------------------

a4.sources=s1

a4.channels=c1

a4.sinks=k1 k2

 

a4.sources.s1.type=http

a4.sources.s1.port=33333

 

a4.channels.c1.type=memory

a4.channels.c1.capacity=1000

a4.channels.c1.transactioncapacity=100

 

a4.sinkgroups=g1

a4.sinkgroups.g1.sinks=k1 k2

a4.sinkgroups.g1.processor.type=load_balance

a4.sinkgroups.g1.processor.selector=round_robin

a4.sinkgroups.g1.processor.backoff=true

 

a4.sinks.k1.type=avro

a4.sinks.k1.hostname=tdh001

a4.sinks.k1.port=33333

 

a4.sinks.k2.type=avro

a4.sinks.k2.hostname=tdh003

a4.sinks.k2.port=33333

 

a4.sources.s1.channels=c1

a4.sinks.k1.channel=c1

a4.sinks.k2.channel=c1

五、agent5:interceptor 

source(http)  tdh001(agent1)

       curl -X POST -d '[{"headers":{"":""},"body":"A"}]' http://11.11.192.13:33333

       ../bin/flume-ng agent --conf . --conf-file ./agent5 --name a5 -Dflume.root.logger=INFO,console

--------------------------------------------------------------------------------------

a5.sources=s1

a5.channels=c1

a5.sinks=k1

 

a5.sources.s1.type=http

a5.sources.s1.port=33333

 

a5.sources.s1.interceptors=i1 i2 i3 i4

## timestamp

a5.sources.s1.interceptors.i1.type=timestamp

a5.sources.s1.interceptors.i1.preserveExisting=false

##host

a5.sources.s1.interceptors.i2.type=host

a5.sources.s1.interceptors.i2.useIP=false

a5.sources.s1.interceptors.i2.hostHeader=a5host

a5.sources.s1.interceptors.i2.preserveExisting=false

##静态头,key、value自定义

a5.sources.s1.interceptors.i3.type=static

a5.sources.s1.interceptors.i3.key=static_key

a5.sources.s1.interceptors.i3.value=static_value

a5.sources.s1.interceptors.i3.preserveExisting=false

##UUID

a5.sources.s1.interceptors.i4.type=org.apache.flume.sink.solr.morphline.UUIDInterceptor$Builder

a5.sources.s1.interceptors.i4.headerName=uuid

a5.sources.s1.interceptors.i4.prefix=su

a5.sources.s1.interceptors.i4.preserveExisting=false

 

a5.channels.c1.type=memory

a5.channels.c1.capacity=1000

a5.channels.c1.transactioncapacity=100

 

a5.sinks.k1.type=avro

a5.sinks.k1.hostname=tdh001

a5.sinks.k1.port=33333

 

a5.sources.s1.channels=c1

a5.sinks.k1.channel=c1

       

六、agent6:interceptor-正则匹配 

source(http)  tdh001(agent1)

       curl -X POST -d '[{"headers":{"":""},"body":"111aa:bb"}]' http://11.11.192.13:33333

       ../bin/flume-ng agent --conf . --conf-file ./agent6 --name a6 -Dflume.root.logger=INFO,console

-------------------------------------------------------------------------------------- a6.sources=s1

a6.channels=c1

a6.sinks=k1

 

a6.sources.s1.type=http

a6.sources.s1.port=33333

 

a6.sources.s1.interceptors=i1 i2 i3

## search replace

a6.sources.s1.interceptors.i1.type=search_replace

a6.sources.s1.interceptors.i1.searchPattern=[1-9]+

a6.sources.s1.interceptors.i1.replaceString=0

a6.sources.s1.interceptors.i1.charset=UTF-8

## regex filter

a6.sources.s1.interceptors.i2.type=regex_filter

a6.sources.s1.interceptors.i2.regex=^flume.*

a6.sources.s1.interceptors.i2.excludeEvents=true

## regex extractor  ()代表一个组

a6.sources.s1.interceptors.i3.type=regex_extractor

a6.sources.s1.interceptors.i3.regex=([a-z]+):([a-z]+)

a6.sources.s1.interceptors.i3.serializers=s1 s2

a6.sources.s1.interceptors.i3.serializers.s1.name=one

a6.sources.s1.interceptors.i3.serializers.s2.name=two

 

a6.channels.c1.type=memory

a6.channels.c1.capacity=1000

a6.channels.c1.transactioncapacity=100

 

a6.sinks.k1.type=avro

a6.sinks.k1.hostname=tdh001

a6.sinks.k1.port=33333

 

a6.sources.s1.channels=c1

a6.sinks.k1.channel=c1           

 

七、agent7:selector-interceptor

source(http)  tdh001(agent1)  tdh003(agent1)

       curl -X POST -d '[{"headers":{"":""},"body":"1aa:bb"}]' http://11.11.192.13:33333

curl -X POST -d '[{"headers":{"":""},"body":"2aa:bb"}]' http://11.11.192.13:33333

       ../bin/flume-ng agent --conf . --conf-file ./agent7 --name a7 -Dflume.root.logger=INFO,console

a7.sources=s1

a7.channels=c1 c2

a7.sinks=k1 k2

 

a7.sources.s1.type=http

a7.sources.s1.port=33333

 

a7.sources.s1.interceptors=i1 i2

a7.sources.s1.interceptors.i1.type=regex_extractor

a7.sources.s1.interceptors.i1.regex=(^1).*

a7.sources.s1.interceptors.i1.serializers=se1

a7.sources.s1.interceptors.i1.serializers.se1.name=one

a7.sources.s1.interceptors.i2.type=regex_extractor

a7.sources.s1.interceptors.i2.regex=(^2).*

a7.sources.s1.interceptors.i2.serializers=se1

a7.sources.s1.interceptors.i2.serializers.se1.name=two

 

a7.sources.s1.selector.type=multiplexing

a7.sources.s1.selector.header=one

a7.sources.s1.selector.mapping.1=c1

a7.sources.s1.selector.default=c1

 

a7.sources.s1.selector.type=multiplexing

a7.sources.s1.selector.header=two

a7.sources.s1.selector.mapping.2=c2

a7.sources.s1.selector.default=c1

 

a7.channels.c1.type=memory

a7.channels.c1.capacity=1000

a7.channels.c1.transactioncapacity=100

 

a7.channels.c2.type=memory

a7.channels.c2.capacity=1000

a7.channels.c2.transactioncapacity=100

 

a7.sinks.k1.type=avro

a7.sinks.k1.hostname=tdh001

a7.sinks.k1.port=33333

 

a7.sinks.k2.type=avro

a7.sinks.k2.hostname=tdh003

a7.sinks.k2.port=33333

 

a7.sources.s1.channels=c1 c2

a7.sinks.k1.channel=c1

a7.sinks.k2.channel=c2

 

八、另外两台服务器配置 tdh001(agent1)  tdh003(agent1)

agent1.sources=s1

agent1.channels=c1

agent1.sinks=k1

 

agent1.sources.s1.type=avro

agent1.sources.s1.bind=0.0.0.0

agent1.sources.s1.port=33333

 

agent1.channels.c1.type=memory

agent1.channels.c1.capacity=1000

agent1.channels.c1.transactioncapacity=100

 

agent1.sinks.k1.type=logger

 

agent1.sources.s1.channels=c1

agent1.sinks.k1.channel=c1

 

九、flume+kafka

                                                                                          

agent.sources=s1

agent.channels=c1

agent.sinks=k1

 

agent.sources.s1.type=exec

agent.sources.s1.command=tail -F /log

 

agent.channels.c1.type=memory

agent.channels.c1.capacity=1000

agent.channels.c1.transactionCapacity=100

 

agent.sinks.k1.type=org.apache.flume.sink.kafka.KafkaSink

agent.sinks.k1.brokerList=11.11.192.13:33333

agent.sinks.k1.topic=flume_kafka

agent.sinks.k1.serializer.class=kafka.serializer.StringEncoder

 

agent.sources.s1.channels=c1

agent.sinks.k1.channel=c1

 

 

在安装kafka的目录下执行以下命令:

启动zookeeper:bin/zookeeper-server-start.sh -daemon config/zookeeper.properties &

启动kafka:bin/kafka-server-start.sh -daemon config/server.properties &

创建topic:bin/kafka-topics.sh --create --zookeeper localhost:33333

 --replication-factor 1 --partitions 1 --topic flume_kafka

消费kafka数据:bin/kafka-console-consumer.sh --bootstrap-server localhost:9092   --topic flume_kafka --from-beginning --zookeeper master

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值