[Flume]安装,部署与应用案例

1. 官网 

Welcome to Apache Flume — Apache Flume

2. 下载

Download — Apache Flume

3. 安装

3.1 将下载的flume包,解压到/opt目录中

3.2 创建 flume-env.sh 配置文件

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/conf$ sudo cp flume-env.sh.template flume-env.sh

3.3 修改 flume-env.sh 配置文件,主要是JAVA_HOME变量设置

 
  1. # Licensed to the Apache Software Foundation (ASF) under one
  2. # or more contributor license agreements.  See the NOTICE file
  3. # distributed with this work for additional information
  4. # regarding copyright ownership.  The ASF licenses this file
  5. # to you under the Apache License, Version 2.0 (the
  6. # "License"); you may not use this file except in compliance
  7. # with the License.  You may obtain a copy of the License at
  8. #
  9. #     http://www.apache.org/licenses/LICENSE-2.0
  10. #
  11. # Unless required by applicable law or agreed to in writing, software
  12. # distributed under the License is distributed on an "AS IS" BASIS,
  13. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  14. # See the License for the specific language governing permissions and
  15. # limitations under the License.
  16. # If this file is placed at FLUME_CONF_DIR/flume-env.sh, it will be sourced
  17. # during Flume startup.
  18. # Enviroment variables can be set here.
  19. # export JAVA_HOME=/usr/lib/jvm/java-6-sun
  20. export JAVA_HOME=/opt/jdk1.8.0_91
  21. # Give Flume more memory and pre-allocate, enable remote monitoring via JMX
  22. # export JAVA_OPTS="-Xms100m -Xmx2000m -Dcom.sun.management.jmxremote"
  23. # Note that the Flume conf directory is always included in the classpath.
  24. #FLUME_CLASSPATH=""

3.4 验证是否安装成功

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/bin$ flume-ng version
  2. Flume 1.6.0
  3. Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
  4. Revision: 2561a23240a71ba20bf288c7c2cda88f443c2080
  5. Compiled by hshreedharan on Mon May 11 11:15:44 PDT 2015
  6. From source with checksum b29e416802ce9ece3269d34233baf43f

出现上面信息,表示安装成功了。

4. 案例

4.1 案例一 Avro

Avro可以发送一个给定的文件给Flume,Avro 源使用AVRO RPC机制。

4.1.1 创建agent 配置文件

根据模板文件创建配置文件:

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/conf$ sudo cp flume-conf.properties.template flume.conf

4.1.2 配置agent配置文件

当你运行一个agent的时候,需要通过-f 选项来告诉Flume使用哪个配置文件。让我们看一个基本的例子,复制下面代码并粘贴到conf/flume.conf文件中。

 
  1. # Licensed to the Apache Software Foundation (ASF) under one
  2. # or more contributor license agreements.  See the NOTICE file
  3. # distributed with this work for additional information
  4. # regarding copyright ownership.  The ASF licenses this file
  5. # to you under the Apache License, Version 2.0 (the
  6. # "License"); you may not use this file except in compliance
  7. # with the License.  You may obtain a copy of the License at
  8. #
  9. #  http://www.apache.org/licenses/LICENSE-2.0
  10. #
  11. # Unless required by applicable law or agreed to in writing,
  12. # software distributed under the License is distributed on an
  13. # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  14. # KIND, either express or implied.  See the License for the
  15. # specific language governing permissions and limitations
  16. # under the License.
  17. # The configuration file needs to define the sources,
  18. # the channels and the sinks.
  19. # Sources, channels and sinks are defined per agent,
  20. # in this case called 'agent'
  21. agent1.sources = avro-source1
  22. agent1.channels = ch1
  23. agent1.sinks = logger-sink1
  24. # sources
  25. agent1.sources.avro-source1.type = avro
  26. agent1.sources.avro-source1.channels = ch1
  27. agent1.sources.avro-source1.bind = 0.0.0.0
  28. agent1.sources.avro-source1.port = 4141
  29. # sink
  30. agent1.sinks.logger-sink1.type = logger
  31. agent1.sinks.logger-sink1.channel = ch1
  32. # channel
  33. agent1.channels.ch1.type = memory
  34. agent1.channels.ch1.capacity = 1000
  35. agent1.channels.ch1.transactionCapacity = 100

4.1.3 启动flume agent agent1

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/bin$ flume-ng agent -c . -f ../conf/flume.conf -n agent1 -Dflume.root.logger=INFO,console

4.1.4 创建指定文件

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin$ sudo touch log.00
  2. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin$ sudo vim log.00

4.1.5 使用avro-client发送文件

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/bin$ flume-ng avro-client -c . -H 0.0.0.0 -p 4141 -F ../log.00

4.1.6 查看信息

在启动agent的控制窗口,可以看到一下信息,注意最后一行:

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/bin$ flume-ng agent -c . -f ../conf/flume.conf -n agent1 -Dflume.root.logger=INFO,console
  2. Info: Including Hadoop libraries found via (/opt/hadoop-2.7.2/bin/hadoop) for HDFS access
  3. Info: Excluding /opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-api-1.7.10.jar from classpath
  4. ...
  5. SLF4J: Class path contains multiple SLF4J bindings.
  6. SLF4J: Found binding in [jar:file:/opt/apache-flume-1.6.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  7. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.0.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  8. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.0.0-bin/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  9. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
  10. 16/09/19 10:29:27 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
  11. 16/09/19 10:29:27 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:../conf/flume.conf
  12. 16/09/19 10:29:27 INFO conf.FlumeConfiguration: Processing:logger-sink1
  13. 16/09/19 10:29:27 INFO conf.FlumeConfiguration: Processing:logger-sink1
  14. 16/09/19 10:29:27 INFO conf.FlumeConfiguration: Added sinks: logger-sink1 Agent: agent1
  15. 16/09/19 10:29:27 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [agent1]
  16. 16/09/19 10:29:27 INFO node.AbstractConfigurationProvider: Creating channels
  17. 16/09/19 10:29:27 INFO channel.DefaultChannelFactory: Creating instance of channel ch1 type memory
  18. 16/09/19 10:29:27 INFO node.AbstractConfigurationProvider: Created channel ch1
  19. 16/09/19 10:29:27 INFO source.DefaultSourceFactory: Creating instance of source avro-source1, type avro
  20. 16/09/19 10:29:27 INFO sink.DefaultSinkFactory: Creating instance of sink: logger-sink1, type: logger
  21. 16/09/19 10:29:27 INFO node.AbstractConfigurationProvider: Channel ch1 connected to [avro-source1, logger-sink1]
  22. 16/09/19 10:29:27 INFO node.Application: Starting new configuration:{ sourceRunners:{avro-source1=EventDrivenSourceRunner: { source:Avro source avro-source1: { bindAddress: 0.0.0.0, port: 4141 } }} sinkRunners:{logger-sink1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@453e9d5e counterGroup:{ name:null counters:{} } }} channels:{ch1=org.apache.flume.channel.MemoryChannel{name: ch1}} }
  23. 16/09/19 10:29:27 INFO node.Application: Starting Channel ch1
  24. 16/09/19 10:29:27 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: ch1: Successfully registered new MBean.
  25. 16/09/19 10:29:27 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: ch1 started
  26. 16/09/19 10:29:27 INFO node.Application: Starting Sink logger-sink1
  27. 16/09/19 10:29:27 INFO node.Application: Starting Source avro-source1
  28. 16/09/19 10:29:27 INFO source.AvroSource: Starting Avro source avro-source1: { bindAddress: 0.0.0.0, port: 4141 }...
  29. 16/09/19 10:29:27 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SOURCE, name: avro-source1: Successfully registered new MBean.
  30. 16/09/19 10:29:27 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: avro-source1 started
  31. 16/09/19 10:29:27 INFO source.AvroSource: Avro source avro-source1 started.
  32. 16/09/19 10:36:32 INFO ipc.NettyServer: [id: 0x072a068a, /127.0.0.1:42708 => /127.0.0.1:4141] OPEN
  33. 16/09/19 10:36:32 INFO ipc.NettyServer: [id: 0x072a068a, /127.0.0.1:42708 => /127.0.0.1:4141] BOUND: /127.0.0.1:4141
  34. 16/09/19 10:36:32 INFO ipc.NettyServer: [id: 0x072a068a, /127.0.0.1:42708 => /127.0.0.1:4141] CONNECTED: /127.0.0.1:42708
  35. 16/09/19 10:36:33 INFO ipc.NettyServer: [id: 0x072a068a, /127.0.0.1:42708 :> /127.0.0.1:4141] DISCONNECTED
  36. 16/09/19 10:36:33 INFO ipc.NettyServer: [id: 0x072a068a, /127.0.0.1:42708 :> /127.0.0.1:4141] UNBOUND
  37. 16/09/19 10:36:33 INFO ipc.NettyServer: [id: 0x072a068a, /127.0.0.1:42708 :> /127.0.0.1:4141] CLOSED
  38. 16/09/19 10:36:33 INFO ipc.NettyServer: Connection to /127.0.0.1:42708 disconnected.
  39. 16/09/19 10:36:37 INFO sink.LoggerSink: Event: { headers:{} body: 48 65 6C 6C 6F 20 46 6C 75 6D 65                Hello Flume }

4.2 案例二 Spool

Spool监测配置的目录下新增的文件,并将文件中的数据读取出来。需要注意两点:

(1)拷贝到spool目录下的文件不可以再打开编辑。

(2)spool目录下不可包含相应的子目录

4.2.1 创建配置文件flume-spool.conf

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/conf$ sudo cp flume.conf flume-spool.conf

进行一下配置:

 
  1. # Licensed to the Apache Software Foundation (ASF) under one
  2. # or more contributor license agreements.  See the NOTICE file
  3. # distributed with this work for additional information
  4. # regarding copyright ownership.  The ASF licenses this file
  5. # to you under the Apache License, Version 2.0 (the
  6. # "License"); you may not use this file except in compliance
  7. # with the License.  You may obtain a copy of the License at
  8. #
  9. #  http://www.apache.org/licenses/LICENSE-2.0
  10. #
  11. # Unless required by applicable law or agreed to in writing,
  12. # software distributed under the License is distributed on an
  13. # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  14. # KIND, either express or implied.  See the License for the
  15. # specific language governing permissions and limitations
  16. # under the License.
  17. # The configuration file needs to define the sources,
  18. # the channels and the sinks.
  19. # Sources, channels and sinks are defined per agent,
  20. # in this case called 'agent'
  21. agent1.sources = avro-source1
  22. agent1.channels = ch1
  23. agent1.sinks = logger-sink1
  24. # sources
  25. agent1.sources.avro-source1.type = spooldir
  26. agent1.sources.avro-source1.channels = ch1
  27. agent1.sources.avro-source1.spoolDir = /home/xiaosi/logs/
  28. agent1.sources.avro-source1.fileHeader = true
  29. agent1.sources.avro-source1.bind = 0.0.0.0
  30. agent1.sources.avro-source1.port = 4141
  31. # sink
  32. agent1.sinks.logger-sink1.type = logger
  33. agent1.sinks.logger-sink1.channel = ch1
  34. # channel
  35. agent1.channels.ch1.type = memory
  36. agent1.channels.ch1.capacity = 1000
  37. agent1.channels.ch1.transactionCapacity = 100

对/home/xiaosi/logs目录进行监控。

4.2.2 启动Flume agent agent1

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/bin$ flume-ng agent -c . -f ../conf/flume-spool.conf -n agent1 -Dflume.root.logger=INFO,console

4.2.3 追加文件到监控目录

 
  1. xiaosi@Qunar:~$ echo "Hello Flume first" > /home/xiaosi/logs/flume-log-1.log
  2. xiaosi@Qunar:~$ echo "Hello Flume second" > /home/xiaosi/logs/flume-log-2.log

4.2.4 查看信息

在启动agent的控制窗口,可以看到一下信息,注意最后两行:

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/bin$ flume-ng agent -c . -f ../conf/flume-spool.conf -n agent1 -Dflume.root.logger=INFO,console
  2. Info: Including Hadoop libraries found via (/opt/hadoop-2.7.2/bin/hadoop) for HDFS access
  3. Info: Excluding /opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-api-1.7.10.jar from classpath
  4. Info: Excluding /opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar from classpath
  5. Info: Including Hive libraries found via (/opt/apache-hive-2.0.0-bin) for Hive access
  6. ...
  7. org.apache.flume.node.Application -f ../conf/flume-spool.conf -n agent1
  8. SLF4J: Class path contains multiple SLF4J bindings.
  9. SLF4J: Found binding in [jar:file:/opt/apache-flume-1.6.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  10. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.0.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  11. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.0.0-bin/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  12. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
  13. 16/09/19 11:29:52 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
  14. 16/09/19 11:29:52 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:../conf/flume-spool.conf
  15. 16/09/19 11:29:52 INFO conf.FlumeConfiguration: Processing:logger-sink1
  16. 16/09/19 11:29:52 INFO conf.FlumeConfiguration: Added sinks: logger-sink1 Agent: agent1
  17. 16/09/19 11:29:52 INFO conf.FlumeConfiguration: Processing:logger-sink1
  18. 16/09/19 11:29:52 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [agent1]
  19. 16/09/19 11:29:52 INFO node.AbstractConfigurationProvider: Creating channels
  20. 16/09/19 11:29:52 INFO channel.DefaultChannelFactory: Creating instance of channel ch1 type memory
  21. 16/09/19 11:29:52 INFO node.AbstractConfigurationProvider: Created channel ch1
  22. 16/09/19 11:29:52 INFO source.DefaultSourceFactory: Creating instance of source avro-source1, type spooldir
  23. 16/09/19 11:29:52 INFO sink.DefaultSinkFactory: Creating instance of sink: logger-sink1, type: logger
  24. 16/09/19 11:29:52 INFO node.AbstractConfigurationProvider: Channel ch1 connected to [avro-source1, logger-sink1]
  25. 16/09/19 11:29:52 INFO node.Application: Starting new configuration:{ sourceRunners:{avro-source1=EventDrivenSourceRunner: { source:Spool Directory source avro-source1: { spoolDir: /home/xiaosi/logs/ } }} sinkRunners:{logger-sink1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@4f5f731e counterGroup:{ name:null counters:{} } }} channels:{ch1=org.apache.flume.channel.MemoryChannel{name: ch1}} }
  26. 16/09/19 11:29:52 INFO node.Application: Starting Channel ch1
  27. 16/09/19 11:29:52 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: ch1: Successfully registered new MBean.
  28. 16/09/19 11:29:52 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: ch1 started
  29. 16/09/19 11:29:52 INFO node.Application: Starting Sink logger-sink1
  30. 16/09/19 11:29:52 INFO node.Application: Starting Source avro-source1
  31. 16/09/19 11:29:52 INFO source.SpoolDirectorySource: SpoolDirectorySource source starting with directory: /home/xiaosi/logs/
  32. 16/09/19 11:29:52 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SOURCE, name: avro-source1: Successfully registered new MBean.
  33. 16/09/19 11:29:52 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: avro-source1 started
  34. 16/09/19 11:30:06 INFO avro.ReliableSpoolingFileEventReader: Last read took us just up to a file boundary. Rolling to the next file, if there is one.
  35. 16/09/19 11:30:06 INFO avro.ReliableSpoolingFileEventReader: Preparing to move file /home/xiaosi/logs/flume-log-1.log to /home/xiaosi/logs/flume-log-1.log.COMPLETED
  36. 16/09/19 11:30:07 INFO sink.LoggerSink: Event: { headers:{file=/home/xiaosi/logs/flume-log-1.log} body: 48 65 6C 6C 6F 20 46 6C 75 6D 65 20 66 69 72 73 Hello Flume firs }
  37. 16/09/19 11:30:21 INFO avro.ReliableSpoolingFileEventReader: Last read took us just up to a file boundary. Rolling to the next file, if there is one.
  38. 16/09/19 11:30:21 INFO avro.ReliableSpoolingFileEventReader: Preparing to move file /home/xiaosi/logs/flume-log-2.log to /home/xiaosi/logs/flume-log-2.log.COMPLETED
  39. 16/09/19 11:30:22 INFO sink.LoggerSink: Event: { headers:{file=/home/xiaosi/logs/flume-log-2.log} body: 48 65 6C 6C 6F 20 46 6C 75 6D 65 20 73 65 63 6F Hello Flume seco }

4.3 案例三 Exec

EXEC执行一个给定的命令获得输出的源,如果要使用tail命令,必选使得file足够大才能看到输出内容

4.3.1 创建配置文件flume-exec.conf

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/conf$ sudo cp flume.conf flume-exec.conf

进行如下修改:

 
  1. # The configuration file needs to define the sources,
  2. # the channels and the sinks.
  3. # Sources, channels and sinks are defined per agent,
  4. # in this case called 'agent'
  5. agent1.sources = avro-source1
  6. agent1.channels = ch1
  7. agent1.sinks = logger-sink1
  8. # sources
  9. agent1.sources.avro-source1.type = exec
  10. agent1.sources.avro-source1.channels = ch1
  11. agent1.sources.avro-source1.command = tail -F /home/xiaosi/logs/flume-log-exec.log
  12. agent1.sources.avro-source1.bind = 0.0.0.0
  13. agent1.sources.avro-source1.port = 4141
  14. # sink
  15. agent1.sinks.logger-sink1.type = logger
  16. agent1.sinks.logger-sink1.channel = ch1
  17. # channel
  18. agent1.channels.ch1.type = memory
  19. agent1.channels.ch1.capacity = 1000
  20. agent1.channels.ch1.transactionCapacity = 100

4.2.2 启动Flume agent agent1

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/bin$ flume-ng agent -c . -f ../conf/flume-exec.conf -n agent1 -Dflume.root.logger=INFO,console

4.2.3 执行tail命令

向文件中进行追加数据,生成足够多的数据

 
  1. #! /bin/sh
  2. for index in {1..100}
  3. do
  4.    echo "Hello Flume $index" >> /home/xiaosi/logs/flume-log-exec.log
  5. done

同时对文件使用tail 命令操作:

 
  1. xiaosi@Qunar:~$ tail -F /home/xiaosi/logs/flume-log-exec.log
  2. Hello Flume 1
  3. Hello Flume 2
  4. Hello Flume 3
  5. Hello Flume 4
  6. ...

4.3.4 查看信息

在启动agent的控制窗口,可以看到一下信息:

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/bin$ flume-ng agent -c . -f ../conf/flume-exec.conf -n agent1 -Dflume.root.logger=INFO,console
  2. Info: Including Hadoop libraries found via (/opt/hadoop-2.7.2/bin/hadoop) for HDFS access
  3. Info: Excluding /opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-api-1.7.10.jar from classpath
  4. Info: Excluding /opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar from classpath
  5. ...
  6. SLF4J: Class path contains multiple SLF4J bindings.
  7. SLF4J: Found binding in [jar:file:/opt/apache-flume-1.6.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  8. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.0.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  9. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.0.0-bin/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  10. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
  11. 16/09/19 12:01:28 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
  12. 16/09/19 12:01:28 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:../conf/flume-exec.conf
  13. 16/09/19 12:01:28 INFO conf.FlumeConfiguration: Processing:logger-sink1
  14. 16/09/19 12:01:28 INFO conf.FlumeConfiguration: Added sinks: logger-sink1 Agent: agent1
  15. 16/09/19 12:01:28 INFO conf.FlumeConfiguration: Processing:logger-sink1
  16. 16/09/19 12:01:28 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [agent1]
  17. 16/09/19 12:01:28 INFO node.AbstractConfigurationProvider: Creating channels
  18. 16/09/19 12:01:28 INFO channel.DefaultChannelFactory: Creating instance of channel ch1 type memory
  19. 16/09/19 12:01:28 INFO node.AbstractConfigurationProvider: Created channel ch1
  20. 16/09/19 12:01:28 INFO source.DefaultSourceFactory: Creating instance of source avro-source1, type exec
  21. 16/09/19 12:01:28 INFO sink.DefaultSinkFactory: Creating instance of sink: logger-sink1, type: logger
  22. 16/09/19 12:01:28 INFO node.AbstractConfigurationProvider: Channel ch1 connected to [avro-source1, logger-sink1]
  23. 16/09/19 12:01:28 INFO node.Application: Starting new configuration:{ sourceRunners:{avro-source1=EventDrivenSourceRunner: { source:org.apache.flume.source.ExecSource{name:avro-source1,state:IDLE} }} sinkRunners:{logger-sink1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@242d6c8b counterGroup:{ name:null counters:{} } }} channels:{ch1=org.apache.flume.channel.MemoryChannel{name: ch1}} }
  24. 16/09/19 12:01:28 INFO node.Application: Starting Channel ch1
  25. 16/09/19 12:01:28 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: ch1: Successfully registered new MBean.
  26. 16/09/19 12:01:28 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: ch1 started
  27. 16/09/19 12:01:28 INFO node.Application: Starting Sink logger-sink1
  28. 16/09/19 12:01:28 INFO node.Application: Starting Source avro-source1
  29. 16/09/19 12:01:28 INFO source.ExecSource: Exec source starting with command:tail -F /home/xiaosi/logs/flume-log-exec.log
  30. 16/09/19 12:01:28 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SOURCE, name: avro-source1: Successfully registered new MBean.
  31. 16/09/19 12:01:28 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: avro-source1 started
  32. 16/09/19 12:01:58 INFO sink.LoggerSink: Event: { headers:{} body: 48 65 6C 6C 6F 20 46 6C 75 6D 65 20 31          Hello Flume 1 }
  33. 16/09/19 12:01:58 INFO sink.LoggerSink: Event: { headers:{} body: 48 65 6C 6C 6F 20 46 6C 75 6D 65 20 32          Hello Flume 2 }
  34. 16/09/19 12:01:58 INFO sink.LoggerSink: Event: { headers:{} body: 48 65 6C 6C 6F 20 46 6C 75 6D 65 20 33          Hello Flume 3 }
  35. ...
  36. 16/09/19 12:01:58 INFO sink.LoggerSink: Event: { headers:{} body: 48 65 6C 6C 6F 20 46 6C 75 6D 65 20 39 38       Hello Flume 98 }
  37. 16/09/19 12:01:58 INFO sink.LoggerSink: Event: { headers:{} body: 48 65 6C 6C 6F 20 46 6C 75 6D 65 20 39 39       Hello Flume 99 }
  38. 16/09/19 12:01:58 INFO sink.LoggerSink: Event: { headers:{} body: 48 65 6C 6C 6F 20 46 6C 75 6D 65 20 31 30 30    Hello Flume 100 }

4.4 案例四 Syslogtcp

Syslogtcp监听TCP的端口做为数据源

4.4.1 创建配置文件flume-tcp.conf

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/conf$ sudo cp flume.conf flume-tcp.conf

进行如下修改:

 
  1. # The configuration file needs to define the sources,
  2. # the channels and the sinks.
  3. # Sources, channels and sinks are defined per agent,
  4. # in this case called 'agent'
  5. agent1.sources = avro-source1
  6. agent1.channels = ch1
  7. agent1.sinks = logger-sink1
  8. # sources
  9. agent1.sources.avro-source1.type = syslogtcp
  10. agent1.sources.avro-source1.channels = ch1
  11. agent1.sources.avro-source1.host = localhost
  12. #agent1.sources.avro-source1.bind = 0.0.0.0
  13. agent1.sources.avro-source1.port = 5140
  14. # sink
  15. agent1.sinks.logger-sink1.type = logger
  16. agent1.sinks.logger-sink1.channel = ch1
  17. # channel
  18. agent1.channels.ch1.type = memory
  19. agent1.channels.ch1.capacity = 1000
  20. agent1.channels.ch1.transactionCapacity = 100

4.4.2 启动Flume agent agent1

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/bin$ flume-ng agent -c . -f ../conf/flume-tcp.conf -n agent1 -Dflume.root.logger=INFO,console

4.4.3  测试产生syslog

 
  1. xiaosi@Qunar:~$  echo "hello flume tcp" | nc localhost 5140

4.4.4 查看信息

在启动agent的控制窗口,可以看到一下信息:

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/bin$ flume-ng agent -c . -f ../conf/flume-tcp.conf -n agent1 -Dflume.root.logger=INFO,console
  2. Info: Including Hadoop libraries found via (/opt/hadoop-2.7.2/bin/hadoop) for HDFS access
  3. Info: Excluding /opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-api-1.7.10.jar from classpath
  4. Info: Excluding /opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar from classpath
  5. ...
  6. SLF4J: Class path contains multiple SLF4J bindings.
  7. SLF4J: Found binding in [jar:file:/opt/apache-flume-1.6.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  8. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.0.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  9. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.0.0-bin/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  10. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
  11. 16/09/19 12:10:15 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
  12. 16/09/19 12:10:15 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:../conf/flume-tcp.conf
  13. 16/09/19 12:10:15 INFO conf.FlumeConfiguration: Processing:logger-sink1
  14. 16/09/19 12:10:15 INFO conf.FlumeConfiguration: Processing:logger-sink1
  15. 16/09/19 12:10:15 INFO conf.FlumeConfiguration: Added sinks: logger-sink1 Agent: agent1
  16. 16/09/19 12:10:15 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [agent1]
  17. 16/09/19 12:10:15 INFO node.AbstractConfigurationProvider: Creating channels
  18. 16/09/19 12:10:15 INFO channel.DefaultChannelFactory: Creating instance of channel ch1 type memory
  19. 16/09/19 12:10:15 INFO node.AbstractConfigurationProvider: Created channel ch1
  20. 16/09/19 12:10:15 INFO source.DefaultSourceFactory: Creating instance of source avro-source1, type syslogtcp
  21. 16/09/19 12:10:15 INFO sink.DefaultSinkFactory: Creating instance of sink: logger-sink1, type: logger
  22. 16/09/19 12:10:15 INFO node.AbstractConfigurationProvider: Channel ch1 connected to [avro-source1, logger-sink1]
  23. 16/09/19 12:10:15 INFO node.Application: Starting new configuration:{ sourceRunners:{avro-source1=EventDrivenSourceRunner: { source:org.apache.flume.source.SyslogTcpSource{name:avro-source1,state:IDLE} }} sinkRunners:{logger-sink1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@38aab021 counterGroup:{ name:null counters:{} } }} channels:{ch1=org.apache.flume.channel.MemoryChannel{name: ch1}} }
  24. 16/09/19 12:10:15 INFO node.Application: Starting Channel ch1
  25. 16/09/19 12:10:16 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: ch1: Successfully registered new MBean.
  26. 16/09/19 12:10:16 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: ch1 started
  27. 16/09/19 12:10:16 INFO node.Application: Starting Sink logger-sink1
  28. 16/09/19 12:10:16 INFO node.Application: Starting Source avro-source1
  29. 16/09/19 12:10:16 INFO source.SyslogTcpSource: Syslog TCP Source starting...
  30. 16/09/19 12:10:50 WARN source.SyslogUtils: Event created from Invalid Syslog data.
  31. 16/09/19 12:10:54 INFO sink.LoggerSink: Event: { headers:{Severity=0, Facility=0, flume.syslog.status=Invalid} body: 68 65 6C 6C 6F 20 66 6C 75 6D 65 20 74 63 70    hello flume tcp }

4.5 案例五 JSONHandler

4.5.1 创建配置文件flume-json.conf

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/conf$ sudo cp flume.conf flume-json.conf

进行如下修改:

 
  1. # The configuration file needs to define the sources,
  2. # the channels and the sinks.
  3. # Sources, channels and sinks are defined per agent,
  4. # in this case called 'agent'
  5. agent1.sources = avro-source1
  6. agent1.channels = ch1
  7. agent1.sinks = logger-sink1
  8. # sources
  9. agent1.sources.avro-source1.type = org.apache.flume.source.http.HTTPSource
  10. agent1.sources.avro-source1.channels = ch1
  11. agent1.sources.avro-source1.port = 8888
  12. # sink
  13. agent1.sinks.logger-sink1.type = logger
  14. agent1.sinks.logger-sink1.channel = ch1
  15. # channel
  16. agent1.channels.ch1.type = memory
  17. agent1.channels.ch1.capacity = 1000
  18. agent1.channels.ch1.transactionCapacity = 100

4.5.2 启动Flume agent agent1

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/bin$ flume-ng agent -c . -f ../conf/flume-json.conf -n agent1 -Dflume.root.logger=INFO,console

4.5.3  生成JSON 格式的POST request

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin$ curl -X POST -d '[{ "headers" :{"a":"a1", "b":"b1"}, "body":"flume_json_boy"}]' http://localhost:8888

4.5.4 查看信息

在启动agent的控制窗口,可以看到一下信息:

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/bin$ flume-ng agent -c . -f ../conf/flume-json.conf -n agent1 -Dflume.root.logger=INFO,console
  2. Info: Including Hadoop libraries found via (/opt/hadoop-2.7.2/bin/hadoop) for HDFS access
  3. Info: Excluding /opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-api-1.7.10.jar from classpath
  4. Info: Excluding /opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar from classpath
  5. ...
  6. SLF4J: Class path contains multiple SLF4J bindings.
  7. SLF4J: Found binding in [jar:file:/opt/apache-flume-1.6.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  8. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.0.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  9. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.0.0-bin/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  10. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
  11. 16/09/19 13:21:28 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
  12. 16/09/19 13:21:28 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:../conf/flume-json.conf
  13. 16/09/19 13:21:28 INFO conf.FlumeConfiguration: Processing:logger-sink1
  14. 16/09/19 13:21:28 INFO conf.FlumeConfiguration: Processing:logger-sink1
  15. 16/09/19 13:21:28 INFO conf.FlumeConfiguration: Added sinks: logger-sink1 Agent: agent1
  16. 16/09/19 13:21:28 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [agent1]
  17. 16/09/19 13:21:28 INFO node.AbstractConfigurationProvider: Creating channels
  18. 16/09/19 13:21:28 INFO channel.DefaultChannelFactory: Creating instance of channel ch1 type memory
  19. 16/09/19 13:21:28 INFO node.AbstractConfigurationProvider: Created channel ch1
  20. 16/09/19 13:21:28 INFO source.DefaultSourceFactory: Creating instance of source avro-source1, type org.apache.flume.source.http.HTTPSource
  21. 16/09/19 13:21:28 INFO sink.DefaultSinkFactory: Creating instance of sink: logger-sink1, type: logger
  22. 16/09/19 13:21:28 INFO node.AbstractConfigurationProvider: Channel ch1 connected to [avro-source1, logger-sink1]
  23. 16/09/19 13:21:28 INFO node.Application: Starting new configuration:{ sourceRunners:{avro-source1=EventDrivenSourceRunner: { source:org.apache.flume.source.http.HTTPSource{name:avro-source1,state:IDLE} }} sinkRunners:{logger-sink1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@136bcdd0 counterGroup:{ name:null counters:{} } }} channels:{ch1=org.apache.flume.channel.MemoryChannel{name: ch1}} }
  24. 16/09/19 13:21:28 INFO node.Application: Starting Channel ch1
  25. 16/09/19 13:21:28 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: ch1: Successfully registered new MBean.
  26. 16/09/19 13:21:28 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: ch1 started
  27. 16/09/19 13:21:28 INFO node.Application: Starting Sink logger-sink1
  28. 16/09/19 13:21:28 INFO node.Application: Starting Source avro-source1
  29. 16/09/19 13:21:28 INFO mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
  30. 16/09/19 13:21:28 INFO mortbay.log: jetty-6.1.26
  31. 16/09/19 13:21:28 INFO mortbay.log: Started SelectChannelConnector@0.0.0.0:8888
  32. 16/09/19 13:21:28 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SOURCE, name: avro-source1: Successfully registered new MBean.
  33. 16/09/19 13:21:28 INFO instrumentation.MonitoredCounterGroup: Component type: SOURCE, name: avro-source1 started
  34. 16/09/19 13:21:32 INFO sink.LoggerSink: Event: { headers:{a=a1, b=b1} body: 66 6C 75 6D 65 5F 6A 73 6F 6E 5F 62 6F 79       flume_json_boy }

4.6 案例六 Hadoop Sink

Syslogtcp监听TCP的端口做为数据源,并将监听的数据存储在HDFS中

4.6.1 创建配置文件flume-hadoop.conf

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/conf$ sudo cp flume.conf flume-hadoop.conf

进行如下修改:

 
  1. # The configuration file needs to define the sources,
  2. # the channels and the sinks.
  3. # Sources, channels and sinks are defined per agent,
  4. # in this case called 'agent'
  5. agent1.sources = avro-source1
  6. agent1.channels = ch1
  7. agent1.sinks = logger-sink1
  8. # sources
  9. agent1.sources.avro-source1.type = syslogtcp
  10. agent1.sources.avro-source1.channels = ch1
  11. agent1.sources.avro-source1.host = localhost
  12. agent1.sources.avro-source1.port = 5140
  13. # sink
  14. agent1.sinks.logger-sink1.type = hdfs
  15. agent1.sinks.logger-sink1.channel = ch1
  16. agent1.sinks.logger-sink1.hdfs.path = hdfs://localhost:9000/user/xiaosi/data
  17. agent1.sinks.logger-sink1.hdfs.filePrefix = SysLog
  18. agent1.sinks.logger-sink1.hdfs.round = true
  19. agent1.sinks.logger-sink1.hdfs.roundValue = 10
  20. agent1.sinks.logger-sink1.hdfs.roundUnit = minute
  21. # channel
  22. agent1.channels.ch1.type = memory
  23. agent1.channels.ch1.capacity = 1000
  24. agent1.channels.ch1.transactionCapacity = 100

4.6.2 启动Flume agent agent1

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/bin$ flume-ng agent -c . -f ../conf/flume-hadoop.conf -n agent1 -Dflume.root.logger=INFO,console

4.6.3  测试产生syslog

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin$ echo "Hello Flume -> Hadoop  one" | nc localhost 5140

4.6.4 查看信息

在启动agent的控制窗口,可以看到一下信息:

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/bin$ flume-ng agent -c . -f ../conf/flume-hadoop.conf -n agent1 -Dflume.root.logger=INFO,console
  2. Info: Including Hadoop libraries found via (/opt/hadoop-2.7.2/bin/hadoop) for HDFS access
  3. Info: Excluding /opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-api-1.7.10.jar from classpath
  4. Info: Excluding /opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar from classpath
  5. ...
  6. SLF4J: Class path contains multiple SLF4J bindings.
  7. SLF4J: Found binding in [jar:file:/opt/apache-flume-1.6.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  8. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.0.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  9. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.0.0-bin/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  10. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
  11. 16/09/19 13:34:58 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
  12. 16/09/19 13:34:58 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:../conf/flume-hadoop.conf
  13. 16/09/19 13:34:58 INFO conf.FlumeConfiguration: Processing:logger-sink1
  14. 16/09/19 13:34:58 INFO conf.FlumeConfiguration: Processing:logger-sink1
  15. 16/09/19 13:34:58 INFO conf.FlumeConfiguration: Processing:logger-sink1
  16. 16/09/19 13:34:58 INFO conf.FlumeConfiguration: Added sinks: logger-sink1 Agent: agent1
  17. 16/09/19 13:34:58 INFO conf.FlumeConfiguration: Processing:logger-sink1
  18. 16/09/19 13:34:58 INFO conf.FlumeConfiguration: Processing:logger-sink1
  19. 16/09/19 13:34:58 INFO conf.FlumeConfiguration: Processing:logger-sink1
  20. 16/09/19 13:34:58 INFO conf.FlumeConfiguration: Processing:logger-sink1
  21. 16/09/19 13:34:58 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [agent1]
  22. 16/09/19 13:34:58 INFO node.AbstractConfigurationProvider: Creating channels
  23. 16/09/19 13:34:58 INFO channel.DefaultChannelFactory: Creating instance of channel ch1 type memory
  24. 16/09/19 13:34:58 INFO node.AbstractConfigurationProvider: Created channel ch1
  25. 16/09/19 13:34:58 INFO source.DefaultSourceFactory: Creating instance of source avro-source1, type syslogtcp
  26. 16/09/19 13:34:58 INFO sink.DefaultSinkFactory: Creating instance of sink: logger-sink1, type: hdfs
  27. 16/09/19 13:34:58 INFO node.AbstractConfigurationProvider: Channel ch1 connected to [avro-source1, logger-sink1]
  28. 16/09/19 13:34:58 INFO node.Application: Starting new configuration:{ sourceRunners:{avro-source1=EventDrivenSourceRunner: { source:org.apache.flume.source.SyslogTcpSource{name:avro-source1,state:IDLE} }} sinkRunners:{logger-sink1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@569671b3 counterGroup:{ name:null counters:{} } }} channels:{ch1=org.apache.flume.channel.MemoryChannel{name: ch1}} }
  29. 16/09/19 13:34:58 INFO node.Application: Starting Channel ch1
  30. 16/09/19 13:34:58 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: ch1: Successfully registered new MBean.
  31. 16/09/19 13:34:58 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: ch1 started
  32. 16/09/19 13:34:58 INFO node.Application: Starting Sink logger-sink1
  33. 16/09/19 13:34:58 INFO node.Application: Starting Source avro-source1
  34. 16/09/19 13:34:58 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SINK, name: logger-sink1: Successfully registered new MBean.
  35. 16/09/19 13:34:58 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: logger-sink1 started
  36. 16/09/19 13:34:58 INFO source.SyslogTcpSource: Syslog TCP Source starting...
  37. 16/09/19 13:35:06 WARN source.SyslogUtils: Event created from Invalid Syslog data.
  38. 16/09/19 13:35:07 INFO hdfs.HDFSSequenceFile: writeFormat = Writable, UseRawLocalFileSystem = false
  39. 16/09/19 13:35:07 INFO hdfs.BucketWriter: Creating hdfs://localhost:9000/user/xiaosi/data/SysLog.1474263307767.tmp

4.6.5 查看HDFS

 
  1. xiaosi@Qunar:/opt/hadoop-2.7.2/sbin$ hadoop fs -ls /user/xiaosi/data
  2. Found 3 items
  3. -rw-r--r--   1 xiaosi supergroup        141 2016-09-19 13:35 /user/xiaosi/data/SysLog.1474263307767
  4. -rw-r--r--   1 xiaosi supergroup       1350 2016-07-28 14:10 /user/xiaosi/data/mysql-result.txt
  5. -rw-r--r--   3 xiaosi supergroup         26 2016-07-30 22:47 /user/xiaosi/data/num.txt
  6. xiaosi@Qunar:/opt/hadoop-2.7.2/sbin$ hadoop fs -text /user/xiaosi/data/SysLog.1474263307767
  7. 1474263309104 48 65 6c 6c 6f 20 46 6c 75 6d 65 20 2d 3e 20 48 61 64 6f 6f 70 20 20 6f 6e 65
  8. xiaosi@Qunar:/opt/hadoop-2.7.2/sbin$ hadoop fs -cat /user/xiaosi/data/SysLog.1474263307767
  9. SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable?7��7ξ1�sv���nW@��0Hello Flume -> Hadoop  one

4.7 案例七 File Roll Sink

Syslogtcp监听TCP的端口做为数据源,并将监听的数据存储在文件中,每一定时间生成一个新文件

4.7.1 创建配置文件flume-hadoop.conf

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/conf$ sudo cp flume.conf flume-file-roll.conf

进行如下修改:

 
  1. # The configuration file needs to define the sources,
  2. # the channels and the sinks.
  3. # Sources, channels and sinks are defined per agent,
  4. # in this case called 'agent'
  5. agent1.sources = avro-source1
  6. agent1.channels = ch1
  7. agent1.sinks = logger-sink1
  8. # sources
  9. agent1.sources.avro-source1.type = syslogtcp
  10. agent1.sources.avro-source1.channels = ch1
  11. agent1.sources.avro-source1.bind = 0.0.0.0
  12. agent1.sources.avro-source1.host = localhost
  13. agent1.sources.avro-source1.port = 5555
  14. # sink
  15. agent1.sinks.logger-sink1.type = file_roll
  16. agent1.sinks.logger-sink1.sink.directory = /home/xiaosi/logs/flume
  17. agent1.sinks.logger-sink1.channel = ch1
  18. # channel
  19. agent1.channels.ch1.type = memory
  20. agent1.channels.ch1.capacity = 1000
  21. agent1.channels.ch1.transactionCapacity = 100

备注:

 
  1. agent1.sinks.logger-sink1.sink.directory = /home/xiaosi/logs/flume

directory前面加上sink,否则报错:

 
 
  1. 16/09/19 14:16:12 ERROR node.AbstractConfigurationProvider: Sink logger-sink1 has been removed due to an error during configuration
  2. java.lang.IllegalArgumentException: Directory may not be null
  3. at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
  4. at org.apache.flume.sink.RollingFileSink.configure(RollingFileSink.java:84)
  5. at org.apache.flume.conf.Configurables.configure(Configurables.java:41)
  6. at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:413)
  7. at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:98)
  8. at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140)
  9. at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
  10. at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
  11. at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
  12. at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
  13. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  14. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  15. at java.lang.Thread.run(Thread.java:745)

4.7.2 启动Flume agent agent1

 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/bin$ flume-ng agent -c . -f ../conf/flume-file-roll.conf -n agent1 -Dflume.root.logger=INFO,console

4.7.3  测试产生syslog

 
 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin$ echo "Hello Flume File Roll One" | nc localhost 5555
  2. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin$ echo "Hello Flume File Roll Two" | nc localhost 5555
  3. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin$ echo "Hello Flume File Roll Three" | nc localhost 5555
4.7.4 查看信息

在启动agent的控制窗口,可以看到一下信息:

 
 
  1. xiaosi@Qunar:/opt/apache-flume-1.6.0-bin/bin$ flume-ng agent -c . -f ../conf/flume-file-roll.conf -n agent1 -Dflume.root.logger=INFO,console
  2. Info: Including Hadoop libraries found via (/opt/hadoop-2.7.2/bin/hadoop) for HDFS access
  3. Info: Excluding /opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-api-1.7.10.jar from classpath
  4. Info: Excluding /opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar from classpath
  5. Info: Including Hive libraries found via (/opt/apache-hive-2.0.0-bin) for Hive access
  6. ...
  7. SLF4J: Class path contains multiple SLF4J bindings.
  8. SLF4J: Found binding in [jar:file:/opt/apache-flume-1.6.0-bin/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  9. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.0.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  10. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.0.0-bin/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
  11. SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
  12. 16/09/19 14:18:58 INFO node.PollingPropertiesFileConfigurationProvider: Configuration provider starting
  13. 16/09/19 14:18:58 INFO node.PollingPropertiesFileConfigurationProvider: Reloading configuration file:../conf/flume-file-roll.conf
  14. 16/09/19 14:18:58 INFO conf.FlumeConfiguration: Processing:logger-sink1
  15. 16/09/19 14:18:58 INFO conf.FlumeConfiguration: Added sinks: logger-sink1 Agent: agent1
  16. 16/09/19 14:18:58 INFO conf.FlumeConfiguration: Processing:logger-sink1
  17. 16/09/19 14:18:58 INFO conf.FlumeConfiguration: Processing:logger-sink1
  18. 16/09/19 14:18:59 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [agent1]
  19. 16/09/19 14:18:59 INFO node.AbstractConfigurationProvider: Creating channels
  20. 16/09/19 14:18:59 INFO channel.DefaultChannelFactory: Creating instance of channel ch1 type memory
  21. 16/09/19 14:18:59 INFO node.AbstractConfigurationProvider: Created channel ch1
  22. 16/09/19 14:18:59 INFO source.DefaultSourceFactory: Creating instance of source avro-source1, type syslogtcp
  23. 16/09/19 14:18:59 INFO sink.DefaultSinkFactory: Creating instance of sink: logger-sink1, type: file_roll
  24. 16/09/19 14:18:59 INFO node.AbstractConfigurationProvider: Channel ch1 connected to [avro-source1, logger-sink1]
  25. 16/09/19 14:18:59 INFO node.Application: Starting new configuration:{ sourceRunners:{avro-source1=EventDrivenSourceRunner: { source:org.apache.flume.source.SyslogTcpSource{name:avro-source1,state:IDLE} }} sinkRunners:{logger-sink1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@52548256 counterGroup:{ name:null counters:{} } }} channels:{ch1=org.apache.flume.channel.MemoryChannel{name: ch1}} }
  26. 16/09/19 14:18:59 INFO node.Application: Starting Channel ch1
  27. 16/09/19 14:18:59 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: CHANNEL, name: ch1: Successfully registered new MBean.
  28. 16/09/19 14:18:59 INFO instrumentation.MonitoredCounterGroup: Component type: CHANNEL, name: ch1 started
  29. 16/09/19 14:18:59 INFO node.Application: Starting Sink logger-sink1
  30. 16/09/19 14:18:59 INFO sink.RollingFileSink: Starting org.apache.flume.sink.RollingFileSink{name:logger-sink1, channel:ch1}...
  31. 16/09/19 14:18:59 INFO node.Application: Starting Source avro-source1
  32. 16/09/19 14:18:59 INFO instrumentation.MonitoredCounterGroup: Monitored counter group for type: SINK, name: logger-sink1: Successfully registered new MBean.
  33. 16/09/19 14:18:59 INFO instrumentation.MonitoredCounterGroup: Component type: SINK, name: logger-sink1 started
  34. 16/09/19 14:18:59 INFO sink.RollingFileSink: RollingFileSink logger-sink1 started.
  35. 16/09/19 14:18:59 INFO source.SyslogTcpSource: Syslog TCP Source starting...
  36. 16/09/19 14:19:07 WARN source.SyslogUtils: Event created from Invalid Syslog data.
  37. 16/09/19 14:19:13 WARN source.SyslogUtils: Event created from Invalid Syslog data.
  38. 16/09/19 14:19:37 WARN source.SyslogUtils: Event created from Invalid Syslog data.

4.7.5 查看生成日志文件

查看/home/xiaosi/logs/flume下是否生成文件,默认每30秒生成一个新文件

 
 
  1. xiaosi@Qunar:~$ ll /home/xiaosi/logs/flume/
  2. 总用量 16
  3. drwxrwxr-x 2 xiaosi xiaosi 4096 9 19 14:19 ./
  4. drwxrwxr-x 6 xiaosi xiaosi 4096 9 19 14:09 ../
  5. -rw-rw-r-- 1 xiaosi xiaosi 52 9 19 14:19 1474265939053-1
  6. -rw-rw-r-- 1 xiaosi xiaosi 28 9 19 14:19 1474265939053-2
  7. xiaosi@Qunar:~$ cat /home/xiaosi/logs/flume/1474265939053-1
  8. Hello Flume File Roll One
  9. Hello Flume File Roll Two
  10. xiaosi@Qunar:~$ cat /home/xiaosi/logs/flume/1474265939053-2
  11. Hello Flume File Roll Three

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

@SmartSi

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值