Flume 安装，简单测试

最新推荐文章于 2024-05-12 16:12:10 发布

爱笑的T_T

最新推荐文章于 2024-05-12 16:12:10 发布

阅读量3.9k

点赞数 1

分类专栏： Flume hadoop

hadoop 同时被 2 个专栏收录

23 篇文章 0 订阅

订阅专栏

Flume

2 篇文章 0 订阅

订阅专栏

转：http://www.aboutyun.com/thread-8917-1-1.html

解压

tar -zxvf apache-flume-1.7.0-bin.tar.gz

修改 flume-env.sh 配置文件,主要是JAVA_HOME变量设置

# Enviroment variables can be set here.
export JAVA_HOME=/usr/java/jdk1.8.0_91

验证是否安装成功

$ ./bin/flume-ng version
Flume 1.7.0
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: 511d868555dd4d16e6ce4fedc72c2d1454546707
Compiled by bessbd on Wed Oct 12 20:51:10 CEST 2016
From source with checksum 0d21b3ffdc55a07e1d08875872c00523

出现上面的信息，表示安装成功了

案例 1：start case （single-node configuration）

创建agent配置文件

#文件名：case1_example.conf
#配置内容：
# case1_example.conf: A single-node Flume configuration
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
 
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
 
# Describe the sink
a1.sinks.k1.type = logger
 
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
 
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

#开始命令

./bin/flume-ng agent -c conf -f conf/case1_example.conf -n a1 -Dflume.root.logger=INFO,console

#启动参数说明

-c conf 指定配置目录为conf
-f conf/case1_example.conf 指定配置文件为conf/case1_example.conf
-n a1 指定agent名字为a1,需要与case1_example.conf中的一致
-Dflume.root.logger=INFO,console 指定DEBUF模式在console输出INFO信息

#在另一个终端进行测试（安装telnet：yum install -y telnet）

# telnet 127.0.0.1 44444
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
hello world!
OK

#在启动的终端查看console输出

2017-02-09 11:34:36,369 (lifecycleSupervisor-1-4) [INFO - org.apache.flume.source.NetcatSource.start(NetcatSource.java:169)] Created serverSocket:sun.nio.ch.ServerSocketChannelImpl[/127.0.0.1:44444]
2017-02-09 11:40:50,462 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:95)]

 Event: { headers:{} body: 68 65 6C 6C 6F 20 77 6F 72 6C 64 21 0D          hello world!. }

案例2：Avro案例

Avro可以发送一个给定的文件给Flume，Avro 源使用AVRO RPC机制。

创建agent配置文件

#文件名：case2_avro.conf
#配置内容：
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
 
# Describe/configure the source
a1.sources.r1.type = avro
a1.sources.r1.channels = c1
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 4141
 
# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.channel = c1

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

#Start flume agent a1

./bin/flume-ng agent -c . -f conf/case2_avro.conf -n a1 -Dflume.root.logger=INFO,console

#创建指定文件

echo "hello world" > log.10

#使用avro-client发送文件

./bin/flume-ng avro-client -c . -H localhost -p 4141 -F log.10

#在启动的终端查看console输出

sink.LoggerSink: Event: { headers:{} body: 68 65 6C 6C 6F 20 77 6F 72 6C 64                hello world }

案例3：Exec

EXEC执行一个给定的命令获得输出的源,如果要使用tail命令，必选使得file足够大才能看到输出内容

创建agent配置文件

Test Exec Source

#文件名：case3_exec.conf
#配置内容：
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
 
# Describe/configure the source
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /home/hadoop/apache-flume-1.7.0-bin/log.10
a1.sources.r1.channels = c1
 
# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.channel = c1

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

#启动flume agent a1

./bin/flume-ng agent -c . -f conf/case3_exec.conf -n a1 -Dflume.root.logger=INFO,console

#生成足够多的内容在文件里

for i in {1..100};do echo "exec test$i" >> log.10;echo $i;done

#在启动的终端查看console输出

17/02/09 14:30:33 INFO sink.LoggerSink: Event: { headers:{} body: 65 78 65 63 20 74 65 73 74 37 34                exec test1 }

...
...
...
                                                                                                          
17/02/09 14:30:35 INFO sink.LoggerSink: Event: { headers:{} body: 65 78 65 63 20 74 65 73 74 31 30 30             exec test100 }

案例4：Spool

Spool监测配置的目录下新增的文件，并将文件中的数据读取出来。需要注意两点：
　　　　1) 拷贝到spool目录下的文件不可以再打开编辑。
　　　　2) spool目录下不可包含相应的子目录

创建agent配置文件

#文件名：case4_spool.conf
#配置内容：
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
 
# Describe/configure the source
a1.sources.r1.type = spooldir
a1.sources.r1.spoolDir = /home/hadoop/logs/flumeSpool
a1.sources.r1.fileHeader = true
a1.sources.r1.channels = c1
 
# Describe the sink
a1.sinks.k1.type = logger
 a1.sinks.k1.channel = c1

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

#启动flume agent a1

./bin/flume-ng agent -c . -f conf/case4_spool.conf -n a1 -Dflume.root.logger=INFO,console

追加文件到/home/hadoop/logs/flumeSpool目录

echo "spool test1" > /home/hadoop/logs/flumeSpool/spool_text.log

#在启动的终端查看console输出

17/02/09 14:55:31 INFO sink.LoggerSink: Event: { headers:{file=/home/hadoop/logs/flumeSpool/spool_text.log} body: 73 70 6F 6F 6C 20 74 65 73 74 31                spool test1 }
17/02/09 14:55:31 INFO avro.ReliableSpoolingFileEventReader: Last read took us just up to a file boundary. Rolling to the next file, if there is one.
17/02/09 14:55:31 INFO avro.ReliableSpoolingFileEventReader: Preparing to move file /home/hadoop/logs/flumeSpool/spool_text.log to /home/hadoop/logs/flumeSpool/spool_text.log.COMPLETED

spool_text.log文件中的数据被读取出来后名字变成spool_text.log.COMPLETED

案例5：Syslogtcp

Syslogtcp监听TCP的端口做为数据源

创建agent配置文件

案例5：Test Syslog tcp source
#文件名：case5_syslog.conf
#配置内容：
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
 
# Describe/configure the source
a1.sources.r1.type = syslogtcp
a1.sources.r1.port = 5140
a1.sources.r1.host = localhost
a1.sources.r1.channels = c1
 
# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.channel = c1

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

启动flume agent a1

./bin/flume-ng agent -c . -f conf/case5_syslog.conf -n a1 -Dflume.root.logger=INFO,console

测试产生syslog(安装nc：yum install -y nc)

echo "hello idoall.org syslog" | nc localhost 5140

#在启动的终端查看console输出

17/02/09 15:20:11 WARN source.SyslogUtils: Event created from Invalid Syslog data.
17/02/09 15:20:16 INFO sink.LoggerSink: Event: { headers:{Severity=0, Facility=0, flume.syslog.status=Invalid} body: 68 65 6C 6C 6F 20 69 64 6F 61 6C 6C 2E 6F 72 67 hello idoall.org }

案例6：Syslogudp

创建agent配置文件

案例6：Test Syslog udp source
#文件名：case6_syslogudp.conf
#配置内容：
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
 
# Describe/configure the source
a1.sources.r1.type = syslogudp
a1.sources.r1.port = 5140
a1.sources.r1.host = localhost
a1.sources.r1.channels = c1
 
# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.channel = c1

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

#启动flume agent a1

./bin/flume-ng agent -c . -f conf/case6_syslogudp.conf -n a1 -Dflume.root.logger=INFO,console

#测试产生syslog

echo "<37>hello via syslogudp" | nc -u localhost 5140

#在启动的终端查看console输出

2013-05-27 23:39:10,755 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: { headers:{Severity=5, Facility=4} body: 68 65 6C 6C 6F 20 76 69 61 20 73 79 73 6C 6F 67 hello via syslogudp }

案例7：HTTP source JSONHandler

创建agent配置文件

#文件名：case7_httppost.conf
#配置内容：
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
 
# Describe/configure the source
a1.sources.r1.type = org.apache.flume.source.http.HTTPSource
a1.sources.r1.port = 5140
a1.sources.r1.channels = c1
 
# Describe the sink
a1.sinks.k1.type = logger
a1.sinks.k1.channel = c1

# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

#启动flume agent a1

./bin/flume-ng agent -c . -f conf/case7_httppost.conf -n a1 -Dflume.root.logger=INFO,console

#生成JSON 格式的POST request

curl -X POST -d '[{ "headers" :{"namenode" : "namenode.example.com","datanode" : "random_datanode.example.com"},"body" : "really_random_body"}]' http://localhost:5140

#在启动的终端查看console输出

17/02/09 17:16:51 INFO sink.LoggerSink: Event: { headers:{namenode=namenode.example.com, datanode=random_datanode.example.com} body: 72 65 61 6C 6C 79 5F 72 61 6E 64 6F 6D 5F 62 6F really_random_bo }

爱笑的T_T

关注

1
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
Flume 安装，简单测试

转：http://www.aboutyun.com/thread-8917-1-1.html解压tar -zxvf apache-flume-1.7.0-bin.tar.gz修改 flume-env.sh 配置文件,主要是JAVA_HOME变量设置# Enviroment variables can be set here.export JAVA_HOME=/usr/java
复制链接

扫一扫