1、
Flume安装前置条件
(1)JDK版本必须 1.7+:
Java Runtime Environment - Java 1.7 or later
(2)足够的内存:
Memory - Sufficient memory for configurations used by sources, channels or sinks
(3)足够的磁盘空间:
Disk Space - Sufficient disk space for configurations used by channels or sinks
(4)足够的目录权限:
Directory Permissions - Read/Write permissions for directories used by agent
2、安装Flume步骤
(1)下载地址 :
http://archive.cloudera.com/cdh5/cdh/5/flume-ng-1.6.0-cdh5.7.0.tar.gz
(网址在
http://archive.cloudera.com/cdh5/cdh/5/
)
(2)解压后的文件为:/www/instl/flume/apache-flume-1.6.0-cdh5.7.0-bin
(3)配置系统环境变量中: /etc/profile
export FLUME_HOME=
/www/instl/flume/apache-flume-1.6.0-cdh5.7.0-bin
export PATH=$FLUME_HOME/bin:$PATH
|
(4) source下让其配置生效 : source
/etc/profile
(5) flume-env.sh的配置
export JAVA_HOME=/www/instl/jdk/jdk1.8.0_171
(6)检测: flume-ng version
3、Flume 简单使用
使用Flume的关键就是写配置文件,如下是一个简单的例子(
使用网络传输方式从140.143.236.161:44444端口输入,在本地flume控制台输出
):
配置文件步骤:
A) 配置Source
B) 配置Channel
C) 配置Sink
D) 把以上三个组件串起来
配置:
# Name the components on this agent
# a1: agent名称
# r1: source的名称
# k1: sink的名称
# c1: channel的名称
#
agent的source指定的端口:是安装flume的主机要监听的端口。IP是安装flume的主机的IP。
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind =
hadoop000
a1.sources.r1.port = 6789
# Describe the sink
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
|
4、启动Flume
启动agent。简单参数说明
a. --name :指定使用的agent 的名称
b. --conf 指定的$FLUME_HOME/conf 路径
c. --conf-file 配置文件的路径,可以不在$FLUME_HOME/conf 路径下
flume-ng agent \
--name a1 \
--conf $FLUME_HOME/conf \
--conf-file $FLUME_HOME/conf/example.conf \
-Dflume.root.logger=INFO,console
|
5、测试
使用telnet进行测试: telnet
hadoop000
6789
6、传输数据构成
Event: { headers:{} body: 68 65 6C 6C 6F 0D hello. }
Event是FLume数据传输的基本单元
Event = 可选的header + byte array