3、Host Interceptor
4、Static Interceptor
5、Remove Header Interceptor
6、UUID Interceptor
7、Morphline Interceptor
8、Search and Replace Interceptor
9、Regex Filtering Interceptor
10、Regex Extractor Interceptor
十、Flume 配置
1、Environment Variable Config Filter
2、External Process Config Filter
3、Hadoop Credential Store Config Filter
4、Log4J Appender
5、Load Balancing Log4J Appender
一、环境准备
flume官方文档:Documentation — Apache Flume
1、安装包下载
jdk1.8:Java Downloads | Oracle
flume1.9.0:Download — Apache Flume
2、安装flume
tar zxvf apache-flume-1.9.0-bin.tar.gz -C /usr/local/
ln -s apache-flume-1.9.0-bin flume
3、修改配置文件
cd /usr/local/flume/conf
cp flume-conf.properties.template flume-conf.properties
cp flume-env.ps1.template flume-env.ps1
cp flume-env.sh.template flume-env.sh
二、环境变量配置
1、配置java环境变量
export JAVA_HOME=/usr/java/jdk1.8.0_241-amd64
export PATH= P A T H : PATH: PATH:JAVA_HOME/bin
2、配置flume环境变量
export FLUME_HOME=/usr/local/flume
export PATH= P A T H : PATH: PATH:FLUME_HOME/bin
三、Flume source
1、netcat source
在 /usr/local/flume 目录下创建 example.conf 文件,文件内容如下
source类型为监控端口,sink类型为日志输出,channel类型为内存,channel的最大存储event数量为1000,每次source发送或者sink接收event的数量为100
example.conf: A single-node Flume configuration
Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
Describe the sink
a1.sinks.k1.type = logger
Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
启动flume agent,配置文件为 example.conf ,agent名称为 a1 ,以日志形式在控制台显示接收source消息
flume-ng agent --conf conf --conf-file example.conf --name a1 -Dflume.root.logger=INFO,console
也可以使用命令简令, -c 指定flume的配置目录,-f 指定定义组件的配置文件 -n 指定组件中agent的名称,-Dflume.root.logger=INFO,console为flume的运行日志
flume-ng agent -c $FLUME_HOME/conf -f $FLUME_HOME/example.conf -n a1 -Dflume.root.logger=INFO,console
telnet localhost 44444
效果如图 ,sink监听本机44444端口,使用telnet向本机44444端口发送消息模拟source端发送消息,可以看到sink端以控制台日志的形式接收了source端的消息发送
flume还支持配置文件使用环境变量,仅限于值使用,变量也可以通过 conf/flume-env.sh 文件配置
将 example.conf source监听的端口 修改为
a1.sources.r1.port = ${BIND_PORT}
需要添加参数 -DpropertiesImplementation=org.apache.flume.node.EnvVarResolverProperties
BIND_PORT=44444 flume-ng agent -c $FLUME_HOME/conf -f $FLUME_HOME/example.conf -n a1 -Dflume.root.logger=INFO,console -DpropertiesImplementation=org.apache.flume.node.EnvVarResolverProperties