集群配置
三个节点:server1, server2, server3
场景
server1 和 server2 有生成日志,flume 从 server1 和 server2 采集日志,flume 在 server3 收集日志并写入到 HDFS
安装 flume
- 首先在 flume 官网下载安装包 https://flume.apache.org/download.html
- 上传到 server1 的 /home/nick 目录下
- 复制到 server2, server3 的相同目录下
scp /home/nick/apache-flume-1.9.0-bin.tar.gz server2:/home/nick/
scp /home/nick/apache-flume-1.9.0-bin.tar.gz server3:/home/nick/
- 解压
tar -zxvf apache-flume-1.9.0-bin.tar.gz
编写模拟日志脚本
generate.sh
#! /bin/bash
while true; do
echo `date` >> /home/nick/data.log
sleep 2
done
配置数据采集节点配置文件
server1 和 server2 都是数据采集节点,所以配置文件都一样
source.conf
a.sources = s1
a.sinks = k1
a.channels = c1
a.sources.s1.inputCharset = GBK
a.sources.s1.type = exec
a.sources.s1.command = tail -f /home/nick/data.log
a.sources.s1.channels = c1
a.channels.c1.type = memory
a.channels.c1.capacity = 1000
a.channels.c1.transactionCapacity = 100
a.channels.c1.keep-alive = 20
a.sinks.k1.type = avro
a.sinks.k1.hostname = server3 #
a.sinks.k1.port = 4444
a.sinks.k1.channel = c1
配置数据收集节点配置文件
在 server3 上新建 sink.conf
#set Agent name
a2.sources = r1
a2.channels = c1
a2.sinks = k1
#set channel
a2.channels.c1.type = memory
a2.channels.c1.capacity = 1000
a2.channels.c1.transactionCapacity = 1000
# other node,nna to nns
a2.sources.r1.type = avro
a2.sources.r1.bind = server3
a2.sources.r1.port = 4444
a2.sources.r1.channels = c1
#set sink to hdfs
a2.sinks.k1.type=hdfs
a2.sinks.k1.hdfs.fileType=DataStream
a2.sinks.k1.hdfs.writeFormat=TEXT
a2.sinks.k1.hdfs.rollInterval=1
a2.sinks.k1.channel=c1
a2.sinks.k1.hdfs.path = hdfs://server1:9000/test_dir/events/%y-%m-%d/%H
a2.sinks.k1.hdfs.filePrefix = events-
a2.sinks.k1.hdfs.round = true
a2.sinks.k1.hdfs.roundValue = 1
a2.sinks.k1.hdfs.roundUnit = hour
a2.sinks.k1.hdfs.useLocalTimeStamp = true
启动
要先启动收集节点的 flume
# 先执行收集节点的, 就是在 server3 上执行
sh apache-flume-1.9.0-bin/bin/flume-ng agent \
--conf /home/nick/apache-flume-1.9.0-bin/conf \
--conf-file /home/nick/test/sink.conf \
--name a2 -Dflume.root.logger=INFO,console > /home/nick/logs/flume-server.log 2>&1 &
# 再在 server1 和 server2 上执行
sh apache-flume-1.9.0-bin/bin/flume-ng agent \
--conf /home/nick/apache-flume-1.9.0-bin/conf \
--conf-file /home/nick/test/source.conf \
--name a -Dflume.root.logger=INFO,console > /home/nick/logs/flume-server.log 2>&1 &