目录
3. node02与node03配置flume collection
高可用的Flume NG集群,架构图如下所示:
图中所示,Agent1数据分别流入到Collector1和Collector2,Flume NG本身提供了Failover机制,可以自动切换和恢复。在上图中,有2个产生日志服务器分布在不同的机房,要把所有的日志都收集到一个集群中存储。下面我们开发配置Flume NG集群 .
1. 角色分配
Flume的Agent和Collector分布如下表所示:
名称 | HOST | 角色 |
---|---|---|
Agent1 | node01 | Web Server |
Collector1 | node02 | AgentMstr1 |
Collector2 | node03 | AgentMstr2 |
2. node01安装配置flume与拷贝文件脚本
将node03机器上面已有的flume安装包以及文件生产的两个目录拷贝到node01机器上面去
node03机器执行以下命令
cd /xsluo/install
scp -r apache-flume-1.6.0-cdh5.14.2-bin/ node01:$PWD
scp -r shells/ taillogs/ node01:$PWD
node01机器改agent的配置文件
cd /xsluo/install/apache-flume-1.6.0-cdh5.14.2-bin/conf
vim agent.conf
内容如下
#agent1 name
agent1.channels = c1
agent1.sources = r1
agent1.sinks = k1 k2
##set gruop
agent1.sinkgroups = g1
##set channel
agent1.channels.c1.type = memory
agent1.channels.c1.capacity = 1000
agent1.channels.c1.transactionCapacity = 100
# 配置source
agent1.sources.r1.channels = c1
agent1.sources.r1.type = exec
agent1.sources.r1.command = tail -F /xsluo/install/taillogs/access_log
# interceptor 拦截器;与source结合,对event进行修改或丢弃
agent1.sources.r1.interceptors = i1 i2
# 静态拦截器在所有的event的header中,增加一个kv对,key是下边属性key对应的值,value是属性value对应的值
agent1.sources.r1.interceptors.i1.type = static
# 被创建的header的名字
agent1.sources.r1.interceptors.i1.key = Type
# 静态的值;key与value对应
agent1.sources.r1.interceptors.i1.value = LOGIN
# timestamp拦截器对event的header中增加kv对,key是timestamp,value是对应的时间戳的值
agent1.sources.r1.interceptors.i2.type = timestamp
## set sink1
agent1.sinks.k1.channel = c1
agent1.sinks.k1.type = avro
agent1.sinks.k1.hostname = node02
agent1.sinks.k1.port = 52020
## set sink2
agent1.sinks.k2.channel = c1
agent1.sinks.k2.type = avro
agent1.sinks.k2.hostname = node03
agent1.sinks.k2.port = 52020
##set sink group
agent1.sinkgroups.g1.sinks = k1 k2
## sink processor处理器;可用于sink的负载均衡或故障转移
agent1.sinkgroups.g1.processor.type = failover
# priority值高的sink,拥有较高的权限;并且必须是唯一不重复的
agent1.sinkgroups.g1.processor.priority.k1 = 10
agent1.sinkgroups.g1.processor.priority.k2 = 1
# maxpenalty 对于故障的节点最大的黑名单时间 (in millis 毫秒)
agent1.sinkgroups.g1.processor.maxpenalty = 10000
3. node02与node03配置flume collection
node02节点安装flume:直接从node03节点拷贝Flume整个文件夹过去。
然后,node02、node03机器修改配置文件,内容相同
cd /xsluo/install/apache-flume-1.6.0-cdh5.14.2-bin/conf
vim collector.conf
内容如下
#set Agent name
a1.sources = r1
a1.channels = c1
a1.sinks = k1
##set channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
## set source
a1.sources.r1.type = avro
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = 52020
a1.sources.r1.channels = c1
# 拦截器
a1.sources.r1.interceptors = i1
#a1.sources.r1.interceptors.i1.type = static
#a1.sources.r1.interceptors.i1.key = Collector
#a1.sources.r1.interceptors.i1.value = node02
# 在header中添加的kv对的key默认是host
a1.sources.r1.interceptors.i1.type = host
a1.sources.r1.interceptors.i1.hostHeader=hostname
##set sink to hdfs
a1.sinks.k1.type=hdfs
a1.sinks.k1.hdfs.path= hdfs://node01:8020/flume/failover/%{hostname}
a1.sinks.k1.hdfs.fileType=DataStream
a1.sinks.k1.hdfs.writeFormat=TEXT
a1.sinks.k1.hdfs.rollInterval=10
a1.sinks.k1.channel=c1
a1.sinks.k1.hdfs.filePrefix=%Y-%m-%d
4. 顺序启动命令
node03机器上面启动flume
cd /xsluo/install/apache-flume-1.6.0-cdh5.14.2-bin
bin/flume-ng agent -n a1 -c conf -f conf/collector.conf -Dflume.root.logger=DEBUG,console
node02机器上面启动flume
cd /xsluo/install/apache-flume-1.6.0-cdh5.14.2-bin
bin/flume-ng agent -n a1 -c conf -f conf/collector.conf -Dflume.root.logger=DEBUG,console
node01机器上面启动flume
cd /xsluo/install/apache-flume-1.6.0-cdh5.14.2-bin
bin/flume-ng agent -n agent1 -c conf -f conf/agent.conf -Dflume.root.logger=DEBUG,console
node01机器开发shell脚本定时追加文件内容
cd /xsluo/install/shells/
vim tail-file.sh
内容如下
#!/bin/bash
while true
do
date >> /xsluo/install/taillogs/access_log;
sleep 0.5;
done
然后,在node01机器启动文件产生脚本
cd /xsluo/install/shells
sh tail-file.sh
5. 测试FAILOVER
-
去hdfs查看生成文件
-
将node02的agent停掉,自动切换到node03上的agent
-
再将node02的agent启动,由于node02的优先级高,自动切换回node02上的agent
- agent1.sinkgroups.g1.processor.priority.k1 = 10
- agent1.sinkgroups.g1.processor.priority.k2 = 1
故障转移场景:我们在Agent1节点上传文件,由于我们配置Collector1的权重比Collector2大,所以 Collector1优先采集并上传到存储系统。然后我们kill掉Collector1,此时有Collector2负责日志的采集上传工作,之后,我 们手动恢复Collector1节点的Flume服务,再次在Agent1上传文件,发现Collector1恢复优先级别的采集工作。