Apache Flume failover高可用性

在完成单点的Flume NG搭建后,下面搭建一个高可用的Flume NG集群,架构图如下所示:
在这里插入图片描述
图中,可以看出,Flume的存储可以支持多种,这里只列举了HDFS和Kafka
(如:存储最新的一周日志,并给Storm系统提供实时日志流)。

角色分配

Flume的Agent和Collector分布如下表所示:

名称 HOST 角色
Agent1 hadoop01 Web Server
Collector1 hadoop02 AgentMstr1
Collector2 hadoop03 AgentMstr2

图中所示,Agent1数据分别流入到Collector1和Collector2,Flume NG本身提供了Failover机制,可以自动切换和恢复。

在上图中,有3个产生日志服务器分布在不同的机房,要把所有的日志都收集到一个集群中存储。下 面我们开发配置Flume NG集群

hadoop01开发配置文件

hadoop01机器配置agent的配置文件

cd /export/servers/apache-flume-1.8.0-bin/tmpconf
vim agent.conf
#agent1 name
agent1.channels = c1
agent1.sources = r1
agent1.sinks = k1 k2
#
##set gruop
agent1.sinkgroups = g1
##set sink group
agent1.sinkgroups.g1.sinks = k1 k2

#
agent1.sources.r1.type = exec
agent1.sources.r1.command = tail -F /home/taillogs/test.log

#
##set channel
agent1.channels.c1.type = memory
agent1.channels.c1.capacity = 1000
agent1.channels.c1.transactionCapacity = 100
## set sink1
agent1.sinks.k1.type = avro
agent1.sinks.k1.hostname = hadoop02
agent1.sinks.k1.port = 52020
#
## set sink2
agent1.sinks.k2.type = avro
agent1.sinks.k2.hostname = hadoop03
agent1.sinks.k2.port = 52020
#
##set failover
agent1.sinkgroups.g1.processor.type = failover
agent1.sinkgroups.g1.processor.priority.k1 = 2
agent1.sinkgroups.g1.processor.priority.k2 = 1
agent1.sinkgroups.g1.processor.maxpenalty = 10000
#maxpenalty故障转移的默认时间
agent1.sources.r1.channels = c1
agent1.sinks.k1.channel = c1
agent1.sinks.k2.channel = c1

hadoop02与hadoop03配置flumecollection

hadoop02机器修改配置文件

cd /export/servers/apache-flume-1.8.0-bin/tmpconf
vim collector.conf
#set Agent name
a1.sources = r1
a1.channels = c1
a1.sinks = k1

## other node,nna to nns
a1.sources.r1.type = avro
a1.sources.r1.bind = hadoop02
a1.sources.r1.port = 52020

##set channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

#
##set sink to hdfs
a1.sinks.k1.type=hdfs
a1.sinks.k1.hdfs.path= hdfs://hadoop01:8020/flume/failover/


a1.sources.r1.channels=c1
a1.sinks.k1.channel=c1

hadoop03机器修改配置文件

cd  /export/servers/apache-flume-1.8.0-bin/tmpconf
vim collector.conf
#set Agent name
a1.sources = r1
a1.channels = c1
a1.sinks = k1

## other node,nna to nns
a1.sources.r1.type = avro
a1.sources.r1.bind = hadoop03
a1.sources.r1.port = 52020

##set channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

#
##set sink to hdfs
a1.sinks.k1.type=hdfs
a1.sinks.k1.hdfs.path= hdfs://hadoop01:8020/flume/failover/


a1.sources.r1.channels=c1
a1.sinks.k1.channel=c1

顺序启动命令(先接收端,后发送端)

hadoop02机器上面启动flume

cd /export/servers/apache-flume-1.8.0-bin/
bin/flume-ng agent -n a1 -c conf -f tmpconf/collector.conf -Dflume.root.logger=DEBUG,console

hadoop03机器上面启动flume

cd /export/servers/apache-flume-1.8.0-bin/
bin/flume-ng agent -n a1 -c conf -f tmpconf/collector.conf -Dflume.root.logger=DEBUG,console

hadoop01机器上面启动flume

cd /export/servers/apache-flume-1.8.0-bin/
bin/flume-ng agent -n agent1 -c conf -f conf/agent.conf -Dflume.root.logger=DEBUG,console

hadoop01机器启动文件产生脚本

cd  /home
sh tail-file.sh

接下来就可以查看几个flume的输出了。
可以看到,hadoop03没有接收到数据

2019-12-05 15:58:48,727 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:127)] Checking file:tmpconf/collector.conf for changes
2019-12-05 15:59:18,727 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:127)] Checking file:tmpconf/collector.conf for changes

而hadoop02一直在接收数据

2019-12-05 16:00:23,278 (New I/O worker #1) [DEBUG - org.apache.flume.source.AvroSource.appendBatch(AvroSource.java:377)] Avro source r1: Received avro event batch of 6 events.
2019-12-05 16:00:23,279 (SinkRunner-PollingRunner-DefaultSinkProcessor) [DEBUG - org.apache.flume.sink.hdfs.BucketWriter.shouldRotate(BucketWriter.java:618)] rolling: rollCount: 10, events: 10
2019-12-05 16:00:23,965 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:393)] Closing hdfs://hadoop01:8020/flume/failover//FlumeData.1575532774557.tmp
2019-12-05 16:00:23,974 (hdfs-k1-call-runner-0) [INFO - org.apache.flume.sink.hdfs.BucketWriter$8.call(BucketWriter.java:655)] Renaming hdfs://hadoop01:8020/flume/failover/FlumeData.1575532774557.tmp to hdfs://hadoop01:8020/flume/failover/FlumeData.1575532774557
2019-12-05 16:00:23,994 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:251)] Creating hdfs://hadoop01:8020/flume/failover//FlumeData.1575532774558.tmp
2019-12-05 16:00:24,015 (hdfs-k1-call-runner-9) [DEBUG - org.apache.flume.sink.hdfs.AbstractHDFSWriter.reflectGetNumCurrentReplicas(AbstractHDFSWriter.java:200)] Using getNumCurrentReplicas--HDFS-826
2019-12-05 16:00:24,015 (hdfs-k1-call-runner-9) [DEBUG - org.apache.flume.sink.hdfs.AbstractHDFSWriter.reflectGetDefaultReplication(AbstractHDFSWriter.java:228)] Using FileSystem.getDefaultReplication(Path) from HADOOP-8014

hadoop01也没有输出数据

2019-12-05 16:00:46,364 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:127)] Checking file:tmpconf/agent.conf for changes
2019-12-05 16:01:16,365 (conf-file-poller-0) [DEBUG - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:127)] Checking file:tmpconf/agent.conf for changes

这是因为配置的时候
hadoop02的数字比hadoop03的数字大,所以除非hadoop02坏了,不然一直往hadoop02里写
如果hadoop02坏了,就会往hadoop03里写。

agent1.sinkgroups.g1.processor.priority.k1 = 2
agent1.sinkgroups.g1.processor.priority.k2 = 1

FAILOVER测试

下面测试Flume NG集群的高可用(故障转移)。

开启日志写入脚本,发现hadoop02在接收。
干掉hadoop02,发现hadoop03开始接收。
重启hadoop02,发现hadoop03停止接收,hadoop02开始接收。
它们两个互补,这就是failover高可用性。

发布了216 篇原创文章 · 获赞 181 · 访问量 7347
展开阅读全文

没有更多推荐了,返回首页

©️2019 CSDN 皮肤主题: 技术黑板 设计师: CSDN官方博客

分享到微信朋友圈

×

扫一扫,手机浏览