5. Flume企业开发案例-负载均衡和故障转移

最新推荐文章于 2023-12-12 20:19:50 发布

喵先生呢

最新推荐文章于 2023-12-12 20:19:50 发布

阅读量245

点赞数

分类专栏： # Flume 文章标签：大数据 flume

本文链接：https://blog.csdn.net/weixin_45267102/article/details/107924852

版权

Flume 专栏收录该内容

9 篇文章 0 订阅

订阅专栏

文章目录

Flume负载均衡和故障转移

Flume 1.7.0 User Guide

1. 故障转移

1.1 需求分析

使用 Flume1 监控一个端口，其 sink 组中的 sink 分别对接 Flume2 和 Flume3，采用
Failover Sink Processor，实现故障转移的功能。
#在/opt/module/flume/job 目录下创建 group3文件夹
mkdir group3
hadoop–flume1.conf
resource1 -- netcat
channel1 -- memory
sink -- avro
a1.sinkgroups.g1.processor.type = failover

hadoop–flume2.conf
resource1 -- avro
channel1 -- memory
sink -- logger

hadoop–flume3.conf
resource1 -- avro
channel1 -- memory
sink -- logger

1.2 配置文件

flume1.conf

# Name the components on this agent
a1.sources = r1
a1.channels = c1

a1.sinkgroups = g1
a1.sinks = k1 k2

# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = hadoop
a1.sources.r1.port = 44444

#设置故障转移
a1.sinkgroups.g1.processor.type = failover
#优先级值。绝对值越大表示优先级越高，优先级较高的值接收器将较早激活。
a1.sinkgroups.g1.processor.priority.k1 = 5
a1.sinkgroups.g1.processor.priority.k2 = 10

#失败的接收的最大回退周期(单位为毫秒)
a1.sinkgroups.g1.processor.maxpenalty = 10000

# Describe the sink
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = hadoop
a1.sinks.k1.port = 4141

a1.sinks.k2.type = avro
a1.sinks.k2.hostname = hadoop
a1.sinks.k2.port = 4142

# Describe the channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinkgroups.g1.sinks = k1 k2

a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c1

flume2.conf

#Name
a2.sources = r1
a2.sinks = k1
a2.channels = c1

#Sources
a2.sources.r1.type = avro
a2.sources.r1.bind = hadoop
a2.sources.r1.port = 4141

#Sink
a2.sinks.k1.type = logger

#Channel
a2.channels.c1.type = memory
a2.channels.c1.capacity = 1000
a2.channels.c1.transactionCapacity = 100

#Bind
a2.sinks.k1.channel = c1
a2.sources.r1.channels = c1

flume3.conf

#Name
a3.sources = r1
a3.sinks = k1
a3.channels = c1

#Sources
a3.sources.r1.type = avro
a3.sources.r1.bind = hadoop
a3.sources.r1.port = 4142

#Sink
a3.sinks.k1.type = logger

#Channel
a3.channels.c1.type = memory
a3.channels.c1.capacity = 1000
a3.channels.c1.transactionCapacity = 100

#Bind
a3.sinks.k1.channel = c1
a3.sources.r1.channels = c1

1.3 测试

bin/flume-ng agent -c conf/ -n a3 -f job/group3/flume3.conf -Dflume.root.logger=INFO,console

bin/flume-ng agent -c conf/ -n a2 -f job/group3/flume2.conf -Dflume.root.logger=INFO,console

bin/flume-ng agent -c conf/ -n a1 -f job/group3/flume1.conf -Dflume.root.logger=INFO,console

#在hadoop上输入以下内容
nc hadoop 44444
hello
OK
world
OK
lala
OK
#发现数据全走的Flume3
#把Flume3杀掉ctrl + c
shazi
OK
haha
OK
hahha
OK
hahah
OK
#这时候数据通过Flume2打印纸控制台上

2. 负载均衡

使用 Flume1 监控一个端口，其 sink 组中的 sink 分别对接 Flume2 和 Flume3，采用
Load balancing Sink Processor，实现负载均衡的功能。

对上面的配置文件进行如下修改flume1.conf：

# Name the components on this agent
a1.sources = r1
a1.channels = c1
a1.sinkgroups = g1
a1.sinks = k1 k2

# Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = hadoop
a1.sources.r1.port = 44444

#设置负载均衡
a1.sinkgroups.g1.processor.type = load_balance
a1.sinkgroups.g1.processor.backoff = true
a1.sinkgroups.g1.processor.selector = random


#失败的接收的最大回退周期(单位为毫秒)
a1.sinkgroups.g1.processor.maxpenalty = 10000

# Describe the sink
a1.sinks.k1.type = avro
a1.sinks.k1.hostname = hadoop
a1.sinks.k1.port = 4141

a1.sinks.k2.type = avro
a1.sinks.k2.hostname = hadoop
a1.sinks.k2.port = 4142

# Describe the channel
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinkgroups.g1.sinks = k1 k2

a1.sinks.k1.channel = c1
a1.sinks.k2.channel = c1