Flume Avro端口传输、负载均衡、故障转移

1.Arvo

Avro可以通过client发送一个指定的文件给Flume,flume可以通过设置source的接受方式,监控avro发送数据的ip和端口,获取数据。
Flume主要的RPC Source也是 Avro Source,它使用Netty-Avro inter-process的通信(IPC)协议来通信,可以用java或JVM语言发送数据到Avro Source端。它的配置文件主要包含三个参数:type: Avro source的别名是avro,也可以使用完整类别名称,org.apache.flume.source.AvroSource;
bind: 绑定的IP地址或主机名。使用0.0.0.0绑定机器所有端口
port: 绑定监听端口端口

例:A服务器(日志文件test.log) -avro->  B服务器接受avro数据sink到控制台上
注:服务器互传日志文件fileToavro.conf 配置文件在所有节点服务器配置都要一

(在A服务器上)

linux>vi flieToavro.conf 
# Name the components on this agent  #//命名(agent)此代理上的组件
test.sources = s1            #//源 命名为 s1 
test.sinks = k1              #//下沉 命名为 k1
test.channels = c1           #//通道  命名为 c1

# Describe/configure the source      #//描述/配置源
test.sources.s1.type = exec     #//设置源类型 = 执行 (用于命令)
test.sources.s1.command = tail -F /root/logs/test.log  #//命令

# Describe the sink
#绑定的不是(A服务器)本机, 是另外一台机器的服务地址, sink端的avro是一个发送端, avro的客户端, 往(B服务器)这个机器上发
test.sinks = k1       #//sinks命名为 k1
test.sinks.k1.type = avro    #//sink类型 = avro
test.sinks.k1.hostname = 192.168.58.201    #//B主机地址
test.sinks.k1.port = 1234        #//sink端口号
test.sinks.k1.batch-size = 2    #//sink    批量大小

# Use a channel which buffers events in memory
test.channels.c1.type = memory        #//通道类型 = memory 
test.channels.c1.capacity = 1000          #//容量 = 1000
test.channels.c1.transactionCapacity = 100      #//事务处理能力 = 100

# Bind the source and sink to the channel
test.sources.s1.channels = c1    #//设置源绑定通道 = c1
test.sinks.k1.channel = c1        #//设置sink绑定通道 = c1

(在B服务器上)

linux>vi avroToConsole.conf

# Name the components on this agent
test.sources = s1
test.sinks = k1
test.channels = c1

# Describe/configure the source
#source中的avro组件是接收者服务, 绑定本机
test.sources.s1.type = avro   #//接收者服务, 绑定本机
test.sources.s1.bind = 0.0.0.0    
test.sources.s1.port = 1234

# Describe the sink
test.sinks.k1.type = logger

# Use a channel which buffers events in memory
test.channels.c1.type = memory
test.channels.c1.capacity = 1000
test.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
test.sources.s1.channels = c1
test.sinks.k1.channel = c1

A服务器执行(后执行):

linux>bin/flume-ng agent --conf conf --conf-file fileToavro.conf --name test -Dflume.root.logger=INFO,console

B服务器执行(先执行):

linux>bin/flume-ng agent --conf conf --conf-file avroToConsole.conf --name test -Dflume.root.logger=INFO,console

结果:执行后可实现从avro端口接收数据 A服务器采集日志通过管道存入B服务中

2.load_balance(负载平衡,每个都发每个都使用)

服务器A配置

linux>vi load_balance_avro.conf

#test name
test.channels = c1
test.sources = s1
test.sinks = k1 k2 k3

#set gruop
test.sinkgroups = g1        #//设置sink组

#set channel
test.channels.c1.type = memory
test.channels.c1.capacity = 1000
test.channels.c1.transactionCapacity = 100

test.sources.s1.channels = c1
test.sources.s1.type = exec
test.sources.s1.command = tail -F /root/logs/test.log

# set sink1
test.sinks.k1.channel = c1
test.sinks.k1.type = avro
test.sinks.k1.hostname = 192.168.58.201
test.sinks.k1.port = 12345

# set sink2
test.sinks.k2.channel = c1
test.sinks.k2.type = avro
test.sinks.k2.hostname= 192.168.58.202
test.sinks.k2.port = 12345

# set sink3
test.sinks.k3.channel = c1
test.sinks.k3.type = avro
test.sinks.k3.hostname = 192.168.58.203
test.sinks.k3.port = 12345

#set sink group
test.sinkgroups.g1.sinks = k1 k2 k3    #//绑定组

#set loadbalance
test.sinkgroups.g1.processor.type = load_balance    #//设置工作类型=负载平衡
#如果开启,则将失败的 sink 放入黑名单
test.sinkgroups.g1.processor.backoff = true        #//sink组加工处理回退
#轮询 轮询负载均衡  
test.sinkgroups.g1.processor.selector = round_robin    #//sink组处理选择=轮询
#在黑名单放置的超时时间,超时结束时,若仍然无法接收,则超时时间呈指数增长 
test.sinkgroups.g1.processor.selector.maxTimeOut=10000 

三个客户端配置相同(B、C、D):

linux>vi avro.conf

# Name the components on this agent
test.sources = s1
test.sinks = k1
test.channels = c1

# Describe/configure the source
test.sources.s1.type = avro
test.sources.s1.channels = c1
test.sources.s1.bind = 0.0.0.0
test.sources.s1.port = 12345

# Describe the sink
test.sinks.k1.type = logger

# Use a channel which buffers events in memory
test.channels.c1.type = memory
test.channels.c1.capacity = 1000
test.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
test.sources.s1.channels = c1
test.sinks.k1.channel = c1

#下发文件
linux>scp avro.conf 192.168.58.202:`pwd`
linux>scp avro.conf 192.168.58.203:`pwd`

(B、C、D)3个节点启动---启动  

linux>bin/flume-ng agent --conf conf --conf-file avro.conf --name test -Dflume.root.logger=INFO,console

(A)日志节点启动----

linux>bin/flume-ng agent --conf conf --conf-file load_balance_avro.conf --name test -Dflume.root.logger=INFO,console

如果报错,注意xxx.conf配置文件内有没有多余的空格
结果:将日志文件发送到其余3个节点(轮询负载均衡)

3.故障转移  (发生故障的时候才使用)

linux>vi failover.conf

#test name
test.channels = c1
test.sources = s1
test.sinks = k1 k2 k3

#set gruop
test.sinkgroups = g1

#set channel
test.channels.c1.type = memory
test.channels.c1.capacity = 1000
test.channels.c1.transactionCapacity = 100

test.sources.s1.channels = c1
test.sources.s1.type = exec
test.sources.s1.command = tail -F /root/logs/test.log

# set sink1
test.sinks.k1.channel = c1
test.sinks.k1.type = avro
test.sinks.k1.hostname = 192.168.58.201
test.sinks.k1.port = 12345

# set sink2
test.sinks.k2.channel = c1
test.sinks.k2.type = avro
test.sinks.k2.hostname= 192.168.58.202
test.sinks.k2.port = 12345

# set sink3
test.sinks.k3.channel = c1
test.sinks.k3.type = avro
test.sinks.k3.hostname = 192.168.58.203
test.sinks.k3.port = 12345

#set sink group
test.sinkgroups.g1.sinks = k1 k2 k3


test.sinkgroups = g1
test.sinkgroups.g1.sinks = k1 k2 k3
test.sinkgroups.g1.processor.type = failover  #//处理模式=故障转移
#优先级值, 绝对值越大表示优先级越高
test.sinkgroups.g1.processor.priority.k1 = 1     #//设置优先级
test.sinkgroups.g1.processor.priority.k2 = 5     #//设置优先级
test.sinkgroups.g1.processor.priority.k3 = 9     #//设置优先级
#失败的Sink的最大回退期(millis) 
test.sinkgroups.g1.processor.maxpenalty = 20000 

(B、C、D)3个节点启动---启动  

linux>bin/flume-ng agent --conf conf --conf-file avro.conf --name test -Dflume.root.logger=INFO,console

(A)日志节点启动----

linux>bin/flume-ng agent --conf conf --conf-file failover.conf
 --name test -Dflume.root.logger=INFO,console

测试数据:

linux>while true;do echo 'access log ...' >> /root/logs/test.log; sleep 0.5;done

结果:数据会发给优先级最高的,如果中间挂掉了会发往优先级第二的服务器

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值