Flume一个数据源对应多个channel,多个sink

原文链接:http://www.tuicool.com/articles/Z73UZf6


hadoop2 和hadoop3上的收集的数据  送到hadoop1集群上,然后hadoop1送到多个不同的目的。



一、概述

1、现在有三台机器,分别是:Hadoop1,Hadoop2,Hadoop3,以Hadoop1为日志汇总

2、Hadoop1汇总的同时往多个目标进行输出

3、Flume一个数据源对应多个channel,多个sink,是在consolidation-accepter.conf文件里配置的

二、部署Flume来采集日志和汇总日志

1、在Hadoop1上运行

flume-ng agent --conf ./ -f consolidation-accepter.conf -n agent1 -Dflume.root.logger=INFO,console

其脚本(consolidation-accepter.conf)内容如下

# Finally, now that we've defined all of our components, tell
# agent1 which ones we want to activate.
agent1.channels = ch1 ch2
agent1.sources = source1
agent1.sinks = hdfssink1 sink2
agent1.source.source1.selector.type = replicating

# Define a memory channel called ch1 on agent1
agent1.channels.ch1.type = memory
agent1.channels.ch1.capacity = 1000000
agent1.channels.ch1.transactionCapacity = 1000000
agent1.channels.ch1.keep-alive = 10

agent1.channels.ch2.type = memory
agent1.channels.ch2.capacity = 1000000
agent1.channels.ch2.transactionCapacity = 100000
agent1.channels.ch2.keep-alive = 10

# Define an Avro source called avro-source1 on agent1 and tell it
# to bind to 0.0.0.0:41414. Connect it to channel ch1.
agent1.sources.source1.channels = ch1 ch2
agent1.sources.source1.type = avro
agent1.sources.source1.bind = con
agent1.sources.source1.port = 44444
agent1.sources.source1.threads = 5

# Define a logger sink that simply logs all events it receives
# and connect it to the other end of the same channel.
agent1.sinks.hdfssink1.channel = ch1
agent1.sinks.hdfssink1.type = hdfs
agent1.sinks.hdfssink1.hdfs.path = hdfs://mycluster/flume/%Y-%m-%d/%H%M
agent1.sinks.hdfssink1.hdfs.filePrefix = S1PA124-consolidation-accesslog-%H-%M-%S
agent1.sinks.hdfssink1.hdfs.useLocalTimeStamp = true
agent1.sinks.hdfssink1.hdfs.writeFormat = Text
agent1.sinks.hdfssink1.hdfs.fileType = DataStream
agent1.sinks.hdfssink1.hdfs.rollInterval = 1800
agent1.sinks.hdfssink1.hdfs.rollSize = 5073741824
agent1.sinks.hdfssink1.hdfs.batchSize = 10000
agent1.sinks.hdfssink1.hdfs.rollCount = 0
agent1.sinks.hdfssink1.hdfs.round = true
agent1.sinks.hdfssink1.hdfs.roundValue = 60
agent1.sinks.hdfssink1.hdfs.roundUnit = minute


agent1.sinks.sink2.type = logger
agent1.sinks.sink2.sink.batchSize=10000
agent1.sinks.sink2.sink.batchTimeout=600000
agent1.sinks.sink2.sink.rollInterval = 1000
agent1.sinks.sink2.sink.directory=/root/data/flume-logs/
agent1.sinks.sink2.sink.fileName=accesslog
agent1.sinks.sink2.channel = ch2

2、分别在Hadoop2和Hadoop3运行如下命令

flume-ng agent --conf ./  --conf-file collect-send.conf --name agent2

Flume数据发送器配置文件collect-send.conf内容如下

agent2.sources = source2
agent2.sinks = sink1
agent2.channels = ch2
agent2.sources.source2.type = exec
agent2.sources.source2.command = tail -F /root/data/flume.log
agent2.sources.source2.channels = ch2

#channels configuration
agent2.channels.ch2.type = memory
agent2.channels.ch2.capacity = 10000
agent2.channels.ch2.transactionCapacity = 10000
agent2.channels.ch2.keep-alive = 3

#sinks configuration
agent2.sinks.sink1.type = avro
agent2.sinks.sink1.hostname=consolidationIpAddress
agent2.sinks.sink1.port = 44444
agent2.sinks.sink1.channel = ch2
1、启动Flume汇总进程
  flume-ng agent --conf ./ -f consolidation-accepter.conf -n agent1 -Dflume.root.logger=INFO,console
2、启动Flume采集进程
  flume-ng agent --conf ./  --conf-file collect-send.conf --name agent2
3、配置参数说明(以下两个条件是or的关系,也就是当一个条件满足就触发)
(1)每半小时把channel里的数据冲刷到sink中去,并且另起新的文件来存储
    agent1.sinks.hdfssink1.hdfs.rollInterval = 18002)当文件大小为5073741824字节时,另起新的文件来存储
    agent1.sinks.hdfssink1.hdfs.rollSize = 5073741824

安装参考: http://blog.csdn.net/panguoyuan/article/details/39555239

用户手册参考: http://flume.apache.org/FlumeUserGuide.html


  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Flume一个分布式的日志收集系统,它能够将数据从不同的数据源收集起来,并将其传输到目标系统。MySQL sinkFlume中的一种sink类型,用于将数据写入MySQL数据库。 下面是一个使用Java编写的MySQL sink的示例: ```java import java.sql.Connection; import java.sql.DriverManager; import java.sql.PreparedStatement; import java.sql.SQLException; import org.apache.flume.Context; import org.apache.flume.Event; import org.apache.flume.conf.Configurable; import org.apache.flume.sink.AbstractSink; public class MySQLSink extends AbstractSink implements Configurable { private String driver; private String url; private String username; private String password; private String tableName; private Connection connection; private PreparedStatement statement; @Override public void configure(Context context) { driver = context.getString("driver"); url = context.getString("url"); username = context.getString("username"); password = context.getString("password"); tableName = context.getString("tableName"); } @Override public void start() { try { Class.forName(driver); connection = DriverManager.getConnection(url, username, password); statement = connection.prepareStatement("INSERT INTO " + tableName + " (message) VALUES (?)"); super.start(); } catch (ClassNotFoundException e) { e.printStackTrace(); } catch (SQLException e) { e.printStackTrace(); } } @Override public void stop() { try { if (statement != null) { statement.close(); } if (connection != null) { connection.close(); } } catch (SQLException e) { e.printStackTrace(); } super.stop(); } @Override public Status process() throws EventDeliveryException { Status status = null; Event event = null; try { event = getChannel().take(); if (event != null) { String message = new String(event.getBody()); statement.setString(1, message); statement.executeUpdate(); status = Status.READY; } else { status = Status.BACKOFF; } } catch (Exception e) { e.printStackTrace(); status = Status.BACKOFF; } return status; } } ``` 在这个示例中,我们实现了Flume中的`AbstractSink`类,并实现了其中的`configure`、`start`、`stop`和`process`方法。在`configure`方法中,我们从Flume的配置文件中获取MySQL数据库相关的配置信息。在`start`方法中,我们建立了一个到MySQL数据库的连接,并准备好了一个PreparedStatement对象用于插入数据。在`stop`方法中,我们关闭了连接和PreparedStatement对象。在`process`方法中,我们从Flumechannel中获取一个event,将其转换为一个字符串,并执行插入到MySQL数据库的操作。如果处理成功,返回`Status.READY`表示可以继续处理,否则返回`Status.BACKOFF`表示需要停止处理一段时间。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值