Flume之同时向HDFS以及Kafka写数据

前言

本篇文章详细的介绍了Flume的Agent配置Multiple flows向Kafka以及hdfs些数据,涉及的Hadoop、Zookeeper、Kafka均是伪分布式部署。

基础环境

组件版本

组件名称组件版本
Flumeflume-ng-1.6.0-cdh5.7.0
Zookeeperzookeeper-3.4.5-cdh5.7.0
Kafkakafka_2.11-0.10.1.1

安装部署

安装Flume

这里略过

安装单点Zookeeper

可参考ZooKeeper介绍及基本操作

安装单点Kafka

#解压
[hadoop@hadoop001 soft]$ cd ~/soft
[hadoop@hadoop001 soft]$ tar -zxvf kafka_2.11-0.10.0.0.tgz -C ~/app/

#修改数据存储位置
[hadoop@hadoop001 soft]$ cd ~/app/kafka_2.11-0.10.0.0/
[hadoop@hadoop001 kafka_2.11-0.10.0.0]$ mkdir -p ~/app/kafka_2.11-0.10.0.0/datalogdir
[hadoop@hadoop001 kafka_2.11-0.10.0.0]$ vim config/server.properties 
log.dirs=/home/hadoop/app/kafka_2.11-0.10.0.0/datalogdir

#添加环境变量
[hadoop@hadoop001 kafka_2.11-0.10.0.0]$ vim ~/.bash_profile 
export KAFKA_HOME=/home/hadoop/app/kafka_2.11-0.10.0.0
export PATH=$KAFKA_HOME/bin:$PATH
[hadoop@hadoop001 kafka_2.11-0.10.0.0]$ source ~/.bash_profile 
[hadoop@hadoop001 kafka_2.11-0.10.0.0]$ which kafka-topics.sh
~/app/kafka-0.10.1.1/bin/kafka-topics.sh

#启动
[hadoop@hadoop001 kafka_2.11-0.10.0.0]$ bin/kafka-server-start.sh config/server.properties 

#测试:创建Topic
[hadoop@hadoop001 kafka_2.11-0.10.0.0]$ bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic leo
#测试:显示Topic列表
[hadoop@hadoop001 kafka_2.11-0.10.0.0]$ bin/kafka-topics.sh --list --zookeeper localhost:2181
#测试:控制台生产者
[hadoop@hadoop001 kafka_2.11-0.10.0.0]$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic leo
#测试:控制台消费者
[hadoop@hadoop001 kafka_2.11-0.10.0.0]$ bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic leo --from-beginning 

配置Flume作业

使用Flume的TailDir Source采集数据发送到Kafka以及HDFS。具体配置如下:

Taildir-HdfsAndKafka-Agnet.sources = taildir-source
Taildir-HdfsAndKafka-Agnet.channels = c1 c2
Taildir-HdfsAndKafka-Agnet.sinks = hdfs-sink kafka-sink

Taildir-HdfsAndKafka-Agnet.sources.taildir-source.type = TAILDIR
Taildir-HdfsAndKafka-Agnet.sources.taildir-source.filegroups = f1
Taildir-HdfsAndKafka-Agnet.sources.taildir-source.filegroups.f1 = /home/hadoop/data/flume/HdfsAndKafka/input/.*
Taildir-HdfsAndKafka-Agnet.sources.taildir-source.positionFile = /home/hadoop/data/flume/HdfsAndKafka/taildir_position/taildir_position.json
Taildir-HdfsAndKafka-Agnet.sources.taildir-source.selector.type = replicating

Taildir-HdfsAndKafka-Agnet.channels.c1.type = memory
Taildir-HdfsAndKafka-Agnet.channels.c2.type = memory

Taildir-HdfsAndKafka-Agnet.sinks.hdfs-sink.type = hdfs
Taildir-HdfsAndKafka-Agnet.sinks.hdfs-sink.hdfs.path = hdfs://hadoop001:9000/flume/HdfsAndKafka/%Y%m%d%H%M
Taildir-HdfsAndKafka-Agnet.sinks.hdfs-sink.hdfs.useLocalTimeStamp=true
Taildir-HdfsAndKafka-Agnet.sinks.hdfs-sink.hdfs.filePrefix = leo
Taildir-HdfsAndKafka-Agnet.sinks.hdfs-sink.hdfs.rollInterval = 10
Taildir-HdfsAndKafka-Agnet.sinks.hdfs-sink.hdfs.rollSize = 100000000
Taildir-HdfsAndKafka-Agnet.sinks.hdfs-sink.hdfs.rollCount = 0
Taildir-HdfsAndKafka-Agnet.sinks.hdfs-sink.hdfs.fileType=DataStream
Taildir-HdfsAndKafka-Agnet.sinks.hdfs-sink.hdfs.writeFormat=Text

Taildir-HdfsAndKafka-Agnet.sinks.kafka-sink.type = org.apache.flume.sink.kafka.KafkaSink
Taildir-HdfsAndKafka-Agnet.sinks.kafka-sink.brokerList = localhost:9092
Taildir-HdfsAndKafka-Agnet.sinks.kafka-sink.topic = leo


Taildir-HdfsAndKafka-Agnet.sources.taildir-source.channels = c1 c2
Taildir-HdfsAndKafka-Agnet.sinks.hdfs-sink.channel = c1
Taildir-HdfsAndKafka-Agnet.sinks.kafka-sink.channel = c2

启动命令:

flume-ng agent \
--name Taildir-HdfsAndKafka-Agnet \
--conf $FLUME_HOME/conf \
--conf-file $FLUME_HOME/conf/Taildir-HdfsAndKafka-Agnet.conf \
-Dflume.root.logger=INFO,console
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值