flume部署和使用

软件下载和安装

flume下载地址:http://archive.apache.org/dist/flume/

-》解压缩
  tar -zxf /opt/softwares/flume-ng-1.6.0-cdh5.10.2.tar.gz
-》配置文件:flume-env.sh
  export JAVA_HOME=/opt/apps/jdk1.7.0_67
-》测试是否成功
  bin/flume-ng version

flume的flume-ng命令

Usage: bin/flume-ng <command> [options]...

commands:
  agent                     run a Flume agent
  avro-client               run an avro Flume client

global options:
  --conf,-c <conf>          use configs in <conf> directory

agent options:
  --name,-n <name>          the name of this agent (required)
  --conf-file,-f <file>     specify a config file (required if -z missing)

avro-client options:
  --rpcProps,-P <file>   RPC client properties file with server connection params
  --host,-H <host>       hostname to which events will be sent
  --port,-p <port>       port of the avro source
  --dirname <dir>        directory to stream to avro source
  --filename,-F <file>   text file to stream to avro source (default: std input)
  --headerFile,-R <file> File containing event headers as key/value pairs on each new line

提交任务的命令:

bin/flume-ng agent --conf conf --name agent --conf-file conf/test.properties  
bin/flume-ng agent -c conf -n agent -f conf/test.properties
bin/flume-ng avro-client --conf conf --host ibeifeng.class --port 8080

配置情况选择:

flume安装在hadoop集群中:
- 配置JAVA_HOME:export JAVA_HOME=/opt/apps/jdk1.7.0_67

flume安装在hadoop集群中,而且还配置了HA:
- HDFS访问入口变化
- 配置JAVA_HOME:export JAVA_HOME=/opt/apps/jdk1.7.0_67
- 还需要添加hadoop的core-site.xml和hdfs-site.xml拷贝到flume的conf目录

flume不在hadoop集群里:
- 配置JAVA_HOME:export JAVA_HOME=/opt/apps/jdk1.7.0_67
- 还需要添加hadoop的core-site.xml和hdfs-site.xml拷贝到flume的conf目录
- 将hadoop的一些jar包添加到flume的lib目录下(用的是什么版本拷贝什么版)

运行

bin/flume-ng agent --conf conf --conf-file conf/flume-agent.properties --name a1 -Dflume.root.logger=INFO,console

需要在conf/flume-agent.properties中配置相关信息,以从kafka中消费数据转存到HDFS为例,配置如下

#定义agent名, source、channel、sink的名称
agent.sources = r1
agent.channels = c1
agent.sinks = k1

#具体定义source
# 定义消息源类型
agent.sources.r1.type = org.apache.flume.source.kafka.KafkaSource
# 定义kafka所在zk的地址
agent.sources.r1.zookeeperConnect = dbtest1:2181,dbtest2:2182,dbtest3:2183
agent.sources.r1.kafka.bootstrap.servers = dbtest1:9092,dbtest2:9093,dbtest3:9094
agent.sources.r1.brokerList = dbtest1:9092,dbtest2:9093,dbtest3:9094
# 配置消费的kafka topic
agent.sources.r1.topic = my-replicated-topic5
#agent.sources.r1.kafka.consumer.timeout.ms = 100
# 配置消费者组的id
agent.sources.r1.kafka.consumer.group.id = flume

#自定义拦截器
#agent.sources.r1.interceptors=i1
#agent.sources.r1.interceptors.i1.type=com.hadoop.flume.FormatInterceptor$Builder

#具体定义channel
# channel类型
agent.channels.c1.type = memory
# channel存储的事件容量
agent.channels.c1.capacity = 10000
# 事务容量
agent.channels.c1.transactionCapacity = 100
#具体定义sink
agent.sinks.k1.type = hdfs
agent.sinks.k1.hdfs.path = hdfs://dbtest1:8020/test/%Y%m%d 
agent.sinks.k1.hdfs.fileType = DataStream
agent.sinks.k1.hdfs.writeFormat = Text
agent.sinks.k1.hdfs.rollInterval = 3
agent.sinks.k1.hdfs.rollSize = 1024000
agent.sinks.k1.hdfs.rollCount = 0

#配置前缀和后缀
agent.sinks.k1.hdfs.fileSuffix=.data
agent.sinks.k1.hdfs.filePrefix = localhost-%Y-%m-%d

agent.sinks.k1.hdfs.useLocalTimeStamp = true
agent.sinks.k1.hdfs.idleTimeout = 60

#避免文件在关闭前使用临时文件
#agent.sinks.k1.hdfs.inUserPrefix=_
#agent.sinks.k1.hdfs.inUserSuffix=

#组装channels
agent.sources.r1.channels = c1
agent.sinks.k1.channel = c1
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值