flume介绍
日志数据收集器
flume使用步骤
- 定义source,channel(通道),sink(转存的位置)
- 启动agent
- 如果有数据,就已经开始接受转存了
flume运行机理
flume type介绍
- source type
- Avro, Exec, Jms, Spooling directory, Netcat, Http,
Syslog, Thrift, twitter等
高级编写自己的source type- channel
- 可以存放在memory、jdbc、file中
- sink type
- HDFS, Hbase 或SPARK STREAM也可能是另一个sink
flume demo
安装解压flume:
/home/hadoop/opt/apache-flume-1.8.0-bin/conf
vi spooldir.conf
========================================================
spooldir.sources=sa
spooldir.channels=ma
spooldir.sinks=ha
spooldir.sources.sa.type=spooldir
spooldir.sources.sa.spoolDir=/home/hadoop/firstdemo/flume_spider
spooldir.sources.sa.fileHeader = true
spooldir.channels.ma.type=memory
spooldir.channels.ma.capacity=10000
spooldir.channels.ma.transactioncapacity=1000000
#spooldir.sinks.ha.type=logger
spooldir.sinks.ha.type=hdfs
spooldir.sinks.ha.hdfs.fileType=DataStream
spooldir.sinks.ha.hdfs.path=/user/hadoop/spider
spooldir.sinks.ha.hdfs.writeFormat=Text
spooldir.sinks.ha.hdfs.batchSize=10000
spooldir.sinks.ha.hdfs.rollCount=1000
spooldir.sinks.ha.hdfs.fileSuffix=.csv
spooldir.sinks.ha.hdfs.filePrefix=test
spooldir.sinks.ha.hdfs.rollSize=0
spooldir.sinks.ha.hdfs.rollInterval=0
spooldir.sources.sa.channels=ma
spooldir.sinks.ha.channel=ma
me
==
启动flume
./bin/flume-ng agent -n spooldir -c conf -f conf/spooldir.conf
重新加载flume
./bin/flume-ng agent -n spooldir -c conf -f conf/spooldir.conf -Dflume.root.logger=INFO,console
“`