一、Flume
JDK版本:1.8.0_211
Flume版本:1.8.0
下载:略
配置:
- 系统环境变量
- export FLUME_HOME=/usr/local/flume/apache-flume-1.8.0-bin
- export FLUME_CONF_DIR=$FLUME_HOME/conf
- PATH加上$FLUME_HOME/bin
- conf/flume-env.sh配置JAVA_HOME
1、使用Flume接收来自AvroSource的信息
- conf/avro.conf
a1.sources=r1 a1.sinks=k1 a1.channels=c1 #Describe/configure the source a1.sources.r1.type=avro a1.sources.r1.channels=c1 a1.sources.r1.bind=0.0.0.0 a1.sources.r1.port=4141 #Describe the sink a1.sinks.k1.type=logger #Use a channel which buffers events in memory a1.channels.c1.type=memory a1.channels.c1.capacity=1000 a1.channels.c1.transactionCapacity=100 #Bind thr source and sink to the channel a1.sources.r1.channels=c1 a1.sinks.k1.channel=c1
-
启动控制台
/usr/local/flume/apache-flume-1.8.0-bin/bin/flume-ng agent -c . -f /usr/local/flume/apache-flume-1.8.0-bin/conf/avro.conf -n a1 -Dflume.root.logger=INFO,console
-
再开一个shell,在Flume主目录下创建文件
sh -c 'echo "hello, world" > /usr/local/flume/apache-flume-1.8.0-bin/log.00'
-
执行命令
./bin/flume-ng avro-client --conf conf -H localhost -p 4141 -F /usr/local/flume/apache-flume-1.8.0-bin/log.00
-
观察主Shell,已输出结果。
2、使用Flume接收来自NetcatSource的信息
- conf/example.conf
#example.conf: A single-node Flume configuration #Name the components on this agent a1.sources=r1 a1.sinks=k1 a1.channels=c1 #Describe/configure the source a1.sources.r1.type=netcat a1.sources.r1.bind=loca1host a1.sources.r1.port=44444 #Describe the sink a1.sinks.k1.type=logger #Use a channel which buffers events in memory a1.channels.c1.type=memory a1.channels.c1.capacity=1000 a1.channels.c1.transactionCapacity=100 #Bind thr source and sink to the channel a1.sources.r1.channels=c1 a1.sinks.k1.channel=c1
- 启动控制台
/usr/local/flume/apache-flume-1.8.0-bin/bin/flume-ng agent --conf ./conf --conf-file ./conf/example.conf --name a1 -Dflume.root.logger=INFO,console
- 再开一个shell,输入以下命令
telnet localhost 44444
- 在该shell中输入的内容会同步到主Shell中
3、Flume接收本地文件
- conf/exec1.conf
#Name the components on this agent a1.sources=r1 a1.sinks=k1 a1.channels=c1 #For each one of the source, the type is defined a1.sources.r1.type=exec a1.sources.r1.command=tail -F /usr/local/hadoop/hadoop-2.7.7/logs/hadoop-root-datanode-bigdata.log #whereis bash a1.sources.r1.shell=/usr/bin/bash -c #Each sink's type must be defined a1.sinks.k1.type=hdfs a1.sinks.k1.hdfs.path=hdfs://bigdata:9000/flume/%y%m%d/%H a1.sinks.k1.hdfs.filePrefix=logs- a1.sinks.k1.hdfs.round=true a1.sinks.k1.hdfs.roundValue=1 a1.sinks.k1.hdfs.roundUnit=minute a1.sinks.k1.hdfs.useLocalTimeStamp=true a1.sinks.k1.hdfs.batchSize=100 a1.sinks.k1.hdfs.fileType=DataStream a1.sinks.k1.hdfs.rollInterval=30 a1.sinks.k1.hdfs.rollSize=134217700 a1.sinks.k1.hdfs.rollCount=0 a1.sinks.k1.hdfs.minBlockReplicas=1 #Specify the channel should use a1.channels.c1.type=memory a1.channels.c1.capacity=1000 a1.channels.c1.transactionCapacity=100 #Bind thr source and sink to the channel a1.sources.r1.channels=c1 a1.sinks.k1.channel=c1
- 启动控制台
/usr/local/flume/apache-flume-1.8.0-bin/bin/flume-ng agent --conf ./conf --conf-file ./conf/exec1.conf --name a1 -Dflume.root.logger=INFO,console
- 去HDFS观察结果
4、Flume接收本地文件夹
- conf/spooldir1.conf
#agent名, source、channel、sink的名称 a1.sources = r1 a1.channels = c1 a1.sinks = k1 #具体定义source a1.sources.r1.type = spooldir a1.sources.r1.spoolDir = /usr/local/flume/apache-flume-1.8.0-bin/logs #具体定义channel a1.channels.c1.type = memory a1.channels.c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 #具体定义sink a1.sinks.k1.type = hdfs a1.sinks.k1.hdfs.path = hdfs://bigdata:9000/flume/%Y%m%d a1.sinks.k1.hdfs.filePrefix = events- a1.sinks.k1.hdfs.fileType = DataStream a1.sinks.k1.hdfs.useLocalTimeStamp = true #不按照条数生成文件 a1.sinks.k1.hdfs.rollCount = 0 #HDFS上的文件达到128M时生成一个文件 a1.sinks.k1.hdfs.rollSize = 134217700 #HDFS上的文件达到60秒生成一个文件 a1.sinks.k1.hdfs.rollInterval = 30 #组装source、channel、sink a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
- 启动控制台
/usr/local/flume/apache-flume-1.8.0-bin/bin/flume-ng agent --conf ./conf --conf-file ./conf/spooldir1.conf --name a1 -Dflume.root.logger=INFO,console
- 去HDFS观察结果