单节点flume搭建

本文详细介绍如何在Linux环境下部署Flume进行日志收集。包括安装配置Flume、设置Agent参数、实现从Netcat源接收数据并记录日志,以及通过Exec Source监控特定文件更新并将数据写入HDFS的过程。
摘要由CSDN通过智能技术生成

1
上传文件到root用户下的tmp目录下,然后使用root用户解压
[root@h101 tmp]$ tar -zxvf flume-ng-1.2.0-cdh3u5.tar.gz -C /usr/local/
授权:chown hadoop.hadoop /usr/local/ flume-ng-1.2.0-cdh3u5
切换用户: su - hadoop
[hadoop@h 101 ~]$ vi .bash_profile
添加:
export FLUME_HOME=/usr/local/flume-ng-1.2.0-cdh3u5
export FLUME_CONF_DIR=$FLUME_HOME/conf
2
[hadoop@h 101 ~]$ cd flume-ng-1.2.0-cdh3u5/conf/

[hadoop@h 101 conf]$ cp flume-conf.properties.template flume-conf.properties

[hadoop@h 101 conf]$ vi flume-conf.properties
#名字
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# Describe/configure the source
#模式网络模式
a1.sources.r1.type = netcat
#捕获那个机器
a1.sources.r1.bind = h101
#端口
a1.sources.r1.port = 44444

# Describe the sink
#投到了当前模式显示logger日志
a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
#通道类型:内存
a1.channels.c1.type = memory
#通道总量1000字节可以设大
a1.channels.c1.capacity = 1000
#每个事物达到100,进行投递
a1.channels.c1.transactionCapacity = 100

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1


运行flume
3.
[hadoop@h 101 flume-ng-1.2.0-cdh3u5]$ bin/flume-ng agent --conf /usr/local/flume-ng-1.2.0-cdh3u5/conf/ --conf-file conf/flume-conf.properties --name a1 -Dflume.root.logger=INFO,console

***********
-Dflume.root.logger=INFO,console 仅为 debug 使用,请勿生产环境生搬硬套,否则大量的日志会返回到终端。
INFO信息、console产生的控制信息,量特别大,生产环境中用数据量太大了
-c/--conf 后跟配置目录,-f/--conf-file 后跟具体的配置文件,-n/--name 指定agent的名称
***********

4.另一个界面
[hadoop@h 101 ~]$ telnet h101 44444
输入 hello world!
回车

5.在运行 bin/flume-ng 界面能看到结果
urce.start(NetcatSource.java: 101 1)] Created serverSocket:sun.nio.ch.ServerSocketChannelImpl[/192.168.8.101:44444]

2015-10-14 12:12:08,114 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: { headers:{} body: 68 65 6C 6C 6F 20 77 6F 72 6C 64 21 0D hello world!. }

===================================================
[hadoop@h 101 flume-ng-1.2.0-cdh3u5]$ cd conf
[hadoop@h 101 conf]$ vi aa.conf

单节点 写入到hdfs

agent1.channels = ch1
agent1.sources = avro-source1
agent1.sinks = log-sink1
# Define a memory channel called ch1 on agent1
#通道类型:内存
agent1.channels.ch1.type = memory
#缓存的最大容量
agent1.channels.ch1.capacity = 1000000
#每事务的最大容量
agent1.channels.ch1.transactionCapacity = 1000000
#保持连接的秒数30秒
agent1.channels.ch1.keep-alive = 30
#define source monitor a file
#执行模式
agent1.sources.avro-source1.type = exec
#用的shell格式是bash格式
agent1.sources.avro-source1.shell = /bin/bash -c
#抽取那个文件/etc/httpd/logs/access_log
agent1.sources.avro-source1.command = tail -n +0 -F /etc/httpd/logs/access_log
#源端放到ch1 通道
agent1.sources.avro-source1.channels = ch1
#应该是连接数是5
agent1.sources.avro-source1.threads = 5
# Define a logger sink that simply logs all events it receives
# and connect it to the other end of the same channel.
agent1.sinks.log-sink1.channel = ch1
#头带到哪hdfs
agent1.sinks.log-sink1.type = hdfs
#投递路径
agent1.sinks.log-sink1.hdfs.path = hdfs://192.168.8.101:9000/user/hadoop/flumeTest
#写类型text文本
agent1.sinks.log-sink1.hdfs.writeFormat = Text
#文件类型是数据流
agent1.sinks.log-sink1.hdfs.fileType = DataStream
#hdfs创建多长时间新建文件,0不基于时间
agent1.sinks.log-sink1.hdfs.rollInterval = 0
#基于数据大小 1000字节
agent1.sinks.log-sink1.hdfs.rollSize = 1000000
#hdfs有多少条消息时新建文件,0不基于消息个数
agent1.sinks.log-sink1.hdfs.rollCount = 0
#匹配的一个大小
agent1.sinks.log-sink1.hdfs.batchSize = 1000
#事件的最大数有多少个
agent1.sinks.log-sink1.hdfs.txnEventMax = 1000
#调用的超时多少秒
agent1.sinks.log-sink1.hdfs.callTimeout = 60000
#追加的timeout时间多少秒
agent1.sinks.log-sink1.hdfs.appendTimeout = 60000

【注意:】
[hadoop@h 101 conf]$ cat /etc/httpd/logs/access_log
如果报权限不够,需要授权
[root@h 101 ~]chmod -R 777 /etc/httpd
[root@h 101 ~]ls -l /etc/httpd/logs/
-rw-r--r-- 1 root root 42833 May 9 20:50 access_log
-rw-r--r-- 1 root root 5213 May 9 20:50 error_log

显示这个说明没有成功授权
[root@h 101 ~]cd ..
[root@h 101 ~]ls -l logs
lrwxrwxrwx 1 root root 19 May 9 17:44 logs -> ../../var/log/httpd
这里有个软连接,也需要进行授权
[root@h 101 ~]chomd -R 777 /var/log/httpd
[root@h 101 ~]ls -l /etc/httpd/logs
-rwxrwxrwx 1 root root 13557 May 9 19:35 aaa.txt
-rwxrwxrwx 1 root root 42833 May 9 20:50 access_log
-rwxrwxrwx 1 root root 5213 May 9 20:50 error_log

【启动】
bin/flume-ng agent --conf /usr/local/flume-ng-1.2.0-cdh3u5/conf/ -f conf/aa -n agent1 -Dflume.root.logger=INFO,console



评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值