Flume conf文件

Flume官网的Source介绍

spooldir

This source will watch the specified directory for new files, and will parse events out of new files as they appear.

After a given file has been fully read into the channel, it is renamed to indicate completion (or optionally deleted).

Unlike the Exec source, this source is reliable and will not miss data, even if Flume is restarted or killed. In exchange for this reliability, only immutable, uniquely-named files must be dropped into the spooling directory. Flume tries to detect these problem conditions and will fail loudly if they are violated:

  1. If a file is written to after being placed into the spooling directory, Flume will print an error to its log file and stop processing.
  2. If a file name is reused at a later time, Flume will print an error to its log file and stop processing.

To avoid the above issues, it may be useful to add a unique identifier (such as a timestamp) to log file names when they are moved into the spooling directory.

Despite the reliability guarantees of this source, there are still cases in which events may be duplicated if certain downstream failures occur. This is consistent with the guarantees offered by other Flume components.

spooldir可以监控一个目录,当有新文件出现,新文件将被放入channel。放入channel后可以将新文件重命名,也可以将它删除。

spooldir十分可靠,即使flume重启或挂掉也不会丢失数据。

放入监控目录的文件必须是不变的,且名字不能重复,否则将报错,比如放入的文件遭到写入操作,或者放入一个名字重复的文件。

spooldir流失败中断后,可能发生复写的情况。

几个常用的参数:

a1.sources.r1.spoolDir =/data/disk/flume监控的目录
a1.sources.r1.ignorePattern = 正则表达式如果符合正则表达式的文件出现,将不会放入channel
a1.sources.r1.includePattern= 正则表达式如果符合正则表达式的文件出现,则放入channel;同时满足ignorePattern 和includePattern正则表达式的文件将不会放入channel
a1.sources.r1.decodeErrorPolicy = IGNORE文件中出现不能解码的字符时,FAIL:抛异常,REPLACE:用指定字符替换,IGNORE:将其删除
a1.sources.r1.inputCharset =latin1解码器用来解码输入文件的字符集,默认为UTF-8
a1.sources.r1.batchSize = 5000分批传输的粒度(记录条数),默认为100
a1.sources.r1.deletePolicy = immediate是否删除完成传输的文件,never:不删,immediate:删
a1.sources.r1.fileSuffix = .COMPLETED完成传输的文件添加一个后缀
a1.sources.r1.maxBackoff = 10000当channel满时,写入channel操作的最大等待时间(毫秒)。每当channel抛一次异常,该时间都会根据设定时间呈指数增长。

FLume官网的Sink介绍

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值