Flume-1.7.0之 Taildir Source & logger Sink

Flume-1.7.0之TAILDIR Source

Taildir Source简介

Taildir Source可以监控指定文件,而且一旦有数据追加到每个文件,这种source都能实时的跟踪并发现文件的数据追加,这个source会等待数据写完,然后尝试接着读取这些文件。

Taildir Source是可靠的,它不会丢失数据即使监控的文件进行滚动并产生新数据,因为它会周期性的将每个文件的每次读取的最后的position位置以json的格式写入到文件,就算flume停止或者down掉,重启之后还是从每个文件的position位置开始读取。

在其它场景中,taildir source也可以使用给定的position file从每个文件的人任意位置开始读取,当指定路径上没有position file 时,默认情况下从每个文件的第一行开始跟踪tail。

文件将按照修改时间的顺序被消耗,修改时间最早的文件将首先被消耗,这个taildir source不会对跟踪的文件作任何重命名、删除等其它修改操作,目前这个taildir source 不支持tail二进制文件,它能读取text file一行接一行的读取

Taildir Source常用的配置属性名称

加粗的部分是必须在xxxx.conf中指定的

Property NameDefaultDescription
channels指定source所对应的channels
type组件的类型,必须为TAILDIR.
filegroups每个filegroup代表一个被tail的files的集合
filegroups.Absolute path of the file group. Regular expression (and not file system patterns) can be used for filename only.
positionFile~/.flume/taildir_position.jsonJson格式的文件,用来记录每个file的agent名称、绝对路径、最后读取的position位置等信息。
headers..Header value which is the set with header key. Multiple headers can be specified for one file group.
byteOffsetHeaderfalseWhether to add the byte offset of a tailed line to a header called ‘byteoffset’.
skipToEndfalse假如files的消费没有写入到position file,那么直接从文件的end开始读取
idleTimeout120000关闭对文件的交互处理的超时时间,如果被关闭的file被追加了新的lines,Taildir source会自动重新打开对文件的引用。(ms)
writePosInterval3000Interval time (ms) to write the last position of each file on the position file.
batchSize100Max number of lines to read and send to the channel at a time. Using the default is usually fine.
backoffSleepIncrement1000The increment for time delay before reattempting to poll for new data, when the last attempt did not find any new data.
maxBackoffSleep5000The max time delay between each reattempt to poll for new data, when the last attempt did not find any new data.
cachePatternMatchingtrueListing directories and applying the filename regex pattern may be time consuming for directories containing thousands of files. Caching the list of matching files can improve performance. The order in which files are consumed will also be cached. Requires that the file system keeps track of modification times with at least a 1-second granularity.
fileHeaderfalseWhether to add a header storing the absolute path filename.
fileHeaderKeyfileHeader key to use when appending absolute path filename to event header.

使用案例如下Taildir Source & logger Sink

# 1.启动agent
nohup bin/flume-ng agent -n a1 -c conf -f conf/flume-conf.properties &


# 2.配置taildir_to_console.conf 
a1.sources = r1
a1.sinks = k1
a1.channels = c1

a1.sources.r1.type = TAILDIR
a1.sources.r1.positionFile = /Users/shufang/program_files/flume-1.7.0/test.json
a1.sources.r1.filegroups = f1 f2
a1.sources.r1.filegroups.f1 = /var/log/test1/example.log
a1.sources.r1.headers.f1.headerKey1 = value1
a1.sources.r1.filegroups.f2 = /var/log/test2/.*log.*
a1.sources.r1.headers.f2.headerKey1 = value2
a1.sources.r1.headers.f2.headerKey2 = value2-2
a1.sources.r1.fileHeader = true


# 配置sink
a1.sinks.k1.type = logger

# 配置channel信息,用来缓存从source接收到的数据
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1


# 3.开始采集
bin/flume-ng agent 
--conf conf 
--conf-file jobs/taildir_to_console.conf  
--name a1 
-Dflume.root.logger=INFO,console
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值