tail(输入插件)
http://docs.fluentd.org/articles/in_tail
tail Input Plugin
The in_tail
Input plugin allows Fluentd to read events from the tail of text files. Its behavior is similar to the tail -F
command.
in_tail
输入插件允许fluentd从文本文件的尾部读事件。它的行为类似于tail-f 命令。
Example Configuration
in_tail
is included in Fluentd’s core. No additional installation process is required. 不需要额外的安装过程。
<source> type tail path /var/log/httpd-access.log pos_file /var/log/td-agent/httpd-access.log.pos tag apache.access format apache2 </source>
Please see the Config File article for the basic structure and syntax of the configuration file. |
How it Works
- When Fluentd is first configured with
in_tail
, it will start reading from the tail of that log, not the beggining. - Once the log is rotated, Fluentd starts reading the new file from the beggining. It keeps track of the current inode number.
- If
td-agent
restarts, it starts reading from the last position td-agent read before the restart. This position is recorded in the position file specified by the pos_file parameter. - 当Fluentd首先配置in_tail插件时,它将开始从尾部的日志阅读,而不是beggining。
-
一旦日志是动(更新),Fluentd开始从beggining阅读新文件。它跟踪当前的inode号。
-
如果
td-agent
重新启动时,在重启之前它从td-agent
最后一个位置开始阅读。这个位置是记录在指定的位置文件文件pos参数。(说明为什么pos的重要性,它必须有)
Parameters
type (required)
The value must be tail
.
path (required)
The paths to read. Multiple paths can be specified, separated by ‘,’.
路径读取。可以指定多个路径,”、“分离。(这就可以说明,你可以同时收集多个log日志,而不用在重新起一个source)
tag (required)
The tag of the event. 事件tag
format (required)指定日志的格式
The format of the log. Itis the name of a template or regexp surrounded by ‘/’.
The regexp must have at least one named capture (?<NAME>PATTERN). If the regexp has a capture named ‘time’, it is used as the time of the event. You can specify the time format using the time_format parameter. If the regexp has a capture named ‘tag’, the tag parameter + the captured tag is used as the tag of the event.
The following templates are supported:
- regexp
- 正则表达式
The regexp for the format parameter can be specified. Fluentular is a great website to test your regexp for Fluentd configuration.
apache2
Reads apache’s log file for the following fields: host, user, time, method, path, code, size, referer and agent. This template is analogous to the following configuration:
读取日志文件apache的为以下字段:主机、用户、时间、方法、路径、代码、大小、推荐人和代理。这个模板类似于如下配置:
format /^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$/ time_format %d/%b/%Y:%H:%M:%S %z
syslog 系统记录
Reads syslog’s output file (e.g. /var/log/syslog) for the following fields: time, host, ident, and message. This template is analogous to the following configuration:
format /^(?<time>[^ ]* [^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?[^\:]*\: *(?<message>.*)$/ time_format %b %d %H:%M:%S
tsv
orcsv
If you use tsv or csv format, please also specify the keys
parameter.
format tsv keys key1, key2, key3 time_key key2
If you specify the time_key
parameter, it will be used to identify the timestamp of the record. The timestamp when Fluentd reads the record is used by default.
time_key
参数,它将被用来确定记录的时间戳。时间戳是当Fluentd读取记录是默认情况下使用的。
format csv keys key1, key2, key3 time_key key3
json
One JSON map, per line. This is the most straight forward format :).
format json
The time_key
parameter can also be specified.
format json time_key key3
pos_file (highly recommended)
This parameter is highly recommended. Fluentd will record the position it last read into this file.
pos_file /var/log/td-agent/tmp/access.log.pos
time_format 时间格式
The format of the time field. This parameter is required only if the format includes a ‘time’ capture and it cannot be parsed automatically. Please see Time#strftime for additional information.
时间字段的格式。这个参数是必需的,只是如果格式包含一个“时间”捕获和它不能自动解析。请看看 Time#strftime了解更多信息。
rotate_wait 循环等待 rotating 我感觉翻译成(更新)更适合
in_tail actually does a bit more than tail -F
itself. When rotating a file, some data may still need to be written to the old file as opposed to the new one.
in_tail takes care of this by keeping a reference to the old file (even after it has been rotated) for some time before transitioning completely to the new file. This helps prevent data designated for the old file from getting lost. By default, this time interval is 5 seconds.
in_tail通过保持一个参考(即使它已更新)对于在完全转变成新文件之前的一些时间来保护这个旧的文件。这有助于防止数据被指定为丢失旧文件。默认情况下,这个时间间隔是5秒
The rotate_wait parameter accepts a single integer representing the number of seconds you want this time interval to be.
[2013-03-29 07:21:55.483292] router - pid=14615 tid=7a93 fid=5354 DEBUG -- Request body: {"host":"api.vcap.me","stats":[{"response_latency":0,"request_tags":"BAh7BjoOY29tcG9uZW50SSIUQ2xvdWRDb250cm9sbGVyBjoGRVQ=","response_codes":{"responses_2xx":2},"response_samples":2}]}
time_format %Y-%m-%d %H:%M:%S