【Filebeat 6.1】Configuring Filebeat》Set up prospectors设置探查者(一)配置文件讲解


Set up prospectorsedit
   Filebeat modules provide the fastest getting started experience for common log formats. See Quick start for common log formats to learn how to get started with modules. Also see Specify which modules to run for information about enabling and configuring modules.

   Filebeat模块提供了最快速的普通日志格式的入门经验。请参阅Quick start,以了解如何从模块开始使用常见的日志格式。还请参见指定要运行哪些模块以获取关于启用和配置模块的信息。

Filebeat uses prospectors to locate and process files. To configure Filebeat, you specify a list of prospectors in the filebeat.prospectors section of the filebeat.yml config file.

Each item in the list begins with a dash (-) and specifies prospector-specific configuration options, including the list of paths that are crawled to locate the files.


Here is a sample configuration:这有一个简单是配置

 - type: log
    - /var/log/apache/httpd-*.log

 - type: log
    - /var/log/messages
    - /var/log/*.log

###Configuration options 配置项

  • log: Reads every line of the log file (default).读取日志文件的每一行(默认)。
  • stdin: Reads the standard in.读取标准输入。
  • redis: Reads slow log entries from redis (experimental).从redis读取慢日志(实验)
  • udp: Reads events over UDP. Also see max_message_size从udp读取
  • docker: Reads logs from Docker. Also see containersedit (experimental).从docker读取

The value that you specify here is used as the type for each event published to Logstash and Elasticsearch.


A list of glob-based paths that should be crawled and fetched. All patterns supported by Golang Glob are also supported here. For example, to fetch all files from a predefined level of subdirectories, the following pattern can be used: /var/log//.log. This fetches all .log files from the subfolders of /var/log. It does not fetch log files from the /var/log folder itself. It is possible to recursively fetch all files in all subdirectories of a directory using the optional recursive_glob settings.

   提取的路径列表应该基于全局(不确定是不是这个意思),这里也支持所有由Golang Glob支持的模式;例如要从预定义级别的子目录中获取所有文件,可以使用以下模式:/var/ log// .log,这将从/var/log的子文件夹中获取所有.log文件,而不从/var/log文件夹本身获取日志文件;可以利用可选的递归式配置来递归地获取子路径下的所有文件。

Filebeat starts a harvester for each file that it finds under the specified paths. You can specify one path per line. Each line begins with a dash (-).

Enable expanding ** into recursive glob patterns. With this feature enabled, the rightmost ** in each path is expanded into a fixed number of glob patterns. For example: /foo/** expands to /foo, /foo/*, /foo/*/*, and so on. If enabled it expands a single ** into a 8-level deep * pattern.

This feature is enabled by default, set to recursive_glob.enabled to false to disable it.

这段讲解**的作用,/foo/**可以匹配 /foo, /foo/, /foo//*等,可以通过recursive_glob.enabled使其无效


The file encoding to use for reading files that contain international characters. See the encoding names recommended by the W3C for use in HTML5.

Here are some sample encodings from W3C recommendation:

  1. plain, latin1, utf-8, utf-16be-bom, utf-16be, utf-16le, big5, gb18030, gbk, hz-gb-2312,
  2. euc-kr, euc-jp, iso-2022-jp, shift-jis, and so on

The plain encoding is special, because it does not validate or transform any input.



A list of regular expressions to match the lines that you want Filebeat to exclude. Filebeat drops any lines that match a regular expression in the list. By default, no lines are dropped.

If multiline is also specified, each multiline message is combined into a single line before the lines are filtered by exclude_lines.

The following example configures Filebeat to drop any lines that start with “DBG”.

- paths:
    - /var/log/myapp/*.log
  exclude_lines: ['^DBG']

See Regular expression support for a list of supported regexp patterns.


A list of regular expressions to match the lines that you want Filebeat to include. Filebeat exports only the lines that match a regular expression in the list. By default, all lines are exported.

If multiline is also specified, each multiline message is combined into a single line before the lines are filtered by include_lines.

The following example configures Filebeat to export any lines that start with “ERR” or “WARN”:

- paths:
- /var/log/myapp/*.log
include_lines: [‘^ERR’, ‘^WARN’]

If both include_lines and exclude_lines are defined, 
Filebeat executes include_lines first and then executes exclude_lines. 
The order in which the two options are defined doesn’t matter. 
The include_lines option will always be executed before the exclude_lines option, 
even if exclude_lines appears before include_lines in the config file.



A list of regular expressions to match the files that you want Filebeat to ignore. By default no files are excluded.

The following example configures Filebeat to ignore all the files that have a gz extension:

exclude_files: [‘.gz$’]
See Regular expression support for a list of supported regexp patterns.


A list of tags that the Beat includes in the tags field of each published event. Tags make it easy to select specific events in Kibana or apply conditional filtering in Logstash. These tags will be appended to the list of tags specified in the general configuration.


- paths: ["/var/log/app/*.json"]
  tags: ["json"]

在每个已发布事件的标记字段中包含的标记列表,标记使得在Kibana中选择特定的事件或在logstash中应用条件过滤变得很容易,这些标记将附加到 常规配置中 指定的标记列表中去。


Optional fields that you can specify to add additional information to the output. For example, you might add fields that you can use for filtering log data. Fields can be scalar values, arrays, dictionaries, or any nested combination of these. By default, the fields that you specify here will be grouped under a fields sub-dictionary in the output document. To store the custom fields as top-level fields, set the fields_under_root option to true. If a duplicate field is declared in the general configuration, then its value will be overwritten by the value declared here.


- paths: ["/var/log/app/*.log"]
    app_id: query_engine_12

If this option is set to true, the custom fields are stored as top-level fields in the output document instead of being grouped under a fields sub-dictionary. If the custom field names conflict with other field names added by Filebeat, then the custom fields overwrite the other fields.



A list of processors to apply to the data generated by the prospector.
See Filter and enhance the exported data for information about specifying processors in your config.


If this option is enabled, Filebeat ignores any files that were modified before the specified timespan. Configuring ignore_older can be especially useful if you keep log files for a long time. For example, if you want to start Filebeat, but only want to send the newest files and files from last week, you can configure this option.


You can use time strings like 2h (2 hours) and 5m (5 minutes). The default is 0, which disables the setting. Commenting out the config has the same effect as setting it to 0.


You must set ignore_older to be greater than close_inactive.

The files affected by this setting fall into two categories:

  • Files that were never harvested
  • Files that were harvested but weren’t updated for longer than ignore_older
    被捕获的文件,但是没有更新的时间 比 ignore_older 的 时间长。

For files which were never seen before, the offset state is set to the end of the file. If a state already exist, the offset is not changed. In case a file is updated again later, reading continues at the set offset position.


The ignore_older setting relies on the modification time of the file to determine if a file is ignored. If the modification time of the file is not updated when lines are written to a file (which can happen on Windows), the ignore_older setting may cause Filebeat to ignore files even though content was added at a later time.


To remove the state of previously harvested files from the registry file, use the clean_inactive configuration option.

Before a file can be ignored by the prospector, it must be closed. To ensure a file is no longer being harvested when it is ignored, you must set ignore_older to a longer duration than close_inactive.

If a file that’s currently being harvested falls under ignore_older, the harvester will first finish reading the file and close it after close_inactive is reached. Then, after that, the file will be ignored.
如果当前正在收获的文件属于ignore_older,harvester 将先读完文件并在close_inactive 时间达到后关闭它,之后文件会被忽略。


The close_* configuration options are used to close the harvester after a certain criteria or time. Closing the harvester means closing the file handler. If a file is updated after the harvester is closed, the file will be picked up again after scan_frequency has elapsed. However, if the file is moved or deleted while the harvester is closed, Filebeat will not be able to pick up the file again, and any data that the harvester hasn’t read will be lost.

在特定的条件或时间之后,close_* 的配置选项用于关闭harvester,关闭harvester意味着关闭文件处理程序。如果一个文件在harvester 关闭后更新了,那么在scan_frequency结束后文件将再次被恢复。但是,如果当harvester 关闭时文件被移动或删除,Filebeat将不能够再次读取文件,任何没有被harvester读取的数据都将丢失。


When this option is enabled, Filebeat closes the file handle if a file has not been harvested for the specified duration. The counter for the defined period starts when the last log line was read by the harvester. It is not based on the modification time of the file. If the closed file changes again, a new harvester is started and the latest changes will be picked up after scan_frequency has elapsed.

当启用此选项时,如果文件未在指定的时间内被捕获,Filebeat将关闭文件句柄。定义期间的计数器从harvester 读取最后一个日志行的时候开始而不是基于文件的修改时间,如果关闭的文件再次被修改了,将启动一个新的harvester并且在scan_frequency结束后,会接收到最新的更改。

We recommended that you set close_inactive to a value that is larger than the least frequent updates to your log files. For example, if your log files get updated every few seconds, you can safely set close_inactive to 1m. If there are log files with very different update rates, you can use multiple prospector configurations with different values.

Setting close_inactive to a lower value means that file handles are closed sooner. However this has the side effect that new log lines are not sent in near real time if the harvester is closed.


The timestamp for closing a file does not depend on the modification time of the file. Instead, Filebeat uses an internal timestamp that reflects when the file was last harvested. For example, if close_inactive is set to 5 minutes, the countdown for the 5 minutes starts after the harvester reads the last line of the file.


You can use time strings like 2h (2 hours) and 5m (5 minutes). The default is 5m.


Only use this option if you understand that data loss is a potential side effect.

When this option is enabled, Filebeat closes the file handler when a file is renamed. This happens, for example, when rotating files. By default, the harvester stays open and keeps reading the file because the file handler does not depend on the file name. If the close_renamed option is enabled and the file is renamed or moved in such a way that it’s no longer matched by the file patterns specified for the prospector, the file will not be picked up again. Filebeat will not finish reading the file.


WINDOWS: If your Windows log rotation system shows errors because it can’t rotate the files, you should enable this option.

  • 1
  • 0
    觉得还不错? 一键收藏
  • 11


  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
评论 11




当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则
钱包余额 0


