Filbeat提取Json日志文件到Elasticsearch
Filebeat 是一个用于转发和集中日志数据的轻量级插件。通过简单的配置就可以接入任何应用。
实验环境:
-
centos7,kernel 3.10
-
Filbeat:7.12.1
-
Elasticsearch:7.12.0
Filebeat使用日志文件作为输出
Filebeat使用yml作为配置文件格式,日志路径支持多个,多级目录。多级目录支持参考Go Glob.
filebeat.inputs:
- type: log
enabled: true
paths:
- /tmp/data.log
### JSON configuration
json.keys_under_root: true
Filebeat使用Elasticsearch作为输出
- 配置es地址:
output.elasticsearch:
enabled: true
hosts: ["localhost:9200"]
- 配置索引模板:
setup.template.enabled: true
# Select the kind of index template. Available options: legacy, component, index.
setup.template.type: index
# Enable JSON template loading.
setup.template.json.enabled: true
setup.template.json.path: "${path.config}/es_template.json"
setup.template.json.name: "envoy_access_template_ID"
# Overwrite existing template
setup.template.overwrite: true
使用指定的json文件作为索引模板,例如:
{
"index_patterns":"envoy_access_log_ID*",
"template": {
"settings":{
"index":{
"lifecycle":{
"name":"dmp-als-ilm-policy_ID",
"rollover_alias":"envoy_access_log_ID"
},
"refresh_interval":"5s",
"number_of_shards":"3",
"number_of_replicas":"0"
}
},
"mappings":{
"properties":{
"search_request_path":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":"256"
}
}
}
}
}
}
}
- 配置索引生命周期管理器(ILM):
setup.ilm.enabled: true
# Set the prefix used in the index lifecycle write alias name.
setup.ilm.rollover_alias: 'envoy_access_log_ID'
# Set the lifecycle policy name.
setup.ilm.policy_name: "dmp-als-ilm-policy_ID"
# The path to a JSON file that contains a lifecycle policy configuration. Used
# to load your own lifecycle policy.
#setup.ilm.policy_file: "${path.config}/es_policy.json"
# Disable the check for an existing lifecycle policy. The default is true. If
# you disable this check, set setup.ilm.overwrite: true so the lifecycle policy
# can be installed.
setup.ilm.check_exists: true
# Overwrite the lifecycle policy at startup.
setup.ilm.overwrite: true
可以使用默认配置,也可以指定配置文件。
- 配置过滤器(processors)
processors:
- drop_fields:
fields: ["@timestamp", "log","input","ecs","host","agent"]
删除一些filebeat自带的日志字段。
启动
filebeat将es作为输出情况下,它在启动时会连接es,如果连接的es没有需要的索引模板和ILM,它会自动创建相关资源。如果es中已有相关资源,建议手动推送相关资源到es。
建议的启动执行命令
./filebeat test config
./filebeat test output
./filebeat setup --index-management
./filebeat run -e
其中任何一步出现错误,都会使得日志读取出现预期之外的情况。
完整配置文件
目录结构:
filebeat-7.12.1-linux-x86_64
├── es_policy.json
├── es_template.json
├── filebeat
├── filebeat.yml
├── ...
└── ...
- filebeat.yml
#=========================== Filebeat inputs =============================
filebeat.inputs:
#------------------------------ Log input --------------------------------
- type: log
# Change to true to enable this input configuration.
enabled: true
paths:
- /tmp/envoy_access.log
### JSON configuration
# By default, the decoded JSON is placed under a "json" key in the output document.
# If you enable this setting, the keys are copied top level in the output document.
json.keys_under_root: true
#=========================== Filebeat Outputs =============================
# ------------------------------- Console Output -------------------------------
#just for test!
#output.console:
# Boolean flag to enable or disable the output module.
#enabled: true
# Configure JSON encoding
#codec.json:
# Pretty-print JSON event
#pretty: true
# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
# Boolean flag to enable or disable the output module.
enabled: true
hosts: ["localhost:9200"]
# ================================== Template ==================================
# Set to false to disable template loading.
setup.template.enabled: true
# Select the kind of index template. From Elasticsearch 7.8, it is possible to
# use component templates. Available options: legacy, component, index.
# By default filebeat uses the legacy index templates.
setup.template.type: index
# Enable JSON template loading. If this is enabled, the fields.yml is ignored.
setup.template.json.enabled: true
# Path to the JSON template file
setup.template.json.path: "${path.config}/es_template.json"
# Name under which the template is stored in Elasticsearch
setup.template.json.name: "envoy_access_template_ID"
# Overwrite existing template
# Do not enable this option for more than one instance of filebeat as it might
# overload your Elasticsearch with too many update requests.
setup.template.overwrite: true
# ====================== Index Lifecycle Management (ILM) ======================
# Enable ILM support. Valid values are true, false, and auto. When set to auto
# (the default), the Beat uses index lifecycle management when it connects to a
# cluster that supports ILM; otherwise, it creates daily indices.
setup.ilm.enabled: true
# Set the prefix used in the index lifecycle write alias name. The default alias
# name is 'filebeat-%{[agent.version]}'.
setup.ilm.rollover_alias: 'envoy_access_log_ID'
# Set the lifecycle policy name. The default policy name is
# 'beatname'.
setup.ilm.policy_name: "dmp-als-ilm-policy_ID"
# The path to a JSON file that contains a lifecycle policy configuration. Used
# to load your own lifecycle policy.
#setup.ilm.policy_file: "${path.config}/es_policy.json"
# Disable the check for an existing lifecycle policy. The default is true. If
# you disable this check, set setup.ilm.overwrite: true so the lifecycle policy
# can be installed.
setup.ilm.check_exists: true
# Overwrite the lifecycle policy at startup. The default is false.
setup.ilm.overwrite: true
# ================================= Processors =================================
# Processors are used to reduce the number of fields in the exported event or to
# enhance the event with external metadata. This section defines a list of
# processors that are applied one by one and the first one receives the initial
# event:
#
# event -> filter1 -> event1 -> filter2 ->event2 ...
#
# The supported processors are drop_fields, drop_event, include_fields,
# decode_json_fields, and add_cloud_metadata.
processors:
- drop_fields:
fields: ["log","input","ecs","host","agent"]
- es_template.json
{
"index_patterns":"envoy_access_log_ID*",
"template": {
"settings":{
"index":{
"lifecycle":{
"name":"dmp-als-ilm-policy_ID",
"rollover_alias":"envoy_access_log_ID"
},
"refresh_interval":"5s",
"number_of_shards":"3",
"number_of_replicas":"0"
}
},
"mappings":{
"properties":{
"search_request_path":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":"256"
}
}
}
}
}
}
}
- es_policy.json
仅作为模板参考
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_docs": 5
}
}
},
"warm": {
"min_age": "10s",
"actions": {
"allocate": {
"include": {
"my_node_type": "warm"
}
}
}
},
"cold": {
"min_age": "15s",
"actions": {
"allocate": {
"include": {
"my_node_type": "code"
}
}
}
},
"delete": {
"min_age": "20s",
"actions": {
"delete": {}
}
}
}
}
}
日志格式
在日志文件中,一行一个json格式数据。