FileBeat实现es日志搜索功能的一次成功实践

三更闲

已于 2023-02-01 21:50:45 修改

阅读量607

点赞数

分类专栏： filbeat 文章标签： elasticsearch java 正则表达式

于 2023-02-01 21:47:22 首次发布

本文链接：https://blog.csdn.net/qq_42092617/article/details/128840525

版权

filbeat 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

FileBeat实现es日志搜索功能的注意事项

一、使用filebeat实现es日志搜索功能的时候碰到的几个问题：

1、版本问题

filebeat版本需要和searchcenter版本一致

2、多条日志合并问题

需要再配置文件中添加对应的多条日志配置,配置效果是正则匹配到的行合并那行以下所有不能配置日志行。

  multiline:
    # 是否开启正则匹配，true:开启，false:不开启
    negate: true
    # 不匹配正则的行是放到匹配到正则的行的after(后面)还是before(前面)
    match: after
    # 多行日志结束的时间，多长时间没接收到日志，如果上一个是多行日志，则认为上一个结束了
    timeout: 2s

3、最后一条日志不能读取到的问题

filebeat使用换行符识别一行日志，最后一行如果没有换行符是不能被读取到；如果需要处理文件需要全部被读取，最好在最后添加一行空行。

4、日志无法读取的问题

日志行任何位置出现指定的排除字段，整行日志会被抛弃不处理；使用了多行的配置的日志行，filebeat会先进行合并，再对合并以后的日志进行过滤。

  # 如果日志中出现了 DEBUG 的字样，则排除这个日志
  #exclude_lines:
  #  - "DEBUG"

二、filebeat多文件日志配置

1、启动命令

filebeat -e -c filebeatTest.yml

2、配置文件

filebeat.inputs:
- type: log
  # 是否启动
  enabled: true
  # 从那个路径收集日志，如果存在多个 input ,则这个 paths 中的收集的日志最好不要重复，否则会出现问题
  # 日志路径可以写通配符
  paths:
    - E:\h5\logs\interface-*.log
  # 如果日志中出现了 DEBUG 的字样，则排除这个日志
  exclude_lines:
    - "DEBUG"
  # 添加一个自定义标签
  tags:
    - "interface"
  # 多行日志的处理，比如java中的异常堆栈
  multiline:
    # 正则表达式
    pattern: "(([0-9]{3}[1-9]|[0-9]{2}[1-9][0-9]{1}|[0-9]{1}[1-9][0-9]{2}|[1-9][0-9]{3})-(((0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01]))|((0[469]|11)-(0[1-9]|[12][0-9]|30))|(02-(0[1-9]|[1][0-9]|2[0-8]))))|((([0-9]{2})(0[48]|[2468][048]|[13579][26])|((0[48]|[2468][048]|[3579][26])00))-02-29)"
    # 是否开启正则匹配，true:开启，false:不开启
    negate: true
    # 不匹配正则的行是放到匹配到正则的行的after(后面)还是before(前面)
    match: after
    # 多行日志结束的时间，多长时间没接收到日志，如果上一个是多行日志，则认为上一个结束了
    timeout: 2s
  # 使用es的ignes node 的pipeline处理数据，这个理论上要配置到output.elasticsearch下方，但是测试的时候发现配置在output.elasticsearch下方不生效。
  pipeline: pipeline-filebeat-h5


  
  
- type: log
  # 是否启动
  enabled: true
  # 从那个路径收集日志，如果存在多个 input ,则这个 paths 中的收集的日志最好不要重复，否则会出现问题
  # 日志路径可以写通配符
  paths:
    - E:\shop\logs\seller-*.log
  # 如果日志中出现了 DEBUG 的字样，则排除这个日志
  exclude_lines:
    - "DEBUG"
  # 添加一个自定义标签
  tags:
    - "seller"
  # 多行日志的处理，比如java中的异常堆栈
  multiline:
    # 正则表达式
    pattern: "(([0-9]{3}[1-9]|[0-9]{2}[1-9][0-9]{1}|[0-9]{1}[1-9][0-9]{2}|[1-9][0-9]{3})-(((0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01]))|((0[469]|11)-(0[1-9]|[12][0-9]|30))|(02-(0[1-9]|[1][0-9]|2[0-8]))))|((([0-9]{2})(0[48]|[2468][048]|[13579][26])|((0[48]|[2468][048]|[3579][26])00))-02-29)"
    # 是否开启正则匹配，true:开启，false:不开启
    negate: true
    # 不匹配正则的行是放到匹配到正则的行的after(后面)还是before(前面)
    match: after
    # 多行日志结束的时间，多长时间没接收到日志，如果上一个是多行日志，则认为上一个结束了
    timeout: 2s
  # 使用es的ignes node 的pipeline处理数据，这个理论上要配置到output.elasticsearch下方，但是测试的时候发现配置在output.elasticsearch下方不生效。
  pipeline: pipeline-filebeat-shop
  
 
  
# 配置索引模板的名字和索引模式的格式
setup.template.enabled: false
setup.template.name: "template-springboot"
setup.template.pattern: "springboot-*"

# 索引的生命周期，需要禁用，否则可能无法使用自定义的索引名字
setup.ilm.enabled: false



# 输出到es中
output.elasticsearch:
  # 是否启动
  enabled: true
  # es 的地址
  hosts: 
    - "http://192.168.47.128:9200"
    - "http://192.168.47.128:9201"
    - "http://192.168.47.128:9202"
  indices:
    - index: "springboot-interface-%{+yyyy.MM.dd}"
      when.contains:
        tags: "interface"
    - index: "springboot-seller-%{+yyyy.MM.dd}"
      when.contains:
        tags: "seller"
 

# 数据处理，如果我们的数据不存在唯一主键，则使用fingerprint否则可以使用add_id来实现
processors:
  - drop_fields:
      fields: ['input.type','agent.hostname','agent.name','agent.id','agent.type','log.file.path','log.offset','agent.ephemeral_id']
  # 指纹，防止同一条数据在output的es中存在多次。（此处为了演示使用message字段做指纹，实际情况应该根据不用的业务来选择不同的字段）
  - fingerprint:
      fields: ["message"]
      ignore_missing: false
      target_field: "@metadata._id"
      method: "sha256"

3、日志切割的正则

#正则表达用于切割日志文件，日志一般使用时间格式开头

(([0-9]{3}[1-9]|[0-9]{2}[1-9][0-9]{1}|[0-9]{1}[1-9][0-9]{2}|[1-9][0-9]{3})-(((0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01]))|((0[469]|11)-(0[1-9]|[12][0-9]|30))|(02-(0[1-9]|[1][0-9]|2[0-8]))))|((([0-9]{2})(0[48]|[2468][048]|[13579][26])|((0[48]|[2468][048]|[3579][26])00))-02-29)

三、es中日志处理配置

1、创建索引模板

PUT /_template/template-springboot
{
  "index_patterns": ["springboot-csm-*"],
  "order": 0,
  "mappings": {
    "properties": {
      "createTime":{
        "type": "date",
        "format": ["yyyy-MM-dd HH:mm:ss.SSS"]
      }
    }
  }
}

2、配置日志切割的pipeline

PUT _ingest/pipeline/pipeline-filebeat-h5
{
  "description": "对springboot项目日志的pipeline处理",
  "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": ["""(?m)%{TIMESTAMP_ISO8601:createTime}%{SPACE}\[%{DATA:ip}\]%{SPACE}\[%{DATA:systemName}\]%{SPACE}\[%{DATA:version}\]%{SPACE}\[%{DATA:memberNo}\]%{SPACE}\[%{DATA:request_id}\]%{SPACE}\[%{DATA:thread}\]%{SPACE}%{LOGLEVEL:level}%{SPACE}-%{SPACE}%{GREEDYDATA:message}"""],
        "pattern_definitions": {
          "METHODNAME": "[a-zA-Z_]+"
        }
      },
	  "remove": {
        "field": "ecs",
        "ignore_failure": true
      }
    },
    {
      "date": {
        "field": "createTime",
        "formats": [
          "yyyy-MM-dd HH:mm:ss.SSS"
        ],
        "timezone": "Asia/Shanghai",
        "target_field": "@timestamp",
        "ignore_failure": true
      }
    }
  ]
}

3、时间格式的处理

#日志切割出来的时间格式不准确，@timestamp转换为北京时间需要指定时区
    {
      "date": {
        "field": "createTime",
        "formats": [
          "yyyy-MM-dd HH:mm:ss.SSS"
        ],
        "timezone": "Asia/Shanghai",
        "target_field": "@timestamp",
        "ignore_failure": true
      }
    }