很多同事认为filebeat采集日志不能做到多行处理,今天这里讨论下filebeat的multiline与include_lines。

 

先来个案例,以下日志,我们只要求采集error的字段,

2017/06/22 11:26:30 [error] 26067#0: *17918 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.32.17, server: localhost, request: "GET /wss/ HTTP/1.1", upstream: "http://192.168.12.106:8010/", host: "192.168.12.106"
2017/06/22 11:26:30 [info] 26067#0:
2017/06/22 12:05:10 [error] 26067#0: *17922 open() "/data/programs/nginx/html/ws" failed (2: No such file or directory), client: 192.168.32.17, server: localhost, request: "GET /ws HTTP/1.1", host: "192.168.12.106"


filebeat.yml文件配置如下:

filebeat.prospectors:
- input_type: log
  paths:
    - /tmp/test.log
  include_lines: ['error']
output.kafka:
  enabled: true
  hosts: ["192.168.12.105:9092"]
  topic: logstash-errors-log

查看下kafka队列

果然只有“error”关键字的日志被采集了

{"@timestamp":"2017-06-23T08:57:25.227Z","beat":{"name":"192.168.12.106"},"input_type":"log","message":"2017/06/22 12:05:10 [error] 26067#0: *17922 open() /data/programs/nginx/html/ws failed (2: No such file or directory), client: 192.168.32.17, server: localhost, request: GET /ws HTTP/1.1, host: 192.168.12.10