Beats：如何调试 Beats processors

Elastic 中国社区官方博客

已于 2022-03-02 10:58:49 修改

阅读量1.6k

点赞数

分类专栏： Beats Elastic 文章标签： elasticsearch 大数据

于 2021-03-17 17:17:00 首次发布

本文为博主原创文章，未经博主允许不得转载。

本文链接：https://blog.csdn.net/UbuntuTouch/article/details/114937542

版权

Elastic 同时被 2 个专栏收录

1495 篇文章 903 订阅

订阅专栏

Beats

91 篇文章 73 订阅

订阅专栏

在之前的 “Beats：Beats processors” 文章中，我详细地描述了如何使用 Beat 的 processors 对数据进行清洗。在很多情况下它是非常有用的一种方法。Beats 的 processors 有很多在 ingest pipeline 的 processors 中以及 Logstash 的过滤器中都有相应的实现。针对 ingest pipeline，我们可以使用 Simulate pipeline API 来进行测试，而对于 Logstash 它也有一个很好的方法就是直接输出到 console 中进行显示。如果你想了解的话，请参阅文章 “Logstash：Logstash 入门教程（二）”。那么针对 Beats processors，我们有没有一种办法不写入到 Elasticsearch 而直接进行测试呢？

答案是肯定的，我们也可以把 Beats 的输出写到 console 中，这样很方便我们进行测试。

我们首先来创建一个如下的 filebeat 的配置文件：

filebeat.yml

logging.level: error

filebeat.inputs:
  - type: stdin

setup.template.settings:
  index.number_of_shards: 3

processors:
  - add_fields:
      target: example
      fields:
        key1: val1
        key2: val2

output.console:
  pretty: true
  enable: true

在上面，它的设计非常之简单。我们使用了 add_fields 这个 processor 来进行演示。在使用中，我们可以使用如下的命令来进行运行：

echo "message" | ./filebeat -c ~/data/filebeat_test/filebeat.yml -e 2> /dev/null

上面的命令显示：

$ echo "message" | ./filebeat -c ~/data/filebeat_test/filebeat.yml -e 2> /dev/null
{
  "@timestamp": "2021-03-17T09:02:08.483Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.11.0"
  },
  "example": {
    "key2": "val2",
    "key1": "val1"
  },
  "log": {
    "offset": 0,
    "file": {
      "path": ""
    }
  },
  "message": "message",
  "input": {
    "type": "stdin"
  },
  "ecs": {
    "version": "1.6.0"
  },
  "host": {
    "name": "liuxg"
  },
  "agent": {
    "ephemeral_id": "16146b1d-6135-4834-bd5d-3dccdf0ef864",
    "id": "e2b7365d-8953-453c-87b5-7e8a65a5bc07",
    "name": "liuxg",
    "type": "filebeat",
    "version": "7.11.0",
    "hostname": "liuxg"
  }
}

在上面，我们看到有添加 key1 和 key 2 到 example 字段下。上面的执行速度非常快：）这个方法的好处也是非常明显的，我们不用把数据导入到 Elasticsearch 中去并查看。

我们也可以通过这样的方式来测试新添加的字段及标签：

filebeat.yml

logging.level: error

filebeat.inputs:
  - type: stdin
    tags: ["my_tag"]  # add a tag named "my_tag"
    fields:
      from: "stdin"  # add a new field named "stdin"
    fields_under_root: true # add the field to the root

setup.template.settings:
  index.number_of_shards: 3

processors:
  - add_fields:
      target: example
      fields:
        key1: val1
        key2: val2

output.console:
  pretty: true
  enable: true

在上面，我们添加了一个新的字段叫做 from 和一个新的 tag。运行 Filebeat 后，我们可以看到如下的输出：

$ echo "message" | ./filebeat -c ~/data/filebeat_test/filebeat.yml -e 2> /dev/null
{
  "@timestamp": "2021-10-13T01:32:20.436Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.15.0"
  },
  "ecs": {
    "version": "1.11.0"
  },
  "host": {
    "name": "liuxg"
  },
  "log": {
    "offset": 0,
    "file": {
      "path": ""
    }
  },
  "message": "message",
  "input": {
    "type": "stdin"
  },
  "example": {
    "key1": "val1",
    "key2": "val2"
  },
  "tags": [
    "my_tag"
  ],
  "from": "stdin",
  "agent": {
    "hostname": "liuxg",
    "ephemeral_id": "7659b250-f443-4427-989e-8a8b215c1b55",
    "id": "a8acf8d6-eb61-4cc1-a61c-787f4d953397",
    "name": "liuxg",
    "type": "filebeat",
    "version": "7.15.0"
  }
}

我们也可以拿我们之前在 “Beats：Beats processors” 文章中介绍的例子来进行展示。

我们创建数据文件 sample.log：

sample.log

"321 - App01 - WebServer is starting"
"321 - App01 - WebServer is up and running"
"321 - App01 - WebServer is scaling 2 pods"
"789 - App02 - Database is will be restarted in 5 minutes"
"789 - App02 - Database is up and running"
"789 - App02 - Database is refreshing tables"

我们创建如下的 filebeat.yml 配置文件：

filebeat.yml

logging.level: error

filebeat.inputs:
  - type: stdin

processors:
 - dissect:
     tokenizer: '"%{pid|integer} - %{service.name} - %{service.status}"'
     field: "message"
     target_prefix: ""
 - drop_fields:
     fields: ["ecs", "agent", "log", "input", "host"]          

output.console:
  pretty: true

我们可以使用如下的方式来进行运行：

 cat sample.log | ./filebeat -c ~/data/filebeat_test/filebeat.yml -e 2> /dev/null

上面的命令显示的结果为：

{
  "@timestamp": "2021-03-17T09:11:02.812Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.11.0"
  },
  "service": {
    "name": "App01",
    "status": "WebServer is starting"
  },
  "message": "\"321 - App01 - WebServer is starting\"",
  "pid": 321
}
{
  "@timestamp": "2021-03-17T09:11:02.812Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "7.11.0"
  },
  "service": {
    "status": "WebServer is up and running",
    "name": "App01"
  },
  "pid": 321,
  "message": "\"321 - App01 - WebServer is up and running\""
}
...

显然在上面我们可以很清楚地看到 Beats processors 处理的结果。