日志分析系统 [ 2 ] --- Logstash 安装与使用,Filebeat 安装与使用

本文档介绍了如何搭建和配置Logstash、Filebeat以实现企业级日志分析。首先,详细展示了Logstash的安装、配置,包括使用Grok过滤器解析Web日志和Geoip插件增强数据。接着,讲解了Filebeat的安装、启用模块以及配置数据收集。最后,给出了Elasticsearch+Logstash+Filebeat在生产环境中的配置方案,以实现日志数据的高效检索和分析。
摘要由CSDN通过智能技术生成

企业级日志分析系统

一、Logstash 入门

官网 点我直达

1、安装

这里用的是 logstash-7.10.0
运行最基本的 Logstash 管道来测试 Logstash 安装

[root@ela1 ~]# ls
logstash-7.10.0-linux-x86_64.tar.gz
[root@ela1 ~]# tar xf logstash-7.10.0-linux-x86_64.tar.gz 
[root@ela1 ~]# cd logstash-7.10.0
[root@ela1 logstash-7.10.0]# bin/logstash -e 'input { stdin { } } output { stdout {} }'

当显示如下的时候

[2020-12-28T06:51:35,703][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}

尝试输入 hello
输出

hello
{
      "@version" => "1",
    "@timestamp" => 2020-12-28T12:01:42.559Z,
       "message" => "hello",
          "host" => "ela1"
}

安装成功

2、配置输入和输出

1.管道配置文件

创建一个Logstash管道,该管道使用标准输入来获取Apache Web日志作为输入,解析这些日志以从日志中创建特定的命名字段,然后将解析的数据写入Elasticsearch集群。无需在命令行上定义管道配置,而是在配置文件中定义管道。

创建first-pipeline.conf文件,并写入如下内容,作为 Logstash 的管道配置文件

[root@ela1 logstash-7.10.0]# cat first-pipeline.conf

input { 
    stdin { } 
} 

output { 
    stdout {} 
}

测试配置文件

bin/logstash -f first-pipeline.conf --config.test_and_exit

启动 Logstatsh

bin/logstash -f first-pipeline.conf --config.reload.automatic

--config.reload.automatic 会在你修改管道配置文件后自动加载,而不必重新启动 Logstash

启动后输入

83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] "GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"

输出

{
    "@timestamp" => 2020-12-28T12:32:09.982Z,
      "@version" => "1",
       "message" => "",
          "host" => "ela1"
}
{
    "@timestamp" => 2020-12-28T12:32:10.035Z,
      "@version" => "1",
       "message" => "83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] \"GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1\" 200 203023 \"http://semicomplete.com/presentations/logstash-monitorama-2013/\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
          "host" => "ela1"
}

报错解决

Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of [ \\t\\r\\n], \"#\", \"input\", \"filter\", \"output\" at line 1, col

一般是 配置的 .conf 文件内容写错了,仔细检查一下
如果真的没错误,就是程序被占用了 杀掉再起一下就好了

2.使用Grok过滤器插件解析Web日志

使用grok过滤器插件解析日志消息以从日志中创建特定的命名字段,将非结构化日志数据解析为结构化和可查询的内容。
grok 过滤插件,会根据你感兴趣的内容分配字段名称,并把这些内容和对应的字段名称进行绑定。

grok 如何知道哪些内容是你感兴趣的呢?它是通过自己预定义的模式来识别感兴趣的字段的。这个可以通过给其配置不同的模式来实现。

这里使用的模式是 %{COMBINEDAPACHELOG}

{COMBINEDAPACHELOG}使用以下模式从Apache日志中构造行:

原信息对应新的字段名称
IP 地址clientip
用户 IDident
用户认证信息auth
时间戳timestamp
HTTP 请求方法verb
请求的 URLrequest
HTTP 版本httpversion
响应码response
响应体大小bytes
跳转来源referrer
客户端代理(浏览器)agent

关于 grok 更多的用法请参考 grok 参考文档 点我直达

这里要想实现修改配置文件之后自动加载它,不能配置 inputstdin
所以, 这里我们使用了 file

添加 日志文件方便导入

[root@ela1 logstash-7.10.0]# cat /usr/local/logstash-7.10.0/access_log 
83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] "GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"
[root@ela1 logstash-7.10.0]# cat /usr/local/logstash-7.10.0/error_log 
 2020/12/29 15:25:10 [warn] 3380#3380: *161 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/5/00/0000000005 while reading upstream, client: 10.9.29.234, server: localhost, request: "GET /35949/bundles/plugin/data/data.plugin.js HTTP/1.1", upstream: "http://127.0.0.1:5601/35949/bundles/plugin/data/data.plugin.js", host: "10.9.12.250:8080", referrer: "http://10.9.12.250:8080/app/home"

match => { “message” => “%{COMBINEDAPACHELOG}”} 的意思是:
当匹配到 “message” 字段时,用户模式 “COMBINEDAPACHELOG}” 进行字段映射。

配置完成后,再次进行验证

[root@ela1 logstash-7.10.0]# cat /usr/local/logstash-7.10.0/second-pipeline.conf 
input {
  file {
    path => "/usr/local/logstash-7.10.0/access_log"
    start_position => "beginning"
  }
}

filter {
    grok {
        match => { "message" => "%{COMBINEDAPACHELOG}" }
        remove_field => [ "message" ]
    }
}

output {
  stdout {
     codec => rubydebug
  }
}

输出

{
     "@timestamp" => 2020-12-29T06:26:15.259Z,
           "path" => "/usr/local/logstash-7.10.0/access_log",
       "clientip" => "83.149.9.216",
    "httpversion" => "1.1",
           "host" => "localhost",
       "referrer" => "\"http://semicomplete.com/presentations/logstash-monitorama-2013/\"",
          "agent" => "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
       "response" => "200",
          "bytes" => "203023",
       "@version" => "1",
           "verb" => "GET",
           "auth" => "-",
          "ident" => "-",
      "timestamp" => "04/Jan/2015:05:13:42 +0000",
        "request" => "/presentations/logstash-monitorama-2013/imageskibana-search.png"
}

会发现原来的非结构化数据,变为结构化的数据了。
细心的你一定发现原来的 message 字段仍然存在,假如你不需要它,可以使用 grok 中提供的常用选项之一: remove_filed 来移除这个字段。
事实上 remove_field 可以移除任意的字段,它可以接收的值是一个数组。

修改后管道配置文件如下:

[root@localhost logstash-7.10.0]# cat first-pipeline.conf 
input {
  file {
    path => "/usr/local/logstash-7.10.0/access_log"
    start_position => "beginning"
  }
}

filter {
    grok {
        match => { "message" => "%{COMBINEDAPACHELOG}"}
        remove_field => [ "message" ]
    }
}

output {
  stdout { codec => rubydebug }
}

继续执行

bin/logstash -f first-pipeline.conf --config.reload.automatic

发现 message 不见了

使用如下命令向示例日志文件中输入新的一行内容

[root@localhost logstash-7.10.0]# echo '83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] "GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"' >> /var/nginx/access_log 

输出

{
          "bytes" => "203023",
           "path" => "/usr/local/logstash-7.10.0/access_log",
      "timestamp" => "04/Jan/2015:05:13:42 +0000",
          "agent" => "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
       "clientip" => "83.149.9.216",
       "response" => "200",
           "verb" => "GET",
       "referrer" => "\"http://semicomplete.com/presentations/logstash-monitorama-2013/\"",
       "@version" => "1",
          "ident" => "-",
     "@timestamp" => 2020-12-29T06:36:51.119Z,
        "request" => "/presentations/logstash-monitorama-2013/imageskibana-search.png",
           "host" => "localhost",
    "httpversion" => "1.1",
           "auth" => "-"
}
3 使用Geoip过滤器插件增强数据编辑

新的管道配置文件

input {
 stdin {}
}
 filter {
    grok {
        match => { "message" => "%{COMBINEDAPACHELOG}"}
    }
    geoip {
        source => "clientip"
    }
}
output {
    stdout { codec => rubydebug }
}

继续输入之前的内容

83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] "GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"

输出

{
        "message" => "83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] \"GET /presentations/logstash-monitorama-2013/imageskibana-search.png HTTP/1.1\" 200 203023 \"http://semicomplete.com/presentations/logstash-monitorama-2013/\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
           "auth" => "-",
          "geoip" => {
          "country_name" => "Russia",
              "latitude" => 55.7527,
              "location" => {
            "lon" => 37.6172,
            "lat" => 55.7527
        },
                    "ip" => "83.149.9.216",
         "country_code2" => "RU",
         "country_code3" => "RU",
             "city_name" => "Moscow",
        "continent_code" => "EU",
              "timezone" => "Europe/Moscow",
             "longitude" => 37.6172,
           "postal_code" => "144700",
           "region_code" => "MOW",
           "region_name" => "Moscow"
    },
       "@version" => "1",
       "clientip" => "83.149.9.216",
      "timestamp" => "04/Jan/2015:05:13:42 +0000",
        "request" => "/presentations/logstash-monitorama-2013/imageskibana-search.png",
    "httpversion" => "1.1",
       "response" => "200",
           "verb" => "GET",
          "bytes" => "203023",
          "ident" => "-",
       "referrer" => "\"http://semicomplete.com/presentations/logstash-monitorama-2013/\"",
     "@timestamp" => 2020-12-29T02:19:09.153Z,
          "agent" => "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36\"",
           "host" => "ela1"
}

详情请参考 grok 点我直达 和 geoip 点我直达

二、Filebeat 安装与使用

1、安装

curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.10.1-linux-x86_64.tar.gz

tar xzvf filebeat-7.10.1-linux-x86_64.tar.gz -C /usr/local

2、启用和配置数据收集模块

Filebeat使用模块来收集和解析日志数据。

1 查看可以启用的模块列表
[root@localhost ~]# cd /usr/local/filebeat-7.10.0-linux-x86_64/
[root@localhost filebeat-7.10.0-linux-x86_64]# ./filebeat modules list
Enabled:

Disabled:
activemq
apache
...
2 在安装目录中,启用一个或多个模块。
2.1 启用 nginx 模块

例如,下面的命令启用 nginx 的配置模块:

[root@localhost filebeat-7.10.0-linux-x86_64]# ./filebeat modules enable  nginx
Enabled nginx

这条命令实际上是把位于 modules.d/ 目录下的文件 nginx.yml.disabled 修改为了 nginx.yml

2.2 配置 nginx 模块

nginx.yml 文件内容:

module: nginx
access:

  # 开启搜集访问日志

  enabled: true

  # var.paths:

error:

  # 开启搜集错误日志

  enabled: true
  #var.paths:
ingress_controller:
  #它可以在Kubernetes环境中用于解析nginx日志的入口,默认禁用此选项
  enabled: false
  #var.paths:
var.paths

是用于给日志文件设置自定义路径的。
如果不设置此选项,Filebeat将根据您的操作系统选择路径。
比如:

/var/log/nginx/assecc.log
/var/log/nginx/error.log

var.paths 接收的值是一个数组,可以使用如下方式配置。

- module: nginx
  access:
    # 开启搜集访问日志
    enabled: true
    var.paths: ["/var/log/nginx/access.log*"] #自定义位置

部署步骤

1.logstash
beats {
port => 5044
}

2.filebeat

数据源
从哪儿搜集日志

模块
./filebeat modules enable nginx

var.paths: ["/path/to/*.log"]

配置输出
filebeat.yml

输出到Logstatsh
配置上 logstash 的 IP 地址

3.启动 filebeat

./filebeat # 前台

nohub ./filebeat &

3 修改配置文件
[root@localhost ~]# cat  /usr/local/logstash-7.10.0/first-pipeline.conf 
input {
    beats {
      port => 5044
   }
}

filter {
    grok {
        match => { "message" => "%{COMBINEDAPACHELOG}"}
        remove_field => [ "message" ]
    }
    geoip { source => "clientip" }
}

output {
  stdout { codec => rubydebug }
}

/filebeat-7.10.0-linux-x86_64/filebeat.yml
在这里插入图片描述

4 启动
[root@localhost ~]#  /usr/loacl/filebeat-7.10.0-linux-x86_64/filebeat -d #放入后台
[root@localhost ~]# cd /usr/local/logstash-7.10.0/
[root@localhost logstash-7.10.0]# bin/logstash -f first-pipeline.conf --config.reload.automatic

输出

[2020-12-29T02:22:28,710][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
{
       "clientip" => "10.9.29.250",
      "timestamp" => "29/Dec/2020:08:46:43 +0800",
       "referrer" => "\"http://10.9.12.250:8080/app/home\"",
           "verb" => "POST",
          "ident" => "-",
           "auth" => "-",
            "ecs" => {
        "version" => "1.5.0"
    },
          "event" => {
        "timezone" => "-05:00",
         "dataset" => "nginx.access",
          "module" => "nginx"
    },
     "@timestamp" => 2020-12-29T07:21:11.293Z,
           "host" => {
         "architecture" => "x86_64",
                 "name" => "localhost",
                   "ip" => [
            [0] "192.168.116.167",
            [1] "fe80::2bcc:46ea:d75d:d5dc"
        ],
...

三、Elasticsearch + Logstash + Filebeat 用于生产环境的简单配置方案

配合 Elastic 可以在集群中实现查询

[root@localhost filebeat-7.10.0-linux-x86_64]# curl -X GET "192.168.116.167:9200/_cat/nodes"
192.168.116.155 33 96  6 0.33 0.23 0.17 cdhilmrstw * ela2
192.168.116.166 31 96  6 0.10 0.11 0.13 cdhilmrstw - ela3
192.168.116.167 29 96 39 1.39 1.67 1.84 cdhilmrstw - ela1
[root@ela3 ~]# curl -X GET "192.168.116.167:9200/_cat/indices?v"
health status index                      uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   logstash-2020.12.28-000001 W1R0NfkXToCDbgy_nSmr8A   1   1          0            0    
[root@localhost ~]# cat /usr/local/logstash-7.10.0/first-pipeline.conf 
input {
    beats {
      port => 5044
   }
}

filter {
    grok {
        match => { "message" => "%{COMBINEDAPACHELOG}"}
        remove_field => [ "message" ]
    }
    geoip { source => "clientip" }
}

output {
  stdout { codec => rubydebug }

elasticsearch {
    # 这里是输出到 elasticsearch 集群中
    hosts => ["192.168.116.167:9200","192.168.116.155:9200","192.168.116.166:9200"]
  }
}

日志追加 则继续
输出

{
        "request" => "/api/ui_metric/report",
       "clientip" => "10.9.29.250",
        "fileset" => {
        "name" => "access"
    },
       "referrer" => "\"http://10.9.12.250:8080/app/home\"",
      "timestamp" => "29/Dec/2020:08:46:43 +0800",
        "service" => {
        "type" => "nginx"
    },
          "agent" => {
        "ephemeral_id" => "68962cc2-f4d6-465a-b7bc-8cc3aa91429d",
                  "id" => "dce975d3-24f5-421f-a7ca-0dadfc6348f1",
            "hostname" => "localhost",
                "type" => "filebeat",
                "name" => "localhost",
             "version" => "7.10.0"
    },
          "geoip" => {},
            "log" => {
          "file" => {
            "path" => "/var/log/nginx/access.log"
        },
        "offset" => 0
    },
    "httpversion" => "1.1",
          "bytes" => "0",
           "tags" => [
        [0] "beats_input_codec_plain_applied",
        [1] "_geoip_lookup_failure"
    ],
       "response" => "499",
          "input" => {
        "type" => "log"
    },
           "verb" => "POST",
           "host" => {
                   "id" => "38b8887c97c045caa0333f41031ea4ea",
             "hostname" => "localhost",
                  "mac" => [
            [0] "00:0c:29:6d:70:86"
        ],
                   "os" => {
            "platform" => "centos",
                "name" => "CentOS Linux",
            "codename" => "Core",
              "family" => "redhat",
             "version" => "7 (Core)",
              "kernel" => "3.10.0-1127.19.1.el7.x86_64"
        },
         "architecture" => "x86_64",
        "containerized" => false,
                 "name" => "localhost",
                   "ip" => [
            [0] "192.168.116.167",
            [1] "fe80::2bcc:46ea:d75d:d5dc"
        ]
    },
       "@version" => "1",
          "ident" => "-",
           "auth" => "-",
            "ecs" => {
        "version" => "1.5.0"
    },
          "event" => {
          "module" => "nginx",
        "timezone" => "-05:00",
         "dataset" => "nginx.access"
    },
     "@timestamp" => 2020-12-29T09:53:38.427Z
}
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值