很多时候应用程序出错是都是抛出一堆 堆栈信息(即在日志文件输出多行),此时logstash可以使用multiline的插件收集日志时需要把错误堆栈信息收集为一个记录。multiline字面意思是多行,顾名思义就是对多行日志进行处理。

multiline配置与用法

input {
  stdin {
    codec => multiline {
      pattern => "pattern, a regexp
      negate => "true" or "false
      what => "previous" or "next
    }
  }
}

## pattern支持正则表达式,通过正则表达式匹配日志信息,grok pattern定义的参数也是支持,Eg:%{TIMESTAMP_ISO8601} ,具体信息可以查看grok-patterns

## negate  只支持布尔值,true 或者false,默认为false。如果设置为true,表示信息不跟上面正则表达式(pattern)匹配的内容都与匹配的整合,具体整合在前还是在后,看what参数。如果设置为false,即与pattern匹配的内容

## what  前一行 或者后一行,指出上面对应的规则与前一行内容收集为一行,还是与后一行整合在一起

 

官方原文

The pattern should match what you believe to be an indicator that the field is part of a multi-line event.

The what must be previous or next and indicates the relation to the multi-line event.

The negate can be true or false (defaults to false). If true, a message not matching the pattern will constitute a match of the multiline filter and the what will be applied. (vice-versa is also true)

 

在这里用PHP-FPM的慢查询日志做个测试

PHP-FPM的慢查询日志如下:

[11-Mar-2015 16:54:17]  [pool www] pid 12873
script_filename = /data//index.php
[0x00007f497fa5b620] curl_exec() /data//Account.php:221
[0x00007f497fa5a4e0] call() /data/gintama_app/jidong/game_code/app/controllers/Game.php:31
[0x00007fff29eea180] load() unknown:0
[0x00007f497fa59e18] call_user_func_array() /data/library/BaseCtrl.php:20
[0x00007fff29eea470] handoutAction() unknown:0
[0x00007f497fa59400] run() /data//index.php:30
  
[11-Mar-2015 16:56:46]  [pool www] pid 12881
script_filename = /data/index.php
[0x00007f497fa5b620] curl_exec() /data//Account.php:221
[0x00007f497fa5a4e0] call() /data/Game.php:31
[0x00007fff29eea180] load() unknown:0
[0x00007f497fa59e18] call_user_func_array() /data/library/BaseCtrl.php:20
[0x00007fff29eea470] handoutAction() unknown:0
[0x00007f497fa59400] run() /data/index.php:30


添加Logstash的配置文件logstash_php-fpm.conf

input {
    file {
        path => "/tmp/php-slow.log"  ###收集的日志文件路径
        codec => multiline {  ###使用multiline
            pattern => "^(\[\d{2}-%{MONTH}-\d{4})"  ###使用正则表达式,%{MONTH}是在gork定义好的,这边偷个懒,直接调用
            negate => true  ###设置为true,即取正则表达式不匹配的行,然后将内容与上一行或者下一行整合
            what => "previous"  ###设置为previous,表示与上一行内容整合在一起。
        }
    }
}

output{
   stdout { codec => rubydebug }
   elasticsearch{
        hosts => ["110.22.145.155:9200"]
        index => "logstash-php_%{+YYYY.MM.dd}"
   }
}

 

##测试配置文件是否符合语法

{logstash_home}/bin/logstash –t logstash_php-fpm.conf

# /opt/logstash/bin/logstash -t logstash-php_slow.conf 
Configuration OK

 

运行logstash,查看输出内容

# /opt/logstash/bin/logstash -f ./logstash-php_slow.conf 
Settings: Default pipeline workers: 8
Pipeline main started

{
    "@timestamp" => "2017-07-17T05:40:40.310Z",
       "message" => "[11-Mar-2015 16:54:17]  [pool www] pid 12873\nscript_filename = /data//index.php\n[0x00007f497fa5b620] curl_exec() /data//Account.php:221\n[0x00007f497fa5a4e0] call() /data/gintama_app/jidong/game_code/app/controllers/Game.php:31\n[0x00007fff29eea180] load() unknown:0\n[0x00007f497fa59e18] call_user_func_array() /data/library/BaseCtrl.php:20\n[0x00007fff29eea470] handoutAction() unknown:0\n[0x00007f497fa59400] run() /data//index.php:30\n  ",
      "@version" => "1",
          "tags" => [
        [0] "multiline"
    ],
          "path" => "/tmp/php-slow.log",
          "host" => "test2-web"
}
{
    "@timestamp" => "2017-07-17T05:40:47.321Z",
       "message" => "[11-Mar-2015 16:56:46]  [pool www] pid 12881\nscript_filename = /data/index.php\n[0x00007f497fa5b620] curl_exec() /data//Account.php:221\n[0x00007f497fa5a4e0] call() /data/Game.php:31\n[0x00007fff29eea180] load() unknown:0\n[0x00007f497fa59e18] call_user_func_array() /data/library/BaseCtrl.php:20\n[0x00007fff29eea470] handoutAction() unknown:0\n[0x00007f497fa59400] run() /data/index.php:30",
      "@version" => "1",
          "tags" => [
        [0] "multiline"
    ],
          "path" => "/tmp/php-slow.log",
          "host" => "test2-web"
}

 

Tomcat堆栈信息收集也类似,找出规则,然后进行匹配即可,在此不重复测试