logstash7.15.1读取文件并写入elasticsearch7.5.1(多行编码和过滤器)

elasticsearch7.5.1和kibana7.5.1以及logstash7.15.1的安装部署这里略过。

1. 配置文件一

input {
  file {
    start_position => end 
    path => "E:/home/wxp/box/task/box-task-info.log"
	type => "type1" ### 用去输出到es时判断存入哪个索引
	codec => multiline {
		### 是否匹配,只有匹配的才记录,不匹配的忽略
		negate => true 
		### 匹配的正则
		pattern => "(?<datetime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}) *" 
		###将没匹配到的合并到上一条,可选previous或next, previous是合并到匹配的上一行末尾
		what => "previous" 
		max_lines => 1000 ### 最大允许的行
		max_bytes => "10MiB" ### 允许的大小
		auto_flush_interval => 30 ### 如果在规定时候内没有新的日志事件就不等待后面的日志事件
   }
  } 
} 
 
output {
  stdout{}
  elasticsearch {
   #es地址,可多个
   hosts => ["localhost:9200"]
    action => "index"
	#获取输出参数"indexname"值当做索引,如果没有则会自动创建对应索引(需要es开启自动创建索引)
    index => "test_log_index"
   }
}
  1. start_position => end 配置表示从文件末尾开始读取;
  2. type => "type1" 表示增加一个类型参数,后续可以根据该参数进行判断;
  3. pattern => "(?<datetime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}) *" ,匹配正则,这里只是以完整的日期时间开头的行,表示开始行;
  4. auto_flush_interval => 30,日志刷写间隔;

2. 增加过滤的配置

input {
  file {
    start_position => end 
    path => "E:/home/wxp/box/task/box-task-info.log"
	type => "type1" ### 用去输出到es时判断存入哪个索引
	codec => multiline {
		### 是否匹配,只有匹配的才记录,不匹配的忽略
		negate => true 
		### 匹配的正则
		pattern => "(?<datetime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}) *" 
		###将没匹配到的合并到上一条,可选previous或next, previous是合并到匹配的上一行末尾
		what => "previous" 
		max_lines => 1000 ### 最大允许的行
		max_bytes => "10MiB" ### 允许的大小
		auto_flush_interval => 30 ### 如果在规定时候内没有新的日志事件就不等待后面的日志事件
   }
  } 
}


filter{
	grok{
	  match => {"message" => "(?<datetime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}) *"}
	} ### 通过grok匹配内容并将
	date{
	  match => ["datetime","yyyy-MM-dd HH:mm:ss.SSS","yyyy-MM-dd HH:mm:ss.SSSZ"]
	  target => "@timestamp"
	} ### 处理时间
}


output {
  stdout{}
  elasticsearch {
   #es地址,可多个
   hosts => ["localhost:9200"]
    action => "index"
	#获取输出参数"indexname"值当做索引,如果没有则会自动创建对应索引(需要es开启自动创建索引)
    index => "test_log_index"
   }
}

这里以最简单的方式进行时间戳提取。

3. 启动logstash进行测试

在文件尾添加如下内容:

2020-02-19 17:09:31.829 [main] INFO  o.a.c.c.StandardService - [log,173] - Stopping service [Tomcat]
2020-02-19 17:09:31.829 [main] INFO  o.a.c.c.StandardService - [log,173] - Stopping service [Tomcat]

控制台输出:

{
      "datetime" => "2020-02-19 17:09:31.829",
    "@timestamp" => 2020-02-19T09:09:31.829Z,
          "type" => "type1",
          "path" => "E:/home/wxp/box/task/box-task-info.log",
       "message" => "2020-02-19 17:09:31.829 [main] INFO  o.a.c.c.StandardService - [log,173] - Stopping service [Tomcat]\r",
      "@version" => "1",
          "host" => "DESKTOP-O93E7VQ"
}
{
      "datetime" => "2020-02-19 17:09:31.829",
    "@timestamp" => 2020-02-19T09:09:31.829Z,
          "type" => "type1",
          "path" => "E:/home/wxp/box/task/box-task-info.log",
       "message" => "2020-02-19 17:09:31.829 [main] INFO  o.a.c.c.StandardService - [log,173] - Stopping service [Tomcat]\r",
      "@version" => "1",
          "host" => "DESKTOP-O93E7VQ"
}

可以看到,这里的时间戳,是从日志中提取出来的,而不是系统的时间。

日志时间,也相应地根据日志时间进行了设置。

4. filter中grok匹配规则

通过增加过滤器,对日志进行格式化。

grok匹配的典型格式:

  • %{NUMBER:duration} — 匹配浮点数
  • %{IP:client} — 匹配IP
  • (?([\S+]*)),自定义正则
  • (?<class_info>([\S+]*)), 自定义正则匹配多个字符
  • \s*或者\s+,代表多个空格
  • \S+或者\S*,代表多个字符
  • 大括号里面:xxx,相当于起别名
  • %{UUID},匹配类似091ece39-5444-44a1-9f1e-019a17286b48
  • %{WORD}, 匹配请求的方式
  • %{GREEDYDATA},匹配所有剩余的数据
  • %{LOGLEVEL:loglevel} ---- 匹配日志级别
  • 自定义类型

在kibana的dev tools中,有一个调试工具"Grok Debugger",界面如下所示:

可以输入样例数据,匹配样式,系统自动显示匹配结果。

案例一

2020-03-18 14:04:23.944 [DubboServerHandler-10.50.245.25:63046-thread-168] INFO  c.f.l.d.LogTraceDubboProviderFilter - c79b0905-03c7-4e54-a5a6-ff1b34058cdf CALLEE_IN dubbo:EstatePriceService.listCellPrice

\s*%{TIMESTAMP_ISO8601:timestamp} \s*\[%{DATA:current_thread}\]\s*%{LOGLEVEL:loglevel}\s*(?<class_info>([\S+]*))

{
  "current_thread": "DubboServerHandler-10.50.245.25:63046-thread-168",
  "loglevel": "INFO",
  "class_info": "c.f.l.d.LogTraceDubboProviderFilter",
  "timestamp": "2020-03-18 14:04:23.944"
}

 案例二

2020-12-04 14:16:30.003  INFO 19095 --- [pool-4-thread-1] com.yck.laochangzhang.task.EndorseTask   : 平台发起背书结束.....

(?<datetime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\.\d{3})\s+(?<level>\w+)\s+\d+\s+-+\s\[[^\[\]]+\]\s+(?<handler>\S+)\s+:(?<msg>.*)

{
  "msg": " 平台发起背书结束.....",
  "handler": "com.yck.laochangzhang.task.EndorseTask",
  "datetime": "2020-12-04 14:16:30.003",
  "level": "INFO"
}

案例三 

2020-02-19 17:09:31.829 [main] INFO  o.a.c.c.StandardService - [log,173] - Stopping service [Tomcat]

(?<datetime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}) \[%{DATA:thread}\] %{LOGLEVEL:loglevel}\s*%{GREEDYDATA:msg} *

{
  "msg": "o.a.c.c.StandardService - [log,173] - Stopping service [Tomcat]",
  "datetime": "2020-02-19 17:09:31.829",
  "loglevel": "INFO",
  "thread": "main"
}

5. 提取线程、日志级别以及按年和月生成索引

input {
  file {
    start_position => end 
    path => "E:/home/wxp/box/task/box-task-info.log"
	type => "type1" ### 用去输出到es时判断存入哪个索引
	codec => multiline {
		### 是否匹配,只有匹配的才记录,不匹配的忽略
		negate => true 
		### 匹配的正则
		pattern => "(?<datetime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3})\s*\[%{DATA:thread}\]\s*%{LOGLEVEL:loglevel}\s*%{GREEDYDATA:msg}" 
		###将没匹配到的合并到上一条,可选previous或next, previous是合并到匹配的上一行末尾
		what => "previous" 
		max_lines => 1000 ### 最大允许的行
		max_bytes => "10MiB" ### 允许的大小
		auto_flush_interval => 30 ### 如果在规定时候内没有新的日志事件就不等待后面的日志事件
   }
  }
 
}


filter{
	grok{
	  match => {"message" => "(?<datetime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3})\s*\[%{DATA:thread}\]\s*%{LOGLEVEL:loglevel}\s*%{GREEDYDATA:msg}"}
	} ### 通过grok匹配内容并将
	date{
	  match => ["datetime","yyyy-MM-dd HH:mm:ss.SSS","yyyy-MM-dd HH:mm:ss.SSSZ"]
	  target => "@timestamp"
	} ### 处理时间
}


output {
  stdout{}
  elasticsearch {
   #es地址,可多个
   hosts => ["localhost:9200"]
    action => "index"
	#获取输出参数"indexname"值当做索引,如果没有则会自动创建对应索引(需要es开启自动创建索引)
    index => "test_log_index-%{+YYYY-MM}"
   }
}

logstash标准输出:

{
          "type" => "type1",
      "datetime" => "2020-02-19 17:09:31.829",
    "@timestamp" => 2020-02-19T09:09:31.829Z,
      "@version" => "1",
        "thread" => "main",
          "host" => "DESKTOP-O93E7VQ",
           "msg" => "o.a.c.c.StandardService - [log,173] - Stopping service [Tomcat]\r",
          "path" => "E:/home/wxp/box/task/box-task-info.log",
      "loglevel" => "INFO",
       "message" => "2020-02-19 17:09:31.829 [main] INFO  o.a.c.c.StandardService - [log,173] - Stopping service [Tomcat]\r"
}
{
          "type" => "type1",
      "datetime" => "2020-02-19 17:09:31.829",
    "@timestamp" => 2020-02-19T09:09:31.829Z,
      "@version" => "1",
        "thread" => "main",
          "host" => "DESKTOP-O93E7VQ",
           "msg" => "o.a.c.c.StandardService - [log,173] - Stopping service [Tomcat]\r",
          "path" => "E:/home/wxp/box/task/box-task-info.log",
      "loglevel" => "INFO",
       "message" => "2020-02-19 17:09:31.829 [main] INFO  o.a.c.c.StandardService - [log,173] - Stopping service [Tomcat]\r"
}

 

 可以看到,增加了日志级别,线程,按年和月生成的索引。

6. 通过grok插件匹配springBoot日志

input { 
    stdin { } 
    file {
        # 容器中日志所在目录的文件
        path => ["/usr/share/logstash/logs/*.log"]
        # 多行匹配方法1
        codec => multiline {
            pattern => "^(%{TIMESTAMP_ISO8601})"
            negate => true
            what => "previous"
        }
        sincedb_path => "NUL"
        type => "spring"
        start_position => "beginning"
    }
}
 
filter {
    if [type] == "spring" {
        # 多行匹配方法2
       # multiline {
            # pattern => "^(%{TIMESTAMP_ISO8601})"
            # negate => true
            # what => "previous"
       # }
        grok {
            # Do multiline matching with (?m) as the above mutliline filter may add newlines to the log messages.
            match => [ "message", "(?m)^%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{LOGLEVEL:logLevel}%{SPACE}%{NUMBER:pid}%{SPACE}---%{SPACE}%{SYSLOG5424SD:threadName}%{SPACE}%{NOTSPACE:loggerName}%{SPACE}:%{SPACE}%{GREEDYDATA:message}" ]
            # 覆盖原有的message
            overwrite=> [ "message" ]
        }
   }
}
 
 
output {
    if [type] == "spring" {
        elasticsearch {
            hosts => ["localhost:9200"]
            index => "springboot-%{+YYYY-MM-dd}"
        }
    }
	stdout { codec => rubydebug }
}

匹配的日志格式:

2020-05-15 17:54:50.805 DEBUG 8296 --- [scheduling-1] org.jooq.tools.LoggerListener            : Executing query          : select id from user
2020-05-15 17:54:52.945 DEBUG 1012 --- [scheduling-1] org.jooq.tools.LoggerListener            : Executing query          : select id from user

上述格式只为举例说明,实际项目中的格式可能与上述格式不同,根据需要进行适当调整即可。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值