1:基础配置
完成基础配置即可完成日志读取解析
input {
file{
path => "/var/log/logstash/info.log"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter{
grok{
match => { "message" => "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:loglevel} %{NUMBER:pid} --- \[%{DATA:thread}\] %{JAVACLASS:package} : %{GREEDYDATA:message}"}
}
}
output {
elasticsearch {
hosts => ["http://172.17.0.2:9200"]
index => "elk"
}
stdout {
codec => rubydebug
}
}
但是解析的日志不够优雅
2:存在的问题
2.1:fix每次重新读取日志文件
file标签下
sincedb_path => "/dev/null" # 每次重新读取日志文件
sincedb_path => "/var/log/logstash/sincedb" # 从结束的位置继续读取日志文件:
2.2:fix message是数组(重复/冗余)
现象:
{
"path" => "/var/log/logstash/info.log",
"package" => "o.s.b.c.embedded.FilterRegistrationBean",
"timestamp" => "2014-03-05 10:57:51.702",
"loglevel" => "INFO",
"message" => [
[0] "2014-03-05 10:57:51.702 INFO 45469 --- [ost-startStop-1] o.s.b.c.embedded.FilterRegistrationBean : info日志第一行\r",
[1] "info日志第一行\r"
],
"pid" => "45469",
"thread" => "ost-startStop-1",
"@timestamp" => 2024-08-11T03:17:59.712Z,
"@version" => "1",
"host" => "b10e967d1b3d"
}
注意看message是数组,第一个元素是全量的日志信息,第二个元素才是具体的日志信息
2.2.1 解决方案1:
在filter下gork标签下加:overwrite => ["message"] 表示覆盖
加上之后的message解析如下
{
"path" => "/var/log/logstash/info.log",
"package" => "o.s.b.c.embedded.FilterRegistrationBean",
"timestamp" => "2014-03-05 10:57:51.702",
"loglevel" => "INFO",
"message" => "info日志第一行\r",
"pid" => "45469",
"thread" => "ost-startStop-1",
"@timestamp" => 2024-08-11T03:17:59.712Z,
"@version" => "1",
"host" => "b10e967d1b3d"
}
现在message才是正常的,
2.2.2 解决方案2:
grok中%{GREEDYDATA:message}不要使用message名字
可以随便定义其他名字,比如定义为%{GREEDYDATA:log_message},这样就会有message和log_message两个字段
效果如下
{
"path" => "/var/log/logstash/info.log",
"package" => "o.s.b.c.embedded.FilterRegistrationBean",
"timestamp" => "2014-03-05 10:57:51.702",
"loglevel" => "INFO",
"message" => "2014-03-05 10:57:51.702 INFO 45469 --- [ost-startStop-1] o.s.b.c.embedded.FilterRegistrationBean : info日志第一行\r",
"log_message" => "info日志第一行\r",
"pid" => "45469",
"thread" => "ost-startStop-1",
"@timestamp" => 2024-08-11T03:17:59.712Z,
"@version" => "1",
"host" => "b10e967d1b3d"
}
2.3 fix 去掉日志最后的换行符
使用mutate过滤器的gsub功能去除message字段中的\r
用法:在filter标签内加上
mutate {
gsub => [
# 第一个数组元素是字段名,第二个元素是正则表达式,第三个元素是替换后的文本
# 这里我们将message字段里的\r替换为空字符串
"message", "\r", "",
# 如果你也想去除\n,可以添加另一个替换规则
"message", "\n", ""
]
}
解析效果:
{
"@timestamp" => 2024-08-11T06:07:08.744Z,
"pid" => "45469",
"thread" => "ost-startStop-1",
"message" => "info日志第一行",
"path" => "/var/log/logstash/info.log",
"@version" => "1",
"host" => "b10e967d1b3d",
"timestamp" => "2014-03-05 10:57:51.702",
"loglevel" => "INFO",
"package" => "o.s.b.c.embedded.FilterRegistrationBean"
}