简介
nagios+logstash实时监控java日志(一)中我们配置java日志输出到logstash的4800端口进行收集。此种收集方式有以下几个问题:
1.若INFO级别的日志量很大,java程序向logstash端口输出日志慢阻塞,导致java程序运行异常;
2.logstash进程可能宕掉,java无法输出日志,导致服务异常;
通过以上两种情况,我们还是调整下思路:java日志打印ERROR级别的日志到一个新文件,logstash通过监控文件的方式配合nagios进行报警。这样即使logstash宕掉也不会影响java程序了。
实现
1.配置java的log4j日志输出
#LogStash
log4j.appender.LogStash=org.apache.log4j.DailyRollingFileAppender
log4j.appender.LogStash.File=logs/logstash-service.log
log4j.appender.LogStash.Threshold = ERROR
log4j.appender.LogStash.layout=org.apache.log4j.PatternLayout
log4j.appender.LogStash.layout.ConversionPattern=%-d{yyyy-MM-dd HH:mm:ss} [%t:%r] - [%p] %m%n
注意: 通过Threshold=ERROR设置日志级别
2.logstash配置文件
(1)自定义正则表达式规则
mkdir -p /usr/local/logstash/patterns/
cd /usr/local/logstash/patterns/
vim java
JETTYDATE %{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{HOUR}:?%{MINUTE}(?::?%{SECOND})
JAVATHREAD1 (?:[A-Za-z]{1,10}[\d]+-[\d]+)
JAVAACCESS %{JETTYDATE:timestamp} \[%{JAVATHREAD1:thread}:%{INT:time}\] - \[%{LOGLEVEL:level}\] %{GREEDYDATA:message}
以上规则可根据logstash中自带的正则表达式进行更改,文件如下:
#java正则,其中该目录下还有其他正则,如ruby,redis,haproxy等
/usr/local/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-2.0.2/patterns/java
#普通正则,例如java,ruby等正则则是由普通正则组合而成
/usr/local/logstash/vendor/bundle/jruby/1.9/gems/logstash-patterns-core-2.0.2/patterns/grok-patterns
若正则的语法不太明白,可参考perl或ruby的正则语法。
注意:
1.logstash自带的java正则可能和我们的日志输出格式不匹配,例如:
我的日志打印的JAVATHREAD为”qtp2070291850-78”,但是系统自带的JAVATHREAD正则为”(?:[A-Z]{2}-Processor[\d]+)”,我们的JAVATHREAD”qtp2070291850-78”中没有Processor,因此我自定义了一个JAVATHREAD1正则表达式为(?:[A-Za-z]{1,10}[\d]+-[\d]+),能够匹配我们的输出。
2.为方便我们调试自定义的正则表达式是否正确,可以到https://grokconstructor.appspot.com/do/match来进行测试。
(2)配置logstash
input {
file {
path => "/data/jetty/logs/logstash-service.log"
type => "jetty-access"
codec => multiline {
pattern => "^%{YEAR}-%{MONTHNUM}-%{MONTHDAY}"
negate => true
what => "previous"
}
start_position => "beginning"
sincedb_path => "/usr/local/logstash/sincedb"
}
}
filter {
grok {
patterns_dir => "/usr/local/logstash/patterns"
match => {
message => "%{JAVAACCESS}"
}
}
}
output {
# stdout {
# codec => "json"
# codec => "rubydebug"
# }
# file {
# path => "/logs/out.log"
# }
#由于监控ERROR日志,nagios主动监控会一直处于ERROR状态,因此我又加了个if,当出现一个ERROR报警的同时,也会出现recovery的恢复报警。
if [level] == "ERROR" {
nagios_nsca {
host => "192.168.1.1"
port => "5667"
message_format => "%{timestamp} %{thread} %{message}"
send_nsca_bin => "/usr/local/nagios/bin/send_nsca"
send_nsca_config => "/usr/local/nagios/etc/send_nsca.cfg"
nagios_host => "192.168.1.2"
nagios_service => "java service"
nagios_status => 2
}
}
if [type] == "java-access"{
nagios_nsca {
host => "192.168.1.1"
port => "5667"
message_format => "%{timestamp} %{thread} OK"
send_nsca_bin => "/usr/local/nagios/bin/send_nsca"
send_nsca_config => "/usr/local/nagios/etc/send_nsca.cfg"
nagios_host => "192.168.1.2"
nagios_service => "java service"
nagios_status => 0
}
}
}
OK,至此logstash监控java日志的两种方式全部完成。