check_logfiles 是检查nagios日志关键字的插件,其功能很强大。项目地址是https://labs.consol.de/nagios/check_logfiles/
一、安装
tar -zxvf check_logfiles-2.3.1.2.tar.gz
cd check_logfiles-2.3.1.2
./configure --with-nagios-user=nagios --with-nagios-group=nagios --with-seekfiles-dir=/usr/local/nagios/var/tmp --with-protocols-dir=/usr/local/nagios/var/tmp --with-trusted-path=/sbin:/usr/sbin:/usr/local/sbin:/bin:/usr/bin:/usr/local/nagios/libexec --with-perl=/usr/bin/perl --with-gzip=/bin/gzip
make
make install
二、配置
使用check_logfile
[root@WEBServer10414 libexec]# ./check_logfiles --help
This Nagios Plugin comes with absolutely NO WARRANTY. You may use
it on your own risk!
Copyright by ConSol Software GmbH, Gerhard Lausser.
This plugin looks for patterns in logfiles, even in those who were rotated
since the last run of this plugin.
Usage: check_logfiles [-t timeout] -f
The configfile looks like this:
$seekfilesdir = '/opt/nagios/var/tmp';#写状态信息的目录,这里面记录已经检查过的日志内容,相当于历史记录
# where the state information will be saved.
$protocolsdir = '/opt/nagios/var/tmp';#写协议信息的目录,这里面记录日志检查的匹配信息
# where protocols with found patterns will be stored.
$scriptpath = '/opt/nagios/var/tmp';#可调用的脚本或程序
# where scripts will be searched for.
$MACROS = { CL_DISK01 => "/dev/dsk/c0d1", CL_DISK02 => "/dev/dsk/c0d2" };#定义宏
@searches = (#此处为配置文件的内容,我们可以通过配置文件来执行程序,也可以通过在命令行中直接定义。通过配置文件更方便
{
tag => 'temperature',#定义唯一的标识符,它将在生成状态信息或协议信息中作为名字中的一部分使用,并没有实际的意义
logfile => '/var/adm/syslog/syslog.log',#日志文件位置
rotation => 'bmwhpux',#用来匹配归档的日志文件,rotation如果有截断日志的话用来定义如何匹配截断日志
criticalpatterns => ['OVERTEMP_EMERG', 'Power supply failed'],#严重错误,可以匹配一个或多个正则表达式
warningpatterns => ['OVERTEMP_CRIT', 'Corrected ECC Error'],#警告错误,可以匹配一个或多个正则表达式
options => 'script,protocol,nocount',#选项列表,我们可以选择启动脚本,写协议,不计数等操作
script => 'sendnsca_cmd'
},#脚本的名字
{
tag => 'scsi',
logfile => '/var/adm/messages',
rotation => 'solaris',
criticalpatterns => 'Sense Key: Not Ready',
criticalexceptions => 'Sense Key: Not Ready /dev/testdisk',
options => 'noprotocol'
},
{
tag => 'logins',
logfile => '/var/adm/messages',
rotation => 'solaris',
criticalpatterns => ['illegal key', 'read error.*$CL_DISK01$'],
criticalthreshold => 4
warningpatterns => ['read error.*$CL_DISK02$'],
}
);
以上将各个项目统一写到配置文件中,当然也可以将其放入命令行中调用,两种调用方式如下:
[root@WEBServer10414 libexec]# ./check_logfiles
Usage: check_logfiles [-t timeout] -f [--searches=tag1,tag2,...]
check_logfiles [-t timeout] --logfile= --tag= --rotation=
--criticalpattern= --warningpattern=
三、现网实例
1、在被监控端编辑一个配置文件,如下
vim /usr/local/nagios/var/catalina.cfg
$seekfilesdir = "/usr/local/nagios/var/tmp";
$protocolsdir = "/usr/local/nagios/var/tmp";
@searches = (
{
tag => 'tomcat',
logfile => '/opt/tomcat7/logs/catalina.out',
rotation => 'catalina.$CL_DATE_YYYY$-$CL_DATE_MM$-$CL_DATE_DD$.log',
criticalpatterns => [
'java.net.SocketTimeoutException',
'Exception'
],
warningpatterns => [
],
options => 'nocase,encoding=UTF-8,criticalthreshold=1,warningthreshold=1'
},
);
我们定义了一个标志tomcat.catalina.out,检查的日志文件为/opt/tomcat7/logs/catalina.out,当日志信息中匹配 ciriticalpattern中的内容时会报严重错误,;状态信息和协议信息会写入到 /usr/local/nagios/var/tmp中,
$CL_DATE_YYYY$-$CL_DATE_MM$-$CL_DATE_DD$是定义的时间宏,上面是匹配当前日期的归档日志。options => 'nocase',正则表达式不区分大小写,options=>'criticalthreshold=1,warningthreshold=1',
这个数值设置了忽略匹配的次数。如设置成3,即忽略前2次匹配,第3次匹配才计数。这里是只忽略1次,第2次匹配就记数。
2、在/usr/local/nagios/libexec的目录下,检查下配置的文件,显示执行正常,日志无报错。
[root@WEBServer10414 libexec]# ./check_logfiles --config /usr/local/nagios/var/catalina.cfg
OK - no errors or warnings|tomcat.catalina.out_lines=192 tomcat.catalina.out_warnings=0 tomcat.catalina.out_criticals=0 tomcat.catalina.out_unknowns=0
3、查看/usr/local/nagios/var/tmp目录下生成的
catalina._opt_tomcat7_logs_catalina.out.tomcat文件,其中tomcat是我们配置的tag,文件内容如下:
[root@WEBServer10414 tmp]# cat catalina._opt_tomcat7_logs_catalina.out.tomcat
$state = {
'logoffset' => 166891197,
'devino' => '2053:27754546',
'servicestateid' => 0,
'logtime' => 1470399570,
'serviceoutput' => ''
};
1;
4、被监控端添加nrpe.cfg文件中添加监控命令
command[check_tomcat_logfiles]=/usr/local/nagios/libexec/check_logfiles --config /usr/local/nagios/var/catalina.cfg
5、监控端定义的日志关键字监控服务配置
define service{
use local-service,srv-pnp ; Name of service template to use
host_name WEBServer10414
service_description Tomcat Front End Log Keyword Monitoring
check_command check_nrpe_arg!check_tomcat_logfiles!60!/usr/local/nagios/var/catalina.cfg
notifications_enabled 1
}
6、check_nrpe_arg命令的定义
# 'check_nrpe_arg' command definition
define command {
command_name check_nrpe_arg
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -t $ARG2$ -a $ARG3$
}
7、查看nagios展示的服务的信息
注意:/usr/local/nagios/var/catalina.cfg文件和/usr/local/nagios/var/tmp目录和下面生成的状态文件的权限要设为nagios,否则会报无权限写入错误。
本文转自服务器运维博客51CTO博客,原文链接http://blog.51cto.com/shamereedwine/1834872如需转载请自行联系原作者
neijiade10000