nagios能正常启动的情况下,发不出邮件

一. 检查sendmail或其他邮件服务能否正常发邮件(如果sendmail发邮件异常慢,则看我“网络服务的一篇文章http://blog.csdn.net/miltonzhong/article/details/10951347”)

二. 确定nagios配置文件正确


查sendmail 的日志/var/log/maillog,只发现我手动发送邮件的记录,而没有其他发送记录---只有下面这么一条记录:

Jul 27 14:27:48 nagios sm-mta[37141]: m6RERkYR037139: to=< sery@163.com>, ctladdr=< nagios@nagios.sery.com> (1003/1003), delay=00:00:02, xdelay=00:00:01, mailer=esmtp, pri=30623, relay=163mx02.mxmail.netease.com. [220.181.12.66], dsn=2.0.0, stat=Sent (Mail OK queued as mx16,QsCowLDbPSxWFYxIb6TzGw==.27600S2 1217140055)

看来nagios并没有调用sendmail发送邮件。
 
差点忘了,nagios自己也有日志记录呢!赶快打开看一眼,发现里面有不少Warning,抽一个出来,其内容如下:
[1217166816] HOST NOTIFICATION: sery;mail-server;DOWN;host-notify-by-email;CRITICAL - Plugin timed out after 10 seconds
[1217166816] Warning: Attempting to execute the command "/usr/bin/printf "%b" "***** Nagios 2.9 *****\n\nNotification Type: PROBLEM\nHost: mail-server\nState: DOWN\nAddress: 211.155.115.66\nInfo: CRITICAL - Plugin timed out after 10 seconds\n\nDate/Time: Sun Jul 27 13:53:36 UTC 2008\n" | /bin/mail -s "Host DOWN alert for mail-server!" sery@163.com" resulted in a return code of 127.  Make sure the script or binary you are trying to execute actually exists...
其他的行也更这个类似;最有用的信息我用红色标记,其大意是不能执行上面的2进制或可执行文件。在这个条目中,只有2个执行文件—printf及mail。我把它按原样单独拿出来执行,操作过程如下:
(1)/usr/bin/printf  “"%b" "***** Nagios 2.9 *****\n”  输出 ***** Nagios 2.9 *****这是正常的结果。
(2)/bin/mail -s "Host DOWN alert for mail-server!" sery@163.com 输出su: /bin/mail: No such file or directory没找到路径或目录。前面还手动发了邮件的,明明有mail这个客户端程序呀!可能这个路径不对,是linux的mail路径。查一下freebsd的mail路径,执行find / -name 得到mail在freebsd的路径为/usr/bin/mail
 
到这里,我们知道了为啥不能发邮件的根本原因,接下来,我把nagios的配置文件commands.cfg的host-notify-by-email、service-notify-by-email的”/bin/mail”替换为“/usr/bin/mail”。其完整形式为:
# 'host-notify-by-email' command definition
define command{
        command_name    host-notify-by-email
        command_line    /usr/bin/printf "%b" "***** Nagios 2.9 *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /usr/bin/mail -s "Host $HOSTSTATE$ alert for $HOSTNAME$!" $CONTACTEMAIL$
        }
# 'notify-by-email' command definition
define command{
        command_name    service-notify-by-email
        command_line    /usr/bin/printf "%b" "***** Nagios 2.9 *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$" | /usr/bin/mail -s "** $NOTIFICATIONTYPE$ alert - $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
        }
修改完配置文件commands.cfg后重启 Nagios,再查看nagios日志,不再有“Make sure the script or binary you are trying to execute actually exists...”报错,并且有发送报警邮件的记录了:
[root@nagios /usr/local/nagios/var]# tail -f nagios.log
[1217170467] SERVICE ALERT: mail-server;check_tcp 995;CRITICAL;SOFT;1;CRITICAL - Socket timeout after 10 seconds
[1217170534] Auto-save of retention data completed successfully.
[1217170577] HOST ALERT: mail-server;DOWN;SOFT;1;CRITICAL - Plugin timed out after 10 seconds
[1217170587] HOST ALERT: mail-server;DOWN;SOFT;2;CRITICAL - Plugin timed out after 10 seconds
[1217170597] HOST ALERT: mail-server;DOWN;SOFT;3;CRITICAL - Plugin timed out after 10 seconds
[1217170607] HOST ALERT: mail-server;DOWN;SOFT;4;CRITICAL - Plugin timed out after 10 seconds
[1217170607] HOST ALERT: mail-server;UP;SOFT;5;PING OK - Packet loss = 0%, RTA = 111.63 ms
[1217170607] SERVICE ALERT: mail-server;check_tcp 995;CRITICAL;SOFT;2;CRITICAL - Socket timeout after 10 seconds
[1217170687] SERVICE ALERT: mail-server;check_tcp 995;OK;SOFT;3;TCP OK - 3.137 second response time on port 995
[1217171057] SERVICE NOTIFICATION: sery;fav-0;check_tcp 443;CRITICAL;service-notify-by-email;CRITICAL - Socket timeout after 10 seconds
 
收邮件,迫不及待,哈哈,我的163邮箱收到久违的报警信息了。再回去瞧一眼邮件日志/var/log/malllog,也记录了这个发送情况。
当然还有其他的联系人的配置等 (nagios的配置文件都是相互关联的,多注意即可)
 
以下也许是最重要的一点,因为我用以下方法解决过问题,内网搭建的nagios服务器最终也能及时发邮件了


   最近在外网新搭了一套nagios系统,开始几天系统出了问题nagios还能发邮件通知,可最近出了问题老收不到邮件,手工在服务器上发邮件又可以,后来一查sendmail的日志和nagios的日志,发现sendmail的邮件只有邮件信息,没有进邮件队列.nagios那边的日志又报下面的警告:

[1292174436] Warning: Contact 'wahaha' service notification command '/usr/bin/printf "%b" "***** Nagios *****/n/nNotification Type: PROBLEM/n/nService: /boot/nHost: hostname/nAddress: 192.168.3.11/nState: CRITICAL/n/nDate/Time: Mon Dec 13 01:20:05 CST 2010/n/nAdditional Info:/n/nDISK CRITICAL - free space: /boot
8 MB (8% inode=99%):" | /bin/mail -s "** PROBLEM Service Alert: hostname//boot is CRITICAL **" wahaha@163.com' timed out after 30 seconds

从上面的日志看,应该是发送超时了,也就是说设置的通知时间还不够能让sendmail发出邮件的时间,到这就好办了.通过修改nagios的配置文件nagios.cfg,将notification_timeout=30改为notification_timeout=120后重起nagios.发现已经能收到报警邮件了,到此问题解决!



  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值