按照nagios安装与配置教程(详细版)【入门教程】2020-11-16_Eye to eye的博客-CSDN博客_nagios安装与配置
安装好nagios服务端和客户端
修改完配置后要记得用:# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg来验证所有配置文件有没有出错,如果出错那就将配置文件改回去,我一般是改一个配置文件就验证一次,验证通过了,再改下一个,不然出错了不知道是哪个配置文件出错了,验证结果为:
[root@localhost etc]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Nagios Core 4.3.1
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 02-23-2017
License: GPL
Website: https://www.nagios.org
Reading configuration data...
Read main config file okay...
Read object config files okay...
Running pre-flight check on configuration data...
Checking objects...
Checked 15 services.
Checked 4 hosts.
Checked 2 host groups.
Checked 0 service groups.
Checked 1 contacts.
Checked 1 contact groups.
Checked 25 commands.
Checked 5 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 4 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
此处为两个0就是配置文件没有问题,那么我们就要重启http和nagios,命令为:
systemctl restart httpd
systemctl restart nagios
只要验证配置文件没有问题一般重启不会出现错误
二、检查监控端的nagios进程,ps -ef|grep nagios,会出nagios进程出来
检查被监控端的5666端口,netstat nltp|grep 5666,会有5666端口在监听,windows的话是:netstat -ano | findstr "5666"
三、监控端和被监控端都要安装Nagios-plugins插件和NRPE,安装差别就是被监控端的NRPE要多安装两步用于监控,而监控端安装NRPE只是安装好就可以了。
四、windows的话只需要安装一个插件NSClient++
插件下载页面为:NSClient++(Nagios监视系统客户端)下载V0.4.3.88 官方最新版-带教程西西软件下载
只要在安装时添加好允许监控的地址和把下面的勾选上就行了,然后在服务里面找到NSCclient++这个服务,在登录里面把允许桌面交互勾选上,再重启服务就可以了。=
五、添加windows主机可以用命令:cp windows.cfg winERP.cfg,然后在winERP.cfg里面更改相应的IP和需要用到的监听服务,然后在etc目录下的nagios.cfg里面添加一行:
cfg_file=/usr/local/nagios/etc/objects/winERP.cfg
而linux系统则是差不多如linux37.cfg,放在objects目录下:
define host{
use linux-server
host_name linux37
alias linux37
address 192.168.7.37
}
define service{
use generic-service
host_name linux37
service_description CHECK USERS
check_command check_nrpe!check_users
}
define service{
use generic-service
host_name linux37
service_description load
check_command check_nrpe!check_load
}
define service{
use generic-service
host_name linux37
service_description disk sda1
check_command check_nrpe!check_sda1
}
define service{
use generic-service
host_name linux37
service_description Zombile procs
check_command check_nrpe!check_zombie_procs
}
define service{
use generic-service
host_name linux37
service_description total procs
check_command check_nrpe!check_total_procs
}
并在nagios.cfg里面添加一行:cfg_file=/usr/local/nagios/etc/objects/linux37.cfg
六、在被监控端的nrpe.cfg里面添加监听命令后如:command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
要重启nrpe才能生效:
1.输入 “ps -ef | grep nrpe”命令查找出nagios的进程id
2.输入“# kill -9 8516”命令杀掉进程,8516就是进程id
3.输入“/usr/local/nagios/bin/nrpe -n -c /usr/local/nagios/etc/nrpe.cfg -d”命令启动nrpe进程
七、nagios发送报警邮件
我用的是sendEmail
测试命令:sendEmail -f answanXXX@163.com -t wyp-4txl9XXX@dingtalk.com -s smtp.163.com -xu answanXXX -xp password -u "nagios test" -m "nagios test "
实际应用:
define command{
command_name notify-host-by-email
command_line /usr/local/bin/sendEmail -f answanXXX@163.com -t $CONTACTEMAIL$ -s smtp.163.com -u "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" -xu answanXXX -xp password
}
# 'notify-service-by-email' command definition
define command{
command_name notify-service-by-email
command_line /usr/local/bin/sendEmail -f answanXXX@163.com -t $CONTACTEMAIL$ -s smtp.163.com -u "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" -xu answanXXX -xp password
}
八、NRPE命令的关系图:
九、在具体主机的配置文件里面,一个名字只能定义一个主机组,如果别的主机配置文件再定义就会报错:重复的主机组定义
# Define a hostgroup for Windows machines
# All hosts that use the windows-server template will automatically be a member of this group
define hostgroup{
hostgroup_name windows-servers ; The name of the hostgroup
alias Windows Servers ; Long name of the group
}