Nagios配置文件详细参数参考: https://blog.51cto.com/ixdba/752870
IP | 主机名 | 备注 |
---|---|---|
192.168.117.14 | nagios | 监控主机 |
192.168.117.15 | client | 客户端 |
监控主机部署Nagios
基础环境
1.安装依赖包
[root@nagios ~]# yum install -y gcc glibc glibc-common wget unzip httpd php gd php-gd gd-devel perl
2.获取Nagios以及相关插件包(这里已下载好,放在/usr/local/src目录下)
Nagios:https://github.com/NagiosEnterprises/nagioscore/archive/nagios-4.4.3.tar.gz
客户端nrpe:https://github.com/NagiosEnterprises/nrpe/archive/nrpe-3.2.1.tar.gz
插件nagios-plugins:https://nagios-plugins.org/download/nagios-plugins-2.2.1.tar.gz
安装Nagios
1.安装nagioscore
[root@nagios ~]# cd /usr/local/src
[root@nagios src]# tar zxvf nagioscore-nagios-4.4.3.tar.gz
[root@nagios nagioscore-nagios-4.4.3]# cd nagioscore-nagios-4.4.3
[root@nagios nagioscore-nagios-4.4.3]# ./configure
[root@nagios nagioscore-nagios-4.4.3]# make all
[root@nagios nagioscore-nagios-4.4.3]# make install-groups-users
[root@nagios nagioscore-nagios-4.4.3]# usermod -a -G nagios apache
2.安装主程序
[root@nagios nagioscore-nagios-4.4.3]# make install
[root@nagios nagioscore-nagios-4.4.3]# make install-daemoninit
3.配置目录权限
[root@nagios nagioscore-nagios-4.4.3]# make install-commandmode
4.安装示例配置文件
[root@nagios nagioscore-nagios-4.4.3]# make install-config
5.安装web接口
[root@nagios nagioscore-nagios-4.4.3]# make install-webconf
6.修改nagios警告信息的邮件地址
[root@nagios ~]# vim /usr/local/nagios/etc/objects/contacts.cfg //这时还无法发送告警邮件,邮件配置往下翻
email 82900528@qq.com
7.nagiosadmin设置密码
[root@nagios ~]# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
8.启动http、nagios服务
[root@nagios ~]# systemctl enable --now httpd
[root@nagios ~]# systemctl enable --now nagios
9.浏览器访问IP/nagios,可以看见nagios界面
10.新增GQD账号,配置账号权限
[root@nagios ~]# htpasswd -bc /usr/local/nagios/etc/htpasswd.users GQD 123456
[root@nagios ~]# sed -i 's@nagiosadmin@GQD@g' /usr/local/nagios/etc/cgi.cfg
[root@nagios ~]# systemctl restart httpd
11.可用以下命令检查nagios配置文件语法错误
[root@nagios ~]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
安装nagios-plugins
1.安装依赖包
[root@nagios ~]# yum install -y gcc glibc glibc-common make gettext automake autoconf wget openssl-devel net-snmp net-snmp-utils epel-release
[root@nagios ~]# yum install -y perl-Net-SNMP
2.安装nagios-plugins
[root@nagios src]# cd /usr/local/src
[root@nagios src]# tar zxvf nagios-plugins-2.2.1.tar.gz
[root@nagios src]# cd nagios-plugins-2.2.1
[root@nagios nagios-plugins-2.2.1]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios
[root@nagios nagios-plugins-2.2.1]# make && make install
[root@nagios nagios-plugins-2.2.1]# systemctl restart nagios
3.浏览器访问nagios页面,查看监控项
安装nrpe
1.安装nrpe
[root@nagios ~]# cd /usr/local/src
[root@nagios src]# tar xvf nrpe-nrpe-3.2.1.tar.gz
[root@nagios src]# cd nrpe-nrpe-3.2.1
[root@nagios nrpe-nrpe-3.2.1]# ./configure --with-nrpe-user=nagios --with-nrpe-group=nagios --with-nagios-user=nagios --with-nagios-group=nagios --enable-command-args --enable-ssl
[root@nagios nrpe-nrpe-3.2.1]# make all
[root@nagios nrpe-nrpe-3.2.1]# make install-plugin
[root@nagios nrpe-nrpe-3.2.1]# make install-daemon
[root@nagios nrpe-nrpe-3.2.1]# make install-config
[root@nagios nrpe-nrpe-3.2.1]# make install-init //生成启动项,有执行此项则无需另配脚本
2.启动nrpe
[root@nagios ~]# systemctl start nrpe
[root@nagios ~]# systemctl enable nrpe
客户端部署nrpe(需要先安装nagios-plugins)
基础环境
1.下载依赖包
[root@client ~]# yum install -y gcc glibc glibc-common gd gd-devel openssl openssl-devel php php-gd perl net-tools make gettext automake autoconf wget net-snmp net-snmp-utils epel-release
[root@client ~]# yum install -y perl-Net-SNMP
2.创建用户并设置密码
[root@client nagios-plugins-2.2.1]# useradd nagios
[root@client nagios-plugins-2.2.1]# echo '123456' | passwd --stdin nagios
安装nagios-plugins
1.安装nagios-plugins(同样已经将包nagios-plugins和nrpe包放在/usr/local/src目录下)
[root@client src]# cd /usr/local/src
[root@client src]# tar zxvf nagios-plugins-2.2.1.tar.gz
[root@client src]# cd nagios-plugins-2.2.1
[root@client nagios-plugins-2.2.1]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios
[root@client nagios-plugins-2.2.1]# make && make install
安装nrpe
1.安装nrpe
[root@client ~]# cd /usr/local/src
[root@client src]# tar xvf nrpe-nrpe-3.2.1.tar.gz
[root@client nrpe-nrpe-3.2.1]# ./configure --with-nrpe-user=nagios --with-nrpe-group=nagios --with-nagios-user=nagios --with-nagios-group=nagios --enable-command-args --enable-ssl
[root@client nrpe-nrpe-3.2.1]# make all
[root@client nrpe-nrpe-3.2.1]# make install-plugin
[root@client nrpe-nrpe-3.2.1]# make install-daemon
[root@client nrpe-nrpe-3.2.1]# make install-config
[root@client nrpe-nrpe-3.2.1]# make install-init //生成启动项,有执行此项则无需另配脚本
2.创建nrpe启动脚本
[root@client ~]# vim /etc/init.d/nrpe
#!/bin/bash
NRPE=/usr/local/nagios/bin/nrpe
NRPECONF=/usr/local/nagios/etc/nrpe.cfg
case "$1" in
start)
echo -n "Starting NRPE daemon..."
$NRPE -c $NRPECONF -d
echo " done."
;;
stop)
echo -n "Stopping NRPE daemon..."
pkill -u nagios nrpe
echo " done."
;;
restart)
$0 stop
sleep 2
$0 start
;;
*)
echo "Usage: $0 start|stop|restart"
;;
esac
exit 0
3.赋予脚本权限并启动nrpe
[root@client ~]# chmod 755 /etc/init.d/nrpe
[root@client ~]# chkconfig --add nrpe
[root@client ~]# systemctl start nrpe
[root@client ~]# systemctl enable nrpe
4.修改nrpe配置文件,添加监控主机的IP地址,重启nrpe服务
[root@client ~]# vim /usr/local/nagios/etc/nrpe.cfg
allowed_hosts=127.0.0.1,192.168.117.0/24
[root@client ~]# systemctl restart nrpe
5.进入监控主机,查看能否与客户端主机通信
[root@nagios ~]# cd /usr/local/nagios/libexec/
[root@nagios libexec]# ./check_nrpe -H 192.168.117.15
NRPE v3.2.1 //通信成功
自定义监控项配置
需求:
1.添加监控脚本,设计一个能监控nginx端口的脚本;并添加到远程监控项目中
2.添加监控脚本,设计一个能监控mysq|端口的脚本;并添加到远程监控项目中
3.添加监控脚本,设计一个能监控apache端口的脚本;并添加到远程监控项目中
1.进入客户端主机,在nagios模块目录下创建自定义脚本并赋予权限
[root@client ~]# vim /usr/local/nagios/libexec/check_nma.sh //监控nginx、mysql、apache服务端口
#!/bin/bash
n=`netstat -lntp | grep ':81' | wc -l`
m=`netstat -lntp | grep ':3306' | wc -l`
a=`netstat -lntp | grep ':80' | wc -l`
case $1 in
nginx)
if [ $n -eq 0 ]; then
echo "Nginx: so much problems!"
exit 2
else
echo "Nginx: I feel good!"
exit 0
fi
;;
mysql)
if [ $m -eq 0 ]; then
echo "Mysql: is something wrong?"
exit 2
else
echo "Mysql: I'm ok!"
exit 0
fi
;;
apache)
if [ $a -eq 0 ]; then
echo "HTTP: so bad."
exit 2
else
echo "HTTP: all is ok!"
exit 0
fi
;;
*)
return 3
;;
esac
[root@client ~]# chown nagios.nagios /usr/local/nagios/libexec/check_nma.sh
[root@client ~]# chmod +x /usr/local/nagios/libexec/check_nma.sh
2.编辑nrpe.cfg文件,编辑完成后重启nrpe服务
[root@client ~]# vim /usr/local/nagios/etc/nrpe.cfg //添加三行配置,分别对应nginx、mysql、apache服务
command[check_nginx]=/usr/local/nagios/libexec/check_nma.sh nginx
command[check_mysql]=/usr/local/nagios/libexec/check_nma.sh mysql
command[check_apache]=/usr/local/nagios/libexec/check_nma.sh apache
[root@client ~]# systemctl restart nrpe
3.进入监控主机,测试能否正常接收到客户端的脚本反馈的值
[root@nagios ~]# /usr/local/nagios/libexec/check_nrpe -H 192.168.117.15 -c check_nginx //-c后跟的是在客户端的nrpe.cfg文件中设定的键值
4.创建远程监控文件
[root@nagios ~]# vim /usr/local/nagios/etc/objects/client.cfg
define host{ //定义监控的主机
use linux-server //引用主机的linux-server信息
host_name client //主机名
alias nagios_client //主机别名
address 192.168.117.15 //被监控的主机IP
}
define hostgroup{ //定义一个主机组
hostgroup_name nagios-client //主机组名字
alias nagios--client //主机组的别名
members localhost,client //主机组成员,这里表示包括本机和client主机
}
define service{ //定义监控的服务
use local-service //引用local-service的属性值
host_name client //指定监控的主机
service_description check_nginx //服务的描述
check_command check_nrpe!check_nginx //检查的命令
max_check_attempts 5 //最大检查次数
normal_check_interval 1 //服务检查时间间隔,单位:分钟
}
define service{
use local-service
host_name client
service_description check_mysql
check_command check_nrpe!check_mysql
max_check_attempts 5
normal_check_interval 1
}
define service{
use local-service
host_name client
service_description check_apache
check_command check_nrpe!check_apache
max_check_attempts 5
normal_check_interval 1
}
5.编辑nagios主配置文件
[root@nagios ~]# vim /usr/local/nagios/etc/nagios.cfg //添加以下配置,否则client.cfg文件不生效
cfg_file=/usr/local/nagios/etc/objects/client.cfg
6.编辑command.cfg文件,编辑完成后重启nagios服务
[root@nagios ~]# vim /usr/local/nagios/etc/objects/commands.cfg //添加如下配置
define command {
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
# $USER1$ 表示nagios插件的路径,在resource.cfg文件中定义
# $HOSTADDRESS$ 表示主机IP
# $ARG1$ 表示检测命令中的第一个参数,即check_nginx
[root@nagios ~]# systemctl restart nagios
7.浏览器访问nagios页面查看监控项,可以看到新增了client主机以及它的三个监控项
Nagios高可用
参考https://www.jianshu.com/p/36584ff88cb9
IP | 主机名 | 节点 |
---|---|---|
192.168.117.14 | nagios_master | 主机 |
192.168.117.16 | nagios_slaver | 从机 |
两台主机均已安装nagioscore、nagios-plugins、nagios-nrpe,配置相同。确保两台主机nrpe正常通信。
1.进入nagios主机,编辑nrpe.cfg文件,完成后重启nrpe
[root@nagios_master ~]# vim /usr/local/nagios/etc/nrpe.cfg
allowed_hosts=127.0.0.1,::1,192.168.117.16 //添加从机IP
command[check_nagios]=/usr/local/nagios/libexec/check_nagios -e 5 -F /usr/local/nagios/var/status.dat -C /usr/local/nagios/bin/nagios //添加这行配置,用于检测nagios进程
[root@nagios_master ~]# systemctl restart nrpe
2.进入nagios从机,编辑nrpe.cfg文件,完成后重启nrpe
[root@nagios_slaver ~]# vim /usr/local/nagios/etc/nrpe.cfg
allowed_hosts=127.0.0.1,::1,192.168.117.14 //添加主机IP
[root@nagios_slaver ~]# systemctl restart nrpe
3.检查能否获取主机check_nagios信息
[root@nagios_slaver ~]# /usr/local/nagios/libexec/check_nrpe -H 192.168.117.14 -c check_nagios
NAGIOS OK: 6 processes, status log updated 8 seconds ago
4.创建eventhandlers目录,并从nagioscore源码目录中复制相关配置文件至该目录下
[root@nagios_slaver ~]# mkdir /usr/local/nagios/libexec/eventhandlers
[root@nagios_slaver ~]# cd /usr/local/src/nagioscore-nagios-4.4.3/contrib/eventhandlers/
[root@nagios_slaver eventhandlers]# cp enable_notifications /usr/local/nagios/libexec/eventhandlers/
[root@nagios_slaver eventhandlers]# cp disable_notifications /usr/local/nagios/libexec/eventhandlers/
[root@nagios_slaver eventhandlers]# cp redundancy-scenario1/handle-master-host-event /usr/local/nagios/libexec/eventhandlers/
[root@nagios_slaver eventhandlers]# cp redundancy-scenario1/handle-master-proc-event /usr/local/nagios/libexec/eventhandlers/
5.编辑handle-master-proc-event文件,修改两行数据
[root@nagios_slaver ~]# vim /usr/local/nagios/libexec/eventhandlers/handle-master-proc-event
`$eventhandlerdir/enable_notifications`
`eventhandlerdir/disable_notifications`
6.修改eventhandlers目录文件权限
[root@nagios_slaver ~]# chown nagios.nagios /usr/local/nagios/libexec/eventhandlers/*
[root@nagios_slaver ~]# chmod 755 /usr/local/nagios/libexec/eventhandlers/*
7.修改command.cfg文件,添加三段配置
[root@nagios_slaver ~]# vim /usr/local/nagios/etc/objects/commands.cfg
define command {
command_name handle-master-host-event
command_line $USER1$/eventhandlers/handle-master-host-event $HOSTSTATE$ $HOSTSTATETYPE$ $HOSTATTEMPT$
}
define command {
command_name handle-master-proc-event
command_line $USER1$/eventhandlers/handle-master-proc-event $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$
}
define command {
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
8.修改localhost.cfg文件,添加两段配置
[root@nagios_slaver ~]# vim /usr/local/nagios/etc/objects/localhost.cfg
define host {
use critical-host
host_name nagiosMaster
alias nagios master
address 192.168.117.14
event_handler handle-master-host-event
}
define service {
use critical-service
host_name nagiosMaster
service_description NAGIOS
check_command check_nrpe!check_nagios
event_handler handle-master-proc-event
}
9.编辑templates.cfg文件,添加两段配置
define host{
name critical-host
use generic-host
check_period 24x7
check_interval 5
retry_interval 1
max_check_attempts 10
check_command check-host-alive
notification_period workhours
notification_interval 120
notification_options d,u,r
contact_groups admins
register 0
}
define service{
name critical-service
active_checks_enabled 1
passive_checks_enabled 1
parallelize_check 1
obsess_over_service 1
check_freshness 0
notifications_enabled 1
event_handler_enabled 1
flap_detection_enabled 0 //该参数若为1,则当服务的状态频繁切换时,会抑制告警
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
is_volatile 0
check_period 24x7
max_check_attempts 1
check_interval 1
retry_interval 1
contact_groups admins
notification_options w,u,c,r
notification_interval 60
notification_period 24x7
register 0
}
10.修改nagios.cfg文件,并重启nagios服务
[root@nagios_slaver ~]# vim /usr/local/nagios/etc/nagios.cfg
use_retained_program_state=0 //关闭状态保持
[root@nagios_slaver ~]# systemctl restart nagios
11.关闭监控主机的nagios,查看从机nagios页面监控项状态
配置邮件告警
1.下载sendmail及mailx
[root@nagios_slaver ~]# yum install -y sendmail* mailx
2.配置smtp
[root@nagios_slaver ~]# vim /etc/mail.rc
set from=82900528@qq.com
set smtp=smtp.qq.com
set smtp-auth-user=82900528@qq.com
set smtp-auth-password=************** //授权码
set smtp-auth=login
3.编辑contacts.cfg文件,确保email填写的是正确的邮箱
[root@nagios_slaver ~]# vim /usr/local/nagios/etc/objects/contacts.cfg
email 82900528@qq.com
4.在localhost.cfg文件中加入一行联系群组
[root@nagios_slaver ~]# vim /usr/local/nagios/etc/objects/localhost.cfg //在定义的service中加上一行,联系群组admins
contact_groups admins
5.确保nagios.cfg文件中的状态保持是关闭状态,以及告警功能是打开状态
[root@nagios_slaver ~]# vim /usr/local/nagios/etc/nagios.cfg
use_retained_program_state=0 //关闭状态保持
enable_notifications=1 //打开告警
6.将command.cfg文件中的/usr/local/sendmail改为/usr/local/mail
[root@nagios_slaver ~]# sed -i.bak 's@/usr/sbin/sendmail@/usr/bin/mail@g' /usr/local/nagios/etc/objects/commands.cfg
7.重启sendmail服务及nagios服务
[root@nagios_slaver ~]# systemctl restart sendmail
[root@nagios_slaver ~]# systemctl restart nagios
8.改变监控主机nagios状态,即可接收邮件
设置只发送一次告警邮件
1.修改对应服务的templates.cfg文件即可,别忘了重启服务
[root@nagios_slaver objects]# vim templates.cfg
notification_interval 0 //服务检查时间间隔,单位:分钟,若为0则只提醒一次
[root@nagios_slaver objects]# systemctl restart nagios
Nagios后期维护:https://blog.51cto.com/xtony/978758