Nagios安装部署、监控项配置及高可用

Nagios配置文件详细参数参考: https://blog.51cto.com/ixdba/752870

IP主机名备注
192.168.117.14nagios监控主机
192.168.117.15client客户端

监控主机部署Nagios

基础环境

1.安装依赖包

[root@nagios ~]# yum install -y gcc glibc glibc-common wget unzip httpd php gd php-gd gd-devel perl

2.获取Nagios以及相关插件包(这里已下载好,放在/usr/local/src目录下)
Nagios:https://github.com/NagiosEnterprises/nagioscore/archive/nagios-4.4.3.tar.gz
客户端nrpe:https://github.com/NagiosEnterprises/nrpe/archive/nrpe-3.2.1.tar.gz
插件nagios-plugins:https://nagios-plugins.org/download/nagios-plugins-2.2.1.tar.gz

安装Nagios

1.安装nagioscore

[root@nagios ~]# cd /usr/local/src
[root@nagios src]# tar zxvf nagioscore-nagios-4.4.3.tar.gz
[root@nagios nagioscore-nagios-4.4.3]# cd nagioscore-nagios-4.4.3
[root@nagios nagioscore-nagios-4.4.3]# ./configure
[root@nagios nagioscore-nagios-4.4.3]# make all
[root@nagios nagioscore-nagios-4.4.3]# make install-groups-users
[root@nagios nagioscore-nagios-4.4.3]# usermod -a -G nagios apache

2.安装主程序

[root@nagios nagioscore-nagios-4.4.3]# make install
[root@nagios nagioscore-nagios-4.4.3]# make install-daemoninit

3.配置目录权限

[root@nagios nagioscore-nagios-4.4.3]# make install-commandmode

4.安装示例配置文件

[root@nagios nagioscore-nagios-4.4.3]# make install-config

5.安装web接口

[root@nagios nagioscore-nagios-4.4.3]# make install-webconf

6.修改nagios警告信息的邮件地址

[root@nagios ~]# vim /usr/local/nagios/etc/objects/contacts.cfg  //这时还无法发送告警邮件,邮件配置往下翻
    email                   82900528@qq.com

7.nagiosadmin设置密码

[root@nagios ~]# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

8.启动http、nagios服务

[root@nagios ~]# systemctl enable --now httpd
[root@nagios ~]# systemctl enable --now nagios

9.浏览器访问IP/nagios,可以看见nagios界面
在这里插入图片描述
10.新增GQD账号,配置账号权限

[root@nagios ~]# htpasswd -bc /usr/local/nagios/etc/htpasswd.users GQD 123456
[root@nagios ~]# sed -i 's@nagiosadmin@GQD@g' /usr/local/nagios/etc/cgi.cfg
[root@nagios ~]# systemctl restart httpd

11.可用以下命令检查nagios配置文件语法错误

[root@nagios ~]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
安装nagios-plugins

1.安装依赖包

[root@nagios ~]# yum install -y gcc glibc glibc-common make gettext automake autoconf wget openssl-devel net-snmp net-snmp-utils epel-release
[root@nagios ~]# yum install -y perl-Net-SNMP

2.安装nagios-plugins

[root@nagios src]# cd /usr/local/src
[root@nagios src]# tar zxvf nagios-plugins-2.2.1.tar.gz
[root@nagios src]# cd nagios-plugins-2.2.1
[root@nagios nagios-plugins-2.2.1]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios
[root@nagios nagios-plugins-2.2.1]# make && make install
[root@nagios nagios-plugins-2.2.1]# systemctl restart nagios

3.浏览器访问nagios页面,查看监控项
在这里插入图片描述

安装nrpe

1.安装nrpe

[root@nagios ~]# cd /usr/local/src
[root@nagios src]# tar xvf nrpe-nrpe-3.2.1.tar.gz
[root@nagios src]# cd nrpe-nrpe-3.2.1
[root@nagios nrpe-nrpe-3.2.1]# ./configure --with-nrpe-user=nagios --with-nrpe-group=nagios --with-nagios-user=nagios --with-nagios-group=nagios --enable-command-args --enable-ssl
[root@nagios nrpe-nrpe-3.2.1]# make all
[root@nagios nrpe-nrpe-3.2.1]# make install-plugin
[root@nagios nrpe-nrpe-3.2.1]# make install-daemon
[root@nagios nrpe-nrpe-3.2.1]# make install-config
[root@nagios nrpe-nrpe-3.2.1]# make install-init  //生成启动项,有执行此项则无需另配脚本

2.启动nrpe

[root@nagios ~]# systemctl start nrpe
[root@nagios ~]# systemctl enable nrpe

客户端部署nrpe(需要先安装nagios-plugins)

基础环境

1.下载依赖包

[root@client ~]# yum install -y gcc glibc glibc-common gd gd-devel openssl openssl-devel php php-gd perl net-tools make gettext automake autoconf wget net-snmp net-snmp-utils epel-release
[root@client ~]# yum install -y perl-Net-SNMP

2.创建用户并设置密码

[root@client nagios-plugins-2.2.1]# useradd nagios
[root@client nagios-plugins-2.2.1]# echo '123456' | passwd --stdin nagios
安装nagios-plugins

1.安装nagios-plugins(同样已经将包nagios-plugins和nrpe包放在/usr/local/src目录下)

[root@client src]# cd /usr/local/src
[root@client src]# tar zxvf nagios-plugins-2.2.1.tar.gz
[root@client src]# cd nagios-plugins-2.2.1
[root@client nagios-plugins-2.2.1]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios
[root@client nagios-plugins-2.2.1]# make && make install
安装nrpe

1.安装nrpe

[root@client ~]# cd /usr/local/src
[root@client src]# tar xvf nrpe-nrpe-3.2.1.tar.gz
[root@client nrpe-nrpe-3.2.1]# ./configure --with-nrpe-user=nagios --with-nrpe-group=nagios --with-nagios-user=nagios --with-nagios-group=nagios --enable-command-args --enable-ssl
[root@client nrpe-nrpe-3.2.1]# make all
[root@client nrpe-nrpe-3.2.1]# make install-plugin
[root@client nrpe-nrpe-3.2.1]# make install-daemon
[root@client nrpe-nrpe-3.2.1]# make install-config
[root@client nrpe-nrpe-3.2.1]# make install-init  //生成启动项,有执行此项则无需另配脚本

2.创建nrpe启动脚本

[root@client ~]# vim /etc/init.d/nrpe
#!/bin/bash

NRPE=/usr/local/nagios/bin/nrpe
NRPECONF=/usr/local/nagios/etc/nrpe.cfg

case "$1" in
       start)
              echo -n "Starting NRPE daemon..."
              $NRPE -c $NRPECONF -d
              echo " done."
              ;;
       stop)
              echo -n "Stopping NRPE daemon..."
              pkill -u nagios nrpe
              echo " done."
       ;;
       restart)
              $0 stop
              sleep 2
              $0 start
              ;;
       *)
              echo "Usage: $0 start|stop|restart"
              ;;
       esac
exit 0

3.赋予脚本权限并启动nrpe

[root@client ~]# chmod 755 /etc/init.d/nrpe
[root@client ~]# chkconfig --add nrpe
[root@client ~]# systemctl start nrpe
[root@client ~]# systemctl enable nrpe

4.修改nrpe配置文件,添加监控主机的IP地址,重启nrpe服务

[root@client ~]# vim /usr/local/nagios/etc/nrpe.cfg
allowed_hosts=127.0.0.1,192.168.117.0/24
[root@client ~]# systemctl restart nrpe

5.进入监控主机,查看能否与客户端主机通信

[root@nagios ~]# cd /usr/local/nagios/libexec/
[root@nagios libexec]# ./check_nrpe -H 192.168.117.15
NRPE v3.2.1  //通信成功
自定义监控项配置

需求:
1.添加监控脚本,设计一个能监控nginx端口的脚本;并添加到远程监控项目中
2.添加监控脚本,设计一个能监控mysq|端口的脚本;并添加到远程监控项目中
3.添加监控脚本,设计一个能监控apache端口的脚本;并添加到远程监控项目中

1.进入客户端主机,在nagios模块目录下创建自定义脚本并赋予权限

[root@client ~]# vim /usr/local/nagios/libexec/check_nma.sh  //监控nginx、mysql、apache服务端口
#!/bin/bash

n=`netstat -lntp | grep ':81' | wc -l`
m=`netstat -lntp | grep ':3306' | wc -l`
a=`netstat -lntp | grep ':80' | wc -l`

case $1 in
nginx)
    if [ $n -eq 0 ]; then
        echo "Nginx: so much problems!"
        exit 2
    else
        echo "Nginx: I feel good!"
        exit 0
    fi
    ;;
mysql)
    if [ $m -eq 0 ]; then
        echo "Mysql: is something wrong?"
        exit 2
    else
        echo "Mysql: I'm ok!"
        exit 0
    fi
    ;;
apache)
    if [ $a -eq 0 ]; then
        echo "HTTP: so bad."
        exit 2
    else
        echo "HTTP: all is ok!"
        exit 0
    fi
    ;;
*)
    return 3
    ;;
esac

[root@client ~]# chown nagios.nagios /usr/local/nagios/libexec/check_nma.sh
[root@client ~]# chmod +x /usr/local/nagios/libexec/check_nma.sh

2.编辑nrpe.cfg文件,编辑完成后重启nrpe服务

[root@client ~]# vim /usr/local/nagios/etc/nrpe.cfg  //添加三行配置,分别对应nginx、mysql、apache服务
command[check_nginx]=/usr/local/nagios/libexec/check_nma.sh nginx
command[check_mysql]=/usr/local/nagios/libexec/check_nma.sh mysql
command[check_apache]=/usr/local/nagios/libexec/check_nma.sh apache

[root@client ~]# systemctl restart nrpe

3.进入监控主机,测试能否正常接收到客户端的脚本反馈的值

[root@nagios ~]# /usr/local/nagios/libexec/check_nrpe -H 192.168.117.15 -c check_nginx  //-c后跟的是在客户端的nrpe.cfg文件中设定的键值

4.创建远程监控文件

[root@nagios ~]# vim /usr/local/nagios/etc/objects/client.cfg
define host{                     //定义监控的主机
    use          linux-server    //引用主机的linux-server信息
    host_name    client          //主机名
    alias        nagios_client   //主机别名
    address      192.168.117.15  //被监控的主机IP
    }

define hostgroup{                         //定义一个主机组
    hostgroup_name    nagios-client       //主机组名字
    alias             nagios--client      //主机组的别名
    members           localhost,client    //主机组成员,这里表示包括本机和client主机
    }

define service{                                     //定义监控的服务
    use                    local-service            //引用local-service的属性值
    host_name              client                   //指定监控的主机
    service_description    check_nginx              //服务的描述
    check_command          check_nrpe!check_nginx   //检查的命令
    max_check_attempts     5                        //最大检查次数
    normal_check_interval  1                        //服务检查时间间隔,单位:分钟
    }     
     
define service{
    use                    local-service
    host_name              client
    service_description    check_mysql
    check_command          check_nrpe!check_mysql
    max_check_attempts     5
    normal_check_interval  1
    }

define service{
    use                    local-service
    host_name              client
    service_description    check_apache
    check_command          check_nrpe!check_apache
    max_check_attempts     5
    normal_check_interval  1
    }

5.编辑nagios主配置文件

[root@nagios ~]# vim /usr/local/nagios/etc/nagios.cfg  //添加以下配置,否则client.cfg文件不生效
cfg_file=/usr/local/nagios/etc/objects/client.cfg

6.编辑command.cfg文件,编辑完成后重启nagios服务

[root@nagios ~]# vim /usr/local/nagios/etc/objects/commands.cfg  //添加如下配置
define command {

    command_name    check_nrpe
    command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

# $USER1$ 表示nagios插件的路径,在resource.cfg文件中定义
# $HOSTADDRESS$ 表示主机IP
# $ARG1$ 表示检测命令中的第一个参数,即check_nginx

[root@nagios ~]# systemctl restart nagios

7.浏览器访问nagios页面查看监控项,可以看到新增了client主机以及它的三个监控项
在这里插入图片描述

Nagios高可用

参考https://www.jianshu.com/p/36584ff88cb9

IP主机名节点
192.168.117.14nagios_master主机
192.168.117.16nagios_slaver从机

两台主机均已安装nagioscore、nagios-plugins、nagios-nrpe,配置相同。确保两台主机nrpe正常通信。

1.进入nagios主机,编辑nrpe.cfg文件,完成后重启nrpe

[root@nagios_master ~]# vim /usr/local/nagios/etc/nrpe.cfg
allowed_hosts=127.0.0.1,::1,192.168.117.16  //添加从机IP
command[check_nagios]=/usr/local/nagios/libexec/check_nagios -e 5 -F /usr/local/nagios/var/status.dat -C /usr/local/nagios/bin/nagios  //添加这行配置,用于检测nagios进程

[root@nagios_master ~]# systemctl restart nrpe

2.进入nagios从机,编辑nrpe.cfg文件,完成后重启nrpe

[root@nagios_slaver ~]# vim /usr/local/nagios/etc/nrpe.cfg
allowed_hosts=127.0.0.1,::1,192.168.117.14  //添加主机IP

[root@nagios_slaver ~]# systemctl restart nrpe

3.检查能否获取主机check_nagios信息

[root@nagios_slaver ~]# /usr/local/nagios/libexec/check_nrpe -H 192.168.117.14 -c check_nagios
NAGIOS OK: 6 processes, status log updated 8 seconds ago

4.创建eventhandlers目录,并从nagioscore源码目录中复制相关配置文件至该目录下

[root@nagios_slaver ~]# mkdir /usr/local/nagios/libexec/eventhandlers
[root@nagios_slaver ~]# cd /usr/local/src/nagioscore-nagios-4.4.3/contrib/eventhandlers/
[root@nagios_slaver eventhandlers]# cp enable_notifications /usr/local/nagios/libexec/eventhandlers/
[root@nagios_slaver eventhandlers]# cp disable_notifications /usr/local/nagios/libexec/eventhandlers/
[root@nagios_slaver eventhandlers]# cp redundancy-scenario1/handle-master-host-event /usr/local/nagios/libexec/eventhandlers/
[root@nagios_slaver eventhandlers]# cp redundancy-scenario1/handle-master-proc-event /usr/local/nagios/libexec/eventhandlers/

5.编辑handle-master-proc-event文件,修改两行数据

[root@nagios_slaver ~]# vim /usr/local/nagios/libexec/eventhandlers/handle-master-proc-event
                `$eventhandlerdir/enable_notifications`
                `eventhandlerdir/disable_notifications`

6.修改eventhandlers目录文件权限

[root@nagios_slaver ~]# chown nagios.nagios /usr/local/nagios/libexec/eventhandlers/*
[root@nagios_slaver ~]# chmod 755 /usr/local/nagios/libexec/eventhandlers/*

7.修改command.cfg文件,添加三段配置

[root@nagios_slaver ~]# vim /usr/local/nagios/etc/objects/commands.cfg
define command {
command_name handle-master-host-event
command_line $USER1$/eventhandlers/handle-master-host-event $HOSTSTATE$ $HOSTSTATETYPE$ $HOSTATTEMPT$
}


define command {
command_name handle-master-proc-event
command_line $USER1$/eventhandlers/handle-master-proc-event $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$
}


define command {
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

8.修改localhost.cfg文件,添加两段配置

[root@nagios_slaver ~]# vim /usr/local/nagios/etc/objects/localhost.cfg
define host {
        use                             critical-host
        host_name                       nagiosMaster
        alias                           nagios master
        address                         192.168.117.14
        event_handler                   handle-master-host-event
}

define service {
        use                              critical-service
        host_name                        nagiosMaster
        service_description              NAGIOS
        check_command                    check_nrpe!check_nagios
        event_handler                    handle-master-proc-event
}

9.编辑templates.cfg文件,添加两段配置

define host{
        name                            critical-host
        use                             generic-host
        check_period                    24x7
        check_interval                  5
        retry_interval                  1
        max_check_attempts              10
        check_command                   check-host-alive
        notification_period             workhours
        notification_interval           120
        notification_options            d,u,r
        contact_groups                  admins
        register                        0
        }

define service{
        name                            critical-service
        active_checks_enabled           1
        passive_checks_enabled          1
        parallelize_check               1
        obsess_over_service             1
        check_freshness                 0
        notifications_enabled           1
        event_handler_enabled           1
        flap_detection_enabled          0    //该参数若为1,则当服务的状态频繁切换时,会抑制告警
        process_perf_data               1
        retain_status_information       1
        retain_nonstatus_information    1
        is_volatile                     0
        check_period                    24x7
        max_check_attempts              1
        check_interval                  1
        retry_interval                  1
        contact_groups                  admins
        notification_options            w,u,c,r
        notification_interval           60
        notification_period             24x7
        register                        0
        }

10.修改nagios.cfg文件,并重启nagios服务

[root@nagios_slaver ~]# vim /usr/local/nagios/etc/nagios.cfg
use_retained_program_state=0        //关闭状态保持
[root@nagios_slaver ~]# systemctl restart nagios

11.关闭监控主机的nagios,查看从机nagios页面监控项状态
在这里插入图片描述

配置邮件告警

1.下载sendmail及mailx

[root@nagios_slaver ~]# yum install -y sendmail* mailx

2.配置smtp

[root@nagios_slaver ~]# vim /etc/mail.rc
set from=82900528@qq.com
set smtp=smtp.qq.com
set smtp-auth-user=82900528@qq.com
set smtp-auth-password=**************  //授权码
set smtp-auth=login

3.编辑contacts.cfg文件,确保email填写的是正确的邮箱

[root@nagios_slaver ~]# vim /usr/local/nagios/etc/objects/contacts.cfg
    email                   82900528@qq.com

4.在localhost.cfg文件中加入一行联系群组

[root@nagios_slaver ~]# vim /usr/local/nagios/etc/objects/localhost.cfg  //在定义的service中加上一行,联系群组admins
contact_groups                   admins

5.确保nagios.cfg文件中的状态保持是关闭状态,以及告警功能是打开状态

[root@nagios_slaver ~]# vim /usr/local/nagios/etc/nagios.cfg
use_retained_program_state=0        //关闭状态保持
enable_notifications=1        //打开告警

6.将command.cfg文件中的/usr/local/sendmail改为/usr/local/mail

[root@nagios_slaver ~]# sed -i.bak 's@/usr/sbin/sendmail@/usr/bin/mail@g' /usr/local/nagios/etc/objects/commands.cfg

7.重启sendmail服务及nagios服务

[root@nagios_slaver ~]# systemctl restart sendmail
[root@nagios_slaver ~]# systemctl restart nagios

8.改变监控主机nagios状态,即可接收邮件
在这里插入图片描述
设置只发送一次告警邮件
1.修改对应服务的templates.cfg文件即可,别忘了重启服务

[root@nagios_slaver objects]# vim templates.cfg
        notification_interval           0   //服务检查时间间隔,单位:分钟,若为0则只提醒一次
[root@nagios_slaver objects]# systemctl restart nagios

Nagios后期维护:https://blog.51cto.com/xtony/978758

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值