安装配置Nagios

监控端

1、安装前的准备工作

(1)解决安装Nagios的依赖关系:

Nagios基本组件的运行依赖于httpd、gcc和gd。可以通过以下命令来检查nagios所依赖的rpm包是否已经完全安装:

# yum -y install httpd gcc glibc glibc-common gd gd-devel php php-mysql mysql mysql-devel mysql-server

(2)添加nagios运行所需要的用户和组:

# groupadd  nagcmd

# useradd -G nagcmd nagios

# passwd nagios

把apache加入到nagcmd组,以便于在通过web Interface操作nagios时能够具有足够的权限:

# usermod -a -G nagcmd apache

2、编译安装nagios:

# tar zxf nagios-3.3.1.tar.gz

# cd nagios-3.3.1

# ./configure –with-command-group=nagcmd –enable-event-broker

# make all

# make install

# make install-init

# make install-commandmode

# make install-config

在httpd的配置文件目录(conf.d)中创建Nagios的Web程序配置文件:

# make install-webconf

创建一个登录nagios web程序的用户,这个用户帐号在以后通过web登录nagios认证时所用:

# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

以上过程配置结束以后需要重新启动httpd:

# service httpd restart

3、编译、安装nagios-plugins

nagios的所有监控工作都是通过插件完成的,因此,在启动nagios之前还需要为其安装官方提供的插件。

# tar zxf nagios-plugins-1.4.15.tar.gz

# cd nagios-plugins-1.4.15

# ./configure –with-nagios-user=nagios –with-nagios-group=nagios

# make

# make install

4、配置并启动Nagios

(1)把nagios添加为系统服务并将之加入到自动启动服务队列:

# chkconfig –add nagios

# chkconfig nagios on

(2)检查其主配置文件的语法是否正确:

# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

(3)如果上面的语法检查没有问题,接下来就可以正式启动nagios服务了:

# service nagios start

(4)通过web界面查看nagios:

http://your_nagios_IP/nagios

 

被监控端(基于NRPE监控远程Linux主机)

1、安装配置被监控端

1)先添加nagios用户

# useradd -s /sbin/nologin nagios

2)NRPE依赖于nagios-plugins,因此,需要先安装之

# tar zxf nagios-plugins-1.4.15.tar.gz

# cd nagios-plugins-1.4.15

# ./configure –with-nagios-user=nagios –with-nagios-group=nagios

# make all

# make instal

3)安装NRPE

# tar -zxvf nrpe-2.12.tar.gz

# cd nrpe-2.12.tar.gz

# ./configure –with-nrpe-user=nagios \

–with-nrpe-group=nagios \

–with-nagios-user=nagios \

–with-nagios-group=nagios \

–enable-command-args \

–enable-ssl

# make all

# make install-plugin

# make install-daemon

# make install-daemon-config

4)配置NRPE

# vim /usr/local/nagios/etc/nrpe.conf

log_facility=daemon

pid_file=/var/run/nrpe.pid

server_address=192.168.210.12

server_port=5666

nrpe_user=nagios

nrpe_group=nagios

allowed_hosts=192.168.210.11

command_timeout=60

connection_timeout=300

debug=0

5)启动NRPE

# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d

为了便于NRPE服务的启动,可以将如下内容定义为/etc/init.d/nrped脚本:

#!/bin/bash

# chkconfig: 2345 88 12

# description: NRPE DAEMON

NRPE=/usr/local/nagios/bin/nrpe

NRPECONF=/usr/local/nagios/etc/nrpe.cfg

case “$1″ in

start)

echo -n “Starting NRPE daemon…”

$NRPE -c $NRPECONF -d

echo ” done.”

;;

stop)

echo -n “Stopping NRPE daemon…”

pkill -u nagios nrpe

echo ” done.”

;;

restart)

$0 stop

sleep 2

$0 start

;;

*)

echo “Usage: $0 start|stop|restart”

;;

esac

exit 0

 

6)配置允许远程主机监控的对象

在被监控端,可以通过NRPE监控的服务或资源需要通过nrpe.conf文件使用命令进行定义:

command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10

command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20

command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda1

command[check_sda3]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda3

command[check_swap]=/usr/local/nagios/libexec/check_disk -w 40% -c 20% -p /dev/shm

command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z

command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200

command[check_diskdisk]=/usr/local/nagios/libexec/check_diskdisk.sh

配置监控端

1)安装NRPE

# tar -zxvf nrpe-2.12.tar.gz

# cd nrpe-2.12.tar.gz

# ./configure –with-nrpe-user=nagios \

–with-nrpe-group=nagios \

–with-nagios-user=nagios \

–with-nagios-group=nagios \

–enable-command-args \

–enable-ssl

# make all

# make install-plugin

2)定义如何监控远程主机及服务:

nagios.cfg主配置文件加一行:cfg_file=/usr/local/nagios/etc/objects/192.168.210.12.cfg

192.168.210.12.cfg内容如下:

define host{

use                     linux-server

host_name           192.168.210.12

alias                       0.12

address                 192.168.210.12

}

 

define service{

use                     generic-service

host_name               192.168.210.12

service_description     check_ping

check_command           check_ping!100.0,20%!200.0,50%

max_check_attempts 5

normal_check_interval 1

}

 

define service{

use                     generic-service

host_name               192.168.210.12

service_description     check_ssh

check_command           check_ssh

max_check_attempts      5

normal_check_interval 1

notification_interval           60

}

 

define service{

use                     generic-service

host_name               192.168.210.12

service_description     check_http

check_command           check_http

max_check_attempts      5

normal_check_interval 1

contact_groups         common

notifications_enabled  1

notification_period   24×7

notification_options   w,u,c,r

}

define service{

use     generic-service

host_name       192.168.210.12

service_description     check_load

check_command           check_nrpe!check_load

max_check_attempts 5

normal_check_interval 1

}

define service{

use     generic-service

host_name       192.168.210.12

service_description     check_disk_sda1

check_command           check_nrpe!check_sda1

max_check_attempts 5

normal_check_interval 1

}