一、部署nagios
nagios部署分两大部分:
1)服务端的配置,需要提前搭建环境:apache,php环境,然后安装nagios
2)客户端安装nrpe和nagios-plugins插件
3)测试服务端和客户端是否可以连通,启动nagios和nrpe
4)安装rrdtool
5)安装pnp4nagios,并于apache和nagios整合
6)
7)windows客户端配置
8)linux下的nrpe代理
所需软件:
源码:httpd-2.2.11.tar,php-5.2.9.tar,nagios-3.3.1.tar,nagios-plugins-1.4.15.tar,nrpe-2.12.tar
Rpm:xinetd-2.3.14-10.el5.i386,openssl ,openssl-devel
设备信息:
服务端
客户端
跳板
(一)服务端的配置,需要提前搭建环境:apache,php环境,然后安装nagios
1)安装apache
拷贝软件:
[/root]#scp -r nagios/ admin@10.35.100.109:/home/admin
Password: clone1root
[/root]#ssh admin@10.35.100.109
Password: clone1root
[admin@localhost ~]$ scp -r nagios/ 11.168.6.11:/home/admin
admin@11.168.6.11's password:
[admin@localhost ~]$ ssh 11.168.6.11
admin@11.168.6.11's password:
[admin@localhost ~]$su - root
安装apache
[/root/Desktop/mysql]#tar jxf httpd-2.2.11.tar.bz2 -C /usr/local/src/
[/usr/local/src/httpd-2.2.11]#ls
INSTALL README
[/usr/local/src/httpd-2.2.11]#vim INSTALL 告诉你如何安装
$ ./configure --prefix=PREFIX
$ make
$ make install
$ PREFIX/bin/apachectl start
[/usr/local/src/httpd-2.2.11]#./configure --prefix=/usr/local/apache2 --enable-mods-shared=most --enable-so --enable-rewrite --enable-ssl
-enable-mods-shared=most编译成模块
-enable-so动态加载模块
enable-rewrite地址重写
enable-rewrite安全防护协议(ssl https)
ssl需要以下软件
[/usr/local/src/httpd-2.2.11]#rpm -qa|grep openssl
openssl-0.9.8e-12.el5_4.6
openssl-devel-0.9.8e-12.el5_4.6
[/usr/local/src/httpd-2.2.11]#make
[/usr/local/src/httpd-2.2.11]#make install 拷贝文件
[/usr/local/apache2]#ls
bin cgi-bin error icons logs manual
build conf htdocs include man modules
htdocs 网页
bin 可执行文件
modules 模块
logs 日志
conf 配置文件
[/usr/local/apache2]#service httpd stop
停止 httpd: [确定]
[/usr/local/apache2]#chkconfig httpd off
[/usr/local/apache2]#/usr/local/apache2/bin/apachectl restart
httpd not running, trying to start
[/usr/local/apache2]#ps -e|grep httpd
16977 ? 00:00:00 httpd
16979 ? 00:00:00 httpd
16980 ? 00:00:00 httpd
16981 ? 00:00:00 httpd
16982 ? 00:00:00 httpd
16983 ? 00:00:00 httpd
http://localhost/
It works!
修改apache的配置文件,增加nagios的目录,并且访问此目录需要进行身份验证
vi /usr/local/apache2/conf/httpd.conf,在最后增加如下内容 :
ScriptAlias /nagios/cgi-bin /usr/local/nagios/sbin
<Directory "/usr/local/nagios/sbin">
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd [color=Red]//用于此目录访问身份验证的文件[/color]
Require valid-user
</Directory>
Alias /nagios /usr/local/nagios/share
<Directory "/usr/local/nagios/share">
Options None
AllowOverride None
Order allow,deny
Allow from all
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd [color=Red]//用于此目录访问身份验证的文件[/color]
Require valid-user
</Directory>
2)安装php
[/root/Desktop/mysql]#tar jxf php-5.2.9.tar.bz2 -C /usr/local/src/
[/usr/local/src/php-5.2.9]#./configure --prefix=/usr/local/php --with-apxs2=/usr/local/apache2/bin/apxs --with-mysql=/usr/local/mysql --with-config-file-path=/usr/local/php
--with-apxs2=/usr/local/apache2/bin/apxs:用apache的apxs工具把php编译成apache的一个模块
--with-mysql=/usr/local/mysql:与mysql结合
[/usr/local/src/php-5.2.9]#make
[/usr/local/src/php-5.2.9]#make install
[/usr/local/src/php-5.2.9]#cp php.ini-dist /usr/local/php/
[/usr/local/php]#mv php.ini-dist php.ini
[/usr/local/php]#ls
bin etc include lib man php.ini
[/usr/local/apache2/modules]#ls libphp5.so
libphp5.so
[/usr/local/apache2]#cd conf/httpd.conf
99 LoadModule php5_module modules/libphp5.so 模块
355 AddType application/x-httpd-php .php 识别php网页 第一个php后面有空格
212 DirectoryIndex index.html index.php 默认网页
[/usr/local/apache2/htdocs]#cat index.php
<?php
phpinfo();
?>
[/usr/local/apache2/htdocs]#
http://localhost/index.php
3)安装nagios
[root@tjapp nagios]# tar zxf nagios-3.3.1.tar.gz
[root@tjapp nagios]# cd nagios
[root@tjapp nagios]# ./configure --prefix=/usr/local/nagios --with-nagios-user=nagios --with-nagios-group=nagios
[root@tjapp nagios]# make all
[root@tjapp nagios]# make install
[root@tjapp nagios]# make install-init
[root@tjapp nagios]# make install-commandmode
[root@tjapp nagios]# make install-config
[root@tjapp nagios]# ls /usr/local/nagios/
bin
跟一般的gnu源码软件安装相比,nagios的安装多了几个步骤(一般的软件运行到make
install就算安装完了)。当然也可以连这两步都不执行,用手工赋予目录或文件权限,再手
动创建配置文件,其效果完全相同。安装完nagios后,我们可以在安装目录/usr/local/nagios
下生成下面的目录:
bin Nagios执行程序所在目录,这个目录只有一个文件nagios
etc Nagios配置文件位置,初始安装完后,只有几个*.cfg文件
sbin Nagios Cgi文件所在目录,也就是执行外部命令所需文件所在的目录
share Nagios网页文件所在的目录
var Nagios日志文件、spid 等文件所在的目录
4)安装nagios插件
[root@tjapp nagios]# tar zxf nagios-plugins-1.4.15.tar.gz
[root@tjapp nagios]# cd nagios-plugins-1.4.15
[root@tjapp nagios-plugins-1.4.15]# ./configure --prefix=/usr/local/nagios/
小注:如果在configure时,到这里checking for redhat spopen problem.....就不动了,所以需要在configure时,加上--enable-redhat-pthread-workaround
[root@tjapp nagios-plugins-1.4.15]# make
[root@tjapp nagios-plugins-1.4.15]# make install
[root@tjapp nagios-plugins-1.4.15]# ls /usr/local/nagios/libexec/
会显示安装的插件文件,即所有的插件都安装在libexec这个目录下
将apache的运行用户加到nagios组里面 :
从httpd.conf中过滤出当前的apache运行用户
grep ^User /usr/local/apache2/conf/httpd.conf
User nobody
UserDir public_html
我的是nobody,下面将这个用户加入nagios组
usermod -G nagios nobody
就是通过web访问nagios的时候,必须要用这个用户登陆.在这里我们增加用户nggiosadmin:密码为123
[root@tjapp nagios-plugins-1.4.15]# /usr/local/apache2/bin/htpasswd -c /usr/local/nagios/etc/htpasswd nagiosadmin
查看文件内容
[root@tjapp nagios-plugins-1.4.15]# cat /usr/local/nagios/etc/htpasswd
nagiosadmin:4P.u3FO0gbI8Q
测试nagios配置是否有错
[root@tjapp ~]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
没错的表现
Total Warnings: 0
Total Errors:
Things look okay - No serious problems were detected during the pre-flight check
[root@tjapp ~]# service nagios restart
Running configuration check...done.
Stopping nagios: done.
Starting nagios: done.
访问网址:http://10.35.100.109/nagios/
user:nagiosadmin
passwd:123
5)配置nagios
修改nagios的配置文件
1、修改主配置文件nagios.cfg.基于方便维护的原则,把各个配置目标单独放在文件中,如
联系人信息在contacts.cfg中定义。Nagios.cfg文件比较长,我只把修改过的内容贴出来:
#注释或删掉这行
#cfg_file=/usr/local/nagios/etc/localhost.cfg
#主机配置文件路径
cfg_file=/usr/local/nagios/etc/hosts.cfg
#//主机组配置文件路径
cfg_file=/usr/local/nagios/etc/hostgroups.cfg
#联系人配置文件路径
cfg_file=/usr/local/nagios/etc/contacts.cfg
#联系组配置文件路径
cfg_file=/usr/local/nagios/etc/contactgroups.cfg
#服务配置文件路径
cfg_file=/usr/local/nagios/etc/services.cfg
#监视时段配置文件路径
cfg_file=/usr/local/nagios/etc/timeperiods.cfg
#在web界面下重启nagios、停止主机/服务检查等操作,.默认值是0.
check_external_commands=1
#根据自己的情况定这个命令检查时间间隔.默认值是1秒.
command_check_interval=10s
2、修改cgi配置文件cgi.cfg.跟修改nagios.cfg一样,只贴出被修改之处:
#如有多个用户,中间用逗号隔开
authorized_for_system_information=nagiosadmin
authorized_for_configuration_information= nagiosadmin
authorized_for_system_commands= nagiosadmin
authorized_for_all_services= nagiosadmin
authorized_for_all_hosts=nagiosadmin
authorized_for_all_service_commands= nagiosadmin
authorized_for_all_host_commands= nagiosadmin
在这里指定的用户”nagiosadmin”可以通过浏览器操纵nagios服务的关闭、重启等各种操作,默认的用户就是nagiosadmin
3、修改commands.cfg配置文件
define command{
4.修改hosts.conf配置文件
define host {
host_name nagios-server
alias nagios server
address 11.168.6.11
contact_groups sagroup
check_command check-host-alive
max_check_attempts 5
notification_interval 10
notification_period 24x7
notification_options d,u
}
define host {
host_name 252-server
alias nagios server
address
contact_groups sagroup
check_command check-host-alive
max_check_attempts 5
notification_interval 10
notification_period 24x7
notification_options d,u
}
define host {
host_name 242-server
alias nagios server
address 192.168.0.242
contact_groups sagroup
check_command check-host-alive
max_check_attempts 5
notification_interval 10
notification_period 24x7
contact_groups sagroup
notification_options d,u
}
define host {
host_name 241-server
alias nagios server
address 192.168.0.241
contact_groups sagroup
check_command check-host-alive
max_check_attempts 5
notification_interval 10
notification_period 24x7
notification_options d,u
}
define host {
host_name 239-server
alias nagios server
address 192.168.0.239
contact_groups sagroup
check_command check-host-alive
max_check_attempts 5
notification_interval 10
notification_period 24x7
notification_options d,u
}
define host {
host_name 2-server
alias nagios server
address 192.168.0.2
contact_groups sagroup
check_command check-host-alive
max_check_attempts 5
notification_interval 10
notification_period 24x7
notification_options d,u
}
define host {
host_name 198-server
alias nagios server
address 192.168.0.198
contact_groups sagroup
check_command check-host-alive
max_check_attempts 5
notification_interval 10
notification_period 24x7
notification_options d,u
}
define host {
host_name 172-server
alias nagios server
address 192.168.0.172
contact_groups sagroup
check_command check-host-alive
max_check_attempts 5
notification_interval 10
notification_period 24x7
notification_options d,u
}
5. 定义主机组配置文件hostgroups.cfg
define hostgroup {
hostgroup_name dcw-servers
alias dcw servers
members nagios-server
}
define hostgroup {
hostgroup_name group-servers
alias 252 servers
members 252-server,242-server,241-server,239-server,2-server,198-server,172-server
}
6.定义联系组配置文件contactgroups.cfg
define contactgroup {
contactgroup_name dcwgroup
alias system administrator group
members nagiosadmin
}
7. 定义服务配置文件 services.cfg
define service {
host_name nagios-server
service_description check-host-alive
check_period 24x7
max_check_attempts 4
normal_check_interval 3
retry_check_interval 2
contact_groups dcwgroup
notification_interval 10
notification_period 24x7
notification_options w,u,c,r
check_command check-host-alive
}
define service {
host_name nagios-server
service_description check_tcp 80
check_period 24x7
max_check_attempts 4
normal_check_interval 3
retry_check_interval 2
contact_groups dcwgroup
notification_interval 10
notification_period 24x7
notification_options w,u,c,r
check_command check_tcp!80
}
define service {
host_name nagios-server
service_description check-disk
check_period 24x7
max_check_attempts 4
normal_check_interval 3
retry_check_interval 2
contact_groups dcwgroup
notification_interval 10
notification_period 24x7
notification_options w,u,c,r
check_command check_local_disk!20%!10%!/
}
define service {
host_name nagios-server
service_description check-load
check_period 24x7
max_check_attempts 4
normal_check_interval 3
retry_check_interval 2
contact_groups dcwgroup
notification_interval 10
notification_period 24x7
notification_options w,u,c,r
check_command check_nrpe!check_load
}
define service {
host_name nagios-server
service_description total_procs
check_period 24x7
max_check_attempts 4
normal_check_interval 3
retry_check_interval 2
contact_groups dcwgroup
notification_interval 10
notification_period 24x7
notification_options w,u,c,r
check_command check_nrpe!check_total_procs
}
小注:
define host{ host_name Nagios-Server #设置主机的名字,该名字会出现在hostgroups.cfg和services.cfg中。
alias Nagios Server #一个别名而已
address 192.168.0.206 #被监控主机的IP地址
check_command check-host-alive #检查的命令
check_interval 1 #检查的时间间隔
retry_interval 1
max_check_attempts 5
check_period 24x7 #检查的时间段,一般的服务器都是7x24
process_perf_data 0
retain_nonstatus_information 0
contact_groups sagroup #设置联系人组,当该主机出现了报警信息,就发信息给这个组 notification_interval 10 #提醒的时间间隔,这个单位是分钟哦,看过一些文档写是秒,用够就知道了。
notification_period 24x7 #提醒的时间段
notification_options d,u,r }
6)安装nrpe软件,主要是验证的作用(也可不装)
1.
[root@UnixHot nrpe-2.12]# make install-plugin
[root@UnixHot nrpe-2.12]# make install-daemon
[root@UnixHot nrpe-2.12]# make install-daemon-config
2.
先要检查xinetd是否安装
[root@UnixHot nrpe-2.12]# make install-xinetd
同服务器端安装nrpe 唯一的不同就是修改xinetd.d/nrpe 的时候在only_from 里只加入Nagios服务器的IP地址即可。
Vim /etc/xinetd.d/nrpe
service nrpe {
flags = REUSE
socket_type = stream
port = 5666
wait = no user = nagios
group = nagios
server = /usr/local/nagios/bin/nrpe
server_args = -c /usr/local/nagios/etc/nrpe.cfg --inetd
log_on_failure += USERID
disable = no
only_from = 127.0.0.1 192.168.0.206 在only_from 添加监控的主机的IP地址,中间以空格隔开。
3、 添加端口
[root@UnixHot nrpe-2.12]# vi /etc/services 在最后添加
nrpe 5666/tcp #nrpe
4.修改文件所有者
[root@prdora1 ~]# chown -R nagios:nagios /usr/local/nagios
[root@prdora1 ~]# /etc/init.d/xinetd restart
[root@UnixHot nrpe-2.12]# netstat -na | grep 5666
tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN
或用[/usr/local/nagios/etc/objects]#lsof -i:5666
COMMAND
xinetd
小注:
1、以独立守护进程启动nrpe服务 /usr/local/nrpe/bin/nrpe –c /usr/local/nrpe/etc/nrpe.cfg –d
2、也可以在不安xinetd时,配置文件[/usr/local/nagios/etc]#ls nrpe.cfg,在此指定端口
[/usr/local/nagios/etc]#cat nrpe.cfg |grep hosts
# address.
allowed_hosts=127.0.0.1 192.168.0.252
[/usr/local/nagios/etc]#cat nrpe.cfg |grep allowed_hosts
allowed_hosts=127.0.0.1 192.168.0.252
(二)客户端安装:
所需软件:nagios-plugins-1.4.15.tar.gz
前期准备:
[admin@tjapp ~]$ scp -r nrpe/ 11.168.6.21:/home/admin
[admin@tjapp ~]$ ssh
passwd:clone1root
[admin@smapp1 ~]$ su - root
口令:
[root@smapp1 ~]# scp -r /home/admin/nrpe/ /root/
口令:123$%^
1.添加nagios用户[root@prdora1 ~]# useradd -s /sbin/nologin nagios
小注:当以上命令无法应用时,执行以下命令
[root@tjdb01 nrpe]# useradd -s /sbin/nologin nagios
useradd:无法打开密码文件
[root@tjdb01 nrpe]# lsattr /etc/passwd
----i-------- /etc/passwd
[root@tjdb01 nrpe]# chattr -i /etc/passwd
[root@tjdb01 nrpe]# lsattr /etc/passwd
------------- /etc/passwd
[root@tjdb01 nrpe]# chattr -i /etc/shadow
[root@tjdb01 nrpe]# chattr -i /etc/gr
gre.d/
[root@tjdb01 nrpe]# chattr -i /etc/group
[root@tjdb01 nrpe]# useradd -s /sbin/nologin nagios
2.安装Nagios的插件nagios-plugin
[root@prdora1 src]# tar zxvf nagios-plugins-1.4.15.tar.gz
[root@prdora1 src]# cd nagios-plugins-1.4.1.5
[root@prdora1 nagios-plugins-1.4.13]# ./configure --enable-redhat-pthread-workaround
[root@prdora1 nagios-plugins-1.4.13]# make && make install
注意:因为服务器是AS4.8的所以,添加了一个编译选项
3. 安装nrpe
[root@UnixHot src]# tar zxvf nrpe-2.12.tar.gz
[root@UnixHot nrpe-2.12]# ./configure && make all
[root@UnixHot nrpe-2.12]# make install-plugin
[root@UnixHot nrpe-2.12]# make install-daemon
[root@UnixHot nrpe-2.12]# make install-daemon-config
先要检查xinetd是否安装
[root@UnixHot nrpe-2.12]# make install-xinetd
同服务器端安装nrpe 唯一的不同就是修改xinetd.d/nrpe 的时候在only_from 里只加入Nagios服务器的IP地址即可。
Vim /etc/xinetd.d/nrpe
service nrpe {
flags = REUSE
socket_type = stream
port = 5666
wait = no user = nagios
group = nagios
server = /usr/local/nagios/bin/nrpe
server_args = -c /usr/local/nagios/etc/nrpe.cfg --inetd
log_on_failure += USERID
disable = no
only_from = 127.0.0.1 11.168.6.21 在only_from 添加监控的主机的IP地址,中间以空格隔开。
添加端口
[root@UnixHot nrpe-2.12]# vi /etc/services 在最后添加
nrpe 5666/tcp #nrpe
4.修改文件所有者
[root@prdora1 ~]# chown -R nagios:nagios /usr/local/nagios
[root@prdora1 ~]# /etc/init.d/xinetd restart
[root@UnixHot nrpe-2.12]# netstat -na | grep 5666
tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN
或用[/usr/local/nagios/etc/objects]#lsof -i:5666
COMMAND
xinetd
5.有哪些nrpe检测命令可以用,它们在哪里呢?
[root@UnixHot ~]# vi /usr/local/nagios/etc/nrpe.cfg (默认有下面四个命令)
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z c
ommand[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
(三)测试客户端和服务端是否连通
客户端测试nrpe:
/usr/local/nagios/libexec/check_nrpe -H 127.0.0.1
NRPE v2.12
server端测试: /usr/local/nagios/libexec/check_nrpe -H 192.168.0.206
NRPE v2.12
如果在执行以上命令时出现一下错误时:
check_nrpe:error - could not complete SSL handshake
解决方法:杀掉nrpe 再重启
Pkill nrpe
Service xinetd restart
11.168.255.24
部分机器由于没法创建用户,所以要改变此权限:
[root@tongjiora ~]# chattr -i /etc/passwd
[root@tongjiora ~]# chattr -i /etc/shadow
[root@tongjiora ~]# lsattr /etc/group
----i-------- /etc/group
[root@tongjiora ~]# chattr -i /etc/group
权限复原:chatr +i dir
如果xinetd安装不上的话,则配置/usr/local/nagios/etc/nrpe.cfg文件
server_port=5666
server_address=127.0.0.1
allowed_hosts=127.0.0.1,11.168.6.11
[root@yzApache nrpe-2.12]# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
(四)安装rrdtool软件
yum install -y pango pango-devel freetype freetype-devel libpng libpng-devel gettext gettext-devel libjpeg libjpeg-devel gd gd-devel libxml2 libxml2-devel libiconv libiconv-devel
tar xvf rrdtool-1.4.5.tar.gz #解压文件
cd rrdtool-1.4.5
./configure --prefix=/usr/local/rrdtool make && make install #编译安装ii.
或yum安装
yum localinstall -y rrdtool-perl-1.4.4-1.el5.wrl.i386.rpm rrdtool-devel-1.4.4-1.el5.wrl.i386.rpm rrdtool-1.4.4-1.el5.wrl.i386.rpm #下载并安装包含所有rrdtool依赖关系的rpm包和rrdtool
(五)pnp4nagios安装与配置
1)安装
# tar zvxf pnp4nagios-0.6.11.tar.gz
# mv pnp4nagios-0.6.11 pnp4nagios
# cd pnp4nagios
#./configure --prefix=/usr/local/pnp4nagios \
# make all
#make install
#make install-webconf
#make install-config
#make install-init
#make fullinstall
小注:可能遇到的问题
i.make all后
最后显示:
*** Configuration summary for pnp4nagios-0.6.6 08-07-2010 ***
根据./configure --help
./configure --prefix=/usr/local/pnp4nagios --with-user=nagios --with-group=nagios --with-rrdtool=/usr/local/rrdtool/bin/rrdtool --with-httpd-conf=/usr/local/apache2/conf --with-init-dir=/etc/init.d
ii. ./configure --prefix=/usr/local/pnp4nagios --with-rrdtool=/usr/local/rrdtool/bin/rrdtool --with-nagios-user=nagios --with-nagios-group=nagios #注意这里的nagios用户和nagcmd组已在安装nagios时创建完成 注意:这里作者在使用编译安装rrdtool时遇到报错
解决方法:
cp -R /usr/local/rrdtool/lib/perl/5.8.8/i386-linux-thread-multi/* /usr/lib/perl5/5.8.8/i386-linux-thread-multi/
./configure --prefix=/usr/local/pnp4nagios --with-rrdtool=/usr/local/rrdtool/bin/rrdtool --with-nagios-user=nagios --with-nagios-group=nagcmd #将需要与perl相关文件复制过去后,重新编译
2)配置pnp4nagios
i.整合pnp4nagios与apache配合
首先在httpd.conf尾部增加include conf/pnp4nagios.conf
然后在/usr/local/apache2/conf/中添加pnp4nagios.conf文件
Alias /pnp4nagios "/usr/local/pnp4nagios/share"
<Directory "/usr/local/pnp4nagios/share">
# Use the same value as defined in nagios.conf
#
# Installation directory
RewriteCond %{REQUEST_FILENAME} !-f
</Directory>
ii. 将上述配置文件重命名,使之能够被程序识别 重启服务
cd
mv misccommands.cfg-sample
mv nagios.cfg-sample nagios.cfg
mv rra.cfg-sample rra.cfg
cd /usr/local/pnp4nagios/etc/pages/
mv web_traffic.cfg-sample web_traffic.cfg
cd ../check_commands
mv check_all_local_disks.cfg-sample
mv check_nrpe.cfg-sample
service npcd restart
iii.修改nagios相关配置文件中参数
[/usr/local/nagios/etc]#vim nagios.cfg
process_performance_data=1 #将此变量值设为1 host_perfdata_command=process-host-perfdata service_perfdata_command=process-service-perfdata #取消这两项的注释
cd /usr/local/nagios/etc/objects
vim commands.cfg #对nagios命令配置文件进行设置
# 'process-host-perfdata' command definition
define command{
# 'process-service-perfdata' command definition
define command{
command_name
command_line
} #添加上面两行在文件的末行 ,并删除或注释掉原有对process-host-perfdata和process-service-perfdata进行定义的字段 说明:command.cfg文件的默认设置对process-host-perfdata 和 process-service-perfdata有定义,如果直接添加而不删除默认定义会发生冲突,进而而导致在检测配置文件时报错。
vim templates.cfg #对nagios的模板配置文件进行修改
define host {
action_url
register
define service {
name
action_url
register
}
vim hosts.cfg
define host {
host_name nagios-server
use linux-server,host-pnp
alias nagios server
address 192.168.0.123
contact_groups sagroup
check_command check-host-alive
max_check_attempts 5
notification_interval 10
notification_period 24x7
notification_options d,u
}
定义服务
[/usr/local/nagios/etc/objects]#head -20 services.cfg
define service {
host_name nagios-server
service_description check-host-alive
use generic-service,srv-pnp
check_period 24x7
max_check_attempts 4
normal_check_interval 3
retry_check_interval 2
contact_groups sagroup
notification_interval 10
notification_period 24x7
notification_options w,u,c,r
check_command check-host-alive
}
define serviceextinfo {
host_name
service_description
action_url
}
#这是向nagios监控界面对应服务添加pnp4nagios图标的方法 说明:pnp4naigos在安装关联nagios后会对所监控的服务进行图形化显示,但是并非所有服务都会自动在nagios监视见面出现pnp4nagios的图标,故用以上字段予以定义。
(六)linux下的nrpe代理
Nagios对远程客户机的监控是通过NRPE这个工具进行的。在每个被监控端,我们安装
(nagios-3.2.1 )
nagios-plugins-1.4.14
nrpe-2.12
并启动服务/opt/nagios/bin/nrpe -d -c /opt/nagios/etc/nrpe.cfg(默认监听端口5666),目的是接收并处理来自监控机发来的nrpe请求。在监控端可以通过如下命令进行测试:
/usr/local/nagios/libexec/check_nrpe –H remoteIP –c nrpeCommand
对于标准局域网环境,由于所有机器都处于相同子网,监控工作可以顺利实施。本文要讨论的是跨子网部署nagios监控的问题。
以从192.168.0.123服务器(公司nagios)监控生产环境11.168.6.21服务器的磁盘分区/app(/dev/sdb1)为例,需要做如下调整:
11.168.6.21服务器:
1、无须调整/usr/local/nagios/etc/nrpe.cfg文件的“allowed_hosts”,因为以xinetd方式启动的nrpe会忽略这个参数的设置
2、调整/etc/xinetd.d/nrpe (注:多个ip空格隔开)
only_from
3、重启xinetd服务
11.168.6.13服务器:
1、将/usr/local/nagios/ 拷贝一份到 /opt/nagios
2、调整nrpe.cfg
# 定义新的pid文件,避免与另外一个nrpe冲突
pid_file=/var/run/nrpe2.pid
# 定义端口,因为防火墙映射端口中只有1414没有使用
server_port=1414
# 增加公司的医保网网关地址(注:逗号隔开)
allowed_hosts=127.0.0.1,10.40.251.1
# 定义命令,以后需要通过11.168.6.13代理监控其它服务器,则增加类似的命令,注意一定要加上ip地址信息以示区别。
command[check_6_21_sdb1]=
/opt/nagios/libexec/check_nrpe -H 11.168.6.21
3、启动新配置好的nrpe(当然也可以配置成xinetd服务)
# /opt/nagios/bin/nrpe -d -c /opt/nagios/etc/nrpe.cfg
192.168.0.123服务器
服务器组名设为sm-center(实名中心端)
1、因为nagios命令check_nrpe对应的nrpe默认端口5666,而本案需要使用1414这个端口,所以重新定义一个命令,修改配置文件
/usr/local/nagios/etc/objects/commands.cfg
增加如下内容:
#nrpe2
define command{
}
2、新建目录 /usr/local/nagios/etc/sm-center
3、新建主机配置文件/usr/local/nagios/etc/sm-center/6_21.cfg
define host{
define service{
4、修改/usr/local/nagios/etc/hostgroups.cfg增加主机组sm-center
define hostgroup {
hostgroup_name sm-center
alias sm-center
members 11.168.6.21
}
5、启动nagios
(七)Nagios windows client Monitor Windows:
--NSClient则不同,被监控机上只安装NSClient,没有任何的插件.当监控主机将监控请求发给NSClient后,NSClient直接完成监控,所有的监控是由NSClient完成的.
这也说明了NSClient的一个很大的问题,不灵活,没有可扩展性.它只能完成自己本身包含的监控操作,不能由一些插件来扩展.好在NSClient已经做的不错了,基本上可以完全满足我们的监控需要.
安装NSClient(NSClient++-0.3.8-Win32.msi ):
Download from :
http://sourceforge.net/projects/nscplus/files/nscplus/NSClient++ 0.3.8/
如果中间有路由器的话好允许的ip写成路由器的网管:10.40.251.1
Server define services(Windows):
#vi /usr/local/nagios/etc/services/serverbj6.cfg
define service{
define service{
define service{
define service{
define service{
define service{
#/etc/init.d/nagios reload
Linux下测试的命令列表
以下为服务端check_nt 检查命令
# 检查本次系统启动总时间
check_nt -H 192.168.1.121 -p 12489 -s 12345 -v UPTIME
# 检查内存占用情况
check_nt -H 192.168.1.121 -p 12489 -s 12345 -v MEMUSE -w 80 -c 90
# 检查客户端版本信息
check_nt -H 192.168.1.121 -p 12489 -s 12345 -v CLIENTVERSION
# 检查5分钟内CPU占用情况
check_nt -H 192.168.1.121 -p 12489 -s 12345 -v CPULOAD -w 80 -c 90 -l 5,80,90
# 检查磁盘C占用情况
check_nt -H 192.168.1.121 -p 12489 -s 12345 -v USEDDISKSPACE -d SHOWALL -l C
# 检查服务状态
check_nt -H 192.168.1.121 -p 12489 -s 12345 -v SERVICESTATE -l Spooler -d SHOWALL
# 检查进程状态
check_nt -H 192.168.1.121 -p 12489 -s 12345 -v PROCSTATE -l spark.exe -d SHOWALL
# 查看所有进程列表
check_nt -H 192.168.1.121 -p 12489 -s 12345 -v INSTANCES -l process
(八)windows下的nrpe代理
其实与linux下的类似,只是在中间端11.168.6.13下的命令不同而已
11.168.6.13下:
[root@localhost etc]# vim nrpe.cf
#10.40.45.250
command[check_45_250_mem]=/usr/local/nagios/libexec/check_nt -H 10.40.45.250 -p 12489 -v MEMUSE -w 80 -c 90
command[check_45_250_c]=/usr/local/nagios/libexec/check_nt -H 10.40.45.250 -p 12489 -v USEDDISKSPACE -l c -w 80 -c 90
command[check_45_250_d]=/usr/local/nagios/libexec/check_nt -H 10.40.45.250 -p 12489 -v USEDDISKSPACE -l d -w 80 -c 90
command[check_45_250_cpu]=/usr/local/nagios/libexec/check_nt -H 10.40.45.250 -p 12489 -v CPULOAD -w 80 -c 90 -l 5,80,90
杀掉进程重启
[root@localhost etc]# pkill nrpe
[root@localhost etc]# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg –d
[root@localhost etc]# ps -ef|grep nrpe
nagios
root
客户端把允许的ip写成:10.35.100.109