Nagios监控设备
*Nagios能有效监控Windows、Linux和Unix的主机状态,交换机路由器等网络设备
*特点:
(1)监控网络服务(SMTP|POP3|HTTP|TCP|ping等)
(2)监控主机资源(CPU|负载|IO状况|虚拟及正式磁盘利用率等)
(3)简单的插件设计模式,监控自己特定的软件
(4)并行服务检查机制
(5)当服务或主机问题产生与解决后将警告发送给联系人
(6)自动日志回滚
(7)具备定义时间句柄功能,它可以在主机或者服务器的事件发生时获取更多的问题定位
(8)支持并行实现对主机冗余监控(分布式监控)
(9)可选的WEB界面用于查看当前的网络状态、通知和故障历史、日志文件等
常见的监控软件及特点
mrtg:早期最主要的是处理网站带宽流量图、历史趋势图等
nagios:主要特色就算专注与报警,也可配合pnp,cacti,hyperic出图等|特别适合大量服务器上的大批服务检测是否正常
cacti:出图,历史趋势图(通过rrdtool软件),可通过插件实现报警,但是功能比较弱,故障分析较差
zabbix:出图报警软件,出图通过php绘制程序
munin:专注于历史趋势图
hyperic:基于java的监控软件
Nagios监控家族成员的组成
*Nagios监控一般有一个主程序(nagios)、一个程序插件(nagios-plugins)和一些可选的附加程序(NRPE|NSClient++、和NDOUtils)等
(1)NRPE:工作于被监控端,一般为linux/unix系统上,用于在被监控的远程linux/unix主机上执行插件脚本获取数据回传给服务端,实现对主机资源的监控,存在形式是守护进程模式,开启的端口是5666
(2)NSClient++:工作于windows系统的被监控端,作用相当于linux下的NRPE
(3)NDOUtils:工作于nagios服务器端,用于将nagios的配置信息和各event产生的数据存入数据库以实现对这些数据的检索和处理,(不推荐使用,把数据放在数据库没有任何好处,数据库宕机了,监控就失效了,配置也就不那么智能了等
(4)NSCA:需要同时安装在nagios的服务端的客户端,纯被动式的监控,用于让被监控的远程主机,主动将监控到的信息发送给Ngagios服务端(在分布式集群模式中要用到,200台服务器以内可以不用)
实验环境
server2:RS
server3|server4:Client
selinux Disabled
iptables off
做好时间同步
解决perl的编译问题
[root@server2 ~]# echo 'export LC_ALL=C'>>/etc/profile
[root@server2 ~]# tail -1 /etc/profile
export LC_ALL=C
[root@server2 ~]# source /etc/profile
[root@server2 ~]# echo $LC_ALL
C
[root@server2 ~]#
在server2中
[root@server2 ~]# yum install gcc glibc-common glibc gd gd-devel php* httpd -y
.....
gcc glibc-common glibc | 编译软件升级
gd gd-devel | 用于pnp出图包
.....
[root@server2 ~]# yum install mysql* -y
服务端
安装Nagio主程序
nagios下载地址
https://sourceforge.net/projects/nagios/files/
[root@server2 ~]# tar zxf nagios-3.5.1.tar.gz
[root@server2 ~]# cd nagios
[root@server2 nagios]# ./configure --with-command-group=nagcmd
.....
Review the options above for accuracy. If they look okay,
type 'make all' to compile the main program and CGIs.
.....
[root@server2 nagios]# make all
.....
http://support.nagios.com
*************************************************************
Enjoy.
.....
[root@server2 nagios]# make install
.....
make install-init
- This installs the init script in /etc/rc.d/init.d
make install-commandmode
- This installs and configures permissions on the
directory for holding the external command file
make install-config
- This installs sample config files in /usr/local/nagios/etc
make[1]: Leaving directory `/root/nagios'
.....
*根据提示,还需要做如下几步
[root@server2 nagios]# make install-init
[root@server2 nagios]# make install-commandmode
[root@server2 nagios]# make install-config
生成nagios对应于apache里的配置
[root@server2 nagios]# make install-webconf
.....
/usr/bin/install -c -m 644 sample-config/httpd.conf /etc/httpd/conf.d/nagios.conf
*** Nagios/Apache conf file installed ***
.....
[root@server2 nagios]# more /etc/httpd/conf.d/nagios.conf
.....
# SAMPLE CONFIG SNIPPETS FOR APACHE WEB SERVER
# Last Modified: 11-26-2005
#
# This file contains examples of entries that need
# to be incorporated into your Apache web server
# configuration file. Customize the paths, etc. as
# needed to fit your system.
ScriptAlias /nagios/cgi-bin "/usr/local/nagios/sbin"
<Directory "/usr/local/nagios/sbin">
# SSLRequireSSL
Options ExecCGI
AllowOverride None
Order allow,deny
Allow from all
# Order deny,allow
# Deny from all
# Allow from 127.0.0.1
AuthName "Nagios Access"
AuthType Basic
AuthUserFile /usr/local/nagios/etc/htpasswd.users
#登陆nagios的web界面所需用户和密码存放位置
Require valid-user
.....
[root@server2 nagios]#
*设置登陆web界面时所需的账户和密码
[root@server2 nagios]# htpasswd -c /usr/local/nagios/etc/htpasswd.users xmj
New password:
Re-type new password:
Adding password for user xmj
[root@server2 nagios]# cat /usr/local/nagios/etc/htpasswd.users
xmj:kGA87J7l.CBaI
[root@server2 nagios]#
*需要刚才设置的账户和密码登陆
*此时任意点开会是这种状态
安装插件软件包
安装nagios-plugins
nagios-plugins下载地址:https://nagios-plugins.org/downloads/
[root@server2 ~]# tar zxf nagios-plugins-1.4.16.tar.gz
[root@server2 ~]# cd nagios-plugins-1.4.16
[root@server2 nagios-plugins-1.4.16]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios --with-myaql=/usr/bin/mysql --prefix=/usr/local/nagios
[root@server2 nagios-plugins-1.4.16]# make
[root@server2 nagios-plugins-1.4.16]# make install
[root@server2 ~]# ll /usr/local/nagios/libexec/ | wc -l #插件存放位置
60[root@server2 ~]# /etc/init.d/nagios start
Starting nagios: done.
[root@server2 ~]# /etc/init.d/nagios checkconfig #检查语法
Running configuration check... OK.
[root@server2 ~]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg #检查语法,显示出详细信息
[root@server2 ~]# ps -ef | grep nagios
nagios 26898 1 0 20:28 ? 00:00:00 /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg #nagios启动命令|配置文件
root 26904 1186 0 20:28 pts/0 00:00:00 grep nagios
[root@server2 ~]#
*表示安装成功
*注:
如果出现其它 Internet Server Error 页面
注意自己的iptables|selinux状态
安装nrpe
*客户端软件
*服务端需要check——nrpe插件(用来和客户端nrpe进行对话)
nrpe下载地址:https://sourceforge.net/projects/nagios/files/nrpe-2.x/
[root@server2 ~]# tar zxf nrpe-2.12.tar.gz
[root@server2 ~]# cd nrpe-2.12
[root@server2 nrpe-2.12]# ./configure
[root@server2 nrpe-2.12]# make all
[root@server2 nrpe-2.12]# make install-plugin
[root@server2 nrpe-2.12]# make install-daemon
[root@server2 nrpe-2.12]# make install-daemon-config
[root@server2 ~]# ls /usr/local/nagios/libexec/check_nrpe
/usr/local/nagios/libexec/check_nrpe
[root@server2 ~]#
客户端
和nagios服务端相比
不需lamp环境|nagios主程序|gcc|glibc|glibc-common|gd|gd-devel|mysql*|httpd|php|php-gd
安装:nagios-plugins-1.4.16.tar.gz
[root@server4 nagios-plugins-1.4.16]# ./configure --with-nagios-user=nagios --with-nagios-group=nagios --with-perl-modules
[root@server4 nagios-plugins-1.4.16]# make
[root@server4 nagios-plugins-1.4.16]# make install
如果报错:perl 模块相关错误
yum install perl-C*
安装:nrpe-2.12.tar.gz
[root@server3 ~]# cd /usr/local/nagios/etc/
[root@server3 etc]# cp nrpe.cfg nrpe.cfg.ori
[root@server3 etc]# vim nrpe.cfg # nrpe 配置文件
[root@server3 etc]# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d #后台启动 nrpe
[root@server3 etc]# ps -ef | grep nrpe
nagios 1657 1 0 19:35 ? 00:00:00 /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
root 1659 1062 0 19:35 pts/0 00:00:00 grep nrpe
[root@server3 etc]# lsof -i :5666
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
nrpe 1657 nagios 4u IPv4 13374 0t0 TCP *:5666 (LISTEN)
[root@server3 etc]#
安装相关插件:
[root@server3 ~]# tar zxfv Params-Validate-0.91.tar.gz
[root@server3 ~]# tar zxvf Class-Accessor-0.31.tar.gz
[root@server3 ~]# tar zxvf Config-Tiny-2.12.tar.gz
[root@server3 ~]# tar zxvf Math-Calc-Units-1.07.tar.gz
[root@server3 ~]# tar zxvf Regexp-Common-2010010201.tar.gz
[root@server3 ~]# cd Params-Validate-0.91
[root@server3 Params-Validate-0.91]# perl Makefile.PL
[root@server3 Params-Validate-0.91]# make
[root@server3 Params-Validate-0.91]# make install
[root@server3 ~]# yum install sysstat -y
监控物理组件的高级命令
内存:top|iostat|sar|free|vmstat|mpstat
CPU:top|vmstat|mpstat|iostat|sar
I/O:|vmstat|mpstat|iostat|sar
进程:ipcs|ipcrm
负载:uptime
配置 nagios 服务端
[root@server2 ~]# tree /usr/local/nagios/etc/ # nagios 相关配置文件
/usr/local/nagios/etc/
|-- cgi.cfg
|-- htpasswd.user
|-- htpasswd.users
|-- nagios.cfg #nagios主配置文件
|-- nrpe.cfg #客户端配置文件
|-- objects
| |-- commands.cfg
| |-- contacts.cfg
| |-- localhost.cfg
| |-- printer.cfg
| |-- switch.cfg
| |-- templates.cfg
| |-- timeperiods.cfg
| `-- windows.cfg
`-- resource.cfg
1 directory, 14 files
[root@server2 etc]# cp nagios.cfg nagios.cfg.ori
[root@server2 etc]# vim nagios.cfg
[root@server2 etc]# pwd
/usr/local/nagios/etc
[root@server2 etc]# cd objects/
[root@server2 objects]# head -51 localhost.cfg > hosts.cfg
[root@server2 objects]# touch service.cfg
[root@server2 objects]# chown nagios.nagios hosts.cfg service.cfg
[root@server2 objects]# ll
total 52
-rw-rw-r-- 1 nagios nagios 7716 Jun 6 17:03 commands.cfg
-rw-rw-r-- 1 nagios nagios 2166 Jun 6 17:03 contacts.cfg
-rw-r--r-- 1 nagios nagios 1870 Jun 9 20:15 hosts.cfg
-rw-rw-r-- 1 nagios nagios 5403 Jun 6 17:03 localhost.cfg
-rw-rw-r-- 1 nagios nagios 3124 Jun 6 17:03 printer.cfg
-rw-r--r-- 1 nagios nagios 0 Jun 9 20:15 service.cfg
-rw-rw-r-- 1 nagios nagios 3293 Jun 6 17:03 switch.cfg
-rw-rw-r-- 1 nagios nagios 10812 Jun 6 17:03 templates.cfg
-rw-rw-r-- 1 nagios nagios 3208 Jun 6 17:03 timeperiods.cfg
-rw-rw-r-- 1 nagios nagios 4019 Jun 6 17:03 windows.cfg
[root@server2 objects]# vim hosts.cfg
.....
define host{
use linux-server
host_name server2
alias server2
address 172.25.66.2
}
define host{
use linux-server
host_name server3
alias server3
address 172.25.66.3
.....
}
define hostgroup{
hostgroup_name linux-servers ; The name of the hostgroup
alias Linux Servers ; Long name of the group
members server2,server3
}
.....
[root@server2 objects]# /etc/init.d/nagios checkconfig
Running configuration check... CONFIG ERROR! Check your Nagios configuration.
[root@server2 objects]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
[root@server2 objects]# vim services.cfg
.....
define service{
use generic-service
host_name server2
service_description Current Load
check_command check_nrpe!check_load
max_check_attempts 2
normal_check_interval 4
retry_check_interval 4
check_period 24x7
notification_interval 1440
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
process_perf_data
.....
[root@server2 objects]# vi commands.cfg
.....
# 'check_nrpe' command definition
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
.....
[root@server2 objects]# ll /usr/local/nagios/libexec/check_nrpe
-rwxrwxr-x 1 nagios nagios 76736 Jun 6 21:34 /usr/local/nagios/libexec/check_nrpe
[root@server2 objects]# /etc/init.d/nagios start
Starting nagios: done.
[root@server2 objects]# vi ../cgi.cfg +119
.....
authorized_for_system_information=xmj
authorized_for_configuration_information=xmj
authorized_for_all_services=xmj
authorized_for_all_hosts=xmj
.....
[root@server2 objects]# /etc/init.d/nagios reload
Running configuration check...done.
Reloading nagios configuration...done
[root@server2 objects]#