Nagios安装

一.nagios在服务器端(监控端)的安装。服务器IP地址:192.168.0.13


1.在安装之前首先检测系统是否安装以下包:httpd php gcc glibc glibc-common gd gd-devel


#rpm -qa | grep httpd
#rpm -qa | grep php 
....
#rpm -qa | grep gd


2.创建用户


#useradd nagios
#groupadd nagcmd
#/usr/sbin/usermod -a -G nagcmd nagios
#/usr/sbin/usermod -a -G nagcmd apache


3.安装nagios包(此处用3.2.0版本)


#tar zxvf nagios-3.2.0.tar.gz
#cd nagios-3.2.0
#./configure --prefix=/usr/local/nagios --with-command-group=nagcmd
#make
#make install
#make install-init
#make install-config
#make install-commandmode


#make install-webconf  


4.创建管理用户并启动apache


#htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
passwd:******
此处所创建用户为nagiosadmin,如果为其他用户刚后面要修改文件:/usr/local/nagios/etc/cgi.cfg,后面再讲。


#service httpd restart


5.安装nagios-plugins(此处用1.4.13版本)


#tar zxvf nagios-plugins-1.4.13.tar.gz
#cd nagios-plugins-1.4.13
#./configure --with-nagios-user=nagios --with-nagios-group=nagios --prefix=/usr/local/nagios
#make
#make install


6.注册服务,设置开机启动


#chkconfig --add nagios
#chkconfig nagios on


7.此时完成初步安装,可以监控查看本机的一些服务,检测配置文件并启动nagios


#/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...


Total Warnings: 0
Total Errors:   0


出现此处,表明,配置文件没有错误,可以启动nagios


#service nagios start


8.登录查看
http://192.168.0.13/nagios/
输入创建的用户名nagiosadmin与设置密码,进去可操作。




########################################################################
 
此时只能说完成了最其他的操作,最重要的是配置,我们通过自己的配置可以达到监控自己想要监控的主机服务的目的。我们安装nagios的目


的肯定不仅仅只为了监控一台服务器,而是要监控一个服务系统群组,这里就要用到一个软件nrpe,此软件在监控端和被监控端都要安装才行


,默认使用端口为5666.


########################################################################
 
二.nagios配置


1.在服务器端安装nrpe(此处使用2.12版本)


#tar zxvf nrpe-2.12.tar.gz
#cd nrpe-2.12
#./configure     (因为之前安装了nagios-plugins,所以nrpe默认安装在/usr/local/nagios/下,也就是也nagios-plugins在同一个安装目


录下)
#make all
#make install-plugin
#make install-daemon
#make install-daemon-config


# ls /usr/local/nagios/libexec/check_nrpe 
/usr/local/nagios/libexec/check_nrpe     
此文件出现,表明安装成功


# ll /usr/local/nagios/
total 24
drwxrwxr-x  2 nagios nagios 4096 Jul 21 19:09 bin
drwxrwxr-x  3 nagios nagios 4096 Jul 22 13:35 etc
drwxrwxr-x  2 nagios nagios 4096 Jul 21 19:09 libexec
drwxrwxr-x  2 nagios nagios 4096 Jul 21 18:57 sbin
drwxrwxr-x 10 nagios nagios 4096 Jul 21 19:03 share
drwxrwxr-x  5 nagios nagios 4096 Jul 22 14:25 var


注意此时,在nagios目录下的所有文件与子目录所有者与所属组都为nagios,但是一个除外,/usr/local/nagios/etc/htpasswd.usrs为 root 


root,以后再添加的文件也同样为nagios nagios,这里如果出现差错,后面可能会出权限问题。


2.配置nagios主配置文件nagios.cfg
#  cat nagios.cfg  只写出改动文件,下同


cfg_file=/usr/local/nagios/etc/objects/commands.cfg
cfg_file=/usr/local/nagios/etc/objects/contacts.cfg
cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg
cfg_file=/usr/local/nagios/etc/objects/templates.cfg


新添加下面4句,指向子文件所在位置
cfg_file=/usr/local/nagios/etc/hosts.cfg
cfg_file=/usr/local/nagios/etc/hostgroups.cfg
cfg_file=/usr/local/nagios/etc/contactgroups.cfg
cfg_file=/usr/local/nagios/etc/services.cfg




# Definitions for monitoring the local (Linux) host
#cfg_file=/usr/local/nagios/etc/objects/localhost.cfg  #注释掉,因为有了hosts.cfg文件


command_check_interval=10s
#command_check_interval=-1  #原来为-1,改成10s




3.由上一步新添加的4句,创建文件hosts.cfg hostgroup.cfg contactgroups.cfg services.cfg


4.配置hosts.cfg    hostgroup.cfg   contactgroups.cfg


# cat hosts.cfg 


define host {
host_name               nagios-server    #与hostgroup.cfg定义的保持一致
alias                   nagios server
address                 192.168.0.13     #被监控主机IP
contact_groups          sagroup          #监控用户所在的组名,在contactgroups.cfg定义
check_command           check-host-alive  #此为一个命令,在objects/commands.cfg中有定义,必须有定义
max_check_attempts      5          #检测次数,一般为3-5次
notification_interval   10  #检测时间间隔,单位为分钟,根据自己情况定
notification_period     24x7              #代表不间断的检测,不能为*,只能为x,下同
notification_options    d,u,r           #此为状态描述,d-down,u-unreacheable,r-recovery
}


----------------------------------------------------
# cat hostgroup.cfg 定义组与组成员


define hostgroup {
hostgroup_name  sa-servers
alias           sa servers
members         nagios-server     #(如果有多用户,可以以“,”分隔,不能有空格)
}


----------------------------------------------------


# cat contactgroups.cfg 


define contactgroup {
contactgroup_name       sagroup
alias                   system administrator group
members                 nagiosadmin
}


--------------------


5.配置cgi.cfg


# cat cgi.cfg
use_authentication=0    #改成0表示不对用户进行cgi验证


authorized_for_system_information=nagiosadmin    #因为当时创建的管理用户就是nagiosadmin,所以此处不用修改,如果创建用户为其他


,则要修改,如果创建多个用户,可以用“,”分隔。
authorized_for_configuration_information=nagiosadmin
authorized_for_system_commands=nagiosadmin   #  * 此处即使是其他用户,也不能改动。*
authorized_for_all_services=nagiosadmin
authorized_for_all_hosts=nagiosadmin
authorized_for_all_service_commands=nagiosadmin
authorized_for_all_host_commands=nagiosadmin




6.配置nrpe.cfg


# cat nrpe.cfg | sed -n '/^[^#]/p'


log_facility=daemon
pid_file=/var/run/nrpe.pid
server_port=5666      #端口号,可以改动
nrpe_user=nagios
nrpe_group=nagios
allowed_hosts=127.0.0.1,192.168.0.13   #此处是可以连接管理此主机的服务器,也就是监控服务器的IP
 
dont_blame_nrpe=0
debug=0
command_timeout=60
connection_timeout=300
#下面是定义的命令
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10    #连接用户数,超过5个warning,10个Cirtical(严重)
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20  #负载情况,三个数表示,当前,5分钟内,15分


钟内
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z  #使用内存
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200  #总内存
command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%  #交换分区使用率 
command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda3  #磁盘分区使用率






还可以自己定义,通过写脚本来完成,后面再来补充。


7.配置objects/contacts.cfg


# cat objects/contacts.cfg


define contact{
contact_name                    nagiosadmin
alias                           system administrator
service_notification_period     24x7
host_notification_period        24x7
service_notification_options    w,u,c,r                  #代表Warning,Unknown,Critical,recovery
host_notification_options       d,u,r
service_notification_commands   notify-service-by-fetion,notify-service-by-sms   #指明报警方式
host_notification_commands      notify-host-by-fetion,notify-host-by-sms         #同上
email **********@139.com
pager                           152******13
}




8.配置 objects/commands.cfg


# cat objects/commands.cfg  (一定要定义的列出,其他的不必要变动)


# 'check-host-alive' define command


define command{
        command_name    check-host-alive
        command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
        }
# 'check_nrpe' define command  这个是要自己定义的,很重要,会影响到services.cfg中的配置


define command{
       command_name check_nrpe
       command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$    # $ARG1$表示check_nrpe后面的命令,如:check_disk
       }




# 'notify-host-by-fetion' command definition   飞信报警配置


define command{
        command_name    notify-host-by-fetion
        command_line    /usr/local/fetion/fetion --mobile=152******** --pwd=******** --to $CONTACTPAGER$ --msg-utf8="$HOSTNAME$ is$HOSTSTATE$" --debug
}


# 'notify-service-by-email' command definition
define command{
        command_name    notify-service-by-fetion
        command_line    /usr/local/fetion/fetion --mobile=152******** --pwd=******** --to $CONTACTPAGER$ --msg-


utf8="$NOTIFICATIONTYPE$: $HOSTALIAS$/$SERVICEDESC$ IS $SERVICESTATE$" --debug
        }




# 'notify-host-by-sms' command definition      邮件报警配置


define command {
       command_name notify-host-by-sms
       command_line  /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" |/usr/local/sendEmail/sendEmail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$
        }


# 'notify-service-by-sms' command definition


define command {
       command_name notify-service-bysms
       command_line  /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: 


$SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional 


Info:\n\n$SERVICEOUTPUT$" | /usr/local/sendEmail/sendEmail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/


$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
       }


9.配置services.cfg


#cat services.cfg


###nagios-server:services.cfg###


define service {
host_name               nagios-server     #主机名一定要与hosts.cfg文件中的定义保持一致
service_description     check-host-alive
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check-host-alive  #命令为objects/commands.cfg中已经定义的
}




define service {
host_name               nagios-server
service_description     check_tcp 80
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_tcp!80   #感叹号后面为参数
}






define service {
host_name               nagios-server
service_description     check_local_disk
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
#check_command          check_local_disk!20%!10%!/
check_command           check_nrpe!check_disk
}






define service {
host_name               nagios-server
service_description     check_load
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_load
}


define service {
host_name               nagios-server
service_description     check_total_procs
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_total_procs
}


define service {
host_name               nagios-server
service_description     check_users
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_users
}




此处定义监控6个服务,如果要监控其他主机的服务,也要在这里定义,下面会提到。






10.此时配置完成了一大步,以后再配置也是在这个基础上,会很容易了。
下面就要启动nrpe,重启nagios来检测配置是否成功!


#/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...


Total Warnings: 0
Total Errors:   0


出现此处,表明,配置文件没有错误,可以启动nagios


#service nagios restart  启动成功




# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
# tail -f /var/log/messages
Jul 22 16:25:16 localhost nrpe[14911]: Starting up daemon
Jul 22 16:25:16 localhost nrpe[14911]: Listening for connections on port 5666 
Jul 22 16:25:16 localhost nrpe[14911]: Allowing connections from: 127.0.0.1,192.168.0.13
日志信息出现如上,表明启动成功,测试一下




# /usr/local/nagios/libexec/check_nrpe -H 192.168.0.13
NRPE v2.12                     会显示nrpe版本号


# /usr/local/nagios/libexec/check_nrpe -H 192.168.0.13 -c check_disk
DISK OK - free space: / 242377 MB (87% inode=99%);| /=34099MB;233219;262371;0;291524


能出现这些信息表明成功!
 
三.安装配置被监控端 192.168.0.61 192.168.0.62 。。。


1.创建用户nagios (在多台主机上作同样的配置,如果要监控其他服务,可以再作处理)


# useradd nagios


2.安装nagios-plugins


# tar zxvf nagios-plugins-1.4.13.tar.gz
# cd nagios-plugins-1.4.13
# ./configure --prefix=/usr/local/nagios/
# make 
# make install


# chown -R nagios.nagios /usr/local/nagios  


2.安装nrpe,版本与监控端保持一致


# tar zxvf nrpe-2.12.tar.gz
# cd nrpe-2.12
# ./configure
# make all
# make install-plugin
# make install-daemon
# make install-daemon-config


# ll /usr/local/nagios/
total 16
drwxrwxr-x 2 nagios nagios 4096 Jul 21 11:30 bin
drwxrwxr-x 2 nagios nagios 4096 Jul 22 13:40 etc
drwxr-xr-x 2 nagios nagios 4096 Jul 21 11:20 libexec
drwxr-xr-x 3 root   root   4096 Jul 21 11:19 share


3.修改配置文件nrpe.cfg
此文件可以从监控端服务器上复制到这里来,因为服务器端都是配置好的文件,我设置的完全一样。


# scp 192.168.0.13:/usr/local/nagios/etc/nrpe.cfg /usr/local/nagios/etc/nrpe.cfg


#cat /usr/local/nagios/etc/nrpe.cfg | grep allowed_hosts
allowed_hosts=127.0.0.1,192.168.0.13      #此处为监控端服务器IP地址


4.启动客户端nrpe


# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d
# tail -f /var/log/messages
Jul 22 16:41:16 localhost nrpe[14911]: Starting up daemon
Jul 22 16:41:16 localhost nrpe[14911]: Listening for connections on port 5666 
Jul 22 16:41:16 localhost nrpe[14911]: Allowing connections from: 127.0.0.1,192.168.0.13
日志信息出现如上,表明启动成功,测试一下


去监控端上测试:
# /usr/local/nagios/libexec/check_nrpe -H 192.168.0.61  一定是在监控端上测试的,而不是在刚安装好的客户端上,自己以前在这里犯


过错!!总是报ssl问题。
NRPE v2.12                     会显示nrpe版本号


# /usr/local/nagios/libexec/check_nrpe -H 192.168.0.61 -c check_load
OK - load average: 0.00, 0.00, 0.00|load1=0.000;15.000;30.000;0; load5=0.000;10.000;25.000;0; load15=0.000;5.000;20.000;0;


再次成功!


四.去客户端配置hosts.cfg hostgroups.cfg services.cfg来完成对服务器群的监控


在192.168.0.13上


1.配置hosts.cfg


# cat hosts.cfg   增加机器


define host {
host_name               nagios-server
alias                   nagios server
address                 192.168.0.13
contact_groups          sagroup
check_command           check-host-alive
max_check_attempts      5
notification_interval   10
notification_period     24x7
notification_options    d,u,r
}


define host {
host_name               mysql-server-61
alias                   mysql server 61
address                 192.168.0.61
contact_groups          sagroup
check_command           check-host-alive
max_check_attempts      5
notification_interval   10
notification_period     24x7
notification_options    d,u,r
}


define host {
host_name               mysql-server-62
alias                   mysql server 62
address                 192.168.0.62
contact_groups          sagroup
check_command           check-host-alive
max_check_attempts      5
notification_interval   10
notification_period     24x7
notification_options    d,u,r
}


2.配置hostgroups.cfg


# cat hostgroups.cfg 


define hostgroup {
hostgroup_name  sa-servers
alias           sa servers
members         nagios-server,mysql-server-61,mysql-server-62  #上面提到过这里,把主机成员增加到这里
}


3.配置 services.cfg


[root@localhost etc]# cat services.cfg 


##### nagios-server:services.cfg ######


define service {
host_name               nagios-server
service_description     check-host-alive
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check-host-alive
}




define service {
host_name               nagios-server
service_description     check_tcp 80
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_tcp!80
}






define service {
host_name               nagios-server
service_description     check_local_disk
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
#check_command          check_local_disk!20%!10%!/
check_command           check_nrpe!check_disk
}






define service {
host_name               nagios-server
service_description     check_load
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_load
}


define service {
host_name               nagios-server
service_description     check_total_procs
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_total_procs
}


define service {
host_name               nagios-server
service_description     check_users
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_users
}


#### mysql-server-61:services.cfg ######


define service {
host_name               mysql-server-61
service_description     check_total_procs
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_total_procs
}




define service {
host_name               mysql-server-61
service_description     check_users
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_users
}


define service {
host_name               mysql-server-61
service_description     check_disk_/dev/sda3
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_disk
}


define service {
host_name               mysql-server-61
service_description     check_load
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_load
}


define service {
host_name               mysql-server-61
service_description     check_swap
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_swap
}


#### mysql-server-62:services.cfg #####


define service {
host_name               mysql-server-62
service_description     check-host-alive
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check-host-alive
}




define service {
host_name               mysql-server-62
service_description     check_users
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_users
}


define service {
host_name               mysql-server-62
service_description     check_disk_/dev/sda3
check_period            24x7
max_check_attempts      4
normal_check_interval   3
retry_check_interval    2
contact_groups          sagroup
notification_interval   10
notification_period     24x7
notification_options    w,u,c,r
check_command           check_nrpe!check_disk
}


4.完成配置,检测配置。


#/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...


Total Warnings: 0
Total Errors:   0


出现此处,表明,配置文件没有错误,可以启动nagios


#service nagios restart  启动成功


5.进入web监控界面


http://192.168.0.13/nagios/


大功告成!!
 


来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/31448824/viewspace-2137651/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/31448824/viewspace-2137651/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值