debian下 nagios 安装记录

最新推荐文章于 2024-07-15 15:20:27 发布

weixin_33691700

最新推荐文章于 2024-07-15 15:20:27 发布

阅读量103

点赞数

文章标签：操作系统运维

原文链接：http://blog.51cto.com/dlsxw/905638

版权

安装apache和相关软件

apt-get install apache2

apt-get install libapache2-mod-php5

apt-get install build-essential

apt-get install libgd2-xpm-dev

添加相关用户和组

useradd -m -s /bin/nologin nagios

usermod -G nagios nagios

/usr/sbin/groupadd nagcmd

/usr/sbin/usermod -a -G nagcmd nagios

/usr/sbin/usermod -a -G nagcmd www-data

下载nagios软件和插件

wget http://osdn.dl.sourceforge.net/sourceforge/nagios/nagios-3.2.0.tar.gz

wget http://osdn.dl.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.11.tar.gz

编译安装

./configure --with-command-group=nagcmd

make all

make install

make install-init

make install-config

make install-commandmode

默认的配置文件就可以正常工作了

修改联系人文件中的email地址，用来接收警告邮件

vi /usr/local/nagios/etc/objects/contacts.cfg

配置apache网页管理nagios

make install-webconf

配置文件安装到了/etc/httpd/conf.d/nagios.conf

cp /etc/httpd/conf.d/nagios.conf /etc/apache2/conf.d

添加网页登录用户

htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

重新加载apache

/etc/init.d/apache2 reload

编译安装插件

cd nagios-plugins-1.4.11

./configure --with-nagios-user=nagios --with-nagios-group=nagios

make

make install

设置开机启动

update-rc.d nagios defaults 99 20

校验文件

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

没有错误，启动nagios

/etc/init.d/nagios start

访问测试

http://172.17.1.202/nagios/

遇到错误，查看nagios日志和apache日志

安装nrpe，监控远程主机使用，监控本机不使用

cd nrpe-2.12

./configure --enable-ssl=no --with-nagios-user=nagios --with-nagios-group=nagios

make all

make install-plugin

make install-daemon

make install-daemon-config

启动nrpe不使用ssl

/usr/local/nagios/bin/nrpe -n -c /usr/local/nagios/etc/nrpe.cfg -d

服务器端不用启动nrpe服务

测试check_nrpe

/usr/local/nagios/libexec/check_nrpe -n -H localhost

NRPE v2.12

================安装发送邮件，可选择===========

apt-get install mailx

apt-get install postfix

vi /usr/local/nagios/etc/objects/commands.cfg

%s/bin\/mail/user\/bin\/mail/g

修改/bin/mail为/usr/bin/mail

最后没有使用上面方法

使用telnet方式发送邮件

脚本内容

用户名和密码要转成base64 http://maclife.net/tools/base64/

#!/bin/bash
IP="smtp.163.com"
PORT="25"
USER=" test@163.com"
USER64="dGVzdEAxNjMuY29t"
PASD64="eDNsSo="
TOUSER="test11 @qq.com"
TOUSER1= test22@qq.com
Date=`date +%Y-%m-%d_%T`
#
(sleep 1;echo "helo hello";sleep 1;echo "auth login";sleep 1;echo "$USER64";sleep 1;echo "$PASD64";sleep 1;echo "mail from: <$USER>";sleep 1;echo "rcpt to: <$TOUSER>";sleep 1;echo "data";sleep 1;echo -e "subject: $1\n";sleep 1;echo -e "nagios:\nType:$1\nHost:$2\nState:$3\nAddress:$4\nInfo:$5\nDate/Time:$6\n$Date";echo ".";echo "quit") | telnet $IP $PORT
(sleep 1;echo "helo hello";sleep 1;echo "auth login";sleep 1;echo "$USER64";sleep 1;echo "$PASD64";sleep 1;echo "mail from: <$USER>";sleep 1;echo "rcpt to: <$TOUSER1>";sleep 1;echo "data";sleep 1;echo -e "subject: $1\n";sleep 1;echo -e "nagios:\nType:$1\nHost:$2\nState:$3\nAddress:$4\nInfo:$5\nDate/Time:$6\n$Date";echo ".";echo "quit") | telnet $IP $PORT

================================================

被监控端安装

useradd -m -s /bin/nologin nagios

cd nagios-plugins-1.4.11

./configure --with-nagios-user=nagios --with-nagios-group=nagios

make

make install

chown nagios.nagios -R nagios

cd nrpe-2.12

./configure --enable-ssl=no --with-nagios-user=nagios --with-nagios-group=nagios

make all

make install-plugin

make install-daemon

make install-daemon-config

启动nrpe不使用ssl

/usr/local/nagios/bin/nrpe -n -c /usr/local/nagios/etc/nrpe.cfg -d

测试check_nrpe

/usr/local/nagios/libexec/check_nrpe -H localhost

NRPE v2.12

在监控主机上测试

/usr/local/nagios/libexec/check_nrpe -n -H 172.17.1.201

拒绝

首先在被监控主机上，添加允许访问的主机ip

vi nrpe.cfg

allowed_hosts=127.0.0.1,172.17.1.202

/usr/local/nagios/libexec/check_nrpe -n -H 172.17.1.201

NRPE v2.12

配置监控服务器

简单说明一下原理

监控服务器通过nrpe 向被监控服务器的nrpe发送命令，然后被监控端的nrpe查找nrpe.cfg文件，找到匹配的命令。

所以被监控端，要安装相关的插件，配置nrpe.cfg

有的插件不需要在被控端得nrpe.cfg中配置，如，check_ping check_http

校验文件

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

配置监控

被监控机

下载脚本，放到libexec下

测试一下命令

check_mem.pl

./check_mem.pl -w 95,60 -c 120,80 -v

添加要监控端要执行的命令，定义命令的名字和插件位置及相关参数，如果想使用变量，编译nrpe的时候要指定dont_blame_nrpe，dont_blame_nrpe=1可以使用变量

vi nrpe.cfg

command[check_df]=/usr/local/nagios/libexec/check_disk -w 20% -c 10%

command[check_mem]=/usr/local/nagios/libexec/check_mem.pl -w 90,30 -c 95,50 -v

查看指定端口的所有连接数

command[check_netstat]=/usr/local/nagios/libexec/check_netstat.pl -p '>'$ARG1$ -w $ARG2$ -c $ARG3$

查看指定端口的ESTABLISHED的连接数

command[check_netstat_ESTABLISHED]=/usr/local/nagios/libexec/check_netstat.pl -p '>'$ARG1$ -w $ARG2$ -c $ARG3$ -e

command[check_netstat]=/usr/local/nagios/libexec/check_netstat.pl -p 80 -w 400 -c 800

command[check_netstat_ESTABLISHED]=/usr/local/nagios/libexec/check_netstat.pl -p 80 -w 200 -c 500 -e

查找访问的页面中指定的字符串，如果存在ok，响应时间警告值11，临界值21

command[check_http]=/usr/local/nagios/libexec/check_http -H $ARG1$ -r $ARG2$ -w $ARG3$ -c $ARG4$

检测用户进程数

command[check_user_procs]=/usr/local/nagios/libexec/check_procs -u $ARG1$

检测dns状态

command[check_bind]=/usr/local/nagios/libexec/check_bind.sh -p /var/run/bind/run/ -n named.172.pid -s /etc/bind

command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10

command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20

command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z

command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200

监控机

添加一个主机监控文件和一个服务监控文件

vi nagios.cfg

#cfg_file=/usr/local/nagios/etc/objects/localhost.cfg

cfg_file=/usr/local/nagios/etc/objects/hosts.cfg

cfg_file=/usr/local/nagios/etc/objects/services.cfg

command_check_interval=10s

在命令文件中添加check_nrpe命令的定义

vi commands.cfg

define command{

command_name check_nrpe

command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$

}

define command{

command_name check_bind

command_line /usr/local/nagios/libexec/check_bind.sh -p /var/run/bind/run/ -n named.172.pid -s /etc/bind

}

define command{

command_name check_web

command_line $USER1$check_http -H $HOSTADDRESS$ -r $ARG1$ -w $ARG2$ -c $ARG3$

}

check_http -H www.dledu.com -r 2008 -w 1 -c 2

配置发送邮件===================================================

自己写的发送邮件脚本

cp mail_host.sh mail_ser.sh /bin/

chown nagios /bin/mail_*

chmod u+x /bin/mail_*

define command{

command_name notify-host-by-email

command_line /bin/mail_host.sh "$NOTIFICATIONTYPE$" "$HOSTNAME$" "$HOSTSTATE$" "$HOSTADDRESS$" "$HOS

TOUTPUT$"

}

define command{

command_name notify-service-by-email

command_line /bin/mail_ser.sh "$NOTIFICATIONTYPE$" "$SERVICEDESC$" "$HOSTALIAS$" "$HOSTADDRESS$" "$S

ERVICESTATE$" "$LONGDATETIME$" "$SERVICEOUTPUT$"

}

配置联系人文件加入自己的邮箱地址

vi contacts.cfg

define contact{

contact_name nagiosadmin

# use generic-contact

alias Nagios Admin

host_notifications_enabled 1

service_notifications_enabled 1

service_notification_period 24x7

host_notification_period 24x7

service_notification_options w,u,c,r

host_notification_options d,u,r

service_notification_commands notify-service-by-email

host_notification_commands notify-host-by-email

email dlsxw@qq.com

}

==========================================================================

=========================每添加一个主机要配置一次===========================

定义要监控的主机和主机的服务

vi hosts.cf

define host{

host_name test-server

alias test server

address 172.17.1.201

check_command check-host-alive

max_check_attempts 5

notification_interval 10

notification_period 24x7

notification_options d,u,r

contacts nagiosadmin

}

===============每个主机要检测的服务=========================

主要监控的服务有，prcess，load，disk，http

vi service.cfg

define service{

host_name test-server

service_description memory

check_command check_nrpe!check_mem!110,50!150,80

check_period 24x7

max_check_attempts 4 发生4次不能访问，认定发生故障

normal_check_interval 3 故障累计3次，报警

retry_check_interval 2 告警之后每两分钟再进行一次检查

notification_interval 10 如果10分钟之后仍然没有恢复，再发送一次告警

notification_period 24x7

notification_options w,u,c,r

contacts nagiosadmin

}

define service{

host_name test-server

service_description disk

check_command check_nrpe!check_df

check_period 24x7

max_check_attempts 4

normal_check_interval 3

retry_check_interval 2

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

contacts nagiosadmin

}

define service{

host_name test-server

service_description load

check_command check_nrpe!check_load

check_period 24x7

max_check_attempts 4

normal_check_interval 3

retry_check_interval 2

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

contacts nagiosadmin

}

define service{

host_name test-server

service_description netstat

check_command check_nrpe!check_netstat

check_period 24x7

max_check_attempts 4

normal_check_interval 3

retry_check_interval 2

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

contacts nagiosadmin

}

define service{

host_name test-server

service_description netstat_ESTABLISHED

check_command check_nrpe!check_netstat_ESTABLISHED

check_period 24x7

max_check_attempts 4

normal_check_interval 3

retry_check_interval 2

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

contacts nagiosadmin

}

define service{

host_name test-server

service_description www.dledu.com

check_command check_http

check_period 24x7

max_check_attempts 4

normal_check_interval 1

retry_check_interval 1

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

contacts nagiosadmin

}

define service{

host_name test-server

service_description procs_apache

check_command check_nrpe!check_apache_procs

check_period 24x7

max_check_attempts 4

normal_check_interval 3

retry_check_interval 2

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

contacts nagiosadmin

}

define service{

host_name test-server

service_description bind_status

check_command check_bind

check_period 24x7

max_check_attempts 4

normal_check_interval 3

retry_check_interval 2

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

contacts nagiosadmin

}

define service{

host_name test-server

service_description procs_total

check_command check_nrpe!check_total_procs

check_period 24x7

max_check_attempts 4

normal_check_interval 3

retry_check_interval 2

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

contacts nagiosadmin

}

define service{

host_name test-server

service_description procs_zombie

check_command check_nrpe!check_zombie_procs

check_period 24x7

max_check_attempts 4

normal_check_interval 3

retry_check_interval 2

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

contacts nagiosadmin

}

windows被监控机

先安装NSClient++-0.3.7-Win32.msi，安装的时候选择前3个模块，指定允许监控的nagios服务器地址

nagios服务器：

定义要监控的项目

修改/usr/local/nagios/etc/services.cfg文件；

cfg_file=/usr/local/nagios/etc/objects/windows.cfg

关于check_nt的用法可以使用下面命令查看帮助：

# /usr/local/nagios/libexec/check_nt -h

下面给出一些常用的参数：

1)监控windows服务器运行的时间

check_command check_nt!UPTIME

2)监控Windows服务器的CPU负载,如果5分钟超过80%则是warning,如果5分钟超过90%则是critical

check_command check_nt!CPULOAD!-l 5,80,90

3)监控Windows服务器的内存使用情况,如果超过了80%则是warning,如果超过90%则是critical.

check_command check_nt!MEMUSE!-w 80 -c 90

4)监控Windows服务器C:\盘的使用情况,如果超过80%已经使用则是warning,超过90%则是critical

check_command check_nt!USEDDISKSPACE!-l c -w 80 -c 90

注:-l后面接的参数用来指定盘符

5)监控Windows服务器D:\盘的使用情况,如果超过80%已经使用则是warning,超过90%则是critical

check_command check_nt!USEDDISKSPACE!-l d -w 80 -c 90

6)监控Windows服务器的W3SVC服务的状态,如果服务停止了,则是critical

check_command check_nt!SERVICESTATE!-d SHOWALL -l W3SVC

7)监控Windows服务器的Explorer.exe进程的状态,如果进程停止了,则是critical

check_command check_nt!PROCSTATE!-d SHOWALL -l Explorer.exe

转载于:https://blog.51cto.com/dlsxw/905638

weixin_33691700

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
debian下 nagios 安装记录

安装apache和相关软件apt-get install apache2apt-get install libapache2-mod-php5apt-get install build-essentialapt-get install libgd2-xpm-dev添加相关用户和组useradd -m -s /bin/nologin nagi...
复制链接

扫一扫

debian下 nagios 安装记录

“相关推荐”对你有帮助么？