用nagios监控linux和windows服务器（五）

最新推荐文章于 2024-09-16 13:36:19 发布

weixin_34166847

最新推荐文章于 2024-09-16 13:36:19 发布

阅读量91

点赞数

文章标签：操作系统运维网络

原文链接：http://blog.51cto.com/andyxu/624579

版权

八、使用Nagios监控Windows服务器

1.windows机器的配置

1).监控方法的选择

其实Nagios对服务器的监控方法有很多，但大体上可以分为三种:

a. 通过snmp协议编写脚本使用snmpwalk或snmpget等client程序对远程主机进行数据的抓取

b. 走c/s方式，通过特定的客户端用他们自己的协议对服务器进行数据抓取，这一类需要在目标服务器上安装服务器端（即Listener），服务器端通过自己的程序对服务器上的数据进行收集（wmi，vbscript），最后再由nagios服务器上的客户端来取数据。这类的代表应用有NSClient++，pNSClient，nrpe_nt等等

c. 还是走c/s方式，只不过这次nagios本机变成了服务器端，目标监控服务器上通过安装客户端向nagios服务器推送本机的相关数据。这类的代表应用有NSCA等

鉴于我需要用到performance data来使用pnp进行绘图，而本人编程能力非常有限，再加上我是个非常懒的SA。所以，我选择了上面的第二类方式对我的所有Windows服务器进行监控，选择的应用是NSClient++。

NSCLient++是针对Windows操作系统的一款简单但是功能强大又安全的监控服务器端，同时兼容了NSClient/NRPE/NSCA三种方式。它能监控cpu，内存，硬盘，进程，服务状态，性能计数器等等。NSClient++提供的CheckCommands.

2).下载NSClient++-Win32-0.3.5.msi并安装。

运行nsclient＋＋，打开cmd窗口，输入：

NSClient++ /install

NSClient++ SysTray install

NSClient++ /start

如果有防火墙，请开放相应端口。

3). 修改配置文件

a. 到安装目录打开NSC.ini文件进行修改：

在[modules]模块，将除RemoteConfiguration.dll外的所有dll文件明前的注释（;)去掉。

在[Settings]模块可以设置一个连接密码password=PWD，为了简单，在此不设密码。设置 allowed_hosts=127.0.0.1/32,192.168.0.19，可以连接的监控服务器的地址，如果写成192.168.0.0/24 则表示该子网内的所有机器都可以访问；如果这个地方是空白则表示所有的主机都可以连接上来（注意在[NSClient]有allowed_hosts的同样设置，不要设置错了），最后不要忘记去掉前面的注释符（;）。

b. 其他模块的配置：（把以下模块相关行的注释去掉就行了）

[log]

file=nsclient.log

date_mask=%Y-%m-%d %H:%M:%S

root_folder=exe

[NSClient]

#允许访问的主机IP，多个主机用,分隔

allowed_hosts=127.0.0.1/32

#监听端口

port=12489

socket_timeout=30

[NRPE]

#监听端口

port=5666

command_timeout=60

#不使用ssl，否则容易出错

use_ssl=0

#允许访问的主机IP，多个主机用,分隔

allowed_hosts=127.0.0.1/32

socket_timeout=30

#启用performance_data（关键，就看着他画图呢）

performance_data=1

[NRPE Handlers]

#定义NRPE的命令

#监测内存

check_mem=inject checkMem MaxWarn=80% MaxCrit=90% ShowAll=long type=physical

编辑完成以后保存关闭，在windows防火墙里开放tcp12489和5666端口，然后在Windows的服务里面找到新装的NSClientpp服务，启动它。

2.Linux监控服务器上nagios的配置

1).创建监控配置文件print-w-30.cfg，使用check_nt命令监控windows系统信息（此命令默认已定义）。

Ｗindows监控示例配置文件：

[root@tech etc]# vi /usr/local/nagios/etc/servers/print-w-80.cfg

###################################################################

# WINDOWS.CFG – SAMPLE CONFIG FILE FOR MONITORING A WINDOWS MACHINE

# Last Modified: 06-13-2007

# NOTES: This config file assumes that you are using the sample configuration

# files that get installed with the Nagios quickstart guide.

####################################################################

# HOST DEFINITIONS

####################################################################

# Define a host for the Windows machine we’ll be monitoring

# Change the host_name, alias, and address to fit your situation

# 定义windows主机名和IP的地方

define host{

use windows-server ; Inherit default values from a template

host_name print80 ; The name we’re giving to this host

alias Print80 ; A longer name associated with the host

address 192.168.0.80 ; IP address of the host

}

# address后面跟的是被监控端windows机器的IP地址，多个IP用,分隔

####################################################################

# HOST GROUP DEFINITIONS

# 主机组在/usr/local/nagios/etc/servers/hostgroup.cfg中单独配置

####################################################################

# Define a hostgroup for Windows machines

# All hosts that use the windows-server template will automatically be a member of this group

#define hostgroup{

# hostgroup_name windows-servers ; The name of the hostgroup

# alias Windows Servers ; Long name of the group

# }

#####################################################################

# SERVICE DEFINITIONS

#####################################################################

# Create a service for monitoring the version of NSCLient++ that is installed

# Change the host_name to match the name of the host you defined above

# 利用check_nt命令监控NSClient++，check_nt命令默认已定义

define service{

use generic-service

host_name print80

service_description NSClient++ Version

check_command check_nt!CLIENTVERSION

}

# Create a service for monitoring the uptime of the server

# Change the host_name to match the name of the host you defined above

# 监控系统运行的时间

define service{

use generic-service

host_name print80

service_description Uptime

check_command check_nt!UPTIME

}

# Create a service for monitoring CPU load

# Change the host_name to match the name of the host you defined above

# 监控CPU使用率，80%警告，90%严重

define service{

use generic-service

host_name print80

service_description CPU Load

check_command check_nt!CPULOAD!-l 5,80,90

}

# Create a service for monitoring

# Change the host_name to match the name of the host you defined above

# 监控内存使用率，记住此为虚拟内存和物理内存的总量，80%警告，90%严重

define service{

use generic-service

host_name print80

service_description Memory Usage

check_command check_nt!MEMUSE!-w 80 -c 90

}

# Create a service for monitoring C:\ disk usage

# Change the host_name to match the name of the host you defined above

# 监控C盘空间的使用率，80%警告，90%严重

define service{

use generic-service

host_name print80

service_description C_Drive_Space

check_command check_nt!USEDDISKSPACE!-l c -w 80 -c 90

}

# Create a service for monitoring the W3SVC service

# Change the host_name to match the name of the host you defined above

# 监控w3svc服务的状态，w3svc是用来监视web服务的健康状况的

define service{

use generic-service

host_name print80

service_description W3SVC

check_command check_nt!SERVICESTATE!-d SHOWALL -l W3SVC

}

# Create a service for monitoring the Explorer.exe process

# Change the host_name to match the name of the host you defined above

# 监控explorer进程

define service{

use generic-service

host_name print80

service_description Explorer

check_command check_nt!PROCSTATE!-d SHOWALL -l Explorer.exe

}

2).主机组配置文件

root@tech etc]# vi /usr/local/nagios/etc/servers/hostgroup.cfg

# 配置linux主机组的地方，可将要监控的linux主机加进来

define hostgroup{

hostgroup_name linux-servers ; The name of the hostgroup

alias Linux Servers ; Long name of the group

members localhost,wiki ; Comma separated list of hosts that belong to this group

}

# 配置windows主机组的地方，可将要监控的windows主机加进来

define hostgroup{

hostgroup_name windows-servers ; The name of the hostgroup

alias Windows Servers ; Long name of the group

members print80 ; Comma separated list of hosts that belong to this group

}

# 去掉上面windows主机配置部分的注释

3). 使用NSClient和NRPE监控

修改commands.cfg，增加使用NSClient和NRPE收集数据的命令，因为NSClient监测到的内存大小都大于实际的物理内存（估计可能是总计），所以使用NRPE监测内存

# 'check_remote_nt_disk' command definition，监测硬盘使用量

define command{

command_name check_remote_nt_disk

command_line$USER1$/check_nt -H $ARG1$ -p $ARG2$ -v $ARG3$ -l $ARG4$ -w $ARG5$ -c $ARG6$

}

# 'check_remote_nt_cpu' command definition，监测cpu负载

define command{

command_name check_remote_nt_cpu

command_line $USER1$/check_nt -H $ARG1$ -p $ARG2$ -v $ARG3$ -l $ARG4$

}

# 'check_nt_mem_nrpe' command definition，监测内存使用量

define command{

command_name check_nt_mem_nrpe

command_line $USER1$/check_nrpe -H $ARG1$ -n -p $ARG2$ -c $ARG3$

}

# 'check_avg_disk_queue' command definition，监测硬盘读写队列

define command{

command_name check_avg_disk_queue

command_line $USER1$/check_nt -H $ARG1$ -p $ARG2$ -v $ARG3$ -l $ARG4$ -d $ARG5$ -w $ARG6$ -c $ARG7$

}

修改print-w-80.cfg中service定义里面的check_command

define service{

use web-service,service-pnp

host_name web1

service_description disk-d

check_command check_remote_nt_disk!10.10.10.11!5666!USEDDISKSPACE!d!85!90

}

define service{

use web-service,service-pnp

host_name web1

service_description mem

check_command check_nt_mem_nrpe!10.10.10.11!5667!check_mem

}

define service{

use web-service,service-pnp

host_name web4

service_description avg-disk-queue

check_command check_avg_disk_queue!10.10.10.24!5666!COUNTER!"\\PhysicalDisk(_Total)\\Avg. Disk Queue Length","%.2f"!SHOWALL!14!28

}

修改完以后重新配置nagios使配置生效

#/etc/init.d/nagios reload

转载于:https://blog.51cto.com/andyxu/624579

weixin_34166847

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫