目录
- 1. 写在前面
- 2. nagois服务端各配置文件之间的关系
- 3 . 编辑/usr/local/nagios/etc/objects/localhost.cfg
- 4. 编辑/usr/local/nagios/etc/objects/hosts.cfg
- 5. 编辑/usr/local/nagios/etc/objects/services.cfg
- 6. 编辑/usr/local/nagios/etc/objects/commands.cfg
- 7. 编辑/usr/local/nagios/etc/objects/contacts.cfg
- 8. 配置/usr/local/nagios/etc/nagios.cfg
- 9. 验证配置文件的正确性
- 10. 重启nagios-server
1. 写在前面
nagois主要用于监控一台或者多台本地主机及远程主机的各种信息,包括本机资源及对外的服务等。
默认的nagois配置没有任何监控内容,仅是一些模板文件。若要让nagois提供服务,就必须修改配置文件,增加要监控的主机和服务,下面将详细介绍。
2. nagois服务端各配置文件之间的关系
在nagois服务端的配置过程中涉及到的几个定义有:
主机、主机组,服务、服务组,联系人、联系人组,监控时间,监控命令等,
从这些定义可以看出,nagois服务端各个配置文件之间是互为关联,彼此引用的。
用nagios成功配置出一个监控系统,必须要弄清楚每个配置文件之间依赖与被依赖的关系,最重要的有四点:
- 第一:定义监控哪些主机、主机组、服务和服务组;
- 第二:定义这个监控要用什麽命令实现;
- 第三:定义监控的时间段;
- 第四:定义主机或服务出现问题时要通知的联系人和联系人组。
为了能更清楚的说明问题,同时也为了维护方便,建议将nagios服务端(/usr/local/nagios/etc/objects/)各个定义对象创建独立的配置文件:
文件 | 默认存在自行创建 | 说明 |
---|---|---|
hosts.cfg | 需要自己创建 | 定义主机和主机组 |
services.cfg | 需要自己创建 | 定义服务和服务组 |
commands.cfg | 默认存在的 | 定义监控命令 |
timeperiods.cfg | 默认存在的 | 定义监控时间段 |
contacts.cfg | 默认存在的 | 定义联系人和联系人组 |
templates.cfg | 默认存在的 | 作为资源引用文件 |
locahost.cfg | 默认存在的 | 定义监控服务端主机项目 |
说明:nagios-server主程序和nagios插件安装完毕之后,是可以做到对本主机的监控的,完全不必再像监控其他远程主机一样要在被监控端安装nrpe,那是因为监控本机的配置项全部来源于/usr/local/nagios/etc/objects/locahost.cfg,如果想不监控本机可以选择将/usr/local/nagios/etc/nagios.cfg配置文件中注释掉:cfg_file=/usr/local/nagios/etc/objects/localhost.cfg当然你也可以在此主机上安装nrpe,使用和其他远程主机一样的手段来监控本主机。
[root@nagios-server ~]# cd /usr/local/nagios/etc/
[root@nagios-server etc]# ll
总用量 72
-rw-rw-r--. 1 nagios nagios 13374 5月 8 15:16 cgi.cfg
-rw-r--r--. 1 root root 50 5月 8 15:13 htpasswd
-rw-rw-r--. 1 nagios nagios 44833 5月 8 14:41 nagios.cfg
drwxrwxr-x. 2 nagios nagios 4096 5月 8 14:41 objects
-rw-rw----. 1 nagios nagios 1312 5月 8 14:41 resource.cfg
[root@nagios-server ~]# cd /usr/local/nagios/etc/objects
[root@nagios-server objects]# ll
总用量 48
-rw-rw-r--. 1 nagios nagios 7696 5月 8 14:41 commands.cfg
-rw-rw-r--. 1 nagios nagios 2138 5月 8 14:41 contacts.cfg
-rw-r--r--. 1 root root 0 5月 8 15:42 hosts.cfg #新创建的
-rw-rw-r--. 1 nagios nagios 5379 5月 8 14:41 localhost.cfg
-rw-rw-r--. 1 nagios nagios 3069 5月 8 14:41 printer.cfg
-rw-r--r--. 1 root root 0 5月 8 15:42 services.cfg #新创建的
-rw-rw-r--. 1 nagios nagios 3252 5月 8 14:41 switch.cfg
-rw-rw-r--. 1 nagios nagios 10595 5月 8 14:41 templates.cfg
-rw-rw-r--. 1 nagios nagios 3178 5月 8 14:41 timeperiods.cfg
-rw-rw-r--. 1 nagios nagios 3991 5月 8 14:41 windows.cfg
3 . 编辑/usr/local/nagios/etc/objects/localhost.cfg
因为 check_http 服务关联 /usr/local/nagios/etc/objects/command.cfg里面的 check_http 命令,默认是监控http 80端口,但是现在我想监控本机的8081端口
# Define a service to check HTTP on the local machine.
# Disable notifications for this service by default, as not all users may have HTTP enabled.
define service{
use local-service ; Name of service template to use
host_name localhost
service_description HTTP
check_command check_local_http ;这两处保持一致
notifications_enabled 0
}
# 'check_local_http' command definition
define command{
command_name check_local_http ;这两处保持一致
command_line $USER1$/check_http -I 127.0.0.1 -p 8081 ;这里写本机ip地址和http端口
}
# Define a service to check HTTP on the local machine.
# Disable notifications for this service by default, as not all users may have HTTP enabled.
define service{
use local-service ; Name of service template to use
host_name localhost
service_description HTTP
check_command check_local_http ;
notifications_enabled 0
}
4. 编辑/usr/local/nagios/etc/objects/hosts.cfg
[root@nagios-server objects]# cat /usr/local/nagios/etc/objects/hosts.cfg
define host{
use linux-server
host_name local-192.168.2.122
alias local-192.168.2.122
address 192.168.2.122
}
define host{
use linux-server
host_name local-192.168.2.124
alias local-192.168.2.124
address 192.168.2.124
}
define host{
use linux-server
host_name local-192.168.2.125
alias local-192.168.2.125
address 192.168.2.125
}
define host{
use linux-server
host_name local-192.168.2.167
alias local-192.168.2.167
address 192.168.2.167
}
define hostgroup{
hostgroup_name linux-servers
alias Linux Servers
members local-192.168.2.122,local-192.168.2.124,local-192.168.2.125,local-192.168.2.167
}
define hostgroup{
hostgroup_name Mysql-servers
alias Mysql Servers
members local-192.168.2.124,,local-192.168.2.167
}
define hostgroup{
hostgroup_name Tomcat-servers
alias Tomcat Servers
members local-192.168.2.122
}
define hostgroup{
hostgroup_name Tcp-servers
alias Tcp Servers
members local-192.168.2.122,local-192.168.2.124,local-192.168.2.125
}
5. 编辑/usr/local/nagios/etc/objects/services.cfg
[root@nagios-server objects]# cat /usr/local/nagios/etc/objects/services.cfg
#define service{
# use local-service ; Name of service template to use
# hostgroup_name linux-servers
# service_description SSH
# check_command check_ssh ;不依赖于check_nrpe,即不依赖于客户端代理检测程序nrpe
# notifications_enabled 0
# }
define service{
use local-service ; Name of service template to use
hostgroup_name linux-servers
service_description PING
check_command check_ping!100.0,20%!500.0,60% ;不依赖于check_nrpe,即不依赖于客户端代理检测程序nrpe
}
define service{
use local-service ; Name of service template to use
hostgroup_name linux-servers
service_description Root Partition
check_command check_nrpe!check_root_disk ;依赖于check_nrpe,客户端代理检测程序nrpe检测磁盘root分区
}
define service{
use local-service ; Name of service template to use
hostgroup_name linux-servers
service_description Current Users
check_command check_nrpe!check_users ;依赖于check_nrpe,客户端代理检测程序nrpe检测登录用户
}
define service{
use local-service ; Name of service template to use
hostgroup_name linux-servers
service_description Total Processes
check_command check_nrpe!check_total_procs #依赖于check_nrpe,客户端代理检测程序nrpe检测主机进程
}
define service{
use local-service ; Name of service template to use
hostgroup_name linux-servers
service_description Current Load
check_command check_nrpe!check_load ;依赖于check_nrpe,客户端代理检测程序nrpe检测主机负责
}
define service{
use local-service ; Name of service template to use
hostgroup_name linux-servers
service_description Swap Usage
check_command check_nrpe!check_swap ;依赖于check_nrpe,客户端代理检测程序nrpe检测swap交换分区
}
define service{
use local-service ; Name of service template to use
hostgroup_name Mysql-servers
service_description Mysql Status
check_command check_nrpe!check_mysql ;依赖于check_nrpe,客户端代理检测程序nrpe检测mysql程序运行情况
}
define service{
use local-service ; Name of service template to use
hostgroup_name Tomcat-servers
service_description Tomcat Status
check_command check_nrpe!check_tomcat ;依赖于check_nrpe,客户端代理检测程序nrpe检测tomcat程序运行情况
}
define service{
use local-service ; Name of service template to use
hostgroup_name Tcp-servers
service_description Tcp Status
check_command check_nrpe!check_tcp ;依赖于check_nrpe,客户端代理检测程序nrpe检测tcp端口运行情况
}
6. 编辑/usr/local/nagios/etc/objects/commands.cfg
[root@nagios-server objects]# vi /usr/local/nagios/etc/objects/commands.cfg
在此配置文件最后添加以下内容,定义监控命令
################################################################################
#
# created by yuki on 2020.05.08
# related to the commands.cfg about check_nrpe
################################################################################
# 'check_nrpe' command definition
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
7. 编辑/usr/local/nagios/etc/objects/contacts.cfg
[root@nagios-server objects]# vi /usr/local/nagios/etc/objects/contacts.cfg
###############################################################################
###############################################################################
#
# CONTACTS
#
###############################################################################
###############################################################################
define contact{
contact_name nagiosadmin ; Short name of user
use generic-contact ; Inherit default values from generic-contact template (defined above)
alias Nagios Admin ; Full name of user
email nagios@localhost ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******
}
################################################################################
#
# created by yuki on 2020.05.08
#
################################################################################
define contact{
contact_name yuki
use generic-contact
alias yuki
email 123456789@qq.com
}
###############################################################################
###############################################################################
#
# CONTACT GROUPS
#
# modified by yuki on 2020.05.08
###############################################################################
###############################################################################
define contactgroup{
contactgroup_name admins
alias Nagios Administrators
members yuki
}
8. 配置/usr/local/nagios/etc/nagios.cfg
以上配置完成以后,然后配置/usr/local/nagios/etc/nagios.cfg来引用以上面几个配置文件。
[root@nagios-server etc]# vi /usr/local/nagios/etc/nagios.cfg +36
在32行下面添加两行如下配置,表示引用此配置文件:
cfg_file=/usr/local/nagios/etc/objects/hosts.cfg
cfg_file=/usr/local/nagios/etc/objects/services.cfg
9. 验证配置文件的正确性
[root@nagios-server etc]# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Nagios Core 4.3.2
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 2017-05-09
License: GPL
Website: https://www.nagios.org
Reading configuration data...
Read main config file okay...
Warning: Duplicate definition found for hostgroup 'linux-servers' (config file '/usr/local/nagios/etc/objects/localhost.cfg', starting on line 45)
#以上错误监控主机组名重复,随便修改成其他的,如linux-servers-local
Error: Could not add object property in file '/usr/local/nagios/etc/objects/localhost.cfg' on line 46.
Error: Invalid max_check_attempts value for host 'local-192.168.2.122'
Error: Could not register host (config file '/usr/local/nagios/etc/objects/hosts.cfg', starting on line 1)
Error processing object config files!
***> One or more problems was encountered while processing the config files...
Check your configuration file(s) to ensure that they contain valid
directives and data definitions. If you are upgrading from a previous
version of Nagios, you should be aware that some variables/definitions
may have been removed or modified in this version. Make sure to read
the HTML documentation regarding the config files, as well as the
'Whats New' section to find out what has changed.
10. 重启nagios-server
[root@dscq-236 objects]# chown -R nagios:nagios /usr/local/nagios/
[root@nagios-server objects]# systemctl start nagios.service && echo $?
0
[root@nagios-server objects]# systemctl stop nagios.service && echo $?
0
[root@nagios-server objects]# systemctl restart nagios.service && echo $?
0
[root@nagios-server objects]# systemctl status nagios.service && echo $?
0