icinga2学习和使用（二）

最新推荐文章于 2024-05-07 16:00:00 发布

weixin_33693070

最新推荐文章于 2024-05-07 16:00:00 发布

阅读量360

点赞数

文章标签：运维

原文链接：http://blog.51cto.com/xyfft/1613606

版权

配置文件说明

本篇是说明icinga2的配置文件。跟nagios比较，逻辑思维是一样的，定义主机（组）、服务（组）、检测命令、模板、检测频率等。但是实际使用的语法却不一样，重新定义了一套关键词。具体细节可参考下文。有些地方我也没能搞明白，希望读者童鞋能一起探讨。

默认采用yum安装的icinga2.

1 matser server上配置文件的两个目录：

/etc/icinga2/，更多的配置放在./conf.d下，这里主要是用来自定义配置。

文件名，只要你能明白是用来做什么的即可，不需要一定区分user，service什么的。

/usr/share/icinga2/include/，这里主要是一些已经定义好的命令，可以直接使用。同时也可以参考这些定义好的命令, 来实现自己的脚本插件.

2 各个配置文件的说明

commands.conf和command-plugins.conf 定义命令

object CheckCommand "ssh"{
      import"plugin-check-command"
      command = PluginDir+"/check_ssh"  #defined in constants.conf
      arguments = {
             "-p"= "$ssh_port$"
             "host"= {
                    value = "$ssh_address$"
                    skip_key = true
                    order = -1
             }
       }
      vars.ssh_address= "$address$"
}

object CheckCommand 为定义命令的固件关键词

import导入模板command.conf里的

Command用法，PluginDir定义在constants.conf

Arguments参数，如果是自定义的脚本，可以不需要在这里定义命令

"-p"= "$ssh_port$" 这个-p是插件本身的参数，后面的ssh_port是自定义名，格式$.....$

templates.conf 定义模板

针对host的检测模板：

template Host "generic-host" {
    max_check_attempts = 3
    check_interval = 5m
    retry_interval = 30s
    check_command = "hostalive"
}

针对service的检测模板：

template Service "generic-service" {
    max_check_attempts = 2
    check_interval = 5m
    retry_interval = 20s
}

template Host templateService固定格式，后面引号内名字自定义

max_check_attempts检测遇到问题，最多尝试次数

check_interval 检测的频率

retry_interval 如果检测遇到问题，重新检测的频率

通知模板：

template Notification "30mins-notification" {
     interval = 30m
     command = "mail-service-notification"
     states = [ Critical ]
     types = [ Problem, Recovery ]
     period = "24x7"
}

Command定义在commands.conf里

States这里设置需要发报警邮件的状态，我只设置critical，减少邮件量

Types为states的类型，很多

Perio报警的时间段

如果你想延迟第一次报警的时间，可如下：

apply Notification "mail" to Service {
     import "generic-notification"
     command = "mail-notification"
     users = [ "icingaadmin" ]
    times.begin = 15m // delay first notification
     assign where service.name == "ping4"
}

Tips：

When detecting a problem with a host/service Icinga re-checks theobject a number of times (based on the max_check_attempts and retry_intervalsettings) before sending notifications. This ensures that no unnecessarynotifications are sent for transient failures. During this time the object isin a SOFT state.After all re-checks have been executed and the object is still ina non-OK state the host/service switches to a HARD state and notifications are sent.

users.conf 用来定义报警和定义主机

object User "icingaadmin"{
     import "generic-user"
    display_name = "Icinga 2 Admin"
     groups = [ "icingaadmins" ]
     email = "dl-monitor@vuclip.com"
}
object Host "xx" {
    display_name = "xx"
     address = "xx"
     groups = [ "cs" ]
    check_command = "hostalive"
}

Object User 或Object Host是固定格式，后面的内容为自定义。

Host说明：

Import导入templates.conf里的模板

display_name 自定义

groups自定义，如果多个，用逗号隔开（但是是否每个都能用，有待确认）

address 可以是域名或者ip

check_command 检测主机的命令，这里用的hostalive，就是ping检测…

services.conf 定义服务（也可以给特别的服务单独写个xxx.conf）

        objectService "ssh" {        
             import "generic-service"
            check_command = "ssh"
             host_name= "hk"
            vars.ssh_port = "22221"
        }

针对单个主机的服务，可以用object Service的方式定义。

var.ssh_port这里是自定义参数的使用方式。var.为固定格式，后面跟参数名，参数名是在command-plugins.conf中定义的，等号后面是自定义的端口。

针对一个服务很多主机的情况，用如下apply service的方式定义：

applyService "total_procs" {
     import "generic-service"
    check_command = "nrpe"   # use nrpe command to check
    vars.nrpe_command ="check_total_procs" #commandon client server
     assignwhere "es" in host.groups
    ignorewherehost.address == ""
}
apply Service "http 80" {
     import "generic-service"
    check_command = "http" # commandon monitor server which has argument “-H”
     assign where "vu" in host.groups
     ignore where host.address == ""
}

用apply的方式，一定有关键词assign和ignore，后者可以为空，可以多行ignore（写在一行没成功）。

这里两个service定义，原理是一样的，都用插件，check_nrpe或者check_http，这里写的命令http或者nrpe已经定义在command-plugins.conf。

groups.confg 定义服务组或者主机组

object ServiceGroup "load" {
    display_name = "Load Checks"
    assignwhereservice.vars.nrpe_command== "check_load"
}
object ServiceGroup "ssh" {
    display_name = "Ssh Checks"
     assign where service.check_command== "ssh"
}
object HostGroup "es" {
    display_name = "es server"
}

notifications.conf 应用报警（之前做了模板，现在是应用）

apply Notification "mail-icingaadmin" to Host {
     import "mail-host-notification"
     user_groups = [ "icingaadmins" ]
     assign where host.vars.sla == "24x7"
}

apply Notification "mail-icingaadmin-5" to Service {
     import "5mins-notification"
     user_groups = [ "icingaadmins" ]
     assign where service.name == "ssh"
     assign where service.name == "check_system_5"
     assign where service.name == "zombie_procs"
     assign where service.name == "http 80"
     assign where service.name == "ssh"
}

icinga2的配置的语法, 远比我上面写的复杂, 支持正则, 各种宏变量, 非常的灵活.

验证并加载配置

/etc/init.d/icinga2 reload (会自动检查配置)

转载于:https://blog.51cto.com/xyfft/1613606

weixin_33693070

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
icinga2学习和使用（二）

配置文件说明本篇是说明icinga2的配置文件。跟nagios比较，逻辑思维是一样的，定义主机（组）、服务（组）、检测命令、模板、检测频率等。但是实际使用的语法却不一样，重新定义了一套关键词。具体细节可参考下文。有些地方我也没能搞明白，希望读者童鞋能一起探讨。默认采用yum安装的icinga2.1 matser server上配置文件的两个目录：/etc/icinga2/，...
复制链接

扫一扫