什么是白盒与黑盒监控
- 白盒监控
把对应的exporter程序安装到被监控的目标主机上,内部暴露进行监控。
- 黑盒监控
无须安装程序,用户只需要在于promethenus和被监控目标互通的环境中,通过HTTP、HTTPS、DNS、TCP、ICMP等方式对网络进行探测监控
BlackBox Exporter 是什么
BlackBox Exporter 是 Prometheus 官方提供的黑盒监控解决方案,允许用户通过 HTTP、HTTPS、DNS、TCP 以及 ICMP 的方式对网络进行探测,这种探测方式常常用于探测一个服务的运行状态,观察服务是否正常运行
一、安装
1.引入库
# wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.21.0/blackbox_exporter-0.21.0.linux-amd64.tar.gz
tar xf blackbox_exporter-0.21.0.linux-amd64.tar.gz
mv blackbox_exporter-0.21.0.linux-amd64 /blackbox_exporter
# 使用systemd管理blackbox_exporter服务
vim /usr/lib/systemd/system/blackbox_exporter.service
[Unit]
Description=blackbox_exporter
After=network.target
[Service]
User=root
Type=simple
ExecStart=/blackbox_exporter/blackbox_exporter --config.file=/blackbox_exporter/blackbox.yml
Restart=on-failure
[Install]
WantedBy=multi-user.target
systemctl daemon-reload
systemctl start blackbox_exporter
systemctl enable blackbox_exporter
ss -tanpl | grep 9115
2.创建 规则
aa
vim /prometheus/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
# 网站监控
- job_name: 'http_status'
metrics_path: /probe
params:
module: [http_2xx]
static_configs:
- targets:
- http://www.baidu.com
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 192.168.31.63:9115
# ping 检测
- job_name: 'ping_status'
metrics_path: /probe
params:
module: [icmp]
static_configs:
- targets: ['192.168.31.62']
labels:
instance: 'ping_status'
group: 'icmp'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: 192.168.31.63:9115
# 端口监控
- job_name: 'port_status'
metrics_path: /probe
params:
module: [tcp_connect]
static_configs:
- targets: ['192.168.31.62:80']
labels:
instance: 'port_status'
group: 'port'
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- target_label: __address__
replacement: 192.168.31.63:9115
# systemctl restart prometheus
配置 alertmanage 进行报警
groups:
- name: example
rules:
- alert: curlHttpStatus
expr: probe_http_status_code{job="http_status"}>=400 and probe_success{job="http_status"}==0
for: 1m
labels:
severity: 严重
annotations:
description: '{{$labels.instance}} 不可访问,请及时查看,当前状态码为{{$value}}'