Wget https://github.com/prometheus/alertmanager/releases/download/v0.15.2/alertmanager-0.15.2.linux-amd64.tar.gz
tar zxvf alertmanager-0.15.2.linux-amd64.tar.gz
进入AlertManager的文件夹并将指定文件复制到指定目录中去:
cd alertmanager-0.15.2.linux-amd64
cp alertmanager amtool /usr/bin/
cp alertmanager.yml /etc/prometheus/
配置 systemd 的 unit 文件:
vi /lib/systemd/system/alertmanager.service
[Unit]
Description=Prometheus: the alerting system
Documentation=http://prometheus.io/docs/
After=prometheus.service
[Service]
ExecStart=/usr/bin/alertmanager --config.file=/etc/prometheus/alertmanager.yml
Restart=always
StartLimitInterval=0
RestartSec=10
[Install]
WantedBy=multi-user.target
vi /etc/prometheus/prometheus.yml
......
rule_files:
- /etc/prometheus/rules/ceph.yaml
修改/etc/prometheus/rules/ceph.yaml,发送告警消息:
groups:
- name: ceph-rule
rules:
- alert: CephCapacityUsage
expr: ceph_cluster_available_bytes / ceph_cluster_capacity_bytes * 100 > 85
for: 2m
labels:
product: ceph
annotations:
summary: "{{$labels.instance}}: Not enough capacity in Ceph detected"
description: "{{$labels.instance}}: Available capacity is used up to 70% (current value is: {{ $value }}"