在此之前,首先确认大家已经成功安装prometheus和Alertmanager 在这里不再赘述。
自定义Prometheus告警规则
修改Prometheus配置文件prometheus.yml,添加以下配置:
rule_files:
- /etc/prometheus/rules/*.rules
在目录/etc/prometheus/rules/下创建告警文件hoststats-alert.rules内容如下:
groups:
- name: hostStatsAlert
rules:
- alert: hostCpuUsageAlert
expr: sum(avg without (cpu)(irate(node_cpu{mode!='idle'}[5m]))) by (instance) > 0.85
for: 1m
labels:
severity: page
annotations:
summary: "Instance {
{ $labels.instance }} CPU usgae high"
description: "{
{ $labels.instance }} CPU usage above 85% (current value: {
{ $value }})"
- alert: hostMemUsageAlert
expr: (node_memory_MemTotal - node_memory_MemAvailable)/node_memory_MemTotal > 0.85
for: 1m
labels:
severity: page
annotations:
summary: "Instance {
{ $labels.instance