Prometheus+Alertmanager实现邮件报警

最新推荐文章于 2024-04-07 09:10:30 发布

2018_like菜

最新推荐文章于 2024-04-07 09:10:30 发布

阅读量501

点赞数

分类专栏： Prometheus+Grafana Linux

本文链接：https://blog.csdn.net/u014756339/article/details/114674443

版权

Linux 同时被 2 个专栏收录

135 篇文章 2 订阅

订阅专栏

Prometheus+Grafana

10 篇文章 1 订阅

订阅专栏

一、告警规则参考

https://awesome-prometheus-alerts.grep.to/rules#host-and-hardware

下面是部署

二、部署Alertmanager
wget https://github.com/prometheus/alertmanager/releases/download/v0.20.0/alertmanager-0.20.0.linux-amd64.tar.gz
tar xvf alertmanager-0.20.0.linux-amd64.tar.gz
mv alertmanager-0.20.0.linux-amd64 /usr/local/bin/alertmanager

三、修改alertmanager的主配置文件【采用邮件告警】

cd /usr/local/bin/alertmanager
cat > alertmanager.yml << EOF
global:
resolve_timeout: 5m
smtp_smarthost: 'smtp.163.com:465'
smtp_from: 发件人邮箱
smtp_auth_username: 发件人邮箱
smtp_auth_password: 密码
smtp_require_tls: false

route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1m
receiver: 'mail'
receivers:
- name: 'mail'
email_configs:
- to: 收件人邮箱
EOF

检查配置文件是否正确

./amtool check-config alertmanager.yml

启动报警

nohup /app/alertmanager/alertmanager --config.file=/app/alertmanager/alertmanager.yml &

四、配置Prometheus与Alertmanager通信， prometheus.yml文件
vim prometheus.yml

# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
- ip:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:

# 这个first_rules.yml放在prometheus.yml同个级别目录下
- "first_rules.yml"
# - "second_rules.yml"

./promtool check config prometheus.yml

五、编写告警规则
官方示例：https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/

cat > first_rules.yml << EOF
groups:
- name: general.rules
rules:

# Alert for any instance that is unreachable for >5 minutes.
- alert: InstanceDown #报警名字
expr: up == 0
for: 1m
labels:
severity: error
annotations:
summary: "Instance {{ $labels.instance }} 停止工作"
description: "{{ $labels.instance }} of job {{ $labels.job }} 已停止1分钟>以上"
EOF
./promtool check config prometheus.yml
systemctl restart prometheus

能看到自己编写的规则浏览器

http://ip:9090/rules
六、验证告警

在172.16.38.238上停止node这个job
可以看到node已经down掉

等待两分钟左右可以收到告警邮件

状态变为FIRING

2018_like菜

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Prometheus+Alertmanager实现邮件报警

一、告警规则参考https://awesome-prometheus-alerts.grep.to/rules#host-and-hardware下面是部署二、部署Alertmanagerwget https://github.com/prometheus/alertmanager/releases/download/v0.20.0/alertmanager-0.20.0.linux-amd64.tar.gztar xvf alertmanager-0.20.0.linux-amd64.ta
复制链接

扫一扫