钉钉告警
钉钉或者飞书告警需要第三方的webhook插件
插件地址 https://github.com/timonwong/prometheus-webhook-dingtalk
#下载插件
wget https://github.com/timonwong/prometheus-webhook-dingtalk/releases/download/v2.1.0/prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz
#解压并软连接到/usr/local/下
#配置文件config.yaml 添加钉钉机器人配置信息
targets:
webhook1: #这个webhook1在Alertmanager的配置文件里有用到,在url中
#url填写机器人webhook的地址
url: https://oapi.dingtalk.com/robot/send?access_token=*****a8b6ebd14d1bd23600b9653287bebb9f5128*********&&&*****
# secret for signature,如果有加签,填写加签的秘钥
secret: SEC6****fbbf850ecfab220b4ae7dfc58b994953c50d687fe**********
启动dingtalk
cat /usr/lib/systemd/system/dingtalk-webhook.service
[Unit]
Description=Prometheus-Server
After=network.target
[Service]
ExecStart=/usr/local/dingtalk/prometheus-webhook-dingtalk --config.file=/usr/local/dingtalk/config.yml
User=root
[Install]
WantedBy=multi-user.target
systemctl start dingtalk-webhook.service
如果启动异常可能需要添加prometheus用户 useradd prometheus
验证:默认启动8060端口
在Alertmanager中添加dingtalk的配置:
route:
group_by: ['dingtalk']
group_wait: 30s
group_interval: 5m
repeat_interval: 10m #告警发送的间隔时间
receiver: 'dingtalk' #接收器名称,需要和下面receivers的name配置
routes:
- receiver: 'dingtalk'
match_re:
alertname: ".*" #匹配所以告警规则
receivers:
- name: 'dingtalk'
webhook_configs:
- url: 'http://10.19.*.*:8060/dingtalk/webhook1/send' #这里是定talk的地址和端口,注意webhook1和dingtalk的配置里保持一致
send_resolved: true #开启恢复通知
配置完成后需要重启Alertmanager
这边模拟一个MYSQL Down的告警
钉钉收到告警信息:
恢复通知:
上面的告警模版是使用了自定义的模板,引用自定义模版可以在dingtalk里:
cat dingtalk/config.yml
templates:
- /usr/local/dingtalk/template.tmpl
…………
#模版信息template.tmpl
{{ define "__subject" }}
[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}]
{{ end }}
{{ define "__alert_list" }}{{ range . }}
---
{{ if .Labels.owner }}@{{ .Labels.owner }}{{ end }}
**告警主题**: {{ .Annotations.summary }}
**告警类型**: {{ .Labels.alertname }}
**告警级别**: {{ .Labels.severity }}
**告警主机**: {{ .Labels.instance }}
**告警信息**: {{ index .Annotations "description" }}
**告警时间**: {{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}
{{ end }}{{ end }}
{{ define "__resolved_list" }}{{ range . }}
---
{{ if .Labels.owner }}@{{ .Labels.owner }}{{ end }}
**告警主题**: {{ .Annotations.summary }}
**告警类型**: {{ .Labels.alertname }}
**告警级别**: {{ .Labels.severity }}
**告警主机**: {{ .Labels.instance }}
**告警信息**: {{ index .Annotations "description" }}
**告警时间**: {{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}
**恢复时间**: {{ dateInZone "2006.01.02 15:04:05" (.EndsAt) "Asia/Shanghai" }}
{{ end }}{{ end }}
{{ define "default.title" }}
{{ template "__subject" . }}
{{ end }}
{{ define "default.content" }}
{{ if gt (len .Alerts.Firing) 0 }}
**====侦测到{{ .Alerts.Firing | len }}个故障====**
{{ template "__alert_list" .Alerts.Firing }}
---
{{ end }}
{{ if gt (len .Alerts.Resolved) 0 }}
**====恢复{{ .Alerts.Resolved | len }}个故障====**
{{ template "__resolved_list" .Alerts.Resolved }}
{{ end }}
{{ end }}
{{ define "ding.link.title" }}{{ template "default.title" . }}{{ end }}
{{ define "ding.link.content" }}{{ template "default.content" . }}{{ end }}
{{ template "default.title" . }}
{{ template "default.content" . }}
飞书告警
飞书告警可以使用PrometheusAlert
#下载安装包
wget https://github.com/feiyu563/PrometheusAlert/releases/download/v4.9/linux.zip
#解压并移动到/usr/local/下
mv linux /usr/local/prometheusAlert
#修改prometheusAlert的配置文件,配置里有很多默认选项,这里主要看需要改动的几项,其他的不使用可以删除
cat conf/app.conf
#---------------------↓全局配置-----------------------
appname = PrometheusAlert
#登录用户名
login_user=admin
#登录密码
login_password=prometheusalert
#监听地址
httpaddr = "0.0.0.0"
#监听端口
httpport = 8020
#---------------------↓webhook-----------------------
#是否开启钉钉告警通道,可同时开始多个通道0为关闭,1为开启
open-dingding=0
#默认钉钉机器人地址
#ddurl=https://oapi.dingtalk.com/robot/send?access_token=xxxxx
ddurl=https://oapi.dingtalk.com/robot/send?access_token=109b5a8b6ebd14d1bd23600b9653287bebb9f5128b682d0248664e5d322eb959&secret=SEC6db70fbbf850ecfab2***********************
#是否开启 @所有人(0为关闭,1为开启)
dd_isatall=1
#是否开启钉钉机器人加签,0为关闭,1为开启
open-dingding-secret=1
#是否开启微信告警通道,可同时开始多个通道0为关闭,1为开启
open-weixin=0
#默认企业微信机器人地址
wxurl=https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxx
#是否开启飞书告警通道,可同时开始多个通道0为关闭,1为开启
open-feishu=1
#默认飞书机器人地址
fsurl=https://open.feishu.cn/open-apis/bot/v2/hook/2bf5654f-4a0a-****-####****
# webhook 发送 http 请求的 contentType, 如 application/json, application/x-www-form-urlencoded,不配置默认 application/json
wh_contenttype=application/json
启动配置
cat << EOF > /usr/lib/systemd/system/prometheusalert.service
[Service]
ExecStart=/usr/local/prometheusAlert/PrometheusAlert
WorkingDirectory=/usr/local/prometheusAlert
Restart=always
[Install]
WantedBy=multi-user.target
[Unit]
Description=Prometheus Alerting Service
After=network.target
EOF
systemctl start prometheusalert
systemctl enable prometheusalert
登录web页面验证:
可以在告警测试里测试飞书机器人:
验证成功:
修改Alertmanager里配置改成使用飞书通知:
route:
group_by: ['dingtalk']
group_wait: 30s
group_interval: 5m
repeat_interval: 10m
receiver: 'feishu'
routes:
- receiver: 'feishu'
match_re:
alertname: ".*"
receivers:
- name: 'dingtalk'
webhook_configs:
- url: 'http://*****:8060/dingtalk/webhook1/send'
send_resolved: true
- name: 'feishu'
webhook_configs:
#url填写prometheusalert的地址和端口,fsurl填写webhook地址,其他的不用动
- url: 'http://*****:8020/prometheusalert?type=fs&tpl=prometheus-fs&fsurl=https://open.feishu.cn/open-apis/bot/v2/hook/2bf56**-****-43aa-**ee-*******'
send_resolved: true
inhibit_rules: #告警抑制
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']
重启Alertmanager,再次模拟mysql down告警
飞机收到告警:
如果需要修改告警模版可以参考:https://github.com/feiyu563/PrometheusAlert/issues/30