一:创建钉钉告警机器人
一:创建钉钉告警机器人
1.在PC版钉钉上打开您想要添加报警机器人的钉钉群,并单击右上角的群设置图标。
2.在群设置面板中单击智能群助手。
3.在智能群助手面板单击添加机器人。
4.在群机器人对话框单击添加机器人区域的+图标,然后选择添加自定义。
5. 在机器人详情对话框单击添加。
6. 在添加机器人对话框中编辑机器人头像和名称,选中必要的安全设置(至少选择一种),选中我已阅读并同意《自定义机器人服务及免责条款》。单击完成。
二:安装钉钉告警插件
1. 下载插件
https://github.com/timonwong/prometheus-webhook-dingtalk/releases/
wget https://github.com/timonwong/prometheus-webhook-dingtalk/releases/download/v2.1.0/prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz
2. 安装
tar -xvf prometheus-webhook-dingtalk-2.1.0.linux-amd64.tar.gz -C /usr/local
cd /usr/local
mv prometheus-webhook-dingtalk-2.1.0.linux-amd64 prometheus-webhook-dingtalk
3.修改钉钉告警插件配置文件
## Request timeout
# timeout: 5s
## Uncomment following line in order to write template from scratch (be careful!)
#no_builtin_template: true
## Customizable templates path
templates:
- contrib/templates/*.tmpl # 这里指向你生成的模板
## You can also override default template using `default_message`
## The following example to use the 'legacy' template from v0.3.0
#default_message:
# title: '{{ template "legacy.title" . }}'
# text: '{{ template "legacy.content" . }}'
## Targets, previously was known as "profiles"
targets:
webhook1:
# 钉钉机器人的webhook
url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# secret for signature 加签后得到的值
secret: xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# webhook2:
# url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
# webhook_legacy:
# url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
# # Customize template content
# message:
# # Use legacy template
# title: '{{ template "legacy.title" . }}'
# text: '{{ template "legacy.content" . }}'
# webhook_mention_all:
# url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
# mention:
# all: true
# webhook_mention_users:
# url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx
# mention:
# mobiles: ['156xxxx8827', '189xxxx8325']
钉钉告警模板(这个模板是在钉钉报警插件中使用的)
{{ define "__subject" }}
[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}]
{{ end }}
{{ define "__alert_list" }}{{ range . }}
---
{{ if .Labels.owner }}@{{ .Labels.owner }}{{ end }}
**告警名称**: {{ index .Annotations "title" }}
**告警级别**: {{ .Labels.severity }}
**告警主机**: {{ .Labels.instance }}
**告警信息**: {{ index .Annotations "description" }}
**告警时间**: {{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}
{{ end }}{{ end }}
{{ define "__resolved_list" }}{{ range . }}
---
{{ if .Labels.owner }}@{{ .Labels.owner }}{{ end }}
**告警名称**: {{ index .Annotations "title" }}
**告警级别**: {{ .Labels.severity }}
**告警主机**: {{ .Labels.instance }}
**告警信息**: {{ index .Annotations "description" }}
**告警时间**: {{ dateInZone "2006.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}
**恢复时间**: {{ dateInZone "2006.01.02 15:04:05" (.EndsAt) "Asia/Shanghai" }}
{{ end }}{{ end }}
{{ define "default.title" }}
{{ template "__subject" . }}
{{ end }}
{{ define "default.content" }}
{{ if gt (len .Alerts.Firing) 0 }}
**====侦测到{{ .Alerts.Firing | len }}个故障====**
{{ template "__alert_list" .Alerts.Firing }}
---
{{ end }}
{{ if gt (len .Alerts.Resolved) 0 }}
**====恢复{{ .Alerts.Resolved | len }}个故障====**
{{ template "__resolved_list" .Alerts.Resolved }}
{{ end }}
{{ end }}
{{ define "ding.link.title" }}{{ template "default.title" . }}{{ end }}
{{ define "ding.link.content" }}{{ template "default.content" . }}{{ end }}
{{ template "default.title" . }}
{{ template "default.content" . }}
4.启动钉钉报警插件
./prometheus-webhook-dingtalk --config.file=config.yml >dingtalk.log 2>&1 &
#默认使用 8060 端口
netstat -ntlp|grep 8060
三. 修改alertmanager配置文件(添加钉钉告警渠道)
vi /usr/local/alertmanager/alertmanager.yml
1.routes 部分添加如下部分
#通过正则表达式指定告警名称为Mysql开头,或者告警名称为 Memory Usage 的告警通过 dingding.webhook1 发送
routes:
- receiver: 'dingding.webhook1'
match_re:
alertname: "Mysql.*|Memory Usage"
2. receivers 添加如下部分
receivers:
- name: 'dingding.webhook1'
webhook_configs:
- url: 'http://119.8.238.94:8060/dingtalk/webhook1/send' #这里的webhook1,根据我们在钉钉告警插件配置文件中targets中指定的值做修改
send_resolved: true
3.重新加载 alertmanager 参数
curl -lv -X POST http://localhost:9093/-/reload
4.测试告警
1) 查看是否生成告警
http://119.8.238.94:9090/alerts?search=
2) 查看告警是否通过指定渠道发送
#告警通过钉钉发送成功(prometheus 的告警规则中指定了默认告警方式为邮件告警,但是如果告警名称符合Mysql开头的告警,或者告警名称为Memory Usage则通过dingding.webhook1渠道发送)
告警发送
告警恢复