一、grafana9告警设置:
1、进入告警消息模板介面
2、grafana 消息模板设置
template name : API_msg_tpl #名字随便
{{ define "myalert" }}
**警报时间:** {{ .StartsAt.Format "2006-01-02 15:04:05 " }}
{{ if gt (len .Labels) 0 }}**接口名称:** {{.Labels.alertname}}{{ range .Labels.SortedPairs }}{{ if and (ne (.Name) "alertname") (ne (.Name) "serverity") (ne (.Name) "grafana_folder")}}**{{ .Name }}:** {{ .Value }}{{ end }}{{ end }}{{ end }}
{{ if gt (len .Annotations) 0 }}{{ range .Annotations.SortedPairs }}
**{{ .Name }}:** {{ .Value }}{{ end }}{{ end }}
{{ if gt (len .DashboardURL ) 0 }}**[告警图表]({{ .DashboardURL }})**{{ end }}{{ end }}
{{ define "mymessage" }}
{{ if gt (len .Alerts.Firing) 0 }}# <font color="warning">警报来了</font>{{ range .Alerts.Firing }}{{ template "myalert" .}}
-------{{ end }}{{ end }}
{{ if gt (len .Alerts.Resolved) 0 }}## <font color="info">恢复了</font>{{ range .Alerts.Resolved }}{{ template "myalert" .}}
**恢复时间:** {{ .StartsAt.Format "2006-01-02 15:04:05" }}
-------{{ end }}{{ end }}{{ end }}
3、设置告警媒介
#新建一个告警媒介
#设置企微告警 选择"wecom"
Name: 企微
webhook Url : https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxxxxxxxxxxxxxxx #在企业微信中新建一个就好
Message: {{ template "mymessage" . }}
Title: 接口告警了,快看看!!(生产)
4、设置Notification policeis
5、生成一个告警规则(可不与监控图关联)
#设置填写查询条件和触发条件
测试使用
SELECT toStartOfInterval(timestamp, INTERVAL 60 second) as time, 100 from access_smartgate.access_smartgate_local where $__timeFilter(timestamp) GROUP BY time ORDER by time
#手动添加annotations
Rule name API转码接口
告警级别 严重
告警信息 API接口成功率低于90%
接口地址 /ebus/test/login
现值成功率 {{ with $values }}{{ range $k, $v := . }}{{ $v }}{{ end }}{{ end }}
#企微里收到以下信息
— 分界线 —
模板使用技巧
#读取ValueString的方法
{{ if gt (len .ValueString) 0 }}
**告警信息:**
{{ .ValueString }}{{ end }}
#读取Labels下所有label,并排除指定的label
{{ if gt (len .Labels) 0 }}
**主机标签:** {{ range .Labels.SortedPairs }}{{ if and (ne (.Name) "alertname")}}
{{ .Name }}: {{ .Value }}{{ end }}{{ end }}{{ end }}
#时间格式化,以下方法会增加8个小时
**警报时间:** {{ (.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05 " }}
#标准时间使用方法
**警报时间:** {{ .StartsAt.Format "2006-01-02 15:04:05 " }}
#提取B的值,此方法有待再次验证(曾经正常提取) 在annotations使用
{{ $values.B }}
标记,查询本地时间前1分钟~前2分钟间
SELECT toStartOfInterval(timestamp, INTERVAL 60 second) as time,
round(sum(if(statusCode >= 0 and statusCode <= 600, 1, 0)) / count(1) *100, 5) as `fail_percent`
from access_smartgate.access_smartgate_local
where (timestamp >= DATE_SUB(NOW(),INTERVAL 2 MINUTE) AND timestamp <= DATE_SUB(NOW(),INTERVAL 1 MINUTE)) And fieldType='apigate' AND orgPathName='/xxxx/xxx//xxx'
GROUP BY time ORDER by time
标记,查询本地时间前1分钟~前2分钟间
SELECT toStartOfInterval(timestamp + 60 , INTERVAL 60 second) as time,count(1) as total, sum(if(statusCode >= 0 and statusCode <= 400, 1, 0)) as fail, round(fail / total *100, 5) as `fail_percent` from access_smartgate.access_smartgate_local where (timestamp >= toDateTime($__fromTime - 60)) AND (timestamp < toDateTime($__toTime - 60)) AND ( 1 = '1' ) GROUP BY time ORDER by time