Prometheus添加钉钉监控

本文档介绍了如何在Prometheus中配置和部署alertmanager以实现钉钉监控。首先,你需要修改alertmanager.yml和alert_rule.yml文件来设置告警规则。接着,更新prometheus.yml文件并添加相关配置,最后重启Prometheus服务以应用变更。
摘要由CSDN通过智能技术生成

配置

#部署钉钉webhook填写钉钉机器人token
docker run -d -p 8060:8060 --name webhook timonwong/prometheus-webhook-dingtalk --ding.profile="webhook1=https://oapi.dingtalk.com/robot/send?access_token=$dingding_token"

修改ops/alert/alertmanager.yml

global:
  resolve_timeout: 5m
route:
  receiver: stos_ops
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 5h
  group_by: [alertname]
  routes:
  - receiver: stos_ops
    group_wait: 30s
receivers:
- name: stos_ops
  webhook_configs:
  - url: http://XXX:8060/dingtalk/webhook1/send
    send_resolved: true

修改alert/alert_rule.yml告警规则

groups:
- name: hs_f0158133_alert_group
  rules:
  - alert: CPU_负载告警
    expr: node_load1{
   job="worker_seal_system_monitor"} > 80
    for: 30m
    labels:
      severity: "warning"
    annotations:
      summary: "{
   {
   $labels.instance}}:CPU核数使用率过高"
      description: "设备 {
   { $labels.instance }} CPU 使用核数超过80核,current value is  {
   { $value }} "
  - alert: 设备挂机告警
    expr: up{
   job=~"worker_seal_system_monitor|worker_store_system_monitor|miner_system_monitor|lotus_system_monitor"} == 0
    for: 1m
    labels:
      severity: "critical"
    annotations:
      summary: "{
   {
   $labels.instance}} 挂机了"

  - alert: /data1使用率告警
    expr: ceil((node_filesystem_size_bytes{mountpoint =~"/rootfs/data1",job!="worker_store_system_monitor"}-node_filesystem_free_bytes{mountpoint 
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值