普罗米修斯-集成钉钉推送

杰夫贾维斯

已于 2022-08-09 09:17:32 修改

阅读量709

点赞数

分类专栏：服务监控文章标签：大数据

于 2022-07-14 18:37:55 首次发布

本文链接：https://blog.csdn.net/qq_34936628/article/details/125790444

版权

服务监控专栏收录该内容

2 篇文章 0 订阅

订阅专栏

安装并且启动alertmanager

alertmanager.yml

global:
    resolve_timeout: 5m
route:
    group_by: ['alertname']
    group_wait: 30s
    group_interval: 1m
    repeat_interval: 2m
    receiver: webhook1
receivers:
  - name: 'webhook1'
    webhook_configs:
      - url: 'http://10.31.18.157:8060/dingtalk/webhook1/send'
        send_resolved: true     # 表示服务恢复后会收到恢复告警
#    当已经发送的告警通知匹配到target_match和target_match_re规则，当有新的告警规则如果满足source_match或者定义的匹配规则，并且已发送的告警与新产生的告警中equal定义的标签完全相同，则启动抑制机制，新的告警不会发送。

docker run -d -p 9093:9093 --name my_alertmanager  \
-v /opt/prometheus/alertmanager.yml:/etc/alertmanager/alertmanager.yml \
-v /etc/localtime:/etc/localtime:ro \
 prom/alertmanager

安装钉钉插件

#可以选择两个版本  最新版 和1.40版本  两种启动方式有点不太一样
docker pull timonwong/prometheus-webhook-dingtalk
docker pull timonwong/prometheus-webhook-dingtalk:v1.4.0

对新版配置钉钉规则模板 config.yml 1.4不需要

## Request timeout
# timeout: 5s

## Customizable templates path
#    此处进行告警模板的指定，不要时使用 --template.file= 进行指定，否则会报错，只会识别tempaltefile 而不去识别指定dingdingwebhook的 配置文件
# templates:
#    - /usr/local/prometheus-webhook-dingtalk/template/*.tmp
targets:
    webhook1:
        url: https://oapi.dingtalk.com/robot/send?access_token=f50b6f2f31f6192fcfa0f27ddceaac07f1b7b5d27c3289cf04301014fa1b386d
        # secret for signature
        secret: SEC5691a8d16e0fc926f81f0348d4bd4ce0cce5c207e1b71cadca79598b3905018c

启动钉钉插件

新版本将 tempaltefile 指定集成到了配置文件中，只需要在配置文件中指定template file 的位置就行了


docker run -d --restart always -p 8060:8060  \
--name webhook-dingding  \
-v /opt/prometheus/config.yml:/etc/prometheus/config.yml  \
-v /etc/localtime:/etc/localtime  \
 timonwong/prometheus-webhook-dingtalk

1.4版本启动

docker run -d -p 8060:8060 \
--name webhook1  timonwong/prometheus-webhook-dingtalk:v1.4.0 \
--ding.profile="webhook1=https://oapi.dingtalk.com/robot/send?access_token=f50b6f2f31f6192fcfa0f27ddceaac07f1b7b5d27c3289cf04301014fa1b386d"

配置推送规则 :rule_fileshttps://download.csdn.net/download/qq_34936628/86393710

rules:注意使用时修改是自己需要监控的job_name

prometheus 配置报警和识别抓取对象实例

# my global config
global:
  # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  # 将抓取间隔设置为每 15 秒。 默认为每 1 分钟。
  scrape_interval: 15s
  # Evaluate rules every 15 seconds. The default is every 1 minute.
  # 每 15 秒评估一次规则。 默认值为每 1 分钟。
  evaluation_interval: 15s
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
# 警报管理器配置
alerting:
  alertmanagers:
    - static_configs:
        - targets: ["10.**.**.***:9093"]
#           - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
# 加载规则一次并根据全局“evaluation_interval”定期评估它们。
rule_files:
  - "rules/*.yml"

  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# 一种抓取配置，只包含一个要抓取的端点
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090"]
       # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  
  - job_name: 'cadvisor'

    static_configs:
      - targets: ['10.**.**.***:8081']
      # scheme defaults to 'http'.

  - job_name: 'prometheus-example'
    scrape_interval: 5s
    # 抓取的端点 这个是springboot项目集成监控的默认请求路径
    metrics_path: '/actuator/prometheus'
    static_configs:
       # 目标机器，数组，也就是说支持集群拉取
      - targets: [ '10.**.**.***:8082' ]

启动prometheus容器

#配置规则路径  用-v 实现统一配置管理
docker run -itd --name prometheus -p 9090:9090 \
-v /opt/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml \
-v /opt/prometheus/rules/:/etc/prometheus/rules/ \
-v /etc/localtime:/etc/localtime:ro \
prom/prometheus