单节点prometheus

最新推荐文章于 2023-03-10 15:22:49 发布

Lori_jishumeng123

最新推荐文章于 2023-03-10 15:22:49 发布

阅读量715

点赞数 1

分类专栏： prometheus 文章标签： prometheus 单节点

本文链接：https://blog.csdn.net/Lori_jishumeng123/article/details/84381424

版权

prometheus 专栏收录该内容

0 篇文章 0 订阅

订阅专栏

单节点搭建：

zabbix server搭建过程了解
采集的数据默认在./data中，默认以2h的数据存储为一个block，https://www.ctolib.com/docs/sfile/prometheus-book/ha/prometheus-local-storage.html
告警配置如何生效？确定当前配置的告警配置哪里有问题？
未生效原因及配置的主要点：
- rules file中的内容是会全部显示到报警所发的内容中，在slack发送中的对link的配置是指在slack中显示报警时可以直接让关注的报警接收人点击链接进入到报警发生的位置或者你想让他看的位置
- 对于rule file中的username是可以用中文
- 在alertmanager.yml中关于slack的配置，api_url不加引号，channel 那么必须是指定的，否则会报错，错误如下

level=error ts=2018-10-19T08:42:36.63691218Z caller=notify.go:332 component=dispatcher msg="Error on notify" err="cancelling noretry for \"slack\" due to unrecoverable error: unexpected status code 404"

实例：

# prometheus.yml

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets: ["localhost:9093"]

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - "/usr/local/prometheus-2.4.3/rules/test.yml" #要不与promutheus.yml在同一级目录中，要不是绝对路径，相对路径无法读取

scrape_configs:

  - job_name: 'prometheus'
    static_configs:
      - targets: ['127.0.0.1:9090']
        labels:
          instance: localhost

  - job_name: 'linux'
    static_configs:
      - targets: ['127.0.0.1:9100']
        labels:
          instance: node1

      - targets: ['172.18.2.28:9090']
        labels:
          instance: node2

      - targets: ['172.18.2.28:1234']
        labels:
          instance: node3



# rules/test.yml 
groups:
- name: test
  rules:

  - alert: InstanceDown
    expr: up == 0
    for: 1m
    labels:
      severity: page
    annotations:
      description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minutes.'
      summary: 'Instance {{ $labels.instance }} down'
      link: 'http://172.18.2.27:9090/alerts'
      color: "#D00000"  #发送时的颜色显示，#D00000为红色
      username: "刘蓉"


#alertmanager.yml


global:
  resolve_timeout: 5m
  smtp_smarthost: 'smtp.163.com:25'
  smtp_from: 'lori_liurong@163.com'
  smtp_auth_username: 'lori_liurong@163.com'
  smtp_auth_password: 'liurong199686'
  smtp_require_tls: false

route:
  group_by: ['ip','id','type']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 2h  #在发送成功的前提下，重复发报警的时间间隔
  receiver: 'liurong'

receivers:
  - name: 'liurong'
    email_configs:
      - to: 'lori_liurong@163.com'
        headers: { Subject: "[WARN] 报警邮件test" }

    slack_configs:
      - send_resolved: true
        api_url: https://hooks.slack.com/services/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  #slack中webhook的url，每个channel的webhook的url都不同
        channel: "#test-alermanager"
        text: "{{ range .Alerts }} {{ .Annotations.description}}\n {{end}} @{{ .CommonAnnotations.username}} <{{.CommonAnnotations.link}}| click here>"
        title: "{{.CommonAnnotations.summary}}"
        title_link: "{{.CommonAnnotations.link}}"
        color: "{{.CommonAnnotations.color}}"

在检测到alertmanager的计算规则时会出现当前有问题的报警，具体解释：http://blog.51cto.com/xujpxm/2055970

日志输出
where can I find prometheus logs?
https://github.com/prometheus/prometheus/issues/2363

启动方式使用脚本方式启动，指定输出日志路径

Lori_jishumeng123

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
单节点prometheus

单节点搭建：zabbix server搭建过程了解采集的数据默认在./data中，默认以2h的数据存储为一个block，https://www.ctolib.com/docs/sfile/prometheus-book/ha/prometheus-local-storage.html告警配置如何生效？确定当前配置的告警配置哪里有问题？未生效原因及配置的主要点：rules file中的...
复制链接

扫一扫

专栏目录