一、原理图
二、操作
环境准备:
本地启动 consul(端口默认8500),本地启动 Prometheus (端口默认9090)
1. consul准备Prometheus配置文件内容
在 consul KV下新建 prometheus/my_prometheus_yaml 的 key,值设置为如下:
## 具体值,测试用,假装搞得复杂点
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# 这是我改变之后的,从consul读取的配置文件。1
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - localhost:9093
# - alertmanager:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "rule/*.yml"
# - "second_rules.yml"
remote_write:
- url: "http://localhost:9090/api/v1/write"
# remote_read:
# - url: http://localhosts:9090/api/v1/read
# read_recent: true
# remote_timeout: 30s
# basic_auth:
# username: xx
# password: xx
# remote_timeout: 30s
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9090"]
- job_name: "hqmjob"
static_configs:
- targets: ["localhost:8080"]
# groups:
# # 报警组组名称
# - name: DirectMemoryAlert
# #报警组规则
# rules:
# #告警名称,需唯一
# - alert: DirectMemoryAlert
# #promQL表达式
# expr: gateway_direct_memory > 5
# #满足此表达式持续时间超过for规定的时间才会触发此报警
# for: 1m
# labels:
# #严重级别
# severity: page
# annotations:
# #发出的告警标题
# summary: "实例 {{ $labels.instance }} 直接内存使用超过 5M"
# #发出的告警内容
# description: "实例{{ $labels.instance }} 直接内存使用超过 5M, (当前值为: {{ $value }})M"
2.配置ctmpl模板并启动consul-template
1).新建ctmpl配置文件
#consul-template目录下新建名为 prometheus.yml.ctmpl 的文件,内容如下:
{{ key "prometheus/my_prometheus_yaml" }}
2).启动consul-template,指定template模板
##这里使用的是docker镜像的方式启动consul-template程序,
##在Linux或windows下启动只需按需求修改-template命令即可
docker run -d \
--restart=always \
-u root \
--privileged=true \
--name consul-template \
-v /data/consul-template/region-recording-templates:/etc/consul-template:rw \
# 将prometheus的配置文件路径挂载到consul-template容器中
-v /data/prometheus/conf:/etc/prometheus:rw \
consul-template:0.19.4-alpine \ #指定自己需要的镜像
-consul-addr localhost:8500 \
-log-level="info" \
# 指定上一步新建的ctmpl模板文件,ctmpl指定要监听consul下的KV,将监听到的KV替换为 etc/prometheus/prometheus.yml
-template /etc/consul-template/prometheus.yml.ctmpl:/etc/prometheus/prometheus.yml:'curl -XPOST http://localhost:9090/-/reload'
查看consul-template启动日志:
3.验证
修改 consul 中 prometheus/my_prometheus_yaml下的值,发现prometheus.yml 配置文件的值跟着修改
4.扩展
同样的方式,可以随时加入 ctmpl 配置文件,再通过consul-template监听指定的KV,-template指定模板动态更新Prometheus下的rules中的告警配置文件或者recording配置文件。
groups:{{ range ls "prometheus/rules/myrecordings" }}
- name: {{ .Key }}{{ with $d := .Value | parseJSON }}
interval: {{ if $d.interval }}{{ $d.interval}}{{ else }}1m{{ end }}
rules:{{ range $value := $d.rules }}
- record: {{ $value.record }}
expr: {{ $value.expr }}
labels:{{range $k, $v := $value.labels}}
{{$k}}: {{$v}}{{end}}
{{if not $value.labels}}job: xcustom{{end}}
{{ end }}{{ end }}{{ end }}