下载
prometheus、blackbox_exporter:https://prometheus.io/download/
解压后可以看到prometheus.yml、blackbox.yml
启动
nohub ./prometheus --config.file=prometheus.yml &
nohub ./blackbox_exporter–config.file=blackbox.yml &
配置
vim prometheus.yml
- job_name: 'blackbox' #自定义job_name
scrape_interval: 10s #监测时间默认15s
metrics_path: /probe #指标,默认为metrics,这边采用probe
params:
module: [http_2xx] #blackbox.yml中的模块(get)
static_configs:
- targets:
- https://baidu.com #监听地址
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 192.168.223.146:9115 #blackbox服务地址
- job_name: 'blackbox-post'
scrape_interval: 30s
metrics_path: /probe
params:
module: [http_post_2xx] #此模块用于监听post请求
static_configs:
- targets:
- https://test.com/
labels:
url_name: "POST 用户查询" #标签用于区分
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 192.168.223.146:9115
vim blackbox.yml
需要修改该模块监听post请求 http_post_2xx
http_post_2xx:
prober: http
timeout: 30s
http:
method: POST
headers:
Content-Type: application/json
token: eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpYXQiOjE2Nzk4OTYzMTEsImV4cCI6MTY3OTk4MjcxMSwicGlkIjoxLCJwbmFtZSI6Iui_kOiQpeaAu-mDqCIsImVpZCI6MSwidWlkIjoxLCJyaWQiOjEsInV0eXBlIjoiMSIsIm5hbWUiOiLov5DokKXmgLvpg6giLCJjbGllbnQiOjF9.4ev-OoYsfW5DZ9xaDFMyfQw6qTbuah99yNgoo26FKMg
body: "isnew=1" #body: {"isnew":"1"}
配置告警规则
下面是两个告警规则,因为有的url接口就是慢,也不影响业务,所以统一配置成超过1秒告警,会导致某些时间频繁报警,所以可以根据 =~ 和 = 的方法配置某一个规则大于1.5秒才告警;
vim /prometheus/rules/http_export-alert-rules.yml
groups:
- name: nginx状态-监控告警
rules:
- alert: 状态码检测
expr: probe_http_status_code{job="blackbox"} != 200
for: 0m
labels:
serverity: warning
status: 非常严重
annotations:
summary: "请求URL状态码非200"
description: "请求{{$.Labels.instance}}状态码非200"
- alert: 证书过期时间检测
expr: probe_ssl_earliest_cert_expiry {job="blackbox"} -time() < 86400 * 30
for: 5m
labels:
serverity: warning
status: 警告
annotations:
summary: "证书过期时间不足30天"
description: "{{$.Labels.instance}}证书还有30天到期,请及时更换"
- alert: 页面响应时间检测
expr: probe_duration_seconds{job="blackbox"} >= 1
for: 1m
labels:
serverity: warning
status: 警告
annotations:
summary: "页面响应时间超过1秒"
description: "{{$.Labels.instance}}页面响应时间超过1秒"
- alert: post页面响应时间检测
expr: probe_duration_seconds{job="blackbox-post"} >= 1
for: 1m
labels:
serverity: warning
status: 警告
annotations:
summary: "{{$.Labels.instance}}页面响应时间超过1秒"
description: "服务:{{$.Labels.url_name}}---响应时间>=1s,(当前:{{$value}})"