下载elasticsearch_exporter
wget 下载二进制包并解压、运行:
tar -xvf elasticsearch_exporter-1.3.0.linux-amd64.tar.gz
mv elasticsearch_exporter-1.3.0.linux-amd64/ elasticsearch_exporter
cd elasticsearch_exporter
运行elasticsearch_exporter:
nohup ./elasticsearch_exporter --es.all --es.indices --es.cluster_settings --es.indices_settings --es.shards --es.snapshots --es.timeout=10s --web.listen-address=":9114" --web.telemetry-path="/metrics" --es.uri http://192.168.11.139:9200 &
查看输出日志:
vim nohup.out
看指标信息:
web 访问
http://192.168.11.139:9114/metrics
启动好后来prometheus添加配置
- job_name: "es"
static_configs:
- targets: ["192.168.11.139:9114"]
http://192.168.11.141:9090/targets 看一下状态 UP则配置成功
配置完成后在Grafana官网中下载es的监控模板 Dashboards | Grafana Labs 下载es的模板
下载好以后来到Grafana-web页面 192.168.221.25:3000
配置完成!!!
如果需要配合alertmanager然后配置告警规则即可
# ES服务挂掉时触发告警
- alert: ES 状态
expr: elasticsearch_cluster_health_up == 0
for: 10s
labels:
severity: warning
annotations:
summary: " {{ $labels.instance }} ES服务"
description: " {{ $labels.instance }} ES服务不可用,请检查 "
- alert: ES节点健康状态
expr: elasticsearch_cluster_health_status{color="red"} == 1
for: 10s
labels:
severity: warning
annotations:
summary: "{{ $labels.instance }} ES节点健康状态"
description: "{{ $labels.instance }} ES节点健康状态为红色,请检查"
systemctl restart prometheus
配置完成后停掉ES服务看下告警是否正常