监控Elasticsearch
Elasticsearch监控原理
Elasticsearch是一个基于Lucene的搜索服务器,它提供了一个具有分布式多用户能力的全文搜索引擎,基于RESTful Web接口。Elasticsearch是使用Java语言开发的,对于监控数据,其本身提供了API接口供外部获取数据。使用如下命令获取监控指标数据:
curl “http://localhost:9200/_cluster/health”) 集群的健康情况
http://192.168.6.21:9200/_cluster/state?pretty 查看集群的状态信息
curl “http://localhost:9200/_nodes/_local/stats?all=true”
获取数据之后,我们编写一些代码将数据输出为zabbix需要的格式。
关于Elasticsearch的监控指标,官方文档地址:
响应的内容解释
内容
解释
"cluster_name" : "my-elk-cluster",
集群名
"status" : "green",
集群健康状态,正常的话是“green”,缺少副本分片为“yellow”,缺少主分片为“red”
"timed_out" : false,
"number_of_nodes" : 1,
集群节点数
"number_of_data_nodes" : 1,
数据节点数
"active_primary_shards" : 0,
主分片数
"active_shards" : 0,
可用的分片数
"relocating_shards" : 0,
正在迁移的分片数
"initializing_shards" : 0,
正在初始化的分片数
"unassigned_shards" : 0,
未分配的分片,但在集群中存在
"delayed_unassigned_shards" : 0,
延时待分配到具体节点上的分片数
"number_of_pending_tasks" : 0,
待处理的任务数,指主节点创建索引并分配shards等任务
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
可用分片数占总分片的比例
Elasticsearch监控脚本
放在/etc/zabbix/scripts/或/usr/local/zabbix/scripts
key_elasticsearch.sh
#!/bin/bash
case $1 in
cluster_name)
curl -s -XGET 'http://192.168.6.21:9200/_cluster/health?pretty' |awk -F" '/cluster_name/ {print $4}' ;;
status)
curl -s -XGET 'http://192.168.6.21:9200/_cluster/health?pretty' |awk -F" 'NR==3 {print $4}' ;;
timed_out)
curl -s -XGET 'http://192.168.6.21:9200/_cluster/health?pretty' |awk -F, 'NR==4 {print $1}' |awk -F: '{print $2}' ;;
number_nodes)
curl -s -XGET 'http://192.168.6.21:9200/_cluster/health?pretty' |awk -F, 'NR==5 {print $1}' |awk -F: '{print $2}' ;;
data_nodes)
curl -s -XGET 'http://192.168.6.21:9200/_cluster/health?pretty' |awk -F, 'NR==6 {print $1}' |awk -F: '{print $2}' ;;
active_primary_shards)
curl -s -XGET 'http://192.168.6.21:9200/_cluster/health?pretty' |awk -F, 'NR==7 {print $1}' |awk -F: '{print $2}' ;;
active_shards)
curl -s -XGET 'http://192.168.6.21:9200/_cluster/health?pretty' |awk -F, 'NR==8 {print $1}' |awk -F: '{print $2}' ;;
relocating_shards)
curl -s -XGET 'http://192.168.6.21:9200/_cluster/health?pretty' |awk -F, 'NR==9 {print $1}' |awk -F: '{print $2}' ;;
initializing_shards)
curl -s -XGET 'http://192.168.6.21:9200/_cluster/health?pretty' |awk -F, 'NR==10 {print $1}' |awk -F: '{print $2}' ;;
unassigned_shards)
curl -s -XGET 'http://192.168.6.21:9200/_cluster/health?pretty' |awk -F, 'NR==11 {print $1}' |awk -F: '{print $2}' ;;
delayed_unassigned_shards)
curl -s -XGET 'http://192.168.6.21:9200/_cluster/health?pretty' |awk -F, 'NR==12 {print $1}' |awk -F: '{print $2}' ;;
number_of_pending_tasks)
curl -s -XGET 'http://192.168.6.21:9200/_cluster/health?pretty' |awk -F, 'NR==13 {print $1}' |awk -F: '{print $2}' ;;
active_shards_percent_as_number)
curl -s -XGET 'http://192.168.6.21:9200/_cluster/health?pretty' |awk -F, 'NR==16 {print $1}' |awk -F: '{print $2}' ;;
*)
echo "Usage: $0 { cluster_name | status | timed_out | number_nodes | data_nodes | active_primary_shards | active_shards | relocating_shards | initializing_shards | unassigned_shards|delayed_unassigned_shards|number_of_pending_tasks|active_shards_percent_as_number}" ;;
esac
在shell脚本里,“curl -s -XGET 'http://192.168.6.21:9200/_cluster/health?pretty' |awk -F, 'NR==16 {print $1}' |awk -F: '{print $2}''”这样的命令,“NR==16”是指在浏览器访问http://192.168.6.21:9200/_cluster/health?pretty,获取页面的第16行(从第1行的“{”开始计数)。
脚本授予执行权限
chmod +x /etc/zabbix/scripts/*.sh
属主、属组可能也需要授权
chown zabbix:zabbix /etc/zabbix/scripts/*.sh
配置文件
目录/etc/zabbix/zabbix_agentd.d/userparameter_rabbitmq.conf或/usr/local/zabbix/etc/zabbix_agentd.conf.d/userparameter_rabbitmq.conf
UserParameter=elasticsearch.cluster_health[*],/etc/zabbix/scripts/key_elasticsearch.sh $1 $2
重启zabbix_agent
systemctl restart zabbix_agentd
yum安装的zabbix-agent
systemctl restart zabbix-agent.service
测试获取RadditMQ数据
zabbix_get -s 192.168.6.21 -k elasticsearch.cluster_health.active_primary_shards
Elasticsearch监控指标
应用集:Elasticsearch
名称
键值
Elasticsearch cluster active primary shards
elasticsearch.cluster_health[active_primary_shards]
Elasticsearch cluster active shards
elasticsearch.cluster_health[active_shards]
Elasticsearch cluster active shards percent
elasticsearch.cluster_health[active_shards_percent]
Elasticsearch cluster delayed unassigned shards
elasticsearch.cluster_health[delayed_unassigned_shards]
Elasticsearch cluster health
elasticsearch.cluster_health[status]
Elasticsearch cluster health int
elasticsearch.cluster_health[status_int]
Elasticsearch cluster initializing shards
elasticsearch.cluster_health[initializing_shards]
Elasticsearch cluster name
elasticsearch.cluster_health[cluster_name]
Elasticsearch cluster number of data nodes
elasticsearch.cluster_health[number_of_data_nodes]
Elasticsearch cluster number of in flight fetch
elasticsearch.cluster_health[number_of_in_flight_fetch]
Elasticsearch cluster number of nodes
elasticsearch.cluster_health[number_of_nodes]
Elasticsearch cluster number of pending tasks
elasticsearch.cluster_health[number_of_pending_tasks]
Elasticsearch cluster relocating shards
elasticsearch.cluster_health[relocating_shards]
Elasticsearch cluster task max waiting in queue
elasticsearch.cluster_health[task_max_waiting_in_queue_millis]
Elasticsearch cluster timeout
elasticsearch.cluster_health[timed_out]
Elasticsearch cluster unassigned shards
elasticsearch.cluster_health[unassigned_shards]
Elasticsearch get stats
elasticsearch.stats[{$ZABBIX_SERVER_IP},{$ES_ADDRESS},{$ES_ZBX_PREFIX},{$ES_USER},{$ES_PASSWORD},{$CA_PATH},{$ZABBIX_HOSTNAME}]
Elasticsearch port listen
net.tcp.service[tcp,,{$ES_PORT}]
Elasticsearch process num
proc.num[,,,bootstrap.Elasticsearch]
Elasticsearch监控触发器