ES 监控指标

最新推荐文章于 2024-06-20 00:30:00 发布

lefooter

最新推荐文章于 2024-06-20 00:30:00 发布

阅读量1k

点赞数

分类专栏：基础架构文章标签： elasticsearch jvm java

本文链接：https://blog.csdn.net/baidu_19620507/article/details/125777168

版权

基础架构专栏收录该内容

5 篇文章 0 订阅

订阅专栏

红色为能够反映集群状态异常的关键指标

蓝色为需重点关注的性能指标

告警阈值均设置为宏变量，可根据集群情况自定义，表格中均为默认值

ES进程监控模板

指标

具体的含义

监控间隔

Warning

High

Disaster

备注

proc.num[,,,bootstrap.Elasticsearch]

检测ES进程是否存活

30s

<1 ,且原先值>0

ES节点监控模板

指标

具体的含义

监控间隔

Warning

High

Disaster

备注

集群汇总指标

cluster_status

集群状态(0-green 1-yellow 2-red)

yellow(值=1)

red(值=2)

cluster_nodes_count

集群总节点数

有节点离开集群

(本次数值<上次数值)

cluster_indices_count

集群开启状态的索引数

cluster_indices_indexing_index_total

集群总的写入TPS

业务集群当前值比5分钟/1天前均值增长/下降20%

总写入<20

在zabbix中已转为速率，以下所有total值相同

cluster_indices_search_query_total

集群总的查询QPS

业务集群当前值比5分钟/1天前均值增长/下降20%

总查询<20

各节点指标

es_roles

es节点角色

heap_committed_in_bytes

已提交的JVM堆量

heap_used_percent

JVM堆内存使用比例

>80%

http_current_open

当前打开的HTTP连接数

http_total_opened

一共打开的HTTP连接数

indices_indexing_flush_total

flush 次数

indices_indexing_flush_total_time_in_millis

flush 总耗时

indices_indexing_index_current

当前写入值

indices_indexing_index_time_in_millis

写入总耗时

indices_indexing_index_total

写入数量(TPS)

业务集群 >5000

日志集群 >20000

indexing_latency

写入延时

业务集群> 10ms

写入总耗时/写入数量

indices_indexing_refresh_total

写入index后执行refresh的总次数

indices_indexing_refresh_total_time_in_millis

写入index后执行refresh的总耗时

indices_search_fetch_current

当前写入search fetch段的次数

indices_search_fetch_time_in_millis

当前写入search fetch段的耗时

indices_search_fetch_total

当前写入search fetch段的总次数

indices_search_query_current

当前写入search query段的次数

indices_search_query_time_in_millis

查询总耗时

indices_search_query_total

查询数量(TPS)

日志集群无

业务集群 >700

search_latency

查询延时

业务集群 >10ms

查询总耗时/查询数量

old_collection_count

old gc数量

日志集群 >100

业务集群 >0

old_collection_time_in_millis

old gc耗时

thread_pool_bulk_queue

bulk写入请求队列长度

日志集群 >100

业务集群 >10

ES5 有此指标

thread_pool_bulk_rejected

bulk写入请求被拒绝的次数

日志集群 >0

ES5 有此指标

thread_pool_write_queue

write写入请求队列长度

日志集群 >100

业务集群 >10

ES6 及以上有此指标

thread_pool_write_rejected

write写入请求被拒绝的次数

日志集群 >0

ES6 及以上有此指标

thread_pool_get_completed

get请求被拒绝的次数

thread_pool_index_queue

index写入请求队列长度

thread_pool_index_rejected

index写入请求被拒绝的次数

thread_pool_search_completed

当前搜索成功的处理次数

thread_pool_search_queue

查询请求队列长度

日志集群 >100

业务集群 >0

thread_pool_search_rejected

查询请求被拒绝的次数

日志集群 >0

业务集群 >0

young_collection_count

young gc数量

young_collection_time_in_millis

young gc耗时

ES索引监控模板

指标

具体的含义

监控间隔

Warning

High

Disaster

备注

集群汇总指标

cluster_no_hidden_indices_count

排除掉以.开头的索引外的索引总数

cluster_primaries_xxx

各索引监控指标都有对应的集群汇总指标

各节点指标

index_type

索引类型(索引或别名)

primaries_docs_count

索引文档数

primaries_size_in_bytes

索引大小

primaries_segments_count

segment数量

primaries_segments_memory_in_bytes

segment使用内存

primaries_indexing_index_total

写入速率

primaries_indexing_index_time_in_millis

写入总耗时

indexing_latency

写入延时

写入总耗时/写入速率

primaries_search_query_total

查询速率

primaries_search_scroll_time_in_millis

查询总耗时

search_latency

查询延时

查询总耗时/查询速率

primaries_search_fetch_total

fetch查询速率

primaries_search_fetch_time_in_millis

fetch查询总耗时

primaries_search_scroll_total

scroll查询速率

primaries_search_scroll_time_in_millis

scroll查询总耗时

primaries_indexing_delete_total

delete操作速率

primaries_indexing_delete_time_in_millis

delete操作总耗时

primaries_merges_total

merge操作速率

primaries_merges_total_time_in_millis

merge操作总耗时

primaries_refresh_total

refresh操作速率

primaries_refresh_total_time_in_millis

refresh操作总耗时

lefooter

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
打赏
1
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录