ES查询优化之Profile API

最新推荐文章于 2024-04-03 11:08:01 发布

智能运维

最新推荐文章于 2024-04-03 11:08:01 发布

阅读量2.5k

点赞数

文章标签： elasticsearch

本文链接：https://blog.csdn.net/zhinengyunwei/article/details/103975169

版权

作者：石文
时间：2019-04-01

Profile API 用于定位查询过程中的异常耗时问题的。具体使用方法参考如下链接：https://www.elastic.co/guide/en/elasticsearch/reference/5.2/search-profile.html

使用Profile API功能对查询进行耗时定位，返回结果格式如下：

{
    "took":4,   #查询耗时
    "timed_out":false, #查询是否超时
    "_shards":Object{...},  #查询命中的分区
    "hits":Object{...},     #查询结果
    "profile":Object{...}   #开启profile后的，本次查询的具体情况分析
}

各项详解

"took":4
查询总耗时，单位：ms

"timed_out";false
查询是否超时

"_shards":{
        "total":5,
        "successful":5,
        "skipped":0,
        "failed":0
    }
查询命中的分片

"hits": {
    "total": 22,  #共查询命中几条记录
    "max_score": null,   #记录的最高得分
    "hits": [        #查询结果前size条
      {              #每个框框是一条查询结果
        "_index": "logmwaf-search-2019-03-10",
        "_type": "data",
        "_id": "219a4cdbfe073728637b834c89c7837304467175",
        "_score": null,
        "_source": {
          "remote_addr": "77.34.167.43",
          "waf_hit": "-",
          "urule_hit": "-",
          "host": "183.2.168.203",
          "waf_id": "waf-vocujyg01u",
          "cc_hit": "-",
          "request_method": "GET",
          "time": 1552254980000,
          "weblock_hit": "-",
          "server_addr": "10.0.0.9",
          "request_uri": "http://183.2.168.203/"
        },
        "sort": [
          1552254980000
        ]
      }
    ]
  }
查询结果

"profile":{
        "shards":[
            Object{...},
            Object{...},
            Object{...},
            Object{...},
            Object{...}
        ]
    }
profile项会按照分片将每个分片查询的详细信息进行阐述。

对于每个object{…}：

{
                "id":"[5jSBvBPQRa6rbuBIUO1n0w][logmwaf-search-2019-03-10][1]", #分区信息
                "searches":[      #具体的查询情况
                    Object{...}
                ],
                "aggregations":Array[0]  #聚合情况
            },

针对上面的searches：

"searches":[
                    {
                        "query":Array[1],  #query信息
                        "rewrite_time":35058,  #
                        "collector":Array[1]   收集器信息
                    }
                ]

先说”rewrite_time”，指的是查询在一个分片中进行时，根据查询语句的不断分解，查询语句发生变化时，查询被重写的时间。单位纳秒。query中：

"query":[
    {
       "type":"BooleanQuery",  #查询类型
       "description":"#time:[1552212510000 TO 1552471710000] #ConstantScore(pin:jcloud_monitor)",
       "time":"1.133225000ms", #耗时
       "time_in_nanos":1133225, #耗时，纳秒记
       "breakdown":Object{...},  #耗时解释
       "children":Array[2]       #子查询
    }
]

哪种查询类型被触发：

BooleanQuery

breakdown部分内容：

"breakdown":{
                                    "score":0,  #这记录了一个特定的文件通过评分器(Scorer)评分所需的时间。
                                    "build_scorer_count":45,
                                    "match_count":0,
                                    "create_weight":59301,
                                    "next_doc":8403,
                                    "match":0,     #匹配
                                    "create_weight_count":1,
                                    "next_doc_count":9,
                                    "score_count":0,
                                    "build_scorer":1065466, #构建积计分器
                                    "advance":0,
                                    "advance_count":0
                                }

上图中的值单位都是纳秒。这些时间信息是发生在lucene级别的。”children”部门介绍：

"children":[
      {
        "type":"IndexOrDocValuesQuery",   #查询类型
        "description":"time:[1552212510000 TO 1552471710000]", #描述
        "time":"0.3486950000ms",  
        "time_in_nanos":348695,     
        "breakdown":Object{...}
      }
      Object{...} #与前json一致
]

其中：type包括：”IndexOrDocValuesQuery”和”ConstantScoreQuery”，而在”ConstantScoreQuery”中包括”TermQuery”。

在看下”collector”

"collector":[
                            {
                                "name":"CancellableCollector",   #收集器类型
                                "reason":"search_cancelled",
                                "time":"0.02791200000ms",
                                "time_in_nanos":27912,
                                "children":[
                                    {
                                        "name":"TotalHitCountCollector",
                                        "reason":"search_count",
                                        "time":"0.01132200000ms",
                                        "time_in_nanos":11322
                                    }
                                ]
                            }
                        ]

Lucene通过定义一个“收集器(Collector)”来工作，它负责协调匹配文档的遍历、得分和集合。

"aggregations":[

                ]

聚合

智能运维

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
ES查询优化之Profile API

作者：石文时间：2019-04-01Profile API 用于定位查询过程中的异常耗时问题的。具体使用方法参考如下链接：https://www.elastic.co/guide/en/elasticsearch/reference/5.2/search-profile.html使用Profile API功能对查询进行耗时定位，返回结果格式如下：{ "took":4, #查...
复制链接

扫一扫