ElasticSearch--解释查询

解释执行语句

GET /_validate/query?explain
{
  "query": {
    "multi_match": {
      "query": "周杰伦",
      "type": "most_fields", 
      "fields": ["singer","wordAuthor"]
    }
  }
}

返回结果

{
  "_shards": {
    "total": 2,
    "successful": 2,
    "failed": 0
  },
  "valid": true,
  "explanations": [
    {
      "index": ".kibana",
      "valid": true,
      "explanation": """(MatchNoDocsQuery("unmapped field [singer]") | MatchNoDocsQuery("unmapped field [wordAuthor]"))~1.0"""
    },
    {
      "index": "music",
      "valid": true,
      "explanation": "((singer:周杰伦 singer:周杰 singer:) | (wordAuthor:周杰伦 wordAuthor:周杰 wordAuthor:))~1.0"
    }
  ]
}

获取查询结果评分是如果计算出来的

GET /music/doc/_search
{
  "query": {
    "match_phrase": {
      "singer": "周杰伦"
    }
  },
  "explain": true
}
"_explanation": {
          "value": 17.777544,
          "description": """weight(singer:"周杰伦 周杰 伦" in 7695) [PerFieldSimilarity], result of:""",
          "details": [
            {
              "value": 17.777544,
              "description": "score(doc=7695,freq=1.0 = phraseFreq=1.0\n), product of:",
              "details": [
                {
                  "value": 19.915054,
                  "description": "idf(), sum of:",
                  "details": [
                    {
                      "value": 7.076377,
                      "description": "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
                      "details": [
                        {
                          "value": 30,
                          "description": "docFreq",
                          "details": []
                        },
                        {
                          "value": 36101,
                          "description": "docCount",
                          "details": []
                        }
                      ]
                    },
                    {
                      "value": 7.076377,
                      "description": "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
                      "details": [
                        {
                          "value": 30,
                          "description": "docFreq",
                          "details": []
                        },
                        {
                          "value": 36101,
                          "description": "docCount",
                          "details": []
                        }
                      ]
                    },
                    {
                      "value": 5.7623005,
                      "description": "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
                      "details": [
                        {
                          "value": 113,
                          "description": "docFreq",
                          "details": []
                        },
                        {
                          "value": 36101,
                          "description": "docCount",
                          "details": []
                        }
                      ]
                    }
                  ]
                },
                {
                  "value": 0.8926686,
                  "description": "tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:",
                  "details": [
                    {
                      "value": 1,
                      "description": "phraseFreq=1.0",
                      "details": []
                    },
                    {
                      "value": 1.2,
                      "description": "parameter k1",
                      "details": []
                    },
                    {
                      "value": 0.75,
                      "description": "parameter b",
                      "details": []
                    },
                    {
                      "value": 2.3185508,
                      "description": "avgFieldLength",
                      "details": []
                    },
                    {
                      "value": 3,
                      "description": "fieldLength",
                      "details": []
                    }
                  ]
                }
              ]
            }
          ]
        }

评分算法为BM25算法,BM25是概率性相关的算法,可以认为是给定文档和查询匹配的概率,解释分为两部分,
第一部分是IDF。计算公式为

log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5))

docCount为文档所在的shard的文档个数,
docFreq为该词在shard中出现的次数。各个词计算后相加。
第二部分为:
TFnorm,计算公式为

(freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength))

freq为词在文档中出现的次数,
k1为词频对结果的影响程度,默认为1.2。
b为文档篇幅对结果的影响程度,默认为0.75
参考https://blog.csdn.net/hellozhxy/article/details/89387550

一篇文档为啥没被查询到

GET /music/doc/52892/_explain
{
  "query": {
    "match_phrase": {
      "singer": "周杰伦"
    }
  }
}

未完待续

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值