ES设置查询的相似度算法

similarity

Elasticsearch allows you to configure a scoring algorithm or similarity per field. The similaritysetting provides a simple way of choosing a similarity algorithm other than the default BM25, such as TF/IDF.

Similarities are mostly useful for text fields, but can also apply to other field types.

Custom similarities can be configured by tuning the parameters of the built-in similarities. For more details about this expert options, see the similarity module.

The only similarities which can be used out of the box, without any further configuration are:

BM25
The Okapi BM25 algorithm. The algorithm used by default in Elasticsearch and Lucene. See  Pluggable Similarity Algorithms for more information.
classic
The TF/IDF algorithm which used to be the default in Elasticsearch and Lucene. See  Lucene’s Practical Scoring Function for more information.
boolean
A simple boolean similarity, which is used when full-text ranking is not needed and the score should only be based on whether the query terms match or not. Boolean similarity gives terms a score equal to their query boost.

The similarity can be set on the field level when a field is first created, as follows:

PUT my_index
{
  "mappings": {
    "my_type": {
      "properties": {
        "default_field": { 
          "type": "text"
        },
        "classic_field": {
          "type": "text",
          "similarity": "classic" 
        },
        "boolean_sim_field": {
          "type": "text",
          "similarity": "boolean" 
        }
      }
    }
  }
}

The default_field uses the BM25 similarity.

The classic_field uses the classic similarity (ie TF/IDF).

The boolean_sim_field uses the boolean similarity.

 

Default and Base Similarities

By default, Elasticsearch will use whatever similarity is configured as default. However, the similarity functions queryNorm() and coord() are not per-field. Consequently, for expert users wanting to change the implementation used for these two methods, while not changing the default, it is possible to configure a similarity with the name base. This similarity will then be used for the two methods.

You can change the default similarity for all fields in an index when it is created:

PUT /my_index
{
  "settings": {
    "index": {
      "similarity": {
        "default": {
          "type": "classic"
        }
      }
    }
  }
}

If you want to change the default similarity after creating the index you must close your index, send the follwing request and open it again afterwards:

PUT /my_index/_settings
{
  "settings": {
    "index": {
      "similarity": {
        "default": {
          "type": "classic"
        }
      }
    }
  }
}

from:https://www.elastic.co/guide/en/elasticsearch/reference/5.4/index-modules-similarity.html


















本文转自张昺华-sky博客园博客,原文链接:http://www.cnblogs.com/bonelee/p/7451929.html,如需转载请自行联系原作者


  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值