Elasticsearch语法知多少之Multi_match query

目录

目标

ES版本信息

官方文档

新增测试数据

基本语法实战

基本格式

通配符匹配多个字段

逻辑操作符

设置评分权重

multi_match多种类型实战

best_fields最佳字段(默认)

most_fields最多字段

跨字段匹配


目标

掌握多匹配查询,包含对多匹配查询的类型分析和应用。


ES版本信息

7.17.5


官方文档

Multi-match queryhttps://www.elastic.co/guide/en/elasticsearch/reference/7.17/query-dsl-multi-match-query.html


新增测试数据

PUT /boss_db
{
    "settings": {
    "index": {
      "analysis.analyzer.default.type": "ik_max_word"
    }
  }
}

PUT /boss_db/_bulk
{"index":{"_id":"1"}}
{"company":"星耀科技有限公司","min_num":0,"max_num":20,"province":"广东省","city":"深圳市","county":"南山区","post":"前端开发实习生","min_salary":10,"max_salary":16,"qualification":"本科","min_work_time":3,"max_work_time":5,"skill":["html","css","vue","js"]}
{"index":{"_id":"2"}}
{"company":"恒和科技有限公司","min_num":100,"max_num":500,"province":"广东省","city":"广州市","county":"天河区","post":"JAVA开发工程师","min_salary":20,"max_salary":30,"qualification":"硕士","min_work_time":1,"max_work_time":3,"skill":["k8s","springboot","mybatis","微服务"]}
{"index":{"_id":"3"}}
{"company":"天心科技有限公司","min_num":2000,"max_num":5000,"province":"广东省","city":"广州市","county":"天河区","post":"JAVA架构师","min_salary":40,"max_salary":50,"qualification":"博士","min_work_time":3,"max_work_time":5,"skill":["mybatis","spring","kafka","微服务"]}
{"index":{"_id":"4"}}
{"company":"黄河科技有限公司","min_num":2000,"max_num":5000,"province":"广东省","city":"广州市","county":"天河区","post":"JAVA","min_salary":40,"max_salary":50,"qualification":"博士","min_work_time":3,"max_work_time":5,"skill":["es","mysql","分布式","soa"]}
{"index":{"_id":"5"}}
{"company":"长江科技有限公司","min_num":2000,"max_num":5000,"province":"广东省","city":"深圳市","county":"龙岗区","post":"资深大数据开发工程师","min_salary":40,"max_salary":50,"qualification":"博士","min_work_time":0,"max_work_time":5,"skill":["redis","kafka","mq","数据结构"]}
{"index":{"_id":"6"}}
{"company":"黄山科技有限公司","min_num":2000,"max_num":5000,"province":"广东省","city":"深圳市","county":"龙岗区","post":"前端开发","min_salary":20,"max_salary":30,"qualification":"大专","min_work_time":0,"max_work_time":5,"skill":["html","css","js","vue"]}
{"index":{"_id":"7"}}
{"company":"黄山科技有限公司","min_num":2000,"max_num":5000,"province":"广东省","city":"深圳市","county":"龙岗区","post":"前端开发实习生","min_salary":10,"max_salary":13,"qualification":"不限","min_work_time":0,"max_work_time":5}
{"index":{"_id":"8"}}
{"company":"银河大数据科技有限公司","min_num":2000,"max_num":5000,"province":"广东省","city":"深圳市","county":"龙岗区","post":"大数据实习生","min_salary":10,"max_salary":13,"qualification":"不限","min_work_time":0,"max_work_time":5,"skill":["电商","spring","容器技术","微服务技术"]}
{"index":{"_id":"9"}}
{"company":"银河大数据科技有限公司","min_num":2000,"max_num":5000,"province":"广东省","city":"深圳市","county":"龙岗区","post":"JAVA实习生","min_salary":30,"max_salary":60,"qualification":"本科","min_work_time":0,"max_work_time":5,"skill":["数据结构","k8s","云原生技术","电商"]}

PUT /blog_db
{
    "settings": {
    "index": {
      "analysis.analyzer.default.type": "ik_max_word"
    }
  }
}

PUT /blog_db/_bulk
{"index":{"_id":"1"}}
{"title":"kafka入门手册","content":"kafka命令、集群、优化"}
{"index":{"_id":"2"}}
{"title":"kafka命令手册","content":"命令详情、命令实战"}

基本语法实战

基本格式

需求:例如:在招聘网搜索栏中的关键词匹配职位和公司。这里分别输入"天心"和"大数据"进行搜索。

GET boss_db/_search
{
  "query": {
    "multi_match" : {
      "query":    "天心", 
      "fields": [ "company", "post" ] 
    }
  }
}

GET boss_db/_search
{
  "query": {
    "multi_match" : {
      "query":    "大数据", 
      "fields": [ "company", "post" ] 
    }
  }
}

通配符匹配多个字段

需求:凡是字段名称包含"work_time"的都作为匹配字段。

GET boss_db/_search
{
  "query": {
    "multi_match" : {
      "query":    5, 
      "fields": [ "*work_time" ] 
    }
  }
}

逻辑操作符

需求一:搜索条件是"前端开发实习生",要求所有分词都匹配。

GET boss_db/_search
{
  "query": {
    "multi_match" : {
      "query":    "前端开发实习生", 
      "fields": [ "post" ] 
      , "operator": "and"
    }
  }
}

需求二:搜索条件是"前端开发实习生",要求只要有分词匹配就符合条件。

GET boss_db/_search
{
  "query": {
    "multi_match" : {
      "query":    "前端开发实习生", 
      "fields": [ "post" ] 
      , "operator": "or"
    }
  }
}

设置评分权重

需求:关键词为"大数据",要求匹配字段是公司和职位,且评分要求职位字段权重大于公司字段权重。

#不处理权重
GET boss_db/_search
{

  "query": {
    "multi_match" : {
      "query":    "大数据", 
      "fields": ["company", "post" ] 
    }
  }
}

#职位评分分数乘以4。
GET boss_db/_search
{

  "query": {
    "multi_match" : {
      "query":    "大数据", 
      "fields": ["company", "post^4" ] 
    }
  }
}

multi_match多种类型实战

best_fields最佳字段(默认)

作用:从所有字段被搜索的字段中找到最重要的字段。比如:关键词为"棕色的狐狸";a字段包含棕色的狐狸,b字段只包含棕色的,c字段只包含狐狸。此时ES认为a字段是最佳字段。tie_breaker的取值的范围是[1,0],默认值为0,即只考虑最佳字段的分数。如果对它进行设置:

  1. 设置0表示:总分=最佳字段的分数。
  2. 设置0<tie_breaker<1表示:总分=最佳字段的分数+tie_breaker*其他字段的分数。
  3. 设置tie_breaker=1表示:所有字段分数权重一样,相当于没有最佳字段。总分=所有字段相加。

需求:搜索关键词为"kafka命令",同时匹配标题和内容,优先标题权重。

分析:如果不设置tie_breaker,根据关键词"kafka命令"匹配,两个文档的最终得分相等,因为id小的排在前面。但是按照业务来看,明显id=2的文档更符合逻辑,所以这里需要将其他字段的分数也算进来一部分即可,这里我设置算进来0.1倍分数。

GET /blog_db/_search
{
  "query": {
    "multi_match": {
      "query": "kafka命令",
      "type": "best_fields",
      "fields": [
        "title",
        "content"
      ],
      "tie_breaker": 0.1
    }
  }
}

most_fields最多字段

相当于best_fields类型,tie_breaker属性设置为1的效果。说明该类型更适合处理字段评分权重相同的场景。这里不做演示,具体同上,tie_breaker设置为1的情况。


跨字段匹配

需求:搜索公司所在区县是"天河区",且应聘学历为"硕士"的数据。

GET boss_db/_search
{
  "query": {
    "multi_match": {
      "query": "天河区硕士",
      "type": "cross_fields",
      "fields": [
        "county",
        "qualification"
      ],
      "operator": "and"
    }
  }
}

分析所有分词必须至少出现在一个字段中,文档才能匹配。它与copy_to类似,但是copy_to需要额外存储,而cross_fields方式不需要额外存储且可以设置字段权重。个人觉得这种方式更适合在查询地名和英文姓名使用。

  • 2
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值