7_ES中高级检索(Query)

最新推荐文章于 2024-09-13 15:04:04 发布

果将如此

最新推荐文章于 2024-09-13 15:04:04 发布

阅读量6.4k

点赞数

分类专栏： # ElaticSearch 文章标签： elasticsearch

By_Code27

本文链接：https://blog.csdn.net/XJ0927/article/details/111999742

版权

ElaticSearch 专栏收录该内容

14 篇文章 29 订阅

订阅专栏

检索方式 _search

ES官方提供了两中检索方式:一种是通过 URL 参数进行搜索,另一种是通过 DSL(Domain Specified Language) 进行搜索。官方更推荐使用第二种方式第二种方式是基于传递JSON作为请求体(request body)格式与ES进行交互，这种方式更强大，更简洁

在这里插入图片描述

语法：

URL查询: GET /索引/类型/_search?参数
DSL查询: GET /索引/类型/_search {}

第一种使用较少，了解即可！

测试数据

# 1.删除索引
DELETE /ems

# 2.创建索引并指定类型
PUT /ems
{
  "mappings":{
    "emp":{
      "properties":{
        "name":{
          "type":"text"
        },
        "age":{
          "type":"integer"
        },
        "bir":{
          "type":"date"
        },
        "content":{
          "type":"text"
        },
        "address":{
          "type":"keyword"
        }
      }
    }
  }
}

# 3.插入测试数据
PUT /ems/emp/_bulk
  {"index":{}}
  {"name":"小黑","age":23,"bir":"2012-12-12","content":"为开发团队选择一款优秀的MVC框架是件难事儿，在众多可行的方案中决择需要很高的经验和水平","address":"北京"}
  {"index":{}}
  {"name":"王小黑","age":24,"bir":"2012-12-12","content":"Spring 框架是一个分层架构，由 7 个定义良好的模块组成。Spring 模块构建在核心容器之上，核心容器定义了创建、配置和管理 bean 的方式","address":"上海"}
  {"index":{}}
  {"name":"张小五","age":8,"bir":"2012-12-12","content":"Spring Cloud 作为Java 语言的微服务框架，它依赖于Spring Boot，有快速开发、持续交付和容易部署等特点。Spring Cloud 的组件非常多，涉及微服务的方方面面，井在开源社区Spring 和Netflix 、Pivotal 两大公司的推动下越来越完善","address":"无锡"}
  {"index":{}}
  {"name":"win7","age":9,"bir":"2012-12-12","content":"Spring的目标是致力于全方位的简化Java开发。 这势必引出更多的解释， Spring是如何简化Java开发的？","address":"南京"}
  {"index":{}}
  {"name":"梅超风","age":43,"bir":"2012-12-12","content":"Redis是一个开源的使用ANSI C语言编写、支持网络、可基于内存亦可持久化的日志型、Key-Value数据库，并提供多种语言的API","address":"杭州"}
  {"index":{}}
  {"name":"张无忌","age":59,"bir":"2012-12-12","content":"ElasticSearch是一个基于Lucene的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎，基于RESTful web接口","address":"北京"}

URL检索

NOTE：了解即可，有助于更好了解DSL

GET /ems/emp/_search?q=*&sort=age:desc&size=5&from=0&_source=name,age,bir

_search：搜索的API

q=* ：匹配所有文档

sort=age：按照指定字段进行排序，默认为升序，:desc 降序排列

size：展示多少条数据

from：展示第几页

_source：只匹配哪些字段

DSL检索

NOTE：重点

0.查询所有(match_all)

match_all关键字: 返回索引中的全部文档

GET /ems/emp/_search
{
  "query": {
    "match_all": {}
  }
}

1.查询结果中返回指定条数(size)

size 关键字: 指定查询结果中返回指定条数。 默认返回值10条

是对查询的结果进行处理

GET /ems/emp/_search
{
  "query": {
    "match_all": {}
  },
  "size": 5
}

2.分页查询(from)

from 关键字: 用来指定起始返回位置，和size关键字连用可实现分页效果

GET /ems/emp/_search
{
  "query": {
    "match_all": {}
  },
  "size": 5,
  "from": 0
}

3. 查询结果中返回指定字段(_source)

_source 关键字: 是一个数组,在数组中用来指定展示那些字段

# 展示单个字段
GET /ems/emp/_search
{
  "query": {
    "match_all": {}
  },
  "_source": "name"
}


# 展示多个字段
GET /ems/emp/_search
{
  "query": {
    "match_all": {}
  },
  "_source": ["name","age"]
}

4. 关键词查询(term)

term 关键字: 用来使用关键词查询

# name 是text类型，会进行分词，所以name包含“张”的文档都行
GET /ems/emp/_search
{
  "query": {
    "term": {
      "name": {
        "value": "张"
      }
    }
  }
}

# bir 是date类型，不会进行分词，所以按照整体查询，查不到数据
GET /ems/emp/_search
{
  "query": {
    "term": {
      "bir": {
        "value": "2012-12"
      }
    }
  }
}

总结：

NOTE1: 通过使用term查询得知ES中默认使用分词器为标准分词器(StandardAnalyzer),标准分词器对于英文单词分词,对于中文单字分词。

NOTE2: 通过使用term查询得知,在ES的Mapping Type 中 keyword , date ,integer, long , double , boolean or ip 这些类型不分词，只有text类型分词。

5. 范围查询(range)

range 关键字: 用来指定查询指定范围内的文档

当然只是针对一些特殊的字段比如age等

# 查询age>=5,<=10的文档
GET /ems/emp/_search
{
  "query": {
    "range": {
      "age": {
        "gte": 5,
        "lte": 10
      }
    }
  }
}

6. 前缀查询(prefix)

prefix 关键字: 用来检索含有指定前缀的关键词的相关文档

GET /ems/emp/_search
{
  "query": {
    "prefix": {
      "name": {
        "value": "无"
      }
    }
  }
}

# 结果：
{
  "took" : 11,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "emp",
        "_id" : "BVSspHYBh-o7eO8i7bUf",
        "_score" : 1.0,
        "_source" : {
          "name" : "张无忌",
          "age" : 59,
          "bir" : "2012-12-12",
          "content" : "ElasticSearch是一个基于Lucene的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎，基于RESTful web接口",
          "address" : "北京"
        }
      }
    ]
  }
}

注意： 指定的前缀并不是说元数据文档中name属性以"张"为前缀的，而是匹配的经过分词器分词后索引区的数据，这里"张无忌"经过分词后为：“张”,“无”,“忌”，无论匹配到哪个都会指向那份文档

7. 通配符查询(wildcard)

wildcard 关键字: 通配符查询 ? 用来匹配一个任意字符 * 用来匹配多个任意字符

GET /ems/emp/_search
{
  "query": {
    "wildcard": {
      "name": {
        "value": "张?"
      }
    }
  }
}

这里的匹配也是针对索引区里面的数据

8. 多id查询(ids)

ids 关键字 : 值为数组类型,用来根据一组id获取多个对应的文档

GET /ems/emp/_search
{
  "query": {
    "ids": {
      "values": ["AlSspHYBh-o7eO8i7bUf","BVSspHYBh-o7eO8i7bUf"]
    }
  }
}

9. 模糊查询(fuzzy)

fuzzy 关键字: 用来模糊查询含有指定关键字的文档

GET /ems/emp/_search
{
  "query": {
      "fuzzy": {
        "content": "sprin"
      }
  }
}

# 这里搜索的关键词长度为5，允许一次模糊，索引区的数据为spring能相应匹配，刚好差1

模糊查询的规则： fuzzy 模糊查询最大模糊错误必须在0-2之间

搜索关键词长度为 2 不允许存在模糊 0
搜索关键词长度为3-5 允许一次模糊 0 1
搜索关键词长度大于5 允许最大2模糊

10. 布尔查询(bool)

bool 关键字: 用来组合多个条件实现复杂查询

must: 相当于&& 同时成立
should: 相当于|| 成立一个就行
must_not: 相当于! 不能满足任何一个

GET /ems/emp/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "age": {
              "gte": 5,
              "lte": 10
            }
          }
        }
      ],
      "must_not": [
        {
          "term": {
            "address": {
              "value": "南"
            }
          }
        }
      ]
    }
  }
}

11. 高亮查询(highlight)

highlight 关键字: 可以让符合条件的文档中的关键词高亮

GET /ems/emp/_search
{
  "query": {
    "term": {
      "name": {
        "value": "五"
      }
    }
  },
  "highlight": {
    "fields": {
      "name":{}
    }
  }
}

# 结果
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "emp",
        "_id" : "AlSspHYBh-o7eO8i7bUf",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "张小五",
          "age" : 8,
          "bir" : "2012-12-12",
          "content" : "Spring Cloud 作为Java 语言的微服务框架，它依赖于Spring Boot，有快速开发、持续交付和容易部署等特点。Spring Cloud 的组件非常多，涉及微服务的方方面面，井在开源社区Spring 和Netflix 、Pivotal 两大公司的推动下越来越完善",
          "address" : "无锡"
        },
        "highlight" : {
          "name" : [
            "张小<em>五</em>"
          ]
        }
      }
    ]
  }
}

highlight 是对查询后的结果进行高亮，所以要放在"query"之后进行，同时，并不是在原数据上进行操作，而是新增了，并增加的为

自定义高亮html标签: 可以在highlight中使用pre_tags和post_tags

GET /ems/emp/_search
{
  "query": {
    "term": {
      "name": {
        "value": "五"
      }
    }
  },
  "highlight": {
    "pre_tags": ["<span style='color:red'>"], 
    "post_tags": ["</span>"], 
    "fields": {
      "name":{}
    }
  }
}

# 结果
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "ems",
        "_type" : "emp",
        "_id" : "AlSspHYBh-o7eO8i7bUf",
        "_score" : 0.2876821,
        "_source" : {
          "name" : "张小五",
          "age" : 8,
          "bir" : "2012-12-12",
          "content" : "Spring Cloud 作为Java 语言的微服务框架，它依赖于Spring Boot，有快速开发、持续交付和容易部署等特点。Spring Cloud 的组件非常多，涉及微服务的方方面面，井在开源社区Spring 和Netflix 、Pivotal 两大公司的推动下越来越完善",
          "address" : "无锡"
        },
        "highlight" : {
          "name" : [
            "张小<span style='color:red'>五</span>"
          ]
        }
      }
    ]
  }
}

12. 多字段查询(multi_match)

有时候光进行一项字段匹配体验效果并不是很好，所以可以同时多字段匹配

GET /ems/emp/_search
{
  "query": {
    "multi_match": {
      "query": "中国", # 关键词
      "fields": ["name","content"]  #这里写要检索的指定字段
    }
  }
}

注意： 检索的关键词是否需要拆开来检索还需要看指定的字段是否有分词

13. 多字段分词查询(query_string)

该关键词可以增加分词器

GET /dangdang/book/_search
{
  "query": {
    "query_string": {
      "query": "中国声音",
      "analyzer": "ik_max_word", 
      "fields": ["name","content"]
    }
  }
}