ElasticSearch之search API

一户董

已于 2024-03-27 16:07:33 修改

阅读量1.6k

点赞数 13

分类专栏： ElasticSearch 文章标签： elk es dsl

于 2024-02-08 17:50:14 首次发布

本文链接：https://blog.csdn.net/wang0907/article/details/136078494

版权

ElasticSearch 专栏收录该内容

38 篇文章 3 订阅

订阅专栏

写在前面

本文看下查询相关内容，这也是我们在实际工作中接触的最多的，所以有必要好好学习下！

1：按照http查询方式的分类

按照http查询方式分为如下2类：

1:基于get查询参数的URI search
2：基于post body的request body search，需要用到es的dsl（domain sepecific language）

不管是哪种查询方式，都需要指定要查询的索引，如下几种方式：
在这里插入图片描述

分别来看下URI search和request body search。

另，测试数据参考这里。

2：URI search

这种方式是通过在uri上设置参数来进行查询，可以设置的参数如下：

1:q 使用query string syntax，来指定要查询值，相对比较复杂，后边会单独看下
2：df,defaut field，默认字段，如果是不指定的话则会对所有字段查询
3：sort 排序
4：from，size，分页
5：profile，可以查看查询是如何被执行的

2.1：泛查询

查询包含2012的文档

来看下其查询方式，在 “profile”->“shards”->“searches”,如下图：
泛查询指定df

看下查询方式：

2.2：指定字段查询

在q中指定字段：
在这里插入图片描述
查询方式：

2.3：Term VS Phrase

假定现在我们要查询title中包含”Beautiful Mind“的文档信息，第一个想法可能是使用下边这种方式来查询：
在这里插入图片描述
但是，并不是我们期望的结果，我们通过查询方式来看下：

此时Mind是泛查询。.

同sql语句：

select * from t where title like "%Beautiful%"
union
(
    select * from t where title like "%Mind%"
    union
    select * from t where id like "%Mind%"
    union
    select * from t where year like  "%Mind%"
    ...
)

那我们如何让es把”Beautiful Mind“当作一句话来查询呢，这就要用到PhraseQuery，也比较简单，只需要使用""括起来就行了：
在这里插入图片描述
看下查询方式：

2.4：分组查询

在这里插入图片描述

2.4.1:AND OR

AND

同写法：

即系统默认就是按照带+好处理的，即must，如果想要查询包含Beautiful但是不包含Mind的，则在Mind前用-：

查询的过程是先执行如下的TermQuery title:beautiful查询到title包含Beautiful的结果集1，然后再执行TermQuery title:Mind查询到title包含Mind的结果集2，然后取A和B的差集获取最终结果：
OR

和AND比较类似，可对比学习。

2.4.2:范围查询

在这里插入图片描述
实例们：

2.4.3:通配符，模糊，正则

在这里插入图片描述
实例们：

3：request body search

3.1：查询所有

在这里插入图片描述
查询方式可以看到是MatchAllDocsQuery。

3.2：查询所有

在这里插入图片描述
可以看到虽然是分页查询，但是查询方式和全量查询是一样的，都是MatchAllDocsQuery,并且查询越往后的数据，效率越低，所以es最好不要用来做分页查询，通过其他方案来实现。

3.3：排序

在这里插入图片描述
查询方式可以看到是MatchAllDocsQuery,所以效率也不会高，要慎用。

3.4：只返回指定字段

在这里插入图片描述
正常相当于sqlselect *这里相当于sqlselect order_date,order_id。

3.5:脚本字段

在这里插入图片描述
相当于sqlselect concat('order_date', '_hello') as 'my_custom_field'。

3.6:match查询

在这里插入图片描述
效果同uri searchGET movies/_search?q=title:(Last OR Christmas)，即默认的操作符是OR,可以使用operator来显式指定操作符：

3.7:match_phrase 短语查询

在这里插入图片描述
等同于uri searchGET movies/_search?q=title:"Last Christmas"。
允许中间有其他单词，可如下搜索：

等同于uri searchGET movies/_search?q=title:"one love"~1。

3.8:query_string 查询

dsl的query下写query_string有点像泛查询。
在这里插入图片描述
同uri searchGET movies/_search?q=on。其实就是不同的写法而已，多一种写法，多个选择，适应更多的应用场景。
当然也可以指定字段，增加default_field即可：

3.9:simple_query_string 查询

dsl的query下写simple_query_string有点像泛查询。
在这里插入图片描述

有需要用query string就行了，这里知道即可。

4：term词项查询 VS 全文查询

咱们上述分析的例子使用的都是全文查询

在执行具体查询的时候，查询又分为两种方式，term查询即基于词项的查询，和全文查询。词项查询是录入什么查询什么，不做任何处理，但全文查询会对查询项通过分词器进行分词处理生成一组词项，再分别对每个词项查询，汇总后获取最终的结果。

接下来详细看下这两种查询方式。

4.1：基于词项的term查询

这种查询方式的特点是对于搜索的输入项不做分词处理，即输入什么查询什么，具体的term查询包含如下的几种查询：
在这里插入图片描述
分别来看下。

4.1.1：term query

POST /products/_bulk
{"index": {"_id": 1}}
{"productID": "XHDK-A-1293-#fJ3","desc": "iPhone"}
{"index": {"_id": 2}}
{"productID": "KDKE-B-9947-#kL5","desc": "iPad"}
{"index": {"_id": 3}}
{"productID": "JODL-X-1937-#pV7","desc": "MBP"}


# 1：搜不着，因为倒排索引中分词后的term是ipad
POST products/_search
{
  "query": {
    "term": {
      "desc": "iPad"
    }
  }
}

# 2：小写就搜的到了
POST products/_search
{
  "query": {
    "term": {
      "desc": "ipad"
    }
  }
}

4.1.2：range query

POST /testrange/_bulk
{"index": {"_id": 1}}
{"price": 25}
{"index": {"_id": 2}}
{"price": 30}
{"index": {"_id": 3}}
{"price": 36}

POST /testrange/_search
{
  "query": {
    "range": {
      "price": {
        "gt": 25,
        "gte": 36
      }
    }
  }
}

4.1.3：exists query

要求字段存在，且有不为null，不为空串的值：

# 1：删除老的，避免影响
DELETE /products

# 2：插入测试数据
POST /products/_bulk
{"index": {"_id": 1}}
{"name":"sam","age":30,"man":true,"child":["jhon","lily"],"address":{"country":"US"}}
{"index": {"_id": 2}}
{"name":"","age":52,"man":true,"child":["jhon","lily"],"address":{"country":"CN"}}
{"index": {"_id": 2}}
{"name":null,"age":30,"man":true,"child":["kobe","lily"],"address":{"country":"CN"}}

# 3：要求存在name字段，并且有值，空串也叫做没值，所以只能查出_id=1的doc
POST /products/_search
{
  "query": {
    "exists": {
      "field": "name"
    }
  }
}

4.1.4：prefix query

按照前缀进行匹配,类似于sql的like "xx%"：

# 1：删除老的，避免影响
DELETE /products

# 2：插入测试数据
POST /products/_bulk
{"index": {"_id": 1}}
{"name":"sam","age":30,"man":true,"child":["jhon","lily"],"address":{"country":"US"}}
{"index": {"_id": 2}}
{"name":"","age":52,"man":true,"child":["jhon","lily"],"address":{"country":"CN"}}
{"index": {"_id": 3}}
{"name":null,"age":30,"man":true,"child":["kobe","lily"],"address":{"country":"CN"}}
{"index": {"_id": 4}}
{"name":"Sack","age":30,"man":true,"child":["jhon","lily"],"address":{"country":"US"}}

# 3：注意因为是词项查询，Sa，不会转小写查询，而倒排索引中的term是小写的，所以查不出东西来
POST /products/_search
{
  "query": {
    "prefix": {
      "name": "Sa"
    }
  }
}

# 4：小写的sa就能查出_id=1和_id=4的文档了
POST /products/_search
{
  "query": {
    "prefix": {
      "name": "sa"
    }
  }
}

4.1.5：wildcard query

基于通配符的查询：

# 1：删除老的，避免影响
DELETE /products

# 2：插入测试数据
POST /products/_bulk
{"index": {"_id": 1}}
{"name":"sam","age":30,"man":true,"child":["jhon","lily"],"address":{"country":"US"}}
{"index": {"_id": 2}}
{"name":"","age":52,"man":true,"child":["jhon","lily"],"address":{"country":"CN"}}
{"index": {"_id": 3}}
{"name":null,"age":30,"man":true,"child":["kobe","lily"],"address":{"country":"CN"}}
{"index": {"_id": 4}}
{"name":"Sack","age":30,"man":true,"child":["jhon","lily"],"address":{"country":"US"}}

# 3：可以查询出name为Sack的文档
POST /products/_search
{
  "query": {
    "wildcard": {
      "name": "*ac*"
    }
  }
}

4.1.6：regexp query

正则查询，灵活度相比于wildcard查询更高，但容易引起性能问题。

# 1：删除老的，避免影响
DELETE /products

# 2：插入测试数据
POST /products/_bulk
{"index": {"_id": 1}}
{"name":"sam","age":30,"man":true,"child":["jhon","lily"],"address":{"country":"US"}}
{"index": {"_id": 2}}
{"name":"","age":52,"man":true,"child":["jhon","lily"],"address":{"country":"CN"}}
{"index": {"_id": 3}}
{"name":null,"age":30,"man":true,"child":["kobe","lily"],"address":{"country":"CN"}}
{"index": {"_id": 4}}
{"name":"Sack","age":30,"man":true,"child":["jhon","lily"],"address":{"country":"US"}}

# 3：查询name只包含字母（大小写均可），且以k结尾的文档
POST /products/_search
{
  "query": {
    "regexp": {
      "name": "[a-zA-Z]*k"
    }
  }
}

在这里插入图片描述

4.2：全文查询

全文搜索就是先把查询语句交由分词器分词，分成很多个词项（term）后，再对每个词项进行查询，最后获得查询结果。包括如下的查询：

Match Query:
    {
        "query": {
            "match": {
                ///
            }
        }
    }
Match Phrase Query：
    {
        "query": {
            "match_phrase": {
                ///
            }
        }
    }
Query String Query(泛查询，即)：
    {
        "query": {
            "query_string": {
                "query": "要查询的内容"
                // 也可以通过defualt_field来指定查询的字段
            }
        }
    }