Easticsearch简单的花式查询

最新推荐文章于 2023-10-22 10:21:01 发布

SiuMu_

最新推荐文章于 2023-10-22 10:21:01 发布

阅读量339

点赞数 2

分类专栏： elasticsearch 文章标签： elasticsearch java 搜索引擎

本文链接：https://blog.csdn.net/SiuMu_/article/details/109551374

版权

elasticsearch 专栏收录该内容

3 篇文章

订阅专栏

本文介绍了如何使用Elasticsearch建立索引、插入文档，以及进行基本查询、多字段查询、模糊查询和布尔查询等操作。通过实例演示了如何根据性别、描述和多个字段搜索文档，适合初学者快速上手。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

如果上天没有给你想要的，不是你值得拥有更好的，而是你不配。

准备工作

上次说了怎么安装Elasticsearch，这次我们就讲讲怎么用它。也没有什么其他的准备工作就是以下这些

Elasticsearch启动成功
kibana启动成功

然后我们打开kibana，再打开Dev Tools界面，这样既能看着官方文档，还能在这里直接练习查询语句。

索引和文档

首先，我们需要搞明白一些概念。索引和文档。索引就相当于我们的某个数据库，文档就相当于我们数据库中某一条记录。就先这么理解就行。
接下来我们尝试着建立一个索引，叫xiumu_user，直接在kibana中执行这个命令：

PUT xiumu_user

ES就会返回这样的一个信息：

{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "xiumu_user"
}

我们的索引就建立成功了，然后我们用get请求就可以看到这个索引的情况：

GET xiumu_user

{
  "xiumu_user" : {
    "aliases" : { },
    "mappings" : { },
    "settings" : {
      "index" : {
        "creation_date" : "1604752426171",
        "number_of_shards" : "1",
        "number_of_replicas" : "1",
        "uuid" : "wtR8dESSSaGygWfRItccAw",
        "version" : {
          "created" : "7090299"
        },
        "provided_name" : "xiumu_user"
      }
    }
  }
}

索引简单来说就是这么一回事，那么我们都知道，数据库里的某一条记录都有一个唯一标识，也就是主键，那我们ES里也有一个文档ID作为某个文档的唯一标识。所以我们插入一条数据需要这样写：

PUT xiumu_user/_doc/1
{
  "id": 1,
  "username": "xiumu",
  "age": 22,
  "gender": "男",
  "description": "一个学习Java的菜鸟"
}

这个请求的意思就是说在xiumu_user这个索引中插入一个文档，这个文档的类型是_doc，文档的ID是1，当然ID是这个_doc后面紧挨着的1，而不是后面大括号里的id。大括号里的数据就是一个文档，全都是JSON格式的数据。
补充一个知识点，xiumu_user/_doc/1，这三个部分是索引/类型/ID，但是类型要被弃用了好像，所以我们就默认所有的类型都是_doc，只关心索引和ID就行了。ES会返回这样的数据：

{
  "_index" : "xiumu_user",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

接下来我们就多放进去一点数据，为我们的花式查询做一些准备。批量插入是这样操作的：

PUT xiumu_user/_bulk
{"index":{"_index":"xiumu_user","_id":2}}
{"id": 2,"username": "亚索","age": 200,"gender": "男","description":"风一样的男人"}
{"index":{"_index":"xiumu_user","_id":3}}
{"id": 3,"username": "伊泽瑞尔","age": 100,"gender": "男","description": "有位移的ADC"}
{"index":{"_index":"xiumu_user","_id":4}}
{"id": 5,"username": "寒冰射手","age": 20,"gender": "女","description": "大招会拐弯"}
{"index":{"_index":"xiumu_user","_id":5}}
{"id": 6,"username": "九尾妖狐","age": 18,"gender": "女","description": "我们心有灵犀"}
{"index":{"_index":"xiumu_user","_id":6}}
{"id": 6,"username": "赵信","age": 28,"gender": "男","description": "一点寒芒先到，随后枪出如龙"}

注意一点就是json数据要在一行，不要换行。一行操作命令，一行数据，一行操作命令，一行数据这样的方式。插入成功就会返回这样的json数据：took这个字段表示执行的时间。

{
  "took" : 25,
  "errors" : false,
  "items" : [
    {
      "index" : {
        "_index" : "xiumu_user",
        "_type" : "_doc",
        "_id" : "2",
        "_version" : 3,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 12,
        "_primary_term" : 6,
        "status" : 201
      }
    },
    {
      "index" : {
        "_index" : "xiumu_user",
        "_type" : "_doc",
        "_id" : "3",
        "_version" : 3,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 13,
        "_primary_term" : 6,
        "status" : 201
      }
    },
    ......

由于返回的全文有点长，我就用省略号表示吧。

开始查询

&emasp;&emasp;如何将添加的数据查询出来呢？这个操作非常简单：

GET xiumu_user/_search
{
  "query": {
    "match_all": {}
  }
}

看这单词的意思就是查询全部，返回值如下：

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 6,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "xiumu_user",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "id" : 1,
          "username" : "xiumu",
          "age" : 22,
          "gender" : "男",
          "description" : "一个学习Java的菜鸟"
        }
      },
      {
        "_index" : "xiumu_user",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "id" : 2,
          "username" : "亚索",
          "age" : 200,
          "gender" : "男",
          "description" : "风一样的男人"
        }
      },
      ......

我还是用省略号省略一下返回的内容。
一、根据某个字段查询文档
这个查询方式呢也有很多种，我们一一来试验，例如我们需要根据性别查询，也就是根据gender这个字段

（1）match查询

GET xiumu_user/_search
{
  "query": {
    "match": {
      "gender": "女"
    }
  }
}

（2）query_string查询

GET xiumu_user/_search
{
  "query": {
    "query_string": {
      "default_field": "gender",
      "query": "女"
    }
  }
}

（3）term查询

GET xiumu_user/_search
{
  "query": {
    "term": {
      "gender": {
        "value": "女"
      }
    }
  }
}

理论上我们会查询出来九尾妖狐和寒冰射手，事实上也正是如此，这三种查询方式都会返回这样的结果

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.1631508,
    "hits" : [
      {
        "_index" : "xiumu_user",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 1.1631508,
        "_source" : {
          "id" : 5,
          "username" : "寒冰射手",
          "age" : 20,
          "gender" : "女",
          "description" : "大招会拐弯"
        }
      },
      {
        "_index" : "xiumu_user",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 1.1631508,
        "_source" : {
          "id" : 6,
          "username" : "九尾妖狐",
          "age" : 18,
          "gender" : "女",
          "description" : "我们心有灵犀"
        }
      }
    ]
  }
}

二、模糊查询
Elasticsearch能够将中文分词，然后匹配你想要搜索的关键字，接下来我们就试一试模糊搜索，比如我们根据description这个字段来进行模糊搜索。
（1）match查询

GET xiumu_user/_search
{
  "query": {
    "match": {
      "description": "有位移"
    }
  }
}

（1）query_string查询

GET xiumu_user/_search
{
  "query": {
    "query_string": {
      "default_field": "description",
      "query": "有位移"
    }
  }
}

我们搜索这个关键字 “ 有位移 ” 这三个字，理论上来说应该是把伊泽瑞尔搜索出来。事实上它也确实出来了，但是顺便也把九尾妖狐也搜索出来了。

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 5.242115,
    "hits" : [
      {
        "_index" : "xiumu_user",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 5.242115,
        "_source" : {
          "id" : 3,
          "username" : "伊泽瑞尔",
          "age" : 100,
          "gender" : "男",
          "description" : "有位移的ADC"
        }
      },
      {
        "_index" : "xiumu_user",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 1.2199264,
        "_source" : {
          "id" : 6,
          "username" : "九尾妖狐",
          "age" : 18,
          "gender" : "女",
          "description" : "我们心有灵犀"
        }
      }
    ]
  }
}

那么这里就涉及了一个小知识，Elasticsearch的分词与匹配度：它会把文档进行分词，然后我们搜索的关键字与文档相匹配，匹配到的字越多说明文档的相关度越高，这在Elasticsearch中叫做倒排索引(当然倒排索引没有我说的这么草率，大家可以去详细的了解)。我们看到伊泽瑞尔三个字都匹配到了，所以他的相关度比较高，“_score”这个字段有5点多这么高，但是九尾妖狐只能匹配到“有”这一个字，所以它的“_score”只有1点多。
3）term查询
这个查询结果就不太一样了，需要单独的说明一下，

GET xiumu_user/_search
{
  "query": {
    "term": {
      "description": {
        "value": "有位移"
      }
    }
  }
}

返回是这样的结果：

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

竟然什么也没查到，这是不是很神奇，根据我的理解，这其实是涉及了上面我们所说的倒排索引，我们来看看“有位移的男人”这6个字被分词之后会分成什么？

GET _analyze
{
  "text": ["有位移的男人"]
}

返回结果如下：

{
  "tokens" : [
    {
      "token" : "有",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "<IDEOGRAPHIC>",
      "position" : 0
    },
    {
      "token" : "位",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "<IDEOGRAPHIC>",
      "position" : 1
    },
    {
      "token" : "移",
      "start_offset" : 2,
      "end_offset" : 3,
      "type" : "<IDEOGRAPHIC>",
      "position" : 2
    },
    {
      "token" : "的",
      "start_offset" : 3,
      "end_offset" : 4,
      "type" : "<IDEOGRAPHIC>",
      "position" : 3
    },
    {
      "token" : "男",
      "start_offset" : 4,
      "end_offset" : 5,
      "type" : "<IDEOGRAPHIC>",
      "position" : 4
    },
    {
      "token" : "人",
      "start_offset" : 5,
      "end_offset" : 6,
      "type" : "<IDEOGRAPHIC>",
      "position" : 5
    }
  ]
}

我们看到它被分成了一个个的汉字。match和query_string查询肯定也是将关键字分成了这样的一个个的汉字与文档进行匹配。但是term就不是这样了，term查询又叫精准查询，但是也不是说字段的值要完全精准，而是关键字要精准，它不会对关键字分词，而是将关键字作为一个词，只要索引中存在这个词就算，但是我们搜索的是“有位移”，倒排索引中只有单个的字，并没有三个字的。所以它才会找不到，假如我们用term只搜索一个字，那就能搜索出来。
三、多字段查询
有时候我们会有这样的需求，比如我们想要搜索某些关键字，希望“username”这个字段里有关键字要搜索出来，“description”这个字段里有这个关键字也要搜索出来，一个关键字要匹配多个字段。
（1）multi_match查询

GET xiumu_user/_search
{
  "query": {
    "multi_match": {
      "query": "伊泽 风 寒芒",
      "fields": ["username","description"]
    }
  }
}

我们看这个查询，理论上会把username和description这两字段中包含关键字的文档查询出来。

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : 2.955389,
    "hits" : [
      {
        "_index" : "xiumu_user",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 2.955389,
        "_source" : {
          "id" : 3,
          "username" : "伊泽瑞尔",
          "age" : 100,
          "gender" : "男",
          "description" : "有位移的ADC"
        }
      },
      {
        "_index" : "xiumu_user",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 2.1834397,
        "_source" : {
          "id" : 6,
          "username" : "赵信",
          "age" : 28,
          "gender" : "男",
          "description" : "一点寒芒先到，随后枪出如龙"
        }
      },
      {
        "_index" : "xiumu_user",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.4900502,
        "_source" : {
          "id" : 2,
          "username" : "亚索",
          "age" : 200,
          "gender" : "男",
          "description" : "风一样的男人"
        }
      },
      {
        "_index" : "xiumu_user",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.4776945,
        "_source" : {
          "id" : 5,
          "username" : "寒冰射手",
          "age" : 20,
          "gender" : "女",
          "description" : "大招会拐弯"
        }
      }
    ]
  }
}

我们确实得到了想要的结果。

其他的查询

Elasticsearch有很多查询方式。比如Bool查询，must，must_not，should这些一看就大概明白什么意思。练一练就知道了。

GET xiumu_user/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "username": "寒冰"
          }
        }
      ]
    }
  }
}
GET xiumu_user/_search
{
  "query": {
    "bool": {
      "must_not": [
        {
          "match": {
            "username": "寒冰"
          }
        }
      ]
    }
  }
}
GET xiumu_user/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "username": "寒冰"
          }
        },
        {
          "query_string": {
            "default_field": "description",
            "query": "寒芒"
          }
        }
      ]
    }
  }
}

当然假如Elasticsearch中有很多索引想一起查询怎么办？那就多写几个索引，并用逗号分开就可以了，多索引查询：

GET xiumu_user,xiumu_user0,xiumu_user1/_search
{
  "query": {
    "match_all": {}
  }
}

还有高亮查询，我们用百度搜索的时候会发现，查询结果有关键词的都会变成红色：

GET xiumu_user/_search
{
  "query": {
    "multi_match": {
      "query": "伊泽 风 寒芒",
      "fields": ["username","description"]
    }
  },
  "highlight": {
    "pre_tags": "<span>",
    "post_tags": "</span>",
    "fields": {
      "username": {},
      "description": {}
    }
  }
}