ElasticSearch分布式搜索引擎_基本查询

最新推荐文章于 2024-02-19 18:00:25 发布

JunSIr_#

最新推荐文章于 2024-02-19 18:00:25 发布

阅读量871

点赞数 1

分类专栏：中间件文章标签： elasticsearch 搜索引擎 java es 数据库

本文链接：https://blog.csdn.net/JunSIrhl/article/details/106062192

版权

中间件专栏收录该内容

16 篇文章 0 订阅

订阅专栏

ElasticSearch分布式搜索引擎_基本查询

在上一篇博客中，我总结了ElasticSearch索引以及数据相关的基本操作

ElasticSearch分布式搜索引擎简介及其增删改查那些事

本篇博客，是关于ElasticSearch查询部分，这也是ElasticSearch的核心操作部分了

1.1基本查询

GET /索引库名/_search
{
    "query":{
        "查询类型":{
            "查询条件":"查询条件值"
        }
    }
}

这里的query代表一个查询对象，里面可以有不同的查询属性

查询类型：
- 例如：match_all， match，term ， range 等等
查询条件：查询条件会根据类型的不同，写法也有差异，后面详细讲解

1.1.1查询索引库中所有数据：

GET /testindex511/_search
{
    "query":{
        "match_all": {}
    }
}

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "testindex511",
        "_type": "goods",
        "_id": "fTzbAnIBL2UUkbPHjt8u",
        "_score": 1,
        "_source": {
          "title": "小米手机",
          "images": "http://image.com/12479122.jpg",
          "price": 2699,
          "stock": 200
        }
      },
      {
        "_index": "testindex511",
        "_type": "goods",
        "_id": "ezzYAnIBL2UUkbPHEN9K",
        "_score": 1,
        "_source": {
          "title": "小米手机",
          "images": "http://image.com/12479122.jpg",
          "price": 2699
        }
      }
    ]
  }
}

其中的一些关键字段

took：查询花费时间，单位是毫秒
time_out：是否超时
_shards：分片信息
hits：搜索结果总览对象
total：搜索到的总条数
max_score：所有结果中文档得分的最高分
hits：搜索结果的文档对象数组，每个元素是一条搜索到的文档信息
_index：索引库
_type：文档类型
_id：文档id
_score：文档得分
_source：文档的源数据

1.1.2 匹配查询（match）

我们先加入一条数据，便于测试：

PUT /testindex511/goods/3
{
    "title":"小米电视4A",
    "images":"http://image.leyou.com/12479122.jpg",
    "price":3899.00
}

or关系:

match类型查询，会把查询条件进行分词，然后进行查询,多个词条之间是or的关系

GET /testindex511/_search
{
    "query":{
        "match":{
            "title":"小米电视"
        }
    }
}

该返回结果会把所有带“小米“关键字的文档查询出来

{
  "took": 28,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0.77041245,
    "hits": [
      {
        "_index": "testindex511",
        "_type": "goods",
        "_id": "3",
        "_score": 0.77041245,
        "_source": {
          "title": "小米电视4A",
          "images": "http://image.leyou.com/12479122.jpg",
          "price": 3899
        }
      },
      {
        "_index": "testindex511",
        "_type": "goods",
        "_id": "fTzbAnIBL2UUkbPHjt8u",
        "_score": 0.2876821,
        "_source": {
          "title": "小米手机",
          "images": "http://image.com/12479122.jpg",
          "price": 2699,
          "stock": 200
        }
      },
      {
        "_index": "testindex511",
        "_type": "goods",
        "_id": "ezzYAnIBL2UUkbPHEN9K",
        "_score": 0.21110918,
        "_source": {
          "title": "小米手机",
          "images": "http://image.com/12479122.jpg",
          "price": 2699
        }
      }
    ]
  }
}

and关系：

某些情况下，我们需要更精确查找，我们希望这个关系变成and，可以这样做：

GET /testindex511/_search
{
    "query":{
        "match": {
          "title": {
            "query": "小米电视",
            "operator": "and"
          }
        }
    }
}

结果：

{
  "took": 25,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.77041245,
    "hits": [
      {
        "_index": "testindex511",
        "_type": "goods",
        "_id": "3",
        "_score": 0.77041245,
        "_source": {
          "title": "小米电视4A",
          "images": "http://image.com/12479122.jpg",
          "price": 3899
        }
      }
    ]
  }
}

本例中，只有同时包含小米和电视的词条才会被搜索到。

在 or 与 and 间二选一有点过于非黑即白。如果用户给定的条件分词后有 5 个查询词项，想查找只包含其中 4 个词的文档，该如何处理？

将 operator 操作符参数设置成 and 只会将此文档排除

在全文搜索的大多数应用场景下，我们既想包含那些可能相关的文档，同时又排除那些不太相关的。换句话说，我们想要处于中间某种结果。

match 查询支持 minimum_should_match 最小匹配参数，这让我们可以指定必须匹配的词项数用来表示一个文档是否相关。我们可以将其设置为某个具体数字，更常用的做法是将其设置为一个百分数，因为我们无法控制用户搜索时输入的单词数量：

GET /testindex511/_search
{
    "query":{
        "match":{
            "title":{
            	"query":"小米曲面电视",
            	"minimum_should_match": "75%"
            }
        }
    }
}

本例中，搜索语句可以分为3个词，如果使用and关系，需要同时满足3个词才会被搜索到。这里我们采用最小品牌数：75%，那么也就是说只要匹配到总词条数量的75%即可，这里3*75% 约等于2。所以只要包含2个词条就算满足条件了

{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.77041245,
    "hits": [
      {
        "_index": "testindex511",
        "_type": "goods",
        "_id": "3",
        "_score": 0.77041245,
        "_source": {
          "title": "小米电视4A",
          "images": "http://image.com/12479122.jpg",
          "price": 3899
        }
      }
    ]
  }
}

即使“曲面”没有匹配到，但是”小米“和“电视”都匹配上了，基本满足75%匹配

1.1.3 多字段查询（multi_match）

multi_match与match类似，不同的是它可以在多个字段中查询

GET /testindex511/_search
{
    "query":{
        "multi_match": {
            "query":    "小米",
            "fields":   [ "title", "subTitle" ]
        }
	}
}

这种查询条件下，会把索引中"title", "subTitle"字段中含有小米的文档全部查出来

1.1.4 词条匹配(term)

term 查询被用于精确值匹配，这些精确值可能是数字、时间、布尔或者那些未分词的字符串

GET /testindex511/_search
{
    "query":{
        "term":{
            "price":2699.00
        }
    }
}

这个查询条件会把索引中所有价格字段值为2699.00的文档全部查询出来

1.1.5 多词条精确匹配(terms)

terms 查询和 term 查询一样，但它允许你指定多值进行匹配。如果这个字段包含了指定值中的任何一个值，那么这个文档满足条件：

GET /testindex511/_search
{
    "query":{
        "terms":{
            "price":[2699.00,2899.00,3899.00]
        }
    }
}

所有价格满足[2699.00,2899.00,3899.00]之间任一个的文档都会被命中

2.1.1 结果过滤

默认情况下，elasticsearch在搜索的结果中，会把文档中保存在_source的所有字段都返回。

如果我们只想获取其中的部分字段，我们可以添加_source的过滤

GET /testindex511/_search
{
  "_source": ["title","price"],
  "query": {
    "term": {
      "price": 2699
    }
  }
}

_source指定了"title" “priice”

返回结果不会包含我们映射的"image"字段

{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1,
    "hits": [
      {
        "_index": "testindex511",
        "_type": "goods",
        "_id": "fTzbAnIBL2UUkbPHjt8u",
        "_score": 1,
        "_source": {
          "price": 2699,
          "title": "小米手机"
        }
      },
      {
        "_index": "testindex511",
        "_type": "goods",
        "_id": "ezzYAnIBL2UUkbPHEN9K",
        "_score": 1,
        "_source": {
          "price": 2699,
          "title": "小米手机"
        }
      }
    ]
  }
}

2.1.2.指定includes和excludes

includes：来指定想要显示的字段
excludes：来指定不想要显示的字段

示例：

两个等效的查询条件

GET /testindex511/_search
{
  "_source": {
    "includes":["title","price"]
  },
  "query": {
    "term": {
      "price": 2699
    }
  }
}

GET /testindex511/_search
{
  "_source": {
     "excludes": ["images"]
  },
  "query": {
    "term": {
      "price": 2699
    }
  }
}