【Elasticsearch学习笔记四】DSL：term和match查询的区别；aggs聚合查询；sort排序和range查询范围；from查询页码和size返回结果；_source查询需要的属性名称

最新推荐文章于 2024-08-04 07:56:44 发布

等到鸡吃完米

最新推荐文章于 2024-08-04 07:56:44 发布

阅读量1.3k

点赞数

分类专栏： Es 文章标签： elasticsearch 学习

本文链接：https://blog.csdn.net/wangxiaozhonga/article/details/127924134

版权

Es 专栏收录该内容

9 篇文章 3 订阅

订阅专栏

4）match_phrase （短语匹配）

5、from查询页码和size返回结果详情数量

6、highlight高亮显示

7、_source查询出需要的属性名称

1、词条匹配：term、terms查询

全文检索字段用match，其他非text字段匹配用term；term查询被用于精确值匹配，这些精确值可能是数字、时间、布尔或者那些未分词的字符串；

#term
GET sku/_search
{
  "query": {
    "term":{
      "price":1000
    }
  }
}

terms查询和term查询一样，但它允许你指定多值进行匹配。如果这个字段包含了指定值中的任何一个值，那么这个文档就满足条件；

#terms
GET sku/_search
{
  "query": {
    "terms":{
      "price":[1000,2000,3000]
    }
  }
}

2、term和match的区别

检索关键字	描述
term	非text使用不会进行分词	查询价格 id username 用term
match	在text中我们实现全文检索-分词	text field
match keyword	在属性字段后加.keyword 实现精确查询-不分词
match_phrase	短语查询，不分词，模糊查询

在实际的项目查询中，term和match是最常用的两个查询；下面我们通过举例来说明两者的不同，先放入数据：

#存放一些数据
PUT /test
POST /test/_doc
{
  "title": "love HuBei",
  "content": "people very love HuBei",
  "tags": ["HuBei","love"]
}
POST /test/_doc
{
    "title": "love China",
    "content": "people very love China",
    "tags": ["China", "love"]
}

1）term（精确查询）

term是代表完全匹配，也就是精确查询，搜索前不会再对搜索词进行分词拆解，来使用term查询下:

GET /test/_search
{
  "query": {
    "term": {
      "title": {
        "value": "love"
      }
    }
  }
}

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.6931472,
    "hits": [
      {
        "_index": "test",
        "_type": "doc",
        "_id": "8",
        "_score": 0.6931472,
        "_source": {
          "title": "love HuBei",
          "content": "people very love HuBei",
          "tags": ["HuBei","love"]
        }
      },
      {
        "_index": "test",
        "_type": "doc",
        "_id": "7",
        "_score": 0.6931472,
        "_source": {
          "title": "love China",
          "content": "people very love China",
          "tags": ["China","love"]
        }
      }
    ]
  }
}

# title里有关love的关键字都查出来了,
# 但是我只想精确匹配love China这个,按照下面的写法是没有查询结果的
# 存放数据love China时进行了分词处理，由于term不会进行分词,所以查询不出来
{
  "query": {
    "term": {
      "title": "love China"
    }
  }
}

2）terms（精确匹配多个或关系）

term属于精确匹配,只能查单个词。我想用term匹配多个词怎么做?可以使用terms来:terms里的[ ]多个是或者的关系,只要满足其中一个词就可以：

{
  "query": {
    "terms": {
      "title": ["love", "China"]
    }
  }
}

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "VHJA4YMBhvMSKKYkMlpJ",
        "_score" : 1.0,
        "_source" : {
          "title" : "love China",
          "content" : "people very love China",
          "tags" : [
            "China",
            "love"
          ]
        }
      },
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "VXJA4YMBhvMSKKYkQFov",
        "_score" : 1.0,
        "_source" : {
          "title" : "love HuBei",
          "content" : "people very love HuBei",
          "tags" : [
            "HuBei",
            "love"
          ]
        }
      }
    ]
  }
}

3）match（分词匹配）

match进行搜索的时候,会先进行分词拆分，拆完后再来匹配，上面两个内容，他们title的词条为：love china hubei；针对于"love China"，分析出来的为love和china的两个词，并且属于或的关系，只要任何一个词条在里面就能匹配到。

GET test/doc/_search
{
  "query": {
    "match": {
      "title": "love China"
    }
  }
}

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1.3862944,
    "hits": [
      {
        "_index": "test",
        "_type": "doc",
        "_id": "7",
        "_score": 1.3862944,
        "_source": {
          "title": "love China",
          "content": "people very love China",
          "tags": [
            "China",
            "love"
          ]
        }
      },
      {
        "_index": "test",
        "_type": "doc",
        "_id": "8",
        "_score": 0.6931472,
        "_source": {
          "title": "love HuBei",
          "content": "people very love HuBei",
          "tags": [
            "HuBei",
            "love"
          ]
        }
      }
    ]
  }
}

4）match_phrase （短语匹配）

如果想love和China同时匹配到的话，使用 match_phrase；match_phrase称为短语搜索,要求所有的分词必须同时出现在文档中,同时位置必须紧邻一致；

GET test/doc/_search
{
  "query": {
    "match_phrase": {
      "title": "love china"
    }
  }
}

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1.3862944,
    "hits": [
      {
        "_index": "test",
        "_type": "doc",
        "_id": "7",
        "_score": 1.3862944,
        "_source": {
          "title": "love China",
          "content": "people very love China",
          "tags": [
            "China",
            "love"
          ]
        }
      }
    ]
  }
}

3、aggregation执行聚合查询

聚合提供了从数据中分组和提取数据的能力。最简单的聚合方法大致等于SQL Group by和SQL聚合函数。

在elasticsearch中执行搜索返回this(命中结果)，并且同时返回聚合结果把以响应中的所有hits(命中结果)分隔开。并且可以执行查询和多个聚合，在一次使用中得到各自的(任何一个的)返回结果，使用一次简洁和简化的API从而避免网络往返。

聚合语法如下:

terms:看值的可能性分布,会合并锁查字段,给出计数即可
avg:看值的分布平均

"aggs":{ # 聚合
    "aggs_name":{ # 这次聚合的名字,方便展示在结果集中
        "AGG_TYPE":{} # 聚合的类型(avg,term,terms)
     }
}

DSL语句实例一：

搜索address中包含mill的所有人的年龄分布以及平均年龄，但不显示这些人的详情：

# 分别为包含mill、,平均年龄、
GET bank/_search
{
  "query": { # 查询出包含mill的
    "match": {
      "address": "Mill"
    }
  },
  "aggs": { #基于查询聚合
    "ageAgg": {  # 聚合的名字,随便起
      "terms": { # 看值的可能性分布
        "field": "age",
        "size": 10
      }
    },
    "ageAvg": { 
      "avg": { # 看age值的平均
        "field": "age"
      }
    },
    "balanceAvg": {
      "avg": { # 看balance的平均
        "field": "balance"
      }
    }
  },
  "size": 0  # 不看详情
}
结果:
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4, // 命中4条
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "ageAgg" : { // 第一个聚合的结果
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 38,  # age为38的有2条
          "doc_count" : 2
        },
        {
          "key" : 28,
          "doc_count" : 1
        },
        {
          "key" : 32,
          "doc_count" : 1
        }
      ]
    },
    "ageAvg" : { // 第二个聚合的结果
      "value" : 34.0  # balance字段的平均值是34
    },
    "balanceAvg" : {
      "value" : 25208.0
    }
  }
}

DSL语句实例二：

aggs/aggName/aggs/aggName子聚合:按照年龄聚合,并且求这些年龄段的这些人的平均薪资:

GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "ageAgg": {
      "terms": { # 看分布
        "field": "age",
        "size": 100
      },
      "aggs": { # 与terms并列
        "ageAvg": { #平均
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  },
  "size": 0
}
输出结果:
{
  "took" : 49,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1000,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "ageAgg" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 31,
          "doc_count" : 61,
          "ageAvg" : {
            "value" : 28312.918032786885
          }
        },
        {
          "key" : 39,
          "doc_count" : 60,
          "ageAvg" : {
            "value" : 25269.583333333332
          }
        },
        {
          "key" : 26,
          "doc_count" : 59,
          "ageAvg" : {
            "value" : 23194.813559322032
          }
        },
        {
          "key" : 32,
          "doc_count" : 52,
          "ageAvg" : {
            "value" : 23951.346153846152
          }
        },
        {
          "key" : 35,
          "doc_count" : 52,
          "ageAvg" : {
            "value" : 22136.69230769231
          }
        },
        {
          "key" : 36,
          "doc_count" : 52,
          "ageAvg" : {
            "value" : 22174.71153846154
          }
        },
        {
          "key" : 22,
          "doc_count" : 51,
          "ageAvg" : {
            "value" : 24731.07843137255
          }
        },
        {
          "key" : 28,
          "doc_count" : 51,
          "ageAvg" : {
            "value" : 28273.882352941175
          }
        },
        {
          "key" : 33,
          "doc_count" : 50,
          "ageAvg" : {
            "value" : 25093.94
          }
        },
        {
          "key" : 34,
          "doc_count" : 49,
          "ageAvg" : {
            "value" : 26809.95918367347
          }
        },
        {
          "key" : 30,
          "doc_count" : 47,
          "ageAvg" : {
            "value" : 22841.106382978724
          }
        },
        {
          "key" : 21,
          "doc_count" : 46,
          "ageAvg" : {
            "value" : 26981.434782608696
          }
        },
        {
          "key" : 40,
          "doc_count" : 45,
          "ageAvg" : {
            "value" : 27183.17777777778
          }
        },
        {
          "key" : 20,
          "doc_count" : 44,
          "ageAvg" : {
            "value" : 27741.227272727272
          }
        },
        {
          "key" : 23,
          "doc_count" : 42,
          "ageAvg" : {
            "value" : 27314.214285714286
          }
        },
        {
          "key" : 24,
          "doc_count" : 42,
          "ageAvg" : {
            "value" : 28519.04761904762
          }
        },
        {
          "key" : 25,
          "doc_count" : 42,
          "ageAvg" : {
            "value" : 27445.214285714286
          }
        },
        {
          "key" : 37,
          "doc_count" : 42,
          "ageAvg" : {
            "value" : 27022.261904761905
          }
        },
        {
          "key" : 27,
          "doc_count" : 39,
          "ageAvg" : {
            "value" : 21471.871794871793
          }
        },
        {
          "key" : 38,
          "doc_count" : 39,
          "ageAvg" : {
            "value" : 26187.17948717949
          }
        },
        {
          "key" : 29,
          "doc_count" : 35,
          "ageAvg" : {
            "value" : 29483.14285714286
          }
        }
      ]
    }
  }
}

DSL语句实例三：

复杂子聚合:查出所有年龄分布，并且这些年龄段中M(男性)的平均薪资和F(女性)的平均薪资以及这个年龄段的总体平均薪资：（这个年龄段的总体平均薪资指的是男和女一起的平均薪资÷2）

GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "ageAgg": {
      "terms": {  #  看age分布
        "field": "age",
        "size": 100
      },
      "aggs": { # 子聚合
        "genderAgg": {
          "terms": { # 看gender分布
            "field": "gender.keyword" # 注意这里,文本字段应该用.keyword
          },
          "aggs": { # 子聚合
            "balanceAvg": {
              "avg": { # 男性的平均
                "field": "balance"
              }
            }
          }
        },
        "ageBalanceAvg": {
          "avg": { #age分布的平均(男女)
            "field": "balance"
          }
        }
      }
    }
  },
  "size": 0
}
输出结果:
{
  "took" : 119,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1000,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "ageAgg" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 31,
          "doc_count" : 61,
          "genderAgg" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "M",
                "doc_count" : 35,
                "balanceAvg" : {
                  "value" : 29565.628571428573
                }
              },
              {
                "key" : "F",
                "doc_count" : 26,
                "balanceAvg" : {
                  "value" : 26626.576923076922
                }
              }
            ]
          },
          "ageBalanceAvg" : {
            "value" : 28312.918032786885
          }
        }
      ]
        .......//省略其他
    }
  }
}

4、排序sort和查询范围range

1）排序sort

1._score默认排序：

GET /newbank/_search
{
  "query": {
    "match_all": {}
  },
  "sort": {
    "_score": {
      "order": "desc" #降序
    }
  }
}

2.相关字段排序：

GET /bank/_search
{
  "sort": [
    {
      "price": {
        "order": "desc"
      }
    }
  ]
}
GET bank/_search
{
  "sort": [
    {
      "price": {
        "order": "asc"
      }
    }
  ]
}

3.多字段排序：

GET /bank/_search
{
  ”query":{ 
    ”match_all": { 
    }
  },
  ”sort": [ 
    {”create_time": { ”order": ”asc" }},
    {”age": { ”order": ”desc" }}
  ]
}

4. 字段多值排序：

对数字或者日期可以查询多个值当中min、max、avg、sum排序

PUT /my-index-000001/_doc/1?refresh
{
   "product": "chocolate",
   "price": [20, 4]
}

GET /my-index-000001/_search
{
    "query": {
        "term": {
            "product": "chocolate"
        }
    },
    "sort": [
        {
            "price": {
                "order": "asc",
                "mode": "avg"
            }
        }
    ]
}
# 结果为:
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "my-index-000001",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : null,
        "_source" : {
          "product" : "chocolate",
          "price" : [
            20,
            4
          ]
        },
        "sort" : [
          12
        ]
      }
    ]
  }
}

2）查询范围range

gt:大于
gte:大于等于
lt:小于
lte:小于等于

# 价格区间的过滤
GET bank/_search
{
  "query": {
    "range": {
      "price": {
        "gte": 1000,
        "lte": 2000
      }
    }
  }
}

GET bank/_search
{
  "sort": [
    {
      "price": {
        "order": "desc"
      }
    }
  ]
}
GET bank/_search
{
  "sort": [
    {
      "price": {
        "order": "asc"
      }
    }
  ]
}

5、from查询页码和size返回结果详情数量

# 分页显示
GET bank/_search
{
  "from": 50,
  "size":50
}

GET bank/_search
{
  "query": {
    "range": {
      "price": {
        "gte": 1000,
        "lte": 2000
      }
    }
  },
  "from": 50,
  "size":50
}

6、highlight高亮显示

"query":{
  "highlight": {
    "fields":{"skuTitle":{}}, 
    "pre_tags": "<b style='color:red'>",
    "post_tags": "</b>"
  }
}

7、_source查询出需要的属性名称

通过_source 字段中的 include 和 exclude 来指定返回结果包含哪些字段，排除哪些字段
举例：根据关系单号，查询 es，设置_source 的 include 和 exclude

实例查询语句：

{
  "_source":{
    "include":[
      "policyNo",
      "policyRelationNo",
      "policyStatus"
    ],
    "exclude":[
       "salesType"
    ]
  },
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "policyRelationNo": "KR01435021"
          }
        }
      ],
      "should": [],
      "must_not": []
    }
  },
  "from": 0,
  "size": 10
}

查询效果：

{
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 19,
        "max_score": 11.391884,
        "hits": [
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407038",
                "_score": 11.391884,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120319",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407046",
                "_score": 10.713255,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120323",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407044",
                "_score": 10.713255,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120322",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407066",
                "_score": 10.713255,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120333",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407058",
                "_score": 10.713255,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120329",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407070",
                "_score": 10.713255,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120335",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407056",
                "_score": 10.294733,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120328",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407052",
                "_score": 10.294733,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120326",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407062",
                "_score": 10.294733,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120331",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407064",
                "_score": 10.294733,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120332",
                    "policyStatus": 11
                }
            }
        ]
    },
    "took": 5,
    "timed_out": false
}

等到鸡吃完米

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
打赏
0
评论
【Elasticsearch学习笔记四】DSL：term和match查询的区别；aggs聚合查询；sort排序和range查询范围；from查询页码和size返回结果；_source查询需要的属性名称

1、词条匹配：term、terms查询2、term和match的区别1）term（精确查询）2）terms（精确匹配多个或关系）3）match（分词匹配）4）match_phrase（短语匹配）3、aggregation执行聚合查询4、排序sort和查询范围range1）排序sort2）查询范围range5、from查询页码和size返回结果详情数量6、highlight高亮显示7、_source查询出需要的属性名称
复制链接

扫一扫