【Elasticsearch学习笔记四】DSL:term和match查询的区别;aggs聚合查询;sort排序和range查询范围;from查询页码和size返回结果;_source查询需要的属性名称

目录

1、 词条匹配:term、terms查询

2、term和match的区别

1)term(精确查询)

2)terms(精确匹配多个或关系)

3)match(分词匹配)

4)match_phrase (短语匹配)

3、aggregation执行聚合查询

4、排序sort和查询范围range

1)排序sort

2)查询范围range

5、from查询页码和size返回结果详情数量

6、highlight高亮显示

7、_source查询出需要的属性名称


1、 词条匹配:term、terms查询

        全文检索字段用match,其他非text字段匹配用term;term查询被用于精确值匹配,这些精确值可能是数字、时间、布尔或者那些未分词的字符串;

#term
GET sku/_search
{
  "query": {
    "term":{
      "price":1000
    }
  }
}

        terms查询和term查询一样,但它允许你指定多值进行匹配。如果这个字段包含了指定值中的任何一个值,那么这个文档就满足条件;

#terms
GET sku/_search
{
  "query": {
    "terms":{
      "price":[1000,2000,3000]
    }
  }
}

2、term和match的区别

检索关键字描述
term非text使用 不会进行分词查询价格 id username 用term
match在text中我们实现全文检索-分词text field
match keyword在属性字段后加.keyword 实现精确查询-不分词
match_phrase短语查询,不分词,模糊查询

        在实际的项目查询中,term和match是最常用的两个查询;下面我们通过举例来说明两者的不同,先放入数据:

#存放一些数据
PUT /test
POST /test/_doc
{
  "title": "love HuBei",
  "content": "people very love HuBei",
  "tags": ["HuBei","love"]
}
POST /test/_doc
{
    "title": "love China",
    "content": "people very love China",
    "tags": ["China", "love"]
}

1)term(精确查询

        term是代表完全匹配,也就是精确查询,搜索前不会再对搜索词进行分词拆解,来使用term查询下:

GET /test/_search
{
  "query": {
    "term": {
      "title": {
        "value": "love"
      }
    }
  }
}

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.6931472,
    "hits": [
      {
        "_index": "test",
        "_type": "doc",
        "_id": "8",
        "_score": 0.6931472,
        "_source": {
          "title": "love HuBei",
          "content": "people very love HuBei",
          "tags": ["HuBei","love"]
        }
      },
      {
        "_index": "test",
        "_type": "doc",
        "_id": "7",
        "_score": 0.6931472,
        "_source": {
          "title": "love China",
          "content": "people very love China",
          "tags": ["China","love"]
        }
      }
    ]
  }
}

# title里有关love的关键字都查出来了,
# 但是我只想精确匹配love China这个,按照下面的写法是没有查询结果的
# 存放数据love China时进行了分词处理,由于term不会进行分词,所以查询不出来
{
  "query": {
    "term": {
      "title": "love China"
    }
  }
}

2)terms(精确匹配多个或关系

        term属于精确匹配,只能查单个词。我想用term匹配多个词怎么做?可以使用terms来:terms里的[ ]多个是或者的关系,只要满足其中一个词就可以:

{
  "query": {
    "terms": {
      "title": ["love", "China"]
    }
  }
}

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "VHJA4YMBhvMSKKYkMlpJ",
        "_score" : 1.0,
        "_source" : {
          "title" : "love China",
          "content" : "people very love China",
          "tags" : [
            "China",
            "love"
          ]
        }
      },
      {
        "_index" : "test",
        "_type" : "_doc",
        "_id" : "VXJA4YMBhvMSKKYkQFov",
        "_score" : 1.0,
        "_source" : {
          "title" : "love HuBei",
          "content" : "people very love HuBei",
          "tags" : [
            "HuBei",
            "love"
          ]
        }
      }
    ]
  }
}

3)match(分词匹配

        match进行搜索的时候,会先进行分词拆分,拆完后再来匹配,上面两个内容,他们title的词条为:love china hubei;针对于"love China",分析出来的为love和china的两个词,并且属于或的关系,只要任何一个词条在里面就能匹配到。

GET test/doc/_search
{
  "query": {
    "match": {
      "title": "love China"
    }
  }
}

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1.3862944,
    "hits": [
      {
        "_index": "test",
        "_type": "doc",
        "_id": "7",
        "_score": 1.3862944,
        "_source": {
          "title": "love China",
          "content": "people very love China",
          "tags": [
            "China",
            "love"
          ]
        }
      },
      {
        "_index": "test",
        "_type": "doc",
        "_id": "8",
        "_score": 0.6931472,
        "_source": {
          "title": "love HuBei",
          "content": "people very love HuBei",
          "tags": [
            "HuBei",
            "love"
          ]
        }
      }
    ]
  }
}

4)match_phrase (短语匹配)

        如果想love和China同时匹配到的话,使用 match_phrase;match_phrase称为短语搜索,要求所有的分词必须同时出现在文档中,同时位置必须紧邻一致;

GET test/doc/_search
{
  "query": {
    "match_phrase": {
      "title": "love china"
    }
  }
}

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1.3862944,
    "hits": [
      {
        "_index": "test",
        "_type": "doc",
        "_id": "7",
        "_score": 1.3862944,
        "_source": {
          "title": "love China",
          "content": "people very love China",
          "tags": [
            "China",
            "love"
          ]
        }
      }
    ]
  }
}

3、aggregation执行聚合查询

        聚合提供了从数据中分组和提取数据的能力。最简单的聚合方法大致等于SQL Group by和SQL聚合函数。

         在elasticsearch中执行搜索返回this(命中结果),并且同时返回聚合结果把以响应中的所有hits(命中结果)分隔开。并且可以执行查询和多个聚合,在一次使用中得到各自的(任何一个的)返回结果,使用一次简洁和简化的API从而避免网络往返。

聚合语法如下:

  1. terms:看值的可能性分布,会合并锁查字段,给出计数即可
  2. avg:看值的分布平均
"aggs":{ # 聚合
    "aggs_name":{ # 这次聚合的名字,方便展示在结果集中
        "AGG_TYPE":{} # 聚合的类型(avg,term,terms)
     }
}

DSL语句实例一:

        搜索address中包含mill的所有人的年龄分布以及平均年龄,但不显示这些人的详情:

# 分别为包含mill、,平均年龄、
GET bank/_search
{
  "query": { # 查询出包含mill的
    "match": {
      "address": "Mill"
    }
  },
  "aggs": { #基于查询聚合
    "ageAgg": {  # 聚合的名字,随便起
      "terms": { # 看值的可能性分布
        "field": "age",
        "size": 10
      }
    },
    "ageAvg": { 
      "avg": { # 看age值的平均
        "field": "age"
      }
    },
    "balanceAvg": {
      "avg": { # 看balance的平均
        "field": "balance"
      }
    }
  },
  "size": 0  # 不看详情
}
结果:
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4, // 命中4条
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "ageAgg" : { // 第一个聚合的结果
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 38,  # age为38的有2条
          "doc_count" : 2
        },
        {
          "key" : 28,
          "doc_count" : 1
        },
        {
          "key" : 32,
          "doc_count" : 1
        }
      ]
    },
    "ageAvg" : { // 第二个聚合的结果
      "value" : 34.0  # balance字段的平均值是34
    },
    "balanceAvg" : {
      "value" : 25208.0
    }
  }
}

 DSL语句实例二:

        aggs/aggName/aggs/aggName子聚合:按照年龄聚合,并且求这些年龄段的这些人的平均薪资:

GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "ageAgg": {
      "terms": { # 看分布
        "field": "age",
        "size": 100
      },
      "aggs": { # 与terms并列
        "ageAvg": { #平均
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  },
  "size": 0
}
输出结果:
{
  "took" : 49,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1000,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "ageAgg" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 31,
          "doc_count" : 61,
          "ageAvg" : {
            "value" : 28312.918032786885
          }
        },
        {
          "key" : 39,
          "doc_count" : 60,
          "ageAvg" : {
            "value" : 25269.583333333332
          }
        },
        {
          "key" : 26,
          "doc_count" : 59,
          "ageAvg" : {
            "value" : 23194.813559322032
          }
        },
        {
          "key" : 32,
          "doc_count" : 52,
          "ageAvg" : {
            "value" : 23951.346153846152
          }
        },
        {
          "key" : 35,
          "doc_count" : 52,
          "ageAvg" : {
            "value" : 22136.69230769231
          }
        },
        {
          "key" : 36,
          "doc_count" : 52,
          "ageAvg" : {
            "value" : 22174.71153846154
          }
        },
        {
          "key" : 22,
          "doc_count" : 51,
          "ageAvg" : {
            "value" : 24731.07843137255
          }
        },
        {
          "key" : 28,
          "doc_count" : 51,
          "ageAvg" : {
            "value" : 28273.882352941175
          }
        },
        {
          "key" : 33,
          "doc_count" : 50,
          "ageAvg" : {
            "value" : 25093.94
          }
        },
        {
          "key" : 34,
          "doc_count" : 49,
          "ageAvg" : {
            "value" : 26809.95918367347
          }
        },
        {
          "key" : 30,
          "doc_count" : 47,
          "ageAvg" : {
            "value" : 22841.106382978724
          }
        },
        {
          "key" : 21,
          "doc_count" : 46,
          "ageAvg" : {
            "value" : 26981.434782608696
          }
        },
        {
          "key" : 40,
          "doc_count" : 45,
          "ageAvg" : {
            "value" : 27183.17777777778
          }
        },
        {
          "key" : 20,
          "doc_count" : 44,
          "ageAvg" : {
            "value" : 27741.227272727272
          }
        },
        {
          "key" : 23,
          "doc_count" : 42,
          "ageAvg" : {
            "value" : 27314.214285714286
          }
        },
        {
          "key" : 24,
          "doc_count" : 42,
          "ageAvg" : {
            "value" : 28519.04761904762
          }
        },
        {
          "key" : 25,
          "doc_count" : 42,
          "ageAvg" : {
            "value" : 27445.214285714286
          }
        },
        {
          "key" : 37,
          "doc_count" : 42,
          "ageAvg" : {
            "value" : 27022.261904761905
          }
        },
        {
          "key" : 27,
          "doc_count" : 39,
          "ageAvg" : {
            "value" : 21471.871794871793
          }
        },
        {
          "key" : 38,
          "doc_count" : 39,
          "ageAvg" : {
            "value" : 26187.17948717949
          }
        },
        {
          "key" : 29,
          "doc_count" : 35,
          "ageAvg" : {
            "value" : 29483.14285714286
          }
        }
      ]
    }
  }
}

 DSL语句实例三:

        复杂子聚合:查出所有年龄分布,并且这些年龄段中M(男性)的平均薪资和F(女性)的平均薪资以及这个年龄段的总体平均薪资:(这个年龄段的总体平均薪资指的是男和女一起的平均薪资÷2)

GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "ageAgg": {
      "terms": {  #  看age分布
        "field": "age",
        "size": 100
      },
      "aggs": { # 子聚合
        "genderAgg": {
          "terms": { # 看gender分布
            "field": "gender.keyword" # 注意这里,文本字段应该用.keyword
          },
          "aggs": { # 子聚合
            "balanceAvg": {
              "avg": { # 男性的平均
                "field": "balance"
              }
            }
          }
        },
        "ageBalanceAvg": {
          "avg": { #age分布的平均(男女)
            "field": "balance"
          }
        }
      }
    }
  },
  "size": 0
}
输出结果:
{
  "took" : 119,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1000,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "ageAgg" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 31,
          "doc_count" : 61,
          "genderAgg" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "M",
                "doc_count" : 35,
                "balanceAvg" : {
                  "value" : 29565.628571428573
                }
              },
              {
                "key" : "F",
                "doc_count" : 26,
                "balanceAvg" : {
                  "value" : 26626.576923076922
                }
              }
            ]
          },
          "ageBalanceAvg" : {
            "value" : 28312.918032786885
          }
        }
      ]
        .......//省略其他
    }
  }
}

4、排序sort和查询范围range

1)排序sort

1._score默认排序:

GET /newbank/_search
{
  "query": {
    "match_all": {}
  },
  "sort": {
    "_score": {
      "order": "desc" #降序
    }
  }
}

2.相关字段排序:

GET /bank/_search
{
  "sort": [
    {
      "price": {
        "order": "desc"
      }
    }
  ]
}
GET bank/_search
{
  "sort": [
    {
      "price": {
        "order": "asc"
      }
    }
  ]
}

3.多字段排序:

GET /bank/_search
{
  ”query":{ 
    ”match_all": { 
    }
  },
  ”sort": [ 
    {”create_time": { ”order": ”asc" }},
    {”age": { ”order": ”desc" }}
  ]
}

4. 字段多值排序 :

        对数字或者日期可以查询多个值当中min、max、avg、sum排序

PUT /my-index-000001/_doc/1?refresh
{
   "product": "chocolate",
   "price": [20, 4]
}

GET /my-index-000001/_search
{
    "query": {
        "term": {
            "product": "chocolate"
        }
    },
    "sort": [
        {
            "price": {
                "order": "asc",
                "mode": "avg"
            }
        }
    ]
}
# 结果为:
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : "my-index-000001",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : null,
        "_source" : {
          "product" : "chocolate",
          "price" : [
            20,
            4
          ]
        },
        "sort" : [
          12
        ]
      }
    ]
  }
}

2)查询范围range

  1. gt:大于
  2. gte:大于等于
  3. lt:小于
  4. lte:小于等于
# 价格区间的过滤
GET bank/_search
{
  "query": {
    "range": {
      "price": {
        "gte": 1000,
        "lte": 2000
      }
    }
  }
}
GET bank/_search
{
  "sort": [
    {
      "price": {
        "order": "desc"
      }
    }
  ]
}
GET bank/_search
{
  "sort": [
    {
      "price": {
        "order": "asc"
      }
    }
  ]
}

5、from查询页码和size返回结果详情数量

# 分页显示
GET bank/_search
{
  "from": 50,
  "size":50
}

GET bank/_search
{
  "query": {
    "range": {
      "price": {
        "gte": 1000,
        "lte": 2000
      }
    }
  },
  "from": 50,
  "size":50
}

6、highlight高亮显示

"query":{
  "highlight": {
    "fields":{"skuTitle":{}}, 
    "pre_tags": "<b style='color:red'>",
    "post_tags": "</b>"
  }
}

7、_source查询出需要的属性名称

        通过_source 字段中的 include 和 exclude 来指定返回结果包含哪些字段,排除哪些字段
举例:根据关系单号,查询 es,设置_source 的 include 和 exclude

实例查询语句:

{
  "_source":{
    "include":[
      "policyNo",
      "policyRelationNo",
      "policyStatus"
    ],
    "exclude":[
       "salesType"
    ]
  },
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "policyRelationNo": "KR01435021"
          }
        }
      ],
      "should": [],
      "must_not": []
    }
  },
  "from": 0,
  "size": 10
}

查询效果:

{
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 19,
        "max_score": 11.391884,
        "hits": [
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407038",
                "_score": 11.391884,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120319",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407046",
                "_score": 10.713255,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120323",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407044",
                "_score": 10.713255,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120322",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407066",
                "_score": 10.713255,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120333",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407058",
                "_score": 10.713255,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120329",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407070",
                "_score": 10.713255,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120335",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407056",
                "_score": 10.294733,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120328",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407052",
                "_score": 10.294733,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120326",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407062",
                "_score": 10.294733,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120331",
                    "policyStatus": 11
                }
            },
            {
                "_index": "search4policy-msad-dev3_20200520000000",
                "_type": "policy-msad-dev3",
                "_id": "4407064",
                "_score": 10.294733,
                "_source": {
                    "policyRelationNo": "KR01435021",
                    "policyNo": "B609120332",
                    "policyStatus": 11
                }
            }
        ]
    },
    "took": 5,
    "timed_out": false
}

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

等到鸡吃完米

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值