【Elasticsearch】Elasticsearch中 aggs (桶)聚合查询和进行二次聚合查询

【Elasticsearch】Elasticsearch中 aggs (桶)聚合查询和进行二次聚合查询


  • Bucket aggregationsedit
    Bucket aggregations don’t calculate metrics over fields like the metrics aggregations do, but instead, they create buckets of documents. Each bucket is associated with a criterion (depending on the aggregation type) which determines whether or not a document in the current context “falls” into it. In other words, the buckets effectively define document sets. In addition to the buckets themselves, the bucket aggregations also compute and return the number of documents that “fell into” each bucket.
    Bucket aggregations, as opposed to metrics aggregations, can hold sub-aggregations. These sub-aggregations will be aggregated for the buckets created by their “parent” bucket aggregation.(大致意思就是,把你指定的数据放入一个桶中,然后返回其文档的数量和你指定字段)

官方文档

以下是几个简单的例子

这里需要先准备数据

POST /product/_bulk
{"index":{"_id":1}}
{"name":"曼克顿全麦面包","desc":"曼克顿全麦面包,超级松软","price":9.00,"lv":"高端面包","type":"面包","createtime":"2022-11-01T08:00:00Z","tags":["松软","全麦","健康"]}
{"index":{"_id":2}}
{"name":"曼克顿高钙面包","desc":"曼克顿高钙面包,每片含0.01g钙","price":10.29,"lv":"高端面包","type":"面包","createtime":"2022-05-21T08:00:00Z","tags":["营养","高钙","儿童"]}
{"index":{"_id":3}}
{"name":"果子面包","desc":"百年义利,老字号品牌面包","price":9.99,"lv":"高端面包","type":"面包","createtime":"2022-06-20","tags":["品牌","老字号","百年义利用"]}
{"index":{"_id":4}}
{"name":"曼克顿巧克力","desc":"曼克顿巧克力,代脂巧克力","price":30,"lv":"代脂巧克力","type":"糖果,巧克力","createtime":"2022-06-23","tags":["代脂巧克力","巧克力","奶油巧克力"]}
{"index":{"_id":5}}
{"name":"百醇巧克力","desc":"百醇巧克力,纯享可香可","price":55,"type":"糖果,巧克力","lv":"可可巧克力","createtime":"2022-07-20","tags":["可可巧克力","巧克力","奶油巧克力"]}
{"index":{"_id":6}}
{"name":"小米手机10","desc":"充电贼快掉电更快,超级无敌望远镜,高刷电竞屏","price":"","lv":"旗舰机","type":"手机","createtime":"2021-07-27","tags":["120HZ刷新率","120W快充","120倍变焦"]}
{"index":{"_id":7}}
{"name":"飞翔SE2","desc":"除了飞行模式,一无是处","price":"3299","lv":"旗舰机","type":"手机","createtime":"2020-07-21","tags":["割韭菜","割韭菜","割新韭菜"]}
{"index":{"_id":8}}
{"name":"XS Max","desc":"听说要出新款12手机了,终于可以换掉手中的Xs了","price":4399,"lv":"旗舰机","type":"手机","createtime":"2020-08-19","tags":["5V1A","4G全网通","大"]}
{"index":{"_id":9}}
{"name":"小米电视","desc":"70寸性价比只选,不要一万八,要不要八千八,只要两千九百九十八","price":2998,"lv":"高端机","type":"耳机","createtime":"2020-08-16","tags":["巨馍","家庭影院","游戏"]}
{"index":{"_id":10}}
{"name":"红米电视","desc":"我比上边那个更划算,我也2998,我也70寸,但是我更好看","price":2999,"type":"电视","lv":"高端机","createtime":"2020-08-28","tags":["大片","蓝光8K","超薄"]}
{"index":{"_id":11}}
{"name":"红米电视","desc":"我比上边那个更划算,我也2998,我也70寸,但是我更好看","price":2998,"type":"电视","lv":"高端机","createtime":"2020-08-28","tags":["大片","蓝光8K","超薄"]}
  • 普通的聚合查询(按照标签进行聚合,并且按照数量升序排序)
#聚合查询agg 
GET product/_search
{
  "size": 0, 
  "aggs": {
    "aggs_tag":{
      "terms": {
        "field": "tags.keyword",
        "order": {
          "_count": "asc"
        }
      }
    }
  }
}

搜索结果:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 11,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "aggs_tag": {
      "doc_count_error_upper_bound": -1,
      "sum_other_doc_count": 22,
      "buckets": [
        {
          "key": "120HZ刷新率",
          "doc_count": 1
        },
        {
          "key": "120W快充",
          "doc_count": 1
        },
        {
          "key": "120倍变焦",
          "doc_count": 1
        },
        {
          "key": "4G全网通",
          "doc_count": 1
        },
        {
          "key": "5V1A",
          "doc_count": 1
        },
        {
          "key": "代脂巧克力",
          "doc_count": 1
        },
        {
          "key": "健康",
          "doc_count": 1
        },
        {
          "key": "儿童",
          "doc_count": 1
        },
        {
          "key": "全麦",
          "doc_count": 1
        },
        {
          "key": "割新韭菜",
          "doc_count": 1
        }
      ]
    }
  }
}
  • 聚合查询价格最高
GET product/_search
{
  "size": 0, 
  "aggs": {
    "max_price":{
      "max": {
        "field": "price"
      }
    }
  }
}

搜索结果如下

{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 11,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "max_price": {
      "value": 4399
    }
  }
}

接下就本文重点了,是使用agg 进行二次聚合

  • 平均价格最低的商品分类
  • 使用 pipline 进行聚合查询
GET product/_search
{
  "size": 0,
  "aggs": {
    "product_type": {
      "terms": {
        "field": "type.keyword"
      },
      "aggs": {
        "price_type_avg": {
          "avg": {
            "field": "price"
          }
        }
      }
    },
    "min_avg_type":{
      "min_bucket": {
        "buckets_path": "product_type>price_type_avg"
      }
    }
  }
}

搜索结果:

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 11,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "product_type": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "手机",
          "doc_count": 3,
          "price_type_avg": {
            "value": 3849
          }
        },
        {
          "key": "面包",
          "doc_count": 3,
          "price_type_avg": {
            "value": 9.759999910990397
          }
        },
        {
          "key": "电视",
          "doc_count": 2,
          "price_type_avg": {
            "value": 2998.5
          }
        },
        {
          "key": "糖果,巧克力",
          "doc_count": 2,
          "price_type_avg": {
            "value": 42.5
          }
        },
        {
          "key": "耳机",
          "doc_count": 1,
          "price_type_avg": {
            "value": 2998
          }
        }
      ]
    },
    "min_avg_type": {
      "value": 9.759999910990397,
      "keys": [
        "面包"
      ]
    }
  }
}
  • 按照类型聚合后,再次按照等级聚合,再次向价格聚合,后获取每个类型下每个标签 桶 中价格最大的
GET product/_search
{
  "size": 0,
  "aggs": {
    "product_type": {
      "terms": {
        "field": "type.keyword"
      },
      "aggs": {
        "lv_aggs": {
          "terms": {
            "field": "lv.keyword"
          },
          "aggs": {
            "lv_price": {
              "stats": {
                "field": "price"
              }
            },
            "tages_buckets": {
              "terms": {
                "field": "tags.keyword"
              }
            }
          }
        },
        "max_type": {
          "max_bucket": {
            "buckets_path": "lv_aggs>lv_price.max"
          }
        }
      }
    }
  }
}

搜索结果如下:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 11,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "product_type": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "手机",
          "doc_count": 3,
          "lv_aggs": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "旗舰机",
                "doc_count": 3,
                "tages_buckets": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                    {
                      "key": "120HZ刷新率",
                      "doc_count": 1
                    },
                    {
                      "key": "120W快充",
                      "doc_count": 1
                    },
                    {
                      "key": "120倍变焦",
                      "doc_count": 1
                    },
                    {
                      "key": "4G全网通",
                      "doc_count": 1
                    },
                    {
                      "key": "5V1A",
                      "doc_count": 1
                    },
                    {
                      "key": "割新韭菜",
                      "doc_count": 1
                    },
                    {
                      "key": "割韭菜",
                      "doc_count": 1
                    },
                    {
                      "key": "大",
                      "doc_count": 1
                    }
                  ]
                },
                "lv_price": {
                  "count": 2,
                  "min": 3299,
                  "max": 4399,
                  "avg": 3849,
                  "sum": 7698
                }
              }
            ]
          },
          "max_type": {
            "value": 4399,
            "keys": [
              "旗舰机"
            ]
          }
        },
        {
          "key": "面包",
          "doc_count": 3,
          "lv_aggs": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "高端面包",
                "doc_count": 3,
                "tages_buckets": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                    {
                      "key": "健康",
                      "doc_count": 1
                    },
                    {
                      "key": "儿童",
                      "doc_count": 1
                    },
                    {
                      "key": "全麦",
                      "doc_count": 1
                    },
                    {
                      "key": "品牌",
                      "doc_count": 1
                    },
                    {
                      "key": "松软",
                      "doc_count": 1
                    },
                    {
                      "key": "百年义利用",
                      "doc_count": 1
                    },
                    {
                      "key": "老字号",
                      "doc_count": 1
                    },
                    {
                      "key": "营养",
                      "doc_count": 1
                    },
                    {
                      "key": "高钙",
                      "doc_count": 1
                    }
                  ]
                },
                "lv_price": {
                  "count": 3,
                  "min": 9,
                  "max": 10.289999961853027,
                  "avg": 9.759999910990397,
                  "sum": 29.27999973297119
                }
              }
            ]
          },
          "max_type": {
            "value": 10.289999961853027,
            "keys": [
              "高端面包"
            ]
          }
        },
        {
          "key": "电视",
          "doc_count": 2,
          "lv_aggs": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "高端机",
                "doc_count": 2,
                "tages_buckets": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                    {
                      "key": "大片",
                      "doc_count": 2
                    },
                    {
                      "key": "蓝光8K",
                      "doc_count": 2
                    },
                    {
                      "key": "超薄",
                      "doc_count": 2
                    }
                  ]
                },
                "lv_price": {
                  "count": 2,
                  "min": 2998,
                  "max": 2999,
                  "avg": 2998.5,
                  "sum": 5997
                }
              }
            ]
          },
          "max_type": {
            "value": 2999,
            "keys": [
              "高端机"
            ]
          }
        },
        {
          "key": "糖果,巧克力",
          "doc_count": 2,
          "lv_aggs": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "代脂巧克力",
                "doc_count": 1,
                "tages_buckets": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                    {
                      "key": "代脂巧克力",
                      "doc_count": 1
                    },
                    {
                      "key": "奶油巧克力",
                      "doc_count": 1
                    },
                    {
                      "key": "巧克力",
                      "doc_count": 1
                    }
                  ]
                },
                "lv_price": {
                  "count": 1,
                  "min": 30,
                  "max": 30,
                  "avg": 30,
                  "sum": 30
                }
              },
              {
                "key": "可可巧克力",
                "doc_count": 1,
                "tages_buckets": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                    {
                      "key": "可可巧克力",
                      "doc_count": 1
                    },
                    {
                      "key": "奶油巧克力",
                      "doc_count": 1
                    },
                    {
                      "key": "巧克力",
                      "doc_count": 1
                    }
                  ]
                },
                "lv_price": {
                  "count": 1,
                  "min": 55,
                  "max": 55,
                  "avg": 55,
                  "sum": 55
                }
              }
            ]
          },
          "max_type": {
            "value": 55,
            "keys": [
              "可可巧克力"
            ]
          }
        },
        {
          "key": "耳机",
          "doc_count": 1,
          "lv_aggs": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
              {
                "key": "高端机",
                "doc_count": 1,
                "tages_buckets": {
                  "doc_count_error_upper_bound": 0,
                  "sum_other_doc_count": 0,
                  "buckets": [
                    {
                      "key": "家庭影院",
                      "doc_count": 1
                    },
                    {
                      "key": "巨馍",
                      "doc_count": 1
                    },
                    {
                      "key": "游戏",
                      "doc_count": 1
                    }
                  ]
                },
                "lv_price": {
                  "count": 1,
                  "min": 2998,
                  "max": 2998,
                  "avg": 2998,
                  "sum": 2998
                }
              }
            ]
          },
          "max_type": {
            "value": 2998,
            "keys": [
              "高端机"
            ]
          }
        }
      ]
    }
  }
}

以下做一个小的测试,以证明,配合query的优先级高于聚合查询你,所以可以先再query种过滤一部分数据再去聚合,这样减少Es的内存使用提高性能

GET product/_search
{
  "size": 10,
  "query": {
    "range": {
      "price": {
        "gte": 10,
        "lte": 60
      }
    }
  },
  "aggs": {
    "tags_bucket": {
      "terms": {
        "field": "tags.keyword"
      }
    }
  }
}

搜索后的结果如下:(这里值展示了金额再10 ~60 范围内的 一些面包和糖果)

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "product",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "曼克顿高钙面包",
          "desc": "曼克顿高钙面包,每片含0.01g钙",
          "price": 10.29,
          "lv": "高端面包",
          "type": "面包",
          "createtime": "2022-05-21T08:00:00Z",
          "tags": [
            "营养",
            "高钙",
            "儿童"
          ]
        }
      },
      {
        "_index": "product",
        "_id": "4",
        "_score": 1,
        "_source": {
          "name": "曼克顿巧克力",
          "desc": "曼克顿巧克力,代脂巧克力",
          "price": 30,
          "lv": "代脂巧克力",
          "type": "糖果,巧克力",
          "createtime": "2022-06-23",
          "tags": [
            "代脂巧克力",
            "巧克力",
            "奶油巧克力"
          ]
        }
      },
      {
        "_index": "product",
        "_id": "5",
        "_score": 1,
        "_source": {
          "name": "百醇巧克力",
          "desc": "百醇巧克力,纯享可香可",
          "price": 55,
          "type": "糖果,巧克力",
          "lv": "可可巧克力",
          "createtime": "2022-07-20",
          "tags": [
            "可可巧克力",
            "巧克力",
            "奶油巧克力"
          ]
        }
      }
    ]
  },
  "aggregations": {
    "tags_bucket": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "奶油巧克力",
          "doc_count": 2
        },
        {
          "key": "巧克力",
          "doc_count": 2
        },
        {
          "key": "代脂巧克力",
          "doc_count": 1
        },
        {
          "key": "儿童",
          "doc_count": 1
        },
        {
          "key": "可可巧克力",
          "doc_count": 1
        },
        {
          "key": "营养",
          "doc_count": 1
        },
        {
          "key": "高钙",
          "doc_count": 1
        }
      ]
    }
  }
}
  • 这是一个带有【global】的查询,因为使用了 global agg,所以会导致,【all_min_price】 这个是 从全局获取的数据,进行聚合的。官方文档也解释了,这样做是没有意义的
    在这里插入图片描述

global 官方文档

GET product/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "range": {
          "price": {
            "gte": 3000
          }
        }
      },
      "boost": 1.2
    }
  },
  "aggs": {
    "stat_data": {
      "stats": {
        "field": "price"
      }
    },
    "all_min_price": {
      "global": {},
      "aggs": {
        "global_avg_price": {
          "min": {
            "field": "price"
          }
        }
      }
    },
    "muti_rang_price": {
      "filter": {
        "range": {
          "price": {
            "lte": 4000
          }
        }
      },
      "aggs": {
        "avg_400_pirce": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}
  • 多级聚合并且排序
    在耳机聚下会把巨厚的数据金一个排序展示。此实例是按照各个商品价格总和的降序进行排序展示
GET product/_search
{
  "size": 0,
  "aggs": {
    "type_args": {
      "terms": {
        "field": "type.keyword",
        "order": {
          "stat_data.sum": "desc"
        }
      },
      "aggs": {
        "stat_data": {
          "stats": {
            "field": "price"
          }
        }
      }
    }
  }
}

结果如下

{
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 11,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "type_args": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "手机",
          "doc_count": 3,
          "stat_data": {
            "count": 2,
            "min": 3299,
            "max": 4399,
            "avg": 3849,
            "sum": 7698
          }
        },
        {
          "key": "电视",
          "doc_count": 2,
          "stat_data": {
            "count": 2,
            "min": 2998,
            "max": 2999,
            "avg": 2998.5,
            "sum": 5997
          }
        },
        {
          "key": "耳机",
          "doc_count": 1,
          "stat_data": {
            "count": 1,
            "min": 2998,
            "max": 2998,
            "avg": 2998,
            "sum": 2998
          }
        },
        {
          "key": "糖果,巧克力",
          "doc_count": 2,
          "stat_data": {
            "count": 2,
            "min": 30,
            "max": 55,
            "avg": 42.5,
            "sum": 85
          }
        },
        {
          "key": "面包",
          "doc_count": 3,
          "stat_data": {
            "count": 3,
            "min": 9,
            "max": 10.289999961853027,
            "avg": 9.759999910990397,
            "sum": 29.27999973297119
          }
        }
      ]
    }
  }
}
  • 20
    点赞
  • 22
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值