【ElasticSearch】常用聚合统计技巧集锦

最新推荐文章于 2024-07-29 10:53:26 发布

太阳下的兰花草

最新推荐文章于 2024-07-29 10:53:26 发布

阅读量855

点赞数

分类专栏： ElasticSearch

本文链接：https://blog.csdn.net/starlywang/article/details/105442073

版权

本文介绍了ElasticSearch的高效分页策略——桶分页，利用管道聚合进行二次统计，以及如何实现聚合前后的过滤。同时，文章还探讨了类似Spark的struct字段捆绑聚合——Top_Hits，用于在分组时保留其他字段信息。

摘要由CSDN通过智能技术生成

【ElasticSearch】常用聚合统计技巧集锦

一、桶分页

相较于Scroll召回数量大且吃内存严重，和SearchAfter滚动排序参数的难以维护，桶分页提供了更优化的分页召回思路：即在内存中基于全量数据排序，基于一页数据召回，减少了网络传输并避免了维护排序参数的繁琐。

{
   
    "size":0,
    "aggregations":{
   
        # total_num：统计结果总数，precision_threshold：精确度默认为100，值越大精确度越大，但消耗内存也越大
        "total_num":{
   
            "cardinality":{
   
                "field":"product_name",
                "precision_threshold":100
            }
        },
        "product_name":{
   
            "terms":{
   
                "field":"product_name",
                # size：terms聚合中如果想根据全量数据排序，size需要设置为Integer的最大值
                "size":2147483647,
                "min_doc_count":1,
                "shard_min_doc_count":0,
                "show_term_doc_count_error":false,
                "execution_hint":"map",
                "order":[
                    {
   
                        "_count":"desc"
                    },
                    {
   
                        "_key":"asc"
                    }
                ],
                 # collect_mode：遍历方法设置为广度优先遍历，效率较深度优先遍历高
                "collect_mode":"breadth_first"
            },
            "aggregations":{
   
                "material_estimate_exposure":{
   
                    "sum":{
   
                        "field":"goods_material_estimate_exposure"
                    }
                },
                # bucket_sort：设置桶排序，同时对排序后的全量数据通过from、 size取分页数据
                "score_bucket_sort":{
   
                    "bucket_sort":{
   
                        "sort":[
                            {
   
                                # 按照聚合树中某一聚合项进行排序
                                "material_estimate_exposure":{
   
                                    "order":"desc"
                                }
                            }
                        ],
                        "from":0,
                        "size":50,
                        "gap_policy":"SKIP"