4.elasticsearch聚合分析

【README】

1.本文介绍了elasticsearch聚合分析的开发方式;


【1】聚合分析介绍

聚合提供了从数据中分组和提取数据的能力。

  • 最简单的聚合方法类似于 group by 和 sql 聚合函数了;
  • 在es中,搜索结果返回 hits(命中结果),并且同时返回聚合结果;
  • 可以执行查询和多个聚合,并且在一次请求中得到各自的结果

2)聚合开发文档:

Aggregations | Elasticsearch Guide [7.2] | Elastic


 【1.1】agg-terms统计年龄分布及avg统计平均年龄

1)搜索address中包含 mill的所有人的年龄分布以及平均年龄;

  • terms聚合结果是数据分布,如年龄为28的有3人,年龄为29的有5人;
  • terms 类似于 sql中的 group by 关键字

场景:查询address包含mill的文档,并基于此统计age分布(统计每个年龄值的个数),年龄均值,余额均值;(这就是一次查询请求,可以执行多种聚合的例子)

注意: match中的address是包含mill,而不是等于mill ,则符合条件

// 【sql】
Select count(1) , avg(age), avg(balance)
From bank
Where address like '%mill%'
Group by age

// [ES AGG] elasticsearch聚合
Post  localhost:9200/bank/_search 
{
    "query":{
        "match":{
            "address":"mill"
        }
    }
    , "aggs":{ // 聚合关键字 aggs 
        "ageAgg":{ 
            "terms":{
                "field":"age"
                , "size":10 // 仅统计出10种不同可能的结果  
            }
        }
        , "ageAvgAgg":{ 
            "avg":{"field":"age"} // 求age均值 
        }
        , "balanceAvgAgg":{
            "avg":{"field":"balance"} // 求 balance余额 均值  
        }
    }
    , "size":0
}

// 聚合结果 
{
    "took": 20,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 4,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "ageAgg": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [ // 年龄分组结果 
                {
                    "key": 38,
                    "doc_count": 2
                },
                {
                    "key": 28,
                    "doc_count": 1
                },
                {
                    "key": 32,
                    "doc_count": 1
                }
            ]
        },
        "ageAvgAgg": { // 年龄均值 
            "value": 34.0
        },
        "balanceAvgAgg": { // 余额均值
            "value": 25208.0
        }
    }
}

【1.2】嵌套聚合统计各个年龄值的人的平均薪资

1)场景:按照年龄聚合,并且基于此,统计这些年龄段的人的平均薪资;
如统计 年龄为31的61个人的平均薪资;

Post localhost:9200/bank/_search 
{
    "query":{
        "match_all":{}
    }
    , "aggs":{
        "ageAgg":{
            "terms":{
                "field":"age"
                , "size":3
            }
            , "aggs":{ // 嵌套聚合,聚合中的聚合 
                "balanceAvgAggForEveryAge":{
                    "avg":{"field":"balance"}
                }
            }
        }        
    }
    , "size":0
}

// 聚合结果 
{
    "took": 52,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 1000,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "ageAgg": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 820,
            "buckets": [
                {
                    "key": 31,
                    "doc_count": 61,
                    "balanceAvgAggForEveryAge": {
                        "value": 28312.918032786885
                    }
                },
                {
                    "key": 39,
                    "doc_count": 60,
                    "balanceAvgAggForEveryAge": {
                        "value": 25269.583333333332
                    }
                },
                {
                    "key": 26,
                    "doc_count": 59,
                    "balanceAvgAggForEveryAge": {
                        "value": 23194.813559322032
                    }
                }
            ]
        }
    }
}

【1.3】terms分组嵌套

场景:查询所有年龄分布,并基于此统计每个年龄段中性别为M和F的平均薪资

Post localhost:9200/bank/_searchlocalhost:9200/bank/_search
{
    "query":{
        "match_all":{}
    }
    , "aggs":{
        "ageAgg":{
            "terms":{
                "field":"age"
                , "size":3
            }
           , "aggs":{
               "wholeBalanceAvgAgg" :{
                   "avg":{"field":"balance"}
               }
               , "genderAgg":{
                    "terms":{
                        "field":"gender.keyword"
                    }
                    , "aggs":{
                        "balanceAvgAgg":{
                                "avg":{"field":"balance"}
                            }
                    }
                }
           }
        }        
    }
    , "size":0
}

// 聚合结果 
{
    "took": 12,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 1000,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "ageAgg": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 820,
            "buckets": [
                {
                    "key": 31,
                    "doc_count": 61,
                    "wholeBalanceAvgAgg": {
                        "value": 28312.918032786885
                    },
                    "genderAgg": {
                        "doc_count_error_upper_bound": 0,
                        "sum_other_doc_count": 0,
                        "buckets": [
                            {
                                "key": "M",
                                "doc_count": 35,
                                "balanceAvgAgg": {
                                    "value": 29565.628571428573
                                }
                            },
                            {
                                "key": "F",
                                "doc_count": 26,
                                "balanceAvgAgg": {
                                    "value": 26626.576923076922
                                }
                            }
                        ]
                    }
                },
                {
                    "key": 39,
                    "doc_count": 60,
                    "wholeBalanceAvgAgg": {
                        "value": 25269.583333333332
                    },
                    "genderAgg": {
                        "doc_count_error_upper_bound": 0,
                        "sum_other_doc_count": 0,
                        "buckets": [
                            {
                                "key": "F",
                                "doc_count": 38,
                                "balanceAvgAgg": {
                                    "value": 26348.684210526317
                                }
                            },
                            {
                                "key": "M",
                                "doc_count": 22,
                                "balanceAvgAgg": {
                                    "value": 23405.68181818182
                                }
                            }
                        ]
                    }
                },
                {
                    "key": 26,
                    "doc_count": 59,
                    "wholeBalanceAvgAgg": {
                        "value": 23194.813559322032
                    },
                    "genderAgg": {
                        "doc_count_error_upper_bound": 0,
                        "sum_other_doc_count": 0,
                        "buckets": [
                            {
                                "key": "M",
                                "doc_count": 32,
                                "balanceAvgAgg": {
                                    "value": 25094.78125
                                }
                            },
                            {
                                "key": "F",
                                "doc_count": 27,
                                "balanceAvgAgg": {
                                    "value": 20943.0
                                }
                            }
                        ]
                    }
                }
            ]
        }
    }
}

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
co.elastic.clients.elasticsearch.core.aggregations 是 Java 客户端 ElasticSearch 的一个聚合Aggregation)方法,用于对数据进行分析和统计。 具体使用方法可以参考以下示例: ```java import co.elastic.clients.base.*; import co.elastic.clients.elasticsearch.*; import co.elastic.clients.elasticsearch.core.*; import co.elastic.clients.elasticsearch.core.aggregations.*; import co.elastic.clients.elasticsearch.core.aggregations.bucket.*; import co.elastic.clients.elasticsearch.core.aggregations.metrics.*; import java.io.IOException; import java.util.*; public class ElasticSearchAggregationExample { public static void main(String[] args) throws IOException, ApiException { RestClientBuilder restClientBuilder = RestClient.builder( new HttpHost("localhost", 9200, "http") ); ElasticSearch client = new ElasticSearch(restClientBuilder); SearchRequest request = new SearchRequest() .index("my_index") .source(new SearchSource() .query(new MatchAllQuery()) .aggregations(new TermsAggregation("my_terms_agg") .field("my_field") .size(10) .subAggregations(new AvgAggregation("my_avg_agg") .field("my_other_field") ) ) ); SearchResponse response = client.search(request); TermsAggregationResult myTermsAggResult = response.aggregations().terms("my_terms_agg"); for (TermsAggregationEntry entry : myTermsAggResult.buckets()) { String term = entry.keyAsString(); long count = entry.docCount(); AvgAggregationResult myAvgAggResult = entry.aggregations().avg("my_avg_agg"); double avg = myAvgAggResult.value(); System.out.println(term + ": " + count + ", avg: " + avg); } client.close(); } } ``` 这个例子展示了如何使用 co.elastic.clients.elasticsearch.core.aggregations 方法来进行聚合查询。在这个例子中,我们使用了 TermsAggregation 和 AvgAggregation 两个聚合方法,对数据进行了分组和统计。具体步骤为: 1. 创建一个 SearchRequest 对象,并设置索引名称和查询条件。 2. 在查询条件中添加聚合条件。这里使用了 TermsAggregation 来对数据进行分组,然后使用 AvgAggregation 来统计每个分组的平均值。 3. 执行查询,并获取查询结果。 4. 使用聚合结果对象的方法来获取聚合结果,然后对结果进行处理。 需要注意的是,聚合方法的具体参数和用法可以参考 ElasticSearch 官方文档。同时,Java 客户端的版本和 ElasticSearch 的版本也需要匹配,否则可能会出现兼容性问题。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值