Elasticsearch实战(十四)---聚合搜索Aggs多层嵌套聚合处理

Elasticsearch实战- -聚合搜索Aggs多层分组嵌套 统计处理

1.准备数据
POST /testcopy/_bulk
{"index":{"_id": 1}}
{"empId" : "111","name" : "员工1","age" : 20,"sex" : "男","mobile" : "19000001111","salary":1333,"deptName" : "技术部","provice" : "湖北省","city":"武汉","area":"光谷大道","address":"湖北省武汉市洪山区光谷大厦","content" : "i like to write best elasticsearch article"}
{"index":{"_id": 2}}
{"empId" : "222","name" : "员工2","age" : 25,"sex" : "男","mobile" : "19000002222","salary":15963,"deptName" : "销售部","provice" : "湖北省","city":"武汉","area":"江汉区","address" : "湖北省武汉市江汉路","content" : "i think java is the best programming language"}
{"index":{"_id": 3}}
{ "empId" : "333","name" : "员工3","age" : 30,"sex" : "男","mobile" : "19000003333","salary":20000,"deptName" : "技术部","provice" : "湖北省","city":"武汉","area":"经济技术开发区","address" : "湖北省武汉市经济开发区","content" : "i am only an elasticsearch beginner"}
{"index":{"_id": 4}}
{"empId" : "444","name" : "员工4","age" : 20,"sex" : "女","mobile" : "19000004444","salary":5600,"deptName" : "销售部","provice" : "湖北省","city":"武汉","area":"沌口开发区","address" : "湖北省武汉市沌口开发区","content" : "elasticsearch and hadoop are all very good solution, i am a beginner"}
{"index":{"_id": 5}}
{ "empId" : "555","name" : "员工5","age" : 20,"sex" : "男","mobile" : "19000005555","salary":9665,"deptName" : "测试部","provice" : "湖北省","city":"高新开发区","area":"武汉","address" : "湖北省武汉市东湖隧道","content" : "spark is best big data solution based on scala ,an programming language similar to java"}
{"index":{"_id": 6}}
{"empId" : "666","name" : "员工6","age" : 30,"sex" : "女","mobile" : "19000006666","salary":30000,"deptName" : "技术部","provice" : "武汉市","city":"湖北省","area":"江汉区","address" : "湖北省武汉市江汉路","content" : "i like java developer"}
{"index":{"_id": 7}}
{"empId" : "777","name" : "员工7","age" : 60,"sex" : "女","mobile" : "19000007777","salary":52130,"deptName" : "测试部","provice" : "湖北省","city":"黄冈市","area":"边城区","address" : "湖北省黄冈市边城区","content" : "i like elasticsearch developer"}
{"index":{"_id": 8}}
{"empId" : "888","name" : "员工8","age" : 19,"sex" : "女","mobile" : "19000008888","salary":60000,"deptName" : "技术部","provice" : "湖北省","city":"武汉","area":"汉阳区","address" : "湖北省武汉市江汉大学","content" : "i like spark language"}
{"index":{"_id": 9}}
{"empId" : "999","name" : "员工9","age" : 40,"sex" : "男","mobile" : "19000009999","salary":23000,"deptName" : "销售部","provice" : "河南省","city":"郑州市","area":"二七区","address" : "河南省郑州市郑州大学","content" : "i like java developer"}
{"index":{"_id": 10}}
{"empId" : "101010","name" : "张湖北","age" : 35,"sex" : "男","mobile" : "19000001010","salary":18000,"deptName" : "测试部","provice" : "湖北省","city":"武汉","area":"高新开发区","address" : "湖北省武汉市东湖高新","content" : "i like java developer i also like  elasticsearch"}
{"index":{"_id": 11}}
{"empId" : "111111","name" : "王河南","age" : 61,"sex" : "男","mobile" : "19000001011","salary":10000,"deptName" : "销售部",,"provice" : "河南省","city":"开封市","area":"金明区","address" : "河南省开封市河南大学","content" : "i am not like  java "}
{"index":{"_id": 12}}
{"empId" : "121212","name" : "张大学","age" : 26,"sex" : "女","mobile" : "19000001012","salary":1321,"deptName" : "测试部",,"provice" : "河南省","city":"开封市","area":"金明区","address" : "河南省开封市河南大学","content" : "i am java developer  thing java is good"}
{"index":{"_id": 13}}
{"empId" : "131313","name" : "李江汉","age" : 36,"sex" : "男","mobile" : "19000001013","salary":1125,"deptName" : "销售部","provice" : "河南省","city":"郑州市","area":"二七区","address" : "河南省郑州市二七区","content" : "i like java and java is very best i like it do you like java "}
{"index":{"_id": 14}}
{"empId" : "141414","name" : "王技术","age" : 45,"sex" : "女","mobile" : "19000001014","salary":6222,"deptName" : "测试部",,"provice" : "河南省","city":"郑州市","area":"金水区","address" : "河南省郑州市金水区","content" : "i like c++"}
{"index":{"_id": 15}}
{"empId" : "151515","name" : "张测试","age" : 18,"sex" : "男","mobile" : "19000001015","salary":20000,"deptName" : "技术部",,"provice" : "河南省","city":"郑州市","area":"高新开发区","address" : "河南省郑州高新开发区","content" : "i think spark is good"}
2.分组嵌套查询及count,avg操作
2.1 以部门分组,求部门avg年龄,且部门内以省分组,省平均年龄,且 order by 每个省 avg年龄

这次的 嵌套分组 和上一篇文章的 Elasticsearch实战(十三)—聚合搜索Aggs聚合及Count,Avg操作 中的 3.3 嵌套分组内avg有什么区别?
之前文章是 先以部门 分组, 然后以 省份 分组, 统计 每个部门内,每个省份的人的 平均年龄,只求了一次平局年龄
aggs 是在terms 平级 开始操作 , 只有一个平均年龄就是 部门内省份内的1个平均年龄

之前文章是  先以部门 分组, 然后以 省份 分组,  统计 每个部门内,每个省份的人的 平均年龄,只求了一次平局年龄, 没有求 部门的平均年龄
1.先分组 部门 deptName
2.在分组 省份 provice
3.然后在省份内 aggs 统计avg年龄

{
  "size":0,
  "aggs":{
    "group_dept":{
      "terms": {
        "field": "deptName.keyword",
        "size": 10
      },
      //dept分组内 term结束 就开始组内嵌套分组
      "aggs": {
        "group_provice": {
          "terms": {
            "field": "provice.keyword",
            "size": 10
          },
          //provice分组内 term结束 就开始统计avg
          "aggs": {
            "provice_avg_age": {
              "avg": {
                "field": "age"
              }
            }
          }
        }
      }
    }
  }
}

之前的查询结果
在这里插入图片描述

这次呢? 不仅要统计 内层的avg 年龄,我还要统计外层的 部门内的 avg年龄,做了两次avg
部门分组, 求avg, 组内再分组 省份, 再求avg

#本次查询不是 基于组内 再进行 aggs ,而是 直接再 dept_avg_age 部门求平均年龄 平级 直接进行group_provice 省份分组
#然后在省份分组的 term平级 进行 aggs 进行 provice_avg_age 省内平均年龄

get /testcopy/_search
{
  "size":0,
  "aggs":{
    "group_dept":{
      "terms": {
        "field": "deptName.keyword",
        "size": 10,
        "order": {
          "dept_avg_age": "desc"
        }
      },
      //dept分组 terms结束 直接aggs先求dept avg年龄
      "aggs": {
        "dept_avg_age": {
          "avg": {
            "field": "age"
          }
        },
        //求完 dept avg age,dept_avg_age 同级别 直接并列分组名字,以provice分组
        "group_provice":{
          "terms": {
            "field": "provice.keyword",
            "size": 10,
            "order": {
              "provice_avg_age": "desc"
            }
          },
          //group provice内部 ,terms结束 直接aggs 求provice avg年龄
          "aggs": {
            "provice_avg_age": {
              "avg": {
                "field": "age"
              }
            }
          }
        }
      }
    }
  }
}

本次查询结果
在这里插入图片描述

2.2 aggs并列实现多次查询,求不同部门的 max, min,sum,avg 四个统计

求每个部门的 人数 的最大年龄,最小年龄, 年龄综合,平均年龄, aggs平级可以多个聚合种类 比如 max,min,sum,avg等

#aggs平级 四个名字,一次性做四次分组
get /testcopy/_search
{
  "size":0,//不展示原始数据
  "aggs":{
    "group_dept":{
      "terms": {
        "field": "deptName.keyword",
        "size": 10
      },
      //group分组内 term平级 进行 统计 max,min,sum,avg
      "aggs": {
        "max_age": {
          "max": {
            "field": "age"
          }
        },
        //aggs内 max分组名称 平级 开始 min分组名称
        "min_age":{
          "min": {
            "field": "age"
          }
        },
        //aggs内 max,min 分组名称 平级 开始 sum分组名称
        "sum_age":{
          "sum": {
            "field": "age"
          }
        },
        //aggs内 max,min,sum 分组名称 平级 开始 avg分组名称
        "avg_age":{
          "avg": {
            "field": "age"
          }
        }
      }
    }
  }
}

查看结果
技术部 4人 max:30, avg:24.75, sum:99,min:19
销售部 4人 max:40, avg:30.25, sum:121,min:20
在这里插入图片描述


至此 我们已经学习了 聚合搜索 aggs 的基本用法,嵌套分组查询的多种用法及之前 分组查询的对比, 下一篇,我们介绍下 如何把 查询query,filter过滤,结合aggs 进行局部/全局聚合统计。

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
好的,以下是一个简单的ElasticSearch聚合的Java API示例: ```java import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.client.Client; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.search.aggregations.AggregationBuilders; import org.elasticsearch.search.aggregations.bucket.histogram.DateHistogramInterval; import org.elasticsearch.search.aggregations.bucket.histogram.Histogram; import org.elasticsearch.search.aggregations.bucket.terms.Terms; import org.elasticsearch.search.aggregations.metrics.sum.Sum; import org.elasticsearch.search.aggregations.metrics.valuecount.ValueCount; import static org.elasticsearch.index.query.QueryBuilders.rangeQuery; public class ElasticSearchAggregationExample { public static void main(String[] args) { // 创建ElasticSearch客户端 Client client = // ...; // 构建查询条件 QueryBuilder query = QueryBuilders.boolQuery() .must(rangeQuery("timestamp").gte("2022-01-01T00:00:00").lte("2022-01-31T23:59:59")); // 构建聚合条件 AggregationBuilder aggregation = AggregationBuilders .dateHistogram("sales_over_time") .field("timestamp") .dateHistogramInterval(DateHistogramInterval.DAY) .subAggregation( AggregationBuilders .terms("product_types") .field("product_type") .subAggregation( AggregationBuilders.sum("total_sales").field("sales"), AggregationBuilders.count("transaction_count").field("transaction_id") ) ); // 执行查询 SearchResponse response = client.prepareSearch("my_index") .setQuery(query) .addAggregation(aggregation) .execute() .actionGet(); // 解析聚合结果 Histogram histogram = response.getAggregations().get("sales_over_time"); for (Histogram.Bucket bucket : histogram.getBuckets()) { System.out.println("Date: " + bucket.getKeyAsString()); Terms productTypes = bucket.getAggregations().get("product_types"); for (Terms.Bucket productType : productTypes.getBuckets()) { System.out.println("Product Type: " + productType.getKeyAsString()); Sum totalSales = productType.getAggregations().get("total_sales"); System.out.println("Total Sales: " + totalSales.getValue()); ValueCount transactionCount = productType.getAggregations().get("transaction_count"); System.out.println("Transaction Count: " + transactionCount.getValue()); } } // 关闭客户端 client.close(); } } ``` 这个示例通过ElasticSearch的Java API执行了一个聚合,其中包含了两层嵌套聚合,分别按照日期和产品类型对销售数据进行了汇总,输出了每个日期和产品类型的销售总额和交易次数。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值