ES 多桶排序
- 当我们使用多值桶(
terms
、histogram
和date_histogram
)时,会动态生成很多桶,ES默认的会按照doc_count
降序排列
内置排序
-
需求:做一个
terms
聚合但是按doc_count
值的升序排序 -
GET /cars/transactions/_search { "size" : 0, "aggs" : { "colors" : { "terms" : { "field" : "color", "order": { "_count" : "asc" } } } } }
-
返回结果 "aggregations": { "colors": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "green", "doc_count": 2 }, { "key": "blue", "doc_count": 4 }, { "key": "red", "doc_count": 4 } ] } }
-
用关键字
_count
,我们可以按doc_count
值的升序排序。 -
order
对象可选值_count
按文档数排序。对terms
、histogram
、date_histogram
有效。_term
按词项的字符串值的字母顺序排序。只在terms
内使用。_key
按每个桶的键值数值排序(理论上与_term
类似)。 只在histogram
和date_histogram
内使用。
-
@Test public void test10(){ TermsAggregationBuilder termsAggregationBuilder = AggregationBuilders.terms("colors").field("color").order(Terms.Order.count(true)); SearchResponse searchResponse = elasticsearchTemplate.getClient().prepareSearch("cars") .setTypes("transactions") .setSize(0) .addAggregation(termsAggregationBuilder) .execute() .actionGet(); StringTerms stringTerms = searchResponse.getAggregations().get("colors"); List<StringTerms.Bucket> buckets = stringTerms.getBuckets(); for (StringTerms.Bucket bucket : buckets){ String keyAsString = bucket.getKeyAsString(); long docCount = bucket.getDocCount(); System.out.println(keyAsString+"---"+docCount); } }
-
返回结果 green---2 blue---4 red---4
-
order函数原型
-
/** * Sets the order in which the buckets will be returned. */ public TermsAggregationBuilder order(Terms.Order order) { if (order == null) { throw new IllegalArgumentException("[order] must not be null: [" + name + "]"); } if(order instanceof CompoundOrder || InternalOrder.isTermOrder(order)) { this.order = order; // if order already contains a tie-breaker we are good to go } else { // otherwise add a tie-breaker by using a compound order this.order = Terms.Order.compound(order); } return this; }
-
Terms.Order
可按照count
、term
或者指定字段排序 -
count、term、aggregation函数原型 public static Order count(boolean asc) { return asc ? InternalOrder.COUNT_ASC : InternalOrder.COUNT_DESC; } /** * @return a bucket ordering strategy that sorts buckets by their terms (ascending or descending) */ public static Order term(boolean asc) { return asc ? InternalOrder.TERM_ASC : InternalOrder.TERM_DESC; } /** * Creates a bucket ordering strategy which sorts buckets based on a single-valued calc get * * @param path the name of the get * @param asc The direction of the order (ascending or descending) */ public static Order aggregation(String path, boolean asc) { return new InternalOrder.Aggregation(path, asc); }
-
true是正序,false是倒序 public static final InternalOrder COUNT_DESC = new InternalOrder(COUNT_DESC_ID, "_count", false, new Comparator<Terms.Bucket>() { @Override public int compare(Terms.Bucket o1, Terms.Bucket o2) { return Long.compare(o2.getDocCount(), o1.getDocCount()); } }); /** * Order by the (lower) count of each term. */ public static final InternalOrder COUNT_ASC = new InternalOrder(COUNT_ASC_ID, "_count", true, new Comparator<Terms.Bucket>() { @Override public int compare(Terms.Bucket o1, Terms.Bucket o2) { return Long.compare(o1.getDocCount(), o2.getDocCount()); } });
按度量排序
-
有时我们想基于度量计算的结果值进行排序,
-
需求:按照汽车颜色创建一个销售条状图表,并按照汽车平均售价的升序进行排序。
-
GET /cars/transactions/_search { "size" : 0, "aggs" : { "colors" : { "terms" : { "field" : "color", "order": { "avg_price" : "asc" } }, "aggs": { "avg_price": { "avg": {"field": "price"} } } } } }
-
"aggregations": { "colors": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "blue", "doc_count": 4, "avg_price": { "value": 17000 } }, { "key": "green", "doc_count": 2, "avg_price": { "value": 21000 } }, { "key": "red", "doc_count": 4, "avg_price": { "value": 32500 } } ] } }
-
@Test public void test11(){ AvgAggregationBuilder avgAggregationBuilder = AggregationBuilders.avg("avg_price").field("price"); TermsAggregationBuilder termsAggregationBuilder = AggregationBuilders.terms("colors").field("color").order(Terms.Order.aggregation("avg_price", true)); termsAggregationBuilder.subAggregation(avgAggregationBuilder); SearchResponse searchResponse = elasticsearchTemplate.getClient().prepareSearch("cars") .setTypes("transactions") .setSize(0) .addAggregation(termsAggregationBuilder) .execute() .actionGet(); StringTerms stringTerms = searchResponse.getAggregations().get("colors"); List<StringTerms.Bucket> buckets = stringTerms.getBuckets(); for (StringTerms.Bucket bucket : buckets){ String keyAsString = bucket.getKeyAsString(); long docCount = bucket.getDocCount(); InternalAvg internalAvg = bucket.getAggregations().get("avg_price"); System.out.println(keyAsString+"---"+docCount+"---"+internalAvg.getValue()); } }
-
返回结果 blue---4---17000.0 green---2---21000.0 red---4---32500.0
-
采用这种方式用任何度量排序,只需简单的引用度量的名字。不过有些度量会输出多个值。
extended_stats
度量的输出有好几个值 -
GET /cars/transactions/_search { "size" : 0, "aggs" : { "colors" : { "terms" : { "field" : "color", "order": { "stats.variance" : "asc" } }, "aggs": { "stats": { "extended_stats": {"field": "price"} } } } } }
-
"aggregations": { "colors": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "blue", "doc_count": 4, "stats": { "count": 4, "min": 8000, "max": 25000, "avg": 17000, "sum": 68000, "sum_of_squares": 1314000000, "variance": 39500000, "std_deviation": 6284.902544988267, "std_deviation_bounds": { "upper": 29569.805089976537, "lower": 4430.194910023465 } } }, { "key": "green", "doc_count": 2, "stats": { "count": 2, "min": 12000, "max": 30000, "avg": 21000, "sum": 42000, "sum_of_squares": 1044000000, "variance": 81000000, "std_deviation": 9000, "std_deviation_bounds": { "upper": 39000, "lower": 3000 } } }, { "key": "red", "doc_count": 4, "stats": { "count": 4, "min": 10000, "max": 80000, "avg": 32500, "sum": 130000, "sum_of_squares": 7300000000, "variance": 768750000, "std_deviation": 27726.341266023544, "std_deviation_bounds": { "upper": 87952.6825320471, "lower": -22952.68253204709 } } } ] } }
-
使用
.
符号,根据感兴趣的度量进行排序。 -
@Test public void test12(){ ExtendedStatsAggregationBuilder extendedStatsAggregationBuilder = AggregationBuilders.extendedStats("stats").field("price"); TermsAggregationBuilder termsAggregationBuilder = AggregationBuilders.terms("colors").field("color").order(Terms.Order.aggregation("stats.avg", true)); termsAggregationBuilder.subAggregation(extendedStatsAggregationBuilder); SearchResponse searchResponse = elasticsearchTemplate.getClient().prepareSearch("cars") .setTypes("transactions") .setSize(0) .addAggregation(termsAggregationBuilder) .execute() .actionGet(); StringTerms stringTerms = searchResponse.getAggregations().get("colors"); List<StringTerms.Bucket> buckets = stringTerms.getBuckets(); for (StringTerms.Bucket bucket : buckets){ String keyAsString = bucket.getKeyAsString(); long docCount = bucket.getDocCount(); InternalStats internalStats = bucket.getAggregations().get("stats"); System.out.println(keyAsString+"---"+docCount+"---"+internalStats.getAvg()); } }
-
返回结果 blue---4---17000.0 green---2---21000.0 red---4---32500.0