[转载]通过上篇博客的总结,我们知道了在Elasticsearch6中count、distinct和count(distinct)方法的使用。本篇博客继续聚合查询的学习,也就是对应mysql中的group by的使用。
公共实体
对于下面要介绍的查询,返回结果为统一实体,代码如下:
/**
* 单个字段分组返回结果
*
* @date : 2020-11-18 15:02
*/
@Data
public class AggregationForOneDTO implements Serializable {
/**
* 分组字段对应的值
*/
private String key;
/**
* 分组统计字段对应的总数
*/
private Integer count;
}
- group by分组统计
对应mysql中的sql如下:
select field1,count(field2) from table_name group by field1;
针对上面的sql,对应的elasticsearch代码如下:
/**
* 指定索引文档数据中按某个字段分组后对应的文档总数
*/
@Test
public void testCountGroupBy() {
SearchRequest searchRequest = new SearchRequest();
searchRequest.indices("indexName").types("indexType");
TermsAggregationBuilder aggregation = AggregationBuilders
//别名
.terms("uid")
//聚合字段名
.field("uid.keyword")
//降序
.order(BucketOrder.count(false))
//聚合结果数据量,默认只返回前十条
.size(100);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.aggregation(aggregation);
//执行查询
searchRequest.source(searchSourceBuilder);
List<AggregationForOneDTO> result = new ArrayList<>();
SearchResponse response;
try {
response = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("response is {}", response);
Terms byAgeAggregation = response.getAggregations().get("uid");
for (Terms.Bucket buck : byAgeAggregation.getBuckets()) {
AggregationForOneDTO aggregationForOne = new AggregationForOneDTO();
aggregationForOne.setCount((int) buck.getDocCount());
aggregationForOne.setKey(buck.getKeyAsString());
result.add(aggregationForOne);
}
} catch (IOException e) {
log.error("[EsClientConfig.groupByField][error][fail to query]", e);
}
log.info("result is {}", JSON.toJSONString(result));
}
为了看到更直观的结果,附上一张结果截图,其中对应的key就是分组的字段值,count就是通过该字段查询到的文档总数:
- group by分组统计去重
对应mysql中的sql如下:
select field1,count(distinct (field2)) from table_name group by field1;
对应的Elasticsearch查询代码如下:
@Test
public void testCountDistinctGroupBy() {
SearchRequest searchRequest = new SearchRequest();
searchRequest.indices("indexName").types("indexType");
//指定去重字段,cardinality指定别名,field指定字段名
CardinalityAggregationBuilder aggregationBuilder =
AggregationBuilders.cardinality("alias").field("field_distinct");
//指定分组字段,terms指定别名,field指定字段名
TermsAggregationBuilder aggregation = AggregationBuilders.terms("alias")
//聚合字段名
.field("field_group")
.subAggregation(aggregationBuilder)
.size(100)
//按去重字段数量降序
.order(BucketOrder.aggregation("field_distinct", false));
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.aggregation(aggregation);
//执行查询
searchRequest.source(searchSourceBuilder);
List<AggregationForOneDTO> result = new ArrayList<>();
SearchResponse response;
try {
response = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
Terms byAgeAggregation = response.getAggregations().get("field_group");
for (Terms.Bucket buck : byAgeAggregation.getBuckets()) {
Aggregations aggregations1 = buck.getAggregations();
Aggregation subjectCount = aggregations1.get("field_distinct");
JSONObject jsonObject = JSON.parseObject(JSON.toJSONString(subjectCount));
String cardinalityValue = jsonObject.getString("value");
AggregationForOneDTO aggregationForOne = new AggregationForOneDTO();
aggregationForOne.setCount(Integer.parseInt(cardinalityValue));
aggregationForOne.setKey(buck.getKeyAsString());
result.add(aggregationForOne);
}
} catch (IOException e) {
log.error("[EsClientConfig.groupByField][error][fail to query]", e);
}
log.info("result is {}", JSON.toJSONString(result));
}
结果如下,和第一个查询一样,只是count是按照某个字段去重后的结果统计: