elasticsearch 条件去重_elasticsearch 进行聚合+去重查询

本文以客户数据为例,展示了如何在Elasticsearch中通过日期分桶和Cardinality聚合进行条件去重查询,以计算每天的唯一客户数。通过Java代码演示了查询和聚合的实现过程,结果显示去重后的客户数与原始数据的对比。
摘要由CSDN通过智能技术生成

已客户customer为例

我想查询每日的客户数。

先按照日期分桶,然后在桶内按照 姓名来去重 来计算客户数(实际会按照客户id 来区分客户)

测试数据见 文章末尾

一共是9条数据, 名字分别为:

river Lucy 1 Lucy frank tom lily lily tom tom

不同的名字是 6 个。

先看看 es 的 query 怎么写

GET /es-customer/_search

{

"size" : 0,

"aggs" : {

"days" : {

"date_histogram": {

"field": "createTime",

"interval": "day"

},

"aggs": {

"distinct_name" : {

"cardinality" : {

"field" : "firstName"

}

}

}

}

}

}

查询结果为:

{

"took": 0,

"timed_out": false,

"_shards": {

"total": 2,

"successful": 2,

"skipped": 0,

"failed": 0

},

"hits": {

"total": 9,

"max_score": 0,

"hits": []

},

"aggregations": {

"days": {

"buckets": [

{

"key_as_string": "2019-04-10 00:00:00",

"key": 1554854400000,

"doc_count": 9,

"distinct_name": {

"value": 6

}

}

]

}

}

}

2019-04-10 当天查出了9条数据,去重后是6条。

现在就可以根据 查询写java代码了

@Test

public void testAggAndDistinct(){

//获取注解,通过注解可以得到 indexName 和 type

Document document = Customer.class.getAnnotation(Document.class);

// dateHistogram Aggregation 是时间柱状图聚合,按照天来聚合 , dataAgg 为聚合结果的名称,createTime 为字段名称

// cardinality 用来去重

SearchQuery searchQuery = new NativeSearchQueryBuilder()

.withQuery(matchAllQuery())

.withSearchType(SearchType.QUERY_THEN_FETCH)

.withIndices(document.indexName()).withTypes(document.type())

.addAggregation(AggregationBuilders.dateHistogram("dataAgg").field("createTime").dateHistogramInterval(DateHistogramInterval.DAY)

.subAggregation(AggregationBuilders.cardinality("nameAgg").field("firstName")))

.build();

// 聚合的结果

Aggregations aggregations = elasticsearchTemplate.query(searchQuery, response -> response.getAggregations());

Map results = aggregations.asMap();

Histogram histogram = (Histogram) results.get("dataAgg");

// 将bucket list 转换成 map , key -> 名字 value-> 出现次数

histogram.getBuckets().stream().forEach(t->{

Histogram.Bucket histogram1 = t;

System.out.println(histogram1.getKeyAsString());

Cardinality cardinality = histogram1.getAggregations().get("nameAgg");

System.out.println(cardinality.getValue());

});

}

打印结果为

时间:2019-04-10 00:00:00

总数 :9

去重后数量:6

这是我们期望的结果。

测试数据

GET /es-customer/_search

{

"query": {

"match_all": {}

}

}

{

"took": 0,

"timed_out": false,

"_shards": {

"total": 2,

"successful": 2,

"skipped": 0,

"failed": 0

},

"hits": {

"total": 9,

"max_score": 1,

"hits": [

{

"_index": "es-customer",

"_type": "customer",

"_id": "_z8_BmoB7Iqmj8bUCgie",

"_score": 1,

"_source": {

"id": null,

"firstName": "Lucy 1",

"lastName": "001",

"valid": null,

"age": 13,

"createTime": "2019-04-10 07:55:55"

}

},

{

"_index": "es-customer",

"_type": "customer",

"_id": "Aj8_BmoB7Iqmj8bUCwkY",

"_score": 1,

"_source": {

"id": null,

"firstName": "tom",

"lastName": "001",

"valid": null,

"age": 44,

"createTime": "2019-04-10 07:55:56"

}

},

{

"_index": "es-customer",

"_type": "customer",

"_id": "Az8_BmoB7Iqmj8bUCwk6",

"_score": 1,

"_source": {

"id": null,

"firstName": "lily",

"lastName": "002",

"valid": null,

"age": 56,

"createTime": "2019-04-10 07:55:56"

}

},

{

"_index": "es-customer",

"_type": "customer",

"_id": "BD8_BmoB7Iqmj8bUCwlc",

"_score": 1,

"_source": {

"id": null,

"firstName": "lily",

"lastName": "004",

"valid": null,

"age": 53,

"createTime": "2019-04-10 07:55:56"

}

},

{

"_index": "es-customer",

"_type": "customer",

"_id": "_j8_BmoB7Iqmj8bUCghV",

"_score": 1,

"_source": {

"id": null,

"firstName": "river",

"lastName": "007",

"valid": null,

"age": 12,

"createTime": "2019-04-10 07:55:55"

}

},

{

"_index": "es-customer",

"_type": "customer",

"_id": "AD8_BmoB7Iqmj8bUCgnH",

"_score": 1,

"_source": {

"id": null,

"firstName": "Lucy",

"lastName": "002",

"valid": null,

"age": 22,

"createTime": "2019-04-10 07:55:55"

}

},

{

"_index": "es-customer",

"_type": "customer",

"_id": "AT8_BmoB7Iqmj8bUCgnv",

"_score": 1,

"_source": {

"id": null,

"firstName": "frank",

"lastName": "001",

"valid": null,

"age": 33,

"createTime": "2019-04-10 07:55:56"

}

},

{

"_index": "es-customer",

"_type": "customer",

"_id": "BT8_BmoB7Iqmj8bUCwmC",

"_score": 1,

"_source": {

"id": null,

"firstName": "tom",

"lastName": "002",

"valid": null,

"age": 66,

"createTime": "2019-04-10 07:55:56"

}

},

{

"_index": "es-customer",

"_type": "customer",

"_id": "Bj8_BmoB7Iqmj8bUCwmp",

"_score": 1,

"_source": {

"id": null,

"firstName": "tom",

"lastName": "005",

"valid": null,

"age": 33,

"createTime": "2019-04-10 07:55:56"

}

}

]

}

}

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值