关于elasticsearch的先聚合和过滤、先过滤再聚合的详解

对于elasticsearch的聚合和过滤,他的结果并不会受到你写的顺序而影响。换句话说就是你无论是在聚合语句的前面写过滤条件,还是在过滤语句后面写过滤条件都不会影响他的结果。 他都会先过滤再聚合和关系数据库一样先where后group by。
但是如果你想过滤条件不影响聚合(agg)结果,而只是改变hits结果;可以使用setPostFilter() 这个方法

eg:全部数据
代码:
SearchResponse response = null;
SearchRequestBuilder responsebuilder = client.prepareSearch("company")
.setTypes("employee").setFrom(0).setSize(250);
AggregationBuilder aggregation = AggregationBuilders
.terms("agg")
.field("age") ;
response = responsebuilder
.addAggregation(aggregation)
.setExplain(true).execute().actionGet();
SearchHits hits = response.getHits();
Terms agg = response.getAggregations().get("agg");
结果 : 仅聚合结果不过滤(注意看hits和agg里的结果)
{
    "took":100,
    "timed_out":false,
    "_shards":{
        "total":5,
        "successful":5,
        "failed":0
    },
    "hits":{
        "total":7,
        "max_score":1,
        "hits":[
            {
                "_shard":1,
                "_node":"fvp3NBT5R5i6CqN3y2LU4g",
                "_index":"company",
                "_type":"employee",
                "_id":"5",
                "_score":1,
                "_source":{
                    "name":"Fresh",
                    "age":22
                },
                "_explanation":Object{...}
            },
            {
                "_shard":1,
                "_node":"fvp3NBT5R5i6CqN3y2LU4g",
                "_index":"company",
                "_type":"employee",
                "_id":"10",
                "_score":1,
                "_source":{
                    "name":"Henrry",
                    "age":30
                },
                "_explanation":Object{...}
            },
            {
                "_shard":1,
                "_node":"fvp3NBT5R5i6CqN3y2LU4g",
                "_index":"company",
                "_type":"employee",
                "_id":"9",
                "_score":1,
                "_source":{
                    "address":{
                        "country":"china",
                        "province":"jiangsu",
                        "city":"nanjing",
                        "area":{
                            "pos":"10001"
                        }
                    }
                },
                "_explanation":Object{...}
            },
            {
                "_shard":2,
                "_node":"fvp3NBT5R5i6CqN3y2LU4g",
                "_index":"company",
                "_type":"employee",
                "_id":"2",
                "_score":1,
                "_source":{
                    "address":{
                        "country":"china",
                        "province":"jiangsu",
                        "city":"nanjing"
                    },
                    "name":"jack_1",
                    "age":19,
                    "join_date":"2016-01-01"
                },
                "_explanation":Object{...}
            },
            {
                "_shard":2,
                "_node":"fvp3NBT5R5i6CqN3y2LU4g",
                "_index":"company",
                "_type":"employee",
                "_id":"4",
                "_score":1,
                "_source":{
                    "name":"willam",
                    "age":18
                },
                "_explanation":Object{...}
            },
            {
                "_shard":2,
                "_node":"fvp3NBT5R5i6CqN3y2LU4g",
                "_index":"company",
                "_type":"employee",
                "_id":"6",
                "_score":1,
                "_source":{
                    "name":"Avivi",
                    "age":30
                },
                "_explanation":Object{...}
            },
            {
                "_shard":4,
                "_node":"K7qK1ncMQUuIe0K6VSVMJA",
                "_index":"company",
                "_type":"employee",
                "_id":"3",
                "_score":1,
                "_source":{
                    "address":{
                        "country":"china",
                        "province":"shanxi",
                        "city":"xian"
                    },
                    "name":"marry",
                    "age":35,
                    "join_date":"2015-01-01"
                },
                "_explanation":Object{...}
            }
        ]
    },
    "aggregations":{
        "agg":{
            "doc_count_error_upper_bound":0,
            "sum_other_doc_count":0,
            "buckets":[
                {
                    "key":30,
                    "doc_count":2
                },
                {
                    "key":18,
                    "doc_count":1
                },
                {
                    "key":19,
                    "doc_count":1
                },
                {
                    "key":22,
                    "doc_count":1
                },
                {
                    "key":35,
                    "doc_count":1
                }
            ]
        }
    }
}
1、setQuery() 写在前面
代码:
SearchResponse response = null;
SearchRequestBuilder responsebuilder = client.prepareSearch("company")
.setTypes("employee").setFrom(0).setSize(250);
AggregationBuilder aggregation = AggregationBuilders
.terms("agg")
.field("age") ;
response = responsebuilder
.setQuery(QueryBuilders.rangeQuery("age").gt(30).lt(40))
.addAggregation(aggregation)
.setExplain(true).execute().actionGet();
SearchHits hits = response.getHits();
Terms agg = response.getAggregations().get("agg");
结果:
{
    "took":538,
    "timed_out":false,
    "_shards":{
        "total":5,
        "successful":5,
        "failed":0
    },
    "hits":{
        "total":1,
        "max_score":1,
        "hits":[
            {
                "_shard":4,
                "_node":"anlkGjjuQ0G6DODpZgiWrQ",
                "_index":"company",
                "_type":"employee",
                "_id":"3",
                "_score":1,
                "_source":{
                    "address":{
                        "country":"china",
                        "province":"shanxi",
                        "city":"xian"
                    },
                    "name":"marry",
                    "age":35,
                    "join_date":"2015-01-01"
                },
                "_explanation":Object{...}
            }
        ]
    },
    "aggregations":{
        "agg":{
            "doc_count_error_upper_bound":0,
            "sum_other_doc_count":0,
            "buckets":[
                {
                    "key":35,
                    "doc_count":1
                }
            ]
        }
    }
}

2、setQuery() 写在后面
代码:
SearchResponse response = null;
SearchRequestBuilder responsebuilder = client.prepareSearch("company")
.setTypes("employee").setFrom(0).setSize(250);
AggregationBuilder aggregation = AggregationBuilders
.terms("agg")
.field("age") ;
response = responsebuilder
.addAggregation(aggregation)
.setQuery(QueryBuilders.rangeQuery("age").gt(30).lt(40)
.setExplain(true).execute().actionGet();
SearchHits hits = response.getHits();
Terms agg = response.getAggregations().get("agg");
结果:
    "took":538,
    "timed_out":false,
    "_shards":{
        "total":5,
        "successful":5,
        "failed":0
    },
    "hits":{
        "total":1,
        "max_score":1,
        "hits":[
            {
                "_shard":4,
                "_node":"anlkGjjuQ0G6DODpZgiWrQ",
                "_index":"company",
                "_type":"employee",
                "_id":"3",
                "_score":1,
                 "_source":{
                    "address":{
                        "country":"china",
                        "province":"shanxi",
                        "city":"xian"
                    },
                    "name":"marry",
                    "age":35,
                    "join_date":"2015-01-01"
                },
                "_explanation":Object{...}
            }
        ]
    },
    "aggregations":{
        "agg":{
            "doc_count_error_upper_bound":0,
            "sum_other_doc_count":0,
             "buckets":[
                {
                    "key":35,
                    "doc_count":1
                }
            ]
        }
    }
}


3、setPostFilter() 在聚合.aggAggregation()方法后
代码:
SearchResponse response = null;
SearchRequestBuilder responsebuilder = client.prepareSearch("company")
.setTypes("employee").setFrom(0).setSize(250);
AggregationBuilder aggregation = AggregationBuilders
.terms("agg")
.field("age") ;
response = responsebuilder
.addAggregation(aggregation)
.setPostFilter(QueryBuilders.rangeQuery("age").gt(30).lt(40))
.setExplain(true).execute().actionGet();
SearchHits hits = response.getHits();
Terms agg = response.getAggregations().get("agg");
结果:
{
    "took":7,
    "timed_out":false,
    "_shards":{
        "total":5,
        "successful":5,
        "failed":0
    },
    "hits":{
        "total":1,
        "max_score":1,
        "hits":[
            {
                "_shard":4,
                "_node":"fvp3NBT5R5i6CqN3y2LU4g",
                "_index":"company",
                "_type":"employee",
                "_id":"3",
                "_score":1,
                 "_source":{
                    "address":{
                        "country":"china",
                        "province":"shanxi",
                        "city":"xian"
                    },
                    "name":"marry",
                    "age":35,
                    "join_date":"2015-01-01"
                },
                "_explanation":Object{...}
            }
        ]
    },
    "aggregations":{
        "agg":{
            "doc_count_error_upper_bound":0,
            "sum_other_doc_count":0,
             "buckets":[
                {
                    "key":30,
                    "doc_count":2
                },
                {
                    "key":18,
                    "doc_count":1
                },
                {
                    "key":19,
                    "doc_count":1
                },
                {
                    "key":22,
                    "doc_count":1
                },
                {
                    "key":35,
                    "doc_count":1
                }
            ]
        }
    }
}

4、setPostFilter() 在聚合.aggAggregation()方法前
代码:
SearchResponse response = null;
SearchRequestBuilder responsebuilder = client.prepareSearch("company")
.setTypes("employee").setFrom(0).setSize(250);
AggregationBuilder aggregation = AggregationBuilders
.terms("agg")
.field("age") ;
response = responsebuilder
.setPostFilter(QueryBuilders.rangeQuery("age").gt(30).lt(40))
.addAggregation(aggregation)
.setExplain(true).execute().actionGet();
SearchHits hits = response.getHits();
Terms agg = response.getAggregations().get("agg");
结果:
{
    "took":5115,
    "timed_out":false,
    "_shards":{
        "total":5,
        "successful":5,
        "failed":0
    },
    "hits":{
        "total":1,
        "max_score":1,
        "hits":[
            {
                "_shard":4,
                "_node":"b8cNIO5cQr2MmsnsuluoNQ",
                "_index":"company",
                "_type":"employee",
                "_id":"3",
                "_score":1,
                "_source":{
                    "address":{
                        "country":"china",
                        "province":"shanxi",
                        "city":"xian"
                    },
                    "name":"marry",
                    "age":35,
                    "join_date":"2015-01-01"
                },
                "_explanation":Object{...}
            }
        ]
    },
    "aggregations":{
        "agg":{
            "doc_count_error_upper_bound":0,
            "sum_other_doc_count":0,
            "buckets":[
                {
                    "key":30,
                    "doc_count":2
                },
                {
                    "key":18,
                    "doc_count":1
                },
                {
                    "key":19,
                    "doc_count":1
                },
                {
                    "key":22,
                    "doc_count":1
                },
                {
                    "key":35,
                    "doc_count":1
                }
            ]
        }
    }
}

总结:
可以从运行的结果很好的看出无论是setPostFilter()还是setQuery(),它放在那的顺序并不会影响他的结果。更可以看出setQuery()这个方法的过滤条件不仅会影响它的hits的结果还会影响他的聚合(agg)结果。然而对于setPostFilter()这个方法,它只会影响hits的结果,并不会影响它的聚合(agg)结果。
  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Elasticsearch 聚合查询(Aggregation)是一种用于对数据进行多维度分析的功能。聚合查询可以用于分析数据的分布情况、计算数据的统计信息、生成图表等。在 Elasticsearch 中,聚合查询是通过使用特定的聚合器(Aggregator)来完成的。 Java 中使用 Elasticsearch 聚合查询需要使用 Elasticsearch Java API。首需要创建一个 SearchRequest 对象,并设置需要查询的索引和查询条件。然后创建一个 AggregationBuilder 对象,用于定义聚合查询的规则和参数。最后将 AggregationBuilder 对象添加到 SearchRequest 中,执行查询操作。 以下是一个简单的 Java 代码示例,用于查询某个索引下的数据,并按照某个字段进行分组聚合查询: ``` SearchRequest searchRequest = new SearchRequest("index_name"); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); TermsAggregationBuilder aggregationBuilder = AggregationBuilders.terms("group_by_field").field("field_name"); searchSourceBuilder.aggregation(aggregationBuilder); searchRequest.source(searchSourceBuilder); SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT); Terms terms = searchResponse.getAggregations().get("group_by_field"); for (Terms.Bucket bucket : terms.getBuckets()) { String key = bucket.getKeyAsString(); long count = bucket.getDocCount(); System.out.println("key: " + key + ", count: " + count); } ``` 在上面的代码中,首创建了一个 SearchRequest 对象,设置需要查询的索引和查询条件。然后创建了一个 TermsAggregationBuilder 对象,用于按照某个字段进行分组聚合查询。最后将 TermsAggregationBuilder 对象添加到 SearchRequest 中,执行查询操作。 查询结果会返回一个 Terms 对象,其中包含了分组聚合查询的结果。可以使用 Terms 对象的 getBuckets() 方法获取分组聚合查询的结果列表。每个分组聚合查询结果由一个 Terms.Bucket 对象表示,其中包含了分组聚合查询的键值和文档数量等信息。 以上是简单的聚合查询示例,Elasticsearch 聚合查询功能非常强大,支持多种聚合器和聚合规则,可以根据具体需求进行调整和扩展。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值