关于elasticsearch的先聚合和过滤、先过滤再聚合的详解

最新推荐文章于 2024-07-21 02:34:18 发布

龙大.

最新推荐文章于 2024-07-21 02:34:18 发布

阅读量9.1k

点赞数 2

分类专栏： Elasticsearch 文章标签： elasticsearch

本文链接：https://blog.csdn.net/u014745465/article/details/78338096

版权

Elasticsearch 专栏收录该内容

13 篇文章 0 订阅

订阅专栏

  对于elasticsearch的聚合和过滤，他的结果并不会受到你写的顺序而影响。换句话说就是你无论是在聚合语句的前面写过滤条件，还是在过滤语句后面写过滤条件都不会影响他的结果。 
 他都会先过滤再聚合和关系数据库一样先where后group by。 

 
 但是如果你想过滤条件不影响聚合(agg)结果，而只是改变hits结果；可以使用setPostFilter() 这个方法 

 
 eg：全部数据 

 
 代码： 

 
 SearchResponse response = null; 

  SearchRequestBuilder responsebuilder = client.prepareSearch("company") 

  .setTypes("employee").setFrom(0).setSize(250); 

  AggregationBuilder aggregation = AggregationBuilders 

  .terms("agg") 

  .field("age") ; 

  response = responsebuilder 

 
 .addAggregation(aggregation) 

  .setExplain(true).execute().actionGet(); 

  SearchHits hits = response.getHits(); 

  Terms agg = response.getAggregations().get("agg"); 

 
 结果 
 ： 仅聚合结果不过滤（注意看hits和agg里的结果） 

{

      "took":100, 

      "timed_out":false, 

      "_shards":{ 

          "total":5, 

          "successful":5, 

          "failed":0 

},

      "hits":{ 

          "total":7, 

          "max_score":1, 

          "hits":[ 

{

                  "_shard":1, 

                  "_node":"fvp3NBT5R5i6CqN3y2LU4g", 

                  "_index":"company", 

                  "_type":"employee", 

                  "_id":"5", 

                  "_score":1, 

                  "_source":{ 

                      "name":"Fresh", 

                      "age":22 

},

                  "_explanation":Object{...} 

},

{

                  "_shard":1, 

                  "_node":"fvp3NBT5R5i6CqN3y2LU4g", 

                  "_index":"company", 

                  "_type":"employee", 

                  "_id":"10", 

                  "_score":1, 

                  "_source":{ 

                      "name":"Henrry", 

                      "age":30 

},

                  "_explanation":Object{...} 

},

{

                  "_shard":1, 

                  "_node":"fvp3NBT5R5i6CqN3y2LU4g", 

                  "_index":"company", 

                  "_type":"employee", 

                  "_id":"9", 

                  "_score":1, 

                  "_source":{ 

                      "address":{ 

                          "country":"china", 

                          "province":"jiangsu", 

                          "city":"nanjing", 

                          "area":{ 

                              "pos":"10001" 

}

}

},

                  "_explanation":Object{...} 

},

{

                  "_shard":2, 

                  "_node":"fvp3NBT5R5i6CqN3y2LU4g", 

                  "_index":"company", 

                  "_type":"employee", 

                  "_id":"2", 

                  "_score":1, 

                  "_source":{ 

                      "address":{ 

                          "country":"china", 

                          "province":"jiangsu", 

                          "city":"nanjing" 

},

                      "name":"jack_1", 

                      "age":19, 

                      "join_date":"2016-01-01" 

},

                  "_explanation":Object{...} 

},

{

                  "_shard":2, 

                  "_node":"fvp3NBT5R5i6CqN3y2LU4g", 

                  "_index":"company", 

                  "_type":"employee", 

                  "_id":"4", 

                  "_score":1, 

                  "_source":{ 

                      "name":"willam", 

                      "age":18 

},

                  "_explanation":Object{...} 

},

{

                  "_shard":2, 

                  "_node":"fvp3NBT5R5i6CqN3y2LU4g", 

                  "_index":"company", 

                  "_type":"employee", 

                  "_id":"6", 

                  "_score":1, 

                  "_source":{ 

                      "name":"Avivi", 

                      "age":30 

},

                  "_explanation":Object{...} 

},

{

                  "_shard":4, 

                  "_node":"K7qK1ncMQUuIe0K6VSVMJA", 

                  "_index":"company", 

                  "_type":"employee", 

                  "_id":"3", 

                  "_score":1, 

                  "_source":{ 

                      "address":{ 

                          "country":"china", 

                          "province":"shanxi", 

                          "city":"xian" 

},

                      "name":"marry", 

                      "age":35, 

                      "join_date":"2015-01-01" 

},

                  "_explanation":Object{...} 

}

]

},

      "aggregations":{ 

          "agg":{ 

              "doc_count_error_upper_bound":0, 

              "sum_other_doc_count":0, 

              "buckets":[ 

{

                      "key":30, 

                      "doc_count":2 

},

{

                      "key":18, 

                      "doc_count":1 

},

{

                      "key":19, 

                      "doc_count":1 

},

{

                      "key":22, 

                      "doc_count":1 

},

{

                      "key":35, 

                      "doc_count":1 

}

]

}

}

}

 
 1、setQuery() 写在前面 

 
 代码： 

 
 SearchResponse response = null; 

  SearchRequestBuilder responsebuilder = client.prepareSearch("company") 

  .setTypes("employee").setFrom(0).setSize(250); 

  AggregationBuilder aggregation = AggregationBuilders 

  .terms("agg") 

  .field("age") ; 

  response = responsebuilder 

 
 .setQuery(QueryBuilders.rangeQuery("age").gt(30).lt(40)) 

  .addAggregation(aggregation) 

  .setExplain(true).execute().actionGet(); 

  SearchHits hits = response.getHits(); 

  Terms agg = response.getAggregations().get("agg"); 

 
 结果： 

{

      "took":538, 

      "timed_out":false, 

      "_shards":{ 

          "total":5, 

          "successful":5, 

          "failed":0 

},

      "hits":{ 

          "total":1, 

          "max_score":1, 

          "hits":[ 

{

                  "_shard":4, 

                  "_node":"anlkGjjuQ0G6DODpZgiWrQ", 

                  "_index":"company", 

                  "_type":"employee", 

                  "_id":"3", 

                  "_score":1, 

                  "_source":{ 

                      "address":{ 

                          "country":"china", 

                          "province":"shanxi", 

                          "city":"xian" 

},

                      "name":"marry", 

                      "age":35, 

                      "join_date":"2015-01-01" 

},

                  "_explanation":Object{...} 

}

]

},

      "aggregations":{ 

          "agg":{ 

              "doc_count_error_upper_bound":0, 

              "sum_other_doc_count":0, 

              "buckets":[ 

{

                      "key":35, 

                      "doc_count":1 

}

]

}

}

}

 
 2、setQuery() 写在后面 

 
 代码： 

 
 SearchResponse response = null; 

  SearchRequestBuilder responsebuilder = client.prepareSearch("company") 

  .setTypes("employee").setFrom(0).setSize(250); 

  AggregationBuilder aggregation = AggregationBuilders 

  .terms("agg") 

  .field("age") ; 

  response = responsebuilder 

  .addAggregation(aggregation) 

 
 .setQuery(QueryBuilders.rangeQuery("age").gt(30).lt(40) 

  .setExplain(true).execute().actionGet(); 

  SearchHits hits = response.getHits(); 

  Terms agg = response.getAggregations().get("agg"); 

 
 结果： 

      "took":538, 

      "timed_out":false, 

      "_shards":{ 

          "total":5, 

          "successful":5, 

          "failed":0 

},

      "hits":{ 

          "total":1, 

          "max_score":1, 

          "hits":[ 

{

                  "_shard":4, 

                  "_node":"anlkGjjuQ0G6DODpZgiWrQ", 

                  "_index":"company", 

                  "_type":"employee", 

                  "_id":"3", 

                  "_score":1, 

                  
  "_source":{ 

 
                     "address":{ 

 
                         "country":"china", 

 
                         "province":"shanxi", 

 
                         "city":"xian" 

},

 
                     "name":"marry", 

 
                     "age":35, 

 
                     "join_date":"2015-01-01" 

},

                  "_explanation":Object{...} 

}

]

},

      "aggregations":{ 

          "agg":{ 

              "doc_count_error_upper_bound":0, 

              "sum_other_doc_count":0, 

              
  "buckets":[ 

{

 
                     "key":35, 

 
                     "doc_count":1 

}

]

}

}

}

 
 3、setPostFilter() 在聚合.aggAggregation()方法后  

 
 代码： 

 
  
 SearchResponse response = null; 

  SearchRequestBuilder responsebuilder = client.prepareSearch("company") 

  .setTypes("employee").setFrom(0).setSize(250); 

  AggregationBuilder aggregation = AggregationBuilders 

  .terms("agg") 

  .field("age") ; 

  response = responsebuilder 

  .addAggregation(aggregation) 

 
 .setPostFilter(QueryBuilders.rangeQuery("age").gt(30).lt(40))  

  .setExplain(true).execute().actionGet(); 

  SearchHits hits = response.getHits(); 

  Terms agg = response.getAggregations().get("agg"); 

 
 结果： 

{

      "took":7, 

      "timed_out":false, 

      "_shards":{ 

          "total":5, 

          "successful":5, 

          "failed":0 

},

      "hits":{ 

          "total":1, 

          "max_score":1, 

          "hits":[ 

{

                  "_shard":4, 

                  "_node":"fvp3NBT5R5i6CqN3y2LU4g", 

                  "_index":"company", 

                  "_type":"employee", 

                  "_id":"3", 

                  "_score":1, 

                  
  "_source":{ 

 
                     "address":{ 

 
                         "country":"china", 

 
                         "province":"shanxi", 

 
                         "city":"xian" 

},

 
                     "name":"marry", 

 
                     "age":35, 

 
                     "join_date":"2015-01-01" 

},

                  "_explanation":Object{...} 

}

]

},

      "aggregations":{ 

          "agg":{ 

              "doc_count_error_upper_bound":0, 

              "sum_other_doc_count":0, 

               
 "buckets":[ 

{

 
                     "key":30, 

 
                     "doc_count":2 

},

{

 
                     "key":18, 

 
                     "doc_count":1 

},

{

 
                     "key":19, 

 
                     "doc_count":1 

},

{

 
                     "key":22, 

 
                     "doc_count":1 

},

{

 
                     "key":35, 

 
                     "doc_count":1 

}

]

}

}

}

 
 4、setPostFilter() 在聚合.aggAggregation()方法前  

 
 代码： 

 
  
 SearchResponse response = null; 

  SearchRequestBuilder responsebuilder = client.prepareSearch("company") 

  .setTypes("employee").setFrom(0).setSize(250); 

  AggregationBuilder aggregation = AggregationBuilders 

  .terms("agg") 

  .field("age") ; 

  response = responsebuilder 

 
 .setPostFilter(QueryBuilders.rangeQuery("age").gt(30).lt(40))  

  .addAggregation(aggregation) 

  .setExplain(true).execute().actionGet(); 

  SearchHits hits = response.getHits(); 

  Terms agg = response.getAggregations().get("agg"); 

 
 结果： 

{

      "took":5115, 

      "timed_out":false, 

      "_shards":{ 

          "total":5, 

          "successful":5, 

          "failed":0 

},

      "hits":{ 

          "total":1, 

          "max_score":1, 

          "hits":[ 

{

                  "_shard":4, 

                  "_node":"b8cNIO5cQr2MmsnsuluoNQ", 

                  "_index":"company", 

                  "_type":"employee", 

                  "_id":"3", 

                  "_score":1, 

                  "_source":{ 

                      "address":{ 

                          "country":"china", 

                          "province":"shanxi", 

                          "city":"xian" 

},

                      "name":"marry", 

                      "age":35, 

                      "join_date":"2015-01-01" 

},

                  "_explanation":Object{...} 

}

]

},

      "aggregations":{ 

          "agg":{ 

              "doc_count_error_upper_bound":0, 

              "sum_other_doc_count":0, 

              "buckets":[ 

{

                      "key":30, 

                      "doc_count":2 

},

{

                      "key":18, 

                      "doc_count":1 

},

{

                      "key":19, 

                      "doc_count":1 

},

{

                      "key":22, 

                      "doc_count":1 

},

{

                      "key":35, 

                      "doc_count":1 

}

]

}

}

}

 
 总结： 

 
 可以从运行的结果很好的看出无论是setPostFilter()还是setQuery()，它放在那的顺序并不会影响他的结果。更可以看出setQuery()这个方法的过滤条件不仅会影响它的hits的结果还会影响他的聚合（agg）结果。然而对于setPostFilter()这个方法，它只会影响hits的结果，并不会影响它的聚合（agg）结果。 

龙大.

关注

2
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
关于elasticsearch的先聚合和过滤、先过滤再聚合的详解

对于elasticsearch的聚合和过滤，他的结果并不会受到你写的顺序而影响。换句话说就是你无论是在聚合语句的前面写过滤条件，还是在过滤语句后面写过滤条件都不会影响他的结果。他都会先过滤再聚合和关系数据库一样先where后group by。但是如果你想过滤条件不影响聚合(agg)结果，而只是改变hits结果；可以使用setPostFilter() 这个方法
复制链接

扫一扫

专栏目录