关于elasticsearch的先聚合和过滤、先过滤再聚合的详解

对于elasticsearch的聚合和过滤,他的结果并不会受到你写的顺序而影响。换句话说就是你无论是在聚合语句的前面写过滤条件,还是在过滤语句后面写过滤条件都不会影响他的结果。他都会先过滤再聚合和关系数据库一样先where后group by。 但是如果你想过滤条件不影响聚合(agg)结果,而只是改变hits结果;可以使用setPostFilter() 这个方法

eg:全部数据 代码:


SearchResponse response = null;  
SearchRequestBuilder responsebuilder = client.prepareSearch("company")  
               .setTypes("employee").setFrom(0).setSize(250);  
AggregationBuilder aggregation = AggregationBuilders  
               .terms("agg")  
               .field("age")  ;  
response = responsebuilder
                      .addAggregation(aggregation)
                      .setExplain(true).execute().actionGet();
SearchHits hits = response.getHits(); 
Terms agg = response.getAggregations().get("agg");  

结果: 仅聚合结果不过滤(注意看hits和agg里的结果)

{
    "took":100,
    "timed_out":false,
    "_shards":{
        "total":5,
        "successful":5,
        "failed":0
    },
    "hits":{
        "total":7,
        "max_score":1,
        "hits":[
            {
                "_shard":1,
                "_node":"fvp3NBT5R5i6CqN3y2LU4g",
                "_index":"company",
                "_type":"employee",
                "_id":"5",
                "_score":1,
                "_source":{
                    "name":"Fresh",
                    "age":22
                },
                "_explanation":Object{...}
            },
            {
                "_shard":1,
                "_node":"fvp3NBT5R5i6CqN3y2LU4g",
                "_index":"company",
                "_type":"employee",
                "_id":"10",
                "_score":1,
                "_source":{
                    "name":"Henrry",
                    "age":30
                },
                "_explanation":Object{...}
            },
            {
                "_shard":1,
                "_node":"fvp3NBT5R5i6CqN3y2LU4g",
                "_index":"company",
                "_type":"employee",
                "_id":"9",
                "_score":1,
                "_source":{
                    "address":{
                        "country":"china",
                        "province":"jiangsu",
                        "city":"nanjing",
                        "area":{
                            "pos":"10001"
                        }
                    }
                },
                "_explanation":Object{...}
            },
            {
                "_shard":2,
                "_node":"fvp3NBT5R5i6CqN3y2LU4g",
                "_index":"company",
                "_type":"employee",
                "_id":"2",
                "_score":1,
                "_source":{
                    "address":{
                        "country":"china",
                        "province":"jiangsu",
                        "city":"nanjing"
                    },
                    "name":"jack_1",
                    "age":19,
                    "join_date":"2016-01-01"
                },
                "_explanation":Object{...}
            },
            {
                "_shard":2,
                "_node":"fvp3NBT5R5i6CqN3y2LU4g",
                "_index":"company",
                "_type":"employee",
                "_id":"4",
                "_score":1,
                "_source":{
                    "name":"willam",
                    "age":18
                },
                "_explanation":Object{...}
            },
            {
                "_shard":2,
                "_node":"fvp3NBT5R5i6CqN3y2LU4g",
                "_index":"company",
                "_type":"employee",
                "_id":"6",
                "_score":1,
                "_source":{
                    "name":"Avivi",
                    "age":30
                },
                "_explanation":Object{...}
            },
            {
                "_shard":4,
                "_node":"K7qK1ncMQUuIe0K6VSVMJA",
                "_index":"company",
                "_type":"employee",
                "_id":"3",
                "_score":1,
                "_source":{
                    "address":{
                        "country":"china",
                        "province":"shanxi",
                        "city":"xian"
                    },
                    "name":"marry",
                    "age":35,
                    "join_date":"2015-01-01"
                },
                "_explanation":Object{...}
            }
        ]
    },
    "aggregations":{
        "agg":{
            "doc_count_error_upper_bound":0,
            "sum_other_doc_count":0,
            "buckets":[
                {
                    "key":30,
                    "doc_count":2
                },
                {
                    "key":18,
                    "doc_count":1
                },
                {
                    "key":19,
                    "doc_count":1
                },
                {
                    "key":22,
                    "doc_count":1
                },
                {
                    "key":35,
                    "doc_count":1
                }
            ]
        }
    }
}

1、setQuery() 写在前面 代码:


 SearchResponse response = null;  
 SearchRequestBuilder responsebuilder = client.prepareSearch("company")  
                .setTypes("employee").setFrom(0).setSize(250);  
 AggregationBuilder aggregation = AggregationBuilders  
                .terms("agg")  
                .field("age")  ;  
response = responsebuilder
        	       .setQuery(QueryBuilders.rangeQuery("age").gt(30).lt(40))
                       .addAggregation(aggregation)
                       .setExplain(true).execute().actionGet();
 SearchHits hits = response.getHits(); 
 Terms agg = response.getAggregations().get("agg");  

结果:

{
    "took":538,
    "timed_out":false,
    "_shards":{
        "total":5,
        "successful":5,
        "failed":0
    },
    "hits":{
        "total":1,
        "max_score":1,
        "hits":[
            {
                "_shard":4,
                "_node":"anlkGjjuQ0G6DODpZgiWrQ",
                "_index":"company",
                "_type":"employee",
                "_id":"3",
                "_score":1,
                "_source":{
                    "address":{
                        "country":"china",
                        "province":"shanxi",
                        "city":"xian"
                    },
                    "name":"marry",
                    "age":35,
                    "join_date":"2015-01-01"
                },
                "_explanation":Object{...}
            }
        ]
    },
    "aggregations":{
        "agg":{
            "doc_count_error_upper_bound":0,
            "sum_other_doc_count":0,
            "buckets":[
                {
                    "key":35,
                    "doc_count":1
                }
            ]
        }
    }
}

2、setQuery() 写在后面 代码:


SearchResponse response = null;  
SearchRequestBuilder responsebuilder = client.prepareSearch("company")  
               .setTypes("employee").setFrom(0).setSize(250);  
AggregationBuilder aggregation = AggregationBuilders  
               .terms("agg")  
               .field("age")  ;  
response = responsebuilder
                      .addAggregation(aggregation)
                      .setQuery(QueryBuilders.rangeQuery("age").gt(30).lt(40)
                      .setExplain(true).execute().actionGet();
SearchHits hits = response.getHits(); 
Terms agg = response.getAggregations().get("agg");  

结果:


"took":538,
  "timed_out":false,
  "_shards":{
      "total":5,
      "successful":5,
      "failed":0
  },
  "hits":{
      "total":1,
      "max_score":1,
      "hits":[
          {
              "_shard":4,
              "_node":"anlkGjjuQ0G6DODpZgiWrQ",
              "_index":"company",
              "_type":"employee",
              "_id":"3",
              "_score":1,
              "_source":{
                  "address":{
                      "country":"china",
                      "province":"shanxi",
                      "city":"xian"
                  },
                  "name":"marry",
                  "age":35,
                  "join_date":"2015-01-01"
              },
              "_explanation":Object{...}
          }
      ]
  },
  "aggregations":{
      "agg":{
          "doc_count_error_upper_bound":0,
          "sum_other_doc_count":0,
          "buckets":[
              {
                  "key":35,
                  "doc_count":1
              }
          ]
      }
  }
}

3、setPostFilter() 在聚合.aggAggregation()方法后 代码:


  SearchResponse response = null;  
    SearchRequestBuilder responsebuilder = client.prepareSearch("company")  
               .setTypes("employee").setFrom(0).setSize(250);  
   AggregationBuilder aggregation = AggregationBuilders  
               .terms("agg")  
               .field("age")  ;  
  response = responsebuilder
                      .addAggregation(aggregation)
                      .setPostFilter(QueryBuilders.rangeQuery("age").gt(30).lt(40))  
                      .setExplain(true).execute().actionGet();
  SearchHits hits = response.getHits(); 
  Terms agg = response.getAggregations().get("agg");  
结果:
 {
    "took":7,
    "timed_out":false,
    "_shards":{
        "total":5,
        "successful":5,
        "failed":0
    },
    "hits":{
        "total":1,
        "max_score":1,
        "hits":[
            {
                "_shard":4,
                "_node":"fvp3NBT5R5i6CqN3y2LU4g",
                "_index":"company",
                "_type":"employee",
                "_id":"3",
                "_score":1,
                "_source":{
                    "address":{
                        "country":"china",
                        "province":"shanxi",
                        "city":"xian"
                    },
                    "name":"marry",
                    "age":35,
                    "join_date":"2015-01-01"
                },
                "_explanation":Object{...}
            }
        ]
    },
    "aggregations":{
        "agg":{
            "doc_count_error_upper_bound":0,
            "sum_other_doc_count":0,
            "buckets":[
                {
                    "key":30,
                    "doc_count":2
                },
                {
                    "key":18,
                    "doc_count":1
                },
                {
                    "key":19,
                    "doc_count":1
                },
                {
                    "key":22,
                    "doc_count":1
                },
                {
                    "key":35,
                    "doc_count":1
                }
            ]
        }
    }
}

4、setPostFilter() 在聚合.aggAggregation()方法前 代码:

SearchResponse response = null;  
   SearchRequestBuilder responsebuilder = client.prepareSearch("company")  
              .setTypes("employee").setFrom(0).setSize(250);  
  AggregationBuilder aggregation = AggregationBuilders  
              .terms("agg")  
              .field("age")  ;  
 response = responsebuilder
                     .setPostFilter(QueryBuilders.rangeQuery("age").gt(30).lt(40))  
                     .addAggregation(aggregation)
                     .setExplain(true).execute().actionGet();
 SearchHits hits = response.getHits(); 
 Terms agg = response.getAggregations().get("agg"); 

结果:

  {
    "took":5115,
    "timed_out":false,
    "_shards":{
        "total":5,
        "successful":5,
        "failed":0
    },
    "hits":{
        "total":1,
        "max_score":1,
        "hits":[
            {
                "_shard":4,
                "_node":"b8cNIO5cQr2MmsnsuluoNQ",
                "_index":"company",
                "_type":"employee",
                "_id":"3",
                "_score":1,
                "_source":{
                    "address":{
                        "country":"china",
                        "province":"shanxi",
                        "city":"xian"
                    },
                    "name":"marry",
                    "age":35,
                    "join_date":"2015-01-01"
                },
                "_explanation":Object{...}
            }
        ]
    },
    "aggregations":{
        "agg":{
            "doc_count_error_upper_bound":0,
            "sum_other_doc_count":0,
            "buckets":[
                {
                    "key":30,
                    "doc_count":2
                },
                {
                    "key":18,
                    "doc_count":1
                },
                {
                    "key":19,
                    "doc_count":1
                },
                {
                    "key":22,
                    "doc_count":1
                },
                {
                    "key":35,
                    "doc_count":1
                }
            ]
        }
    }
}

总结: 可以从运行的结果很好的看出无论是setPostFilter()还是setQuery(),它放在那的顺序并不会影响他的结果。更可以看出setQuery()这个方法的过滤条件不仅会影响它的hits的结果还会影响他的聚合(agg)结果。然而对于setPostFilter()这个方法,它只会影响hits的结果,并不会影响它的聚合(agg)结果。

转载于:https://my.oschina.net/u/2367628/blog/1555872

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值