Elasticsearch 分组分页排序查询

背景:elasticsearch聚合之后进行分页是非常常见的操作
 

实现思路:

        基于es聚合函数bucket_sort、terms和指标聚合cardinality实现

实现方式:(以会员编码分组分页展示会员最近一条时间记录排序为例):

1、查询实现



 // 桶排序聚合
 BucketSortPipelineAggregationBuilder bucketSortAggregation = PipelineAggregatorBuilders.bucketSort(
                "sortCustomer", Lists.emptyList()).from((pageNo.intValue() - 1) * pageSize.intValue()).size(pageSize.intValue());


 //分页指标--用于统计分页total总数
        CardinalityAggregationBuilder cardinalityAggregation = AggregationBuilders.cardinality("custCard").field("customer_no.keyword");


//返回字段取最新一条记录
        TopHitsAggregationBuilder topHitsAggregation = AggregationBuilders.topHits("latestCust")
                .fetchSource(new String[]{"customer_no", "customer_name", "identify_no", "visit_time", "service_item_names", "organization_name", "id", "type"} , null).size(1) .sort("visit_time_long", SortOrder.DESC);


//以会员编码分组
TermsAggregationBuilder termsAggregationBuilder = AggregationBuilders.terms("topCustomer").field("customer_no.keyword")
                .size(pageNo.intValue() * pageSize.intValue())
                .subAggregation(bucketSortAggregation)
                .subAggregation(topHitsAggregation);


//es查询 分页指标和分组terms要同级

//hit查询返回0条数据
 searchSourceBuilder.size(0);
 searchSourceBuilder.from(0);
//排序
  searchRequest.source(searchSourceBuilder.sort("visit_time_long", SortOrder.DESC));
//query条件--正常查询条件
  searchRequest.source(searchSourceBuilder.query(boolBuilder));
//聚合条件 分组+分页指标    searchRequest.source(searchSourceBuilder.aggregation(termsAggregationBuilder));
searchRequest.source(searchSourceBuilder.aggregation(cardinalityAggregation));

2、es语句


GET /xxxxx/_search
{
  "from": 0,
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "del_flag": {
              "value": false,
              "boost": 1
            }
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1
    }
  },
  "sort": [
    {
      "visit_time_long": {
        "order": "desc"
      }
    }
  ],
  "aggregations": {
    "topCustomer": {
      "terms": {
        "field": "customer_no.keyword",
        "size": 5,
        "min_doc_count": 1,
        "shard_min_doc_count": 0,
        "show_term_doc_count_error": false,
        "order": [
          {
            "_count": "desc"
          },
          {
            "_key": "asc"
          }
        ]
      },
      "aggregations": {
        "latestCust": {
          "top_hits": {
            "from": 0,
            "size": 1,
            "version": false,
            "seq_no_primary_term": false,
            "explain": false,
            "_source": {
              "includes": [
                "customer_no",
                "customer_name",
                "identify_no",
                "visit_time",
                "service_item_names",
                "id",
                "type"
              ],
              "excludes": []
            },
            "sort": [
              {
                "visit_time_long": {
                  "order": "desc"
                }
              }
            ]
          }
        },
        "sortCustomer": {
          "bucket_sort": {
            "sort": [],
            "from": 0,
            "size": 5,
            "gap_policy": "SKIP"
          }
        }
      }
    },
    "custCard": {
      "cardinality": {
        "field": "customer_no.keyword"
      }
    }
  }
}

es查询结果:

3、java获取结果

最终实现分组分页排序功能


参考:Bucket aggregations | Elasticsearch Guide [8.4] | Elastic 

  • 1
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
您可以通过使用Elasticsearch的聚合功能来实现去重取第一个操作。具体来说,您可以使用Terms Aggregation来对需要去重的字段进行分组,并使用Top Hits Aggregation来获取每个组中的第一个文档,然后通过Sort和From/Size参数来进行分页排序。 以下是一个Java代码示例,演示如何使用Elasticsearch的Java API实现这个功能: ```java import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.client.Client; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.search.aggregations.AggregationBuilders; import org.elasticsearch.search.aggregations.bucket.terms.Terms; import org.elasticsearch.search.aggregations.metrics.tophits.TopHits; import org.elasticsearch.search.sort.SortBuilders; import org.elasticsearch.search.sort.SortOrder; SearchResponse response = client.prepareSearch("your_index_name") .setQuery(QueryBuilders.matchAllQuery()) .addAggregation(AggregationBuilders.terms("dedup").field("your_duplicate_field")) .addAggregation(AggregationBuilders.topHits("first").size(1).sort("your_time_field", SortOrder.DESC)) .addSort(SortBuilders.fieldSort("your_time_field").order(SortOrder.DESC)) .setFrom(0).setSize(10) .execute().actionGet(); Terms dedup = response.getAggregations().get("dedup"); for (Terms.Bucket bucket : dedup.getBuckets()) { TopHits first = bucket.getAggregations().get("first"); // Access the first document in this group System.out.println(first.getHits().getAt(0).getSourceAsString()); } ``` 请将上述代码中的"your_index_name"替换为您要查询的索引名称,"your_duplicate_field"替换为需要去重的字段名称,"your_time_field"替换为您要按时间排序的字段名称。此外,您还可以调整From和Size参数以控制分页大小和偏移量。
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值