【ElasticSearch】deep paging及解决方案

最新推荐文章于 2024-03-29 10:22:17 发布

种下星星的日子

最新推荐文章于 2024-03-29 10:22:17 发布

阅读量354

点赞数

分类专栏：【ElasticSearch】文章标签： elasticsearch

本文链接：https://blog.csdn.net/hongwei15732623364/article/details/110873751

版权

【ElasticSearch】专栏收录该内容

11 篇文章 0 订阅

订阅专栏

es分页查询：
搜索语法：

GET /_search?size=10
GET /_search?from=0&size=10

什么是deep paging问题？为什么会产生这个问题，它的底层原理是什么？
deep paging：
简单来说，就是搜索特别深，比如总共有60000条数据，每个shard分了20000条数据，每页10条，要搜索到第1000页，所以每个shard都要将第10001~10010条返回给coordinate node，然后coordinate node收到总共30003条数据，然后在这些数据中排序，_score，相关度分数，然后取到排位最高的前10条，也就是我们最后要的第1000页的10条数据。

性能问题：
将大量数据排序，耗费网络带宽，耗费内存，耗费CPU，所以要避免deep paging操作。

基于scroll滚动搜索：

GET /test_index/test_type/_search?scroll=1m
{
  "query": {
    "match_all": {}
  },
  "sort": [ "_doc" ],
  "size": 3
}

执行以上语句，可以得到以下结果：

{
  "_scroll_id": "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAACxeFjRvbnNUWVZaVGpHdklqOV9zcFd6MncAAAAAAAAsYBY0b25zVFlWWlRqR3ZJajlfc3BXejJ3AAAAAAAALF8WNG9uc1RZVlpUakd2SWo5X3NwV3oydwAAAAAAACxhFjRvbnNUWVZaVGpHdklqOV9zcFd6MncAAAAAAAAsYhY0b25zVFlWWlRqR3ZJajlfc3BXejJ3",
  "took": 5,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 10,
    "max_score": null,
    "hits": [
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "8",
        "_score": null,
        "_source": {
          "test_field": "test client 2"
        },
        "sort": [
          0
        ]
      },
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "6",
        "_score": null,
        "_source": {
          "test_field": "tes test"
        },
        "sort": [
          0
        ]
      },
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "AVp4RN0bhjxldOOnBxaE",
        "_score": null,
        "_source": {
          "test_content": "my test"
        },
        "sort": [
          0
        ]
      }
    ]
  }
}

获得的结果有一个scroll_id，下次再发送scroll请求时，必须带上这个scroll_id，

GET /test_index/test_type/_search/scroll
{
    "scroll": "1m", 
    "scroll_id" : "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAACxeFjRvbnNUWVZaVGpHdklqOV9zcFd6MncAAAAAAAAsYBY0b25zVFlWWlRqR3ZJajlfc3BXejJ3AAAAAAAALF8WNG9uc1RZVlpUakd2SWo5X3NwV3oydwAAAAAAACxhFjRvbnNUWVZaVGpHdklqOV9zcFd6MncAAAAAAAAsYhY0b25zVFlWWlRqR3ZJajlfc3BXejJ3"
}