python elasticsearch_dsl search_after翻页

想使用search_after 这个方法,搜索的数据必须是有唯一的键,用来排序,这里使用_id。

如下每次查询带上sort参数,然后将上次查询的最后一个数据的sort信息,添加到下次查询的search_after参数里

GET books/_search
{
  
    "size": 10,
    "query": {
        "match_all" : {
           
        }
    },
    "sort": [
        {"_id": "asc"},
        {"_score": "desc"},
        {"title.keyword":"desc"}
    ]
}



GET books/_search
{
  
    "size": 10,
    "query": {
        "match_all" : {
           
        }
    },
    "search_after": [ "-55upIIB7UkVIfnN6-JF",
          1.0,
          "Learn Git in a Month of Lunches"],
    "sort": [
        {"_id": "asc"},
        {"_score": "desc"},
        {"title.keyword":"desc"}
    ]
}

elasticsearch_dsl 包

import time
from elasticsearch_dsl import connections
from elasticsearch_dsl import Search,Q
conn = connections.create_connection(hosts=['192.168.214.131'],port=9200, http_auth="elastic:ellischen")
search = Search(using= conn,index='books')
querys={}
querys['must'] = [Q('regexp', **{'logical_address.keyword': {"value": "{}(.)*".format('2')}})]
search.query = Q({"match_all":{}})

res = search.count()
hits_source=[]
start=time.time()
search_data = search.extra(size=1000).sort('-_id').execute()
while len(search_data['hits']['hits']):
    hits_source.extend(search_data['hits']['hits'])
    search_data = search.extra(size=1000).extra(search_after=search_data['hits']['hits'][-1]['sort']).sort('-_id').execute()
print(time.time()-start)
print(len(hits_source))

https://github.com/elastic/elasticsearch-dsl-py/issues/1329

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值