1.为什么要限制10000条
默认情况下,Elasticsearch集群中每个分片的搜索结果数量限制为10000。这是为了避免潜在的性能问题。
2.突破10000条限制的办法
使用scroll API:scroll API可以帮助我们在不加载所有数据的情况下获取所有结果。它会在后台执行查询以获取滚动ID,并将其用于进行后续查询。这样就可以一次性获取所有结果,而不必担心限制 修改ES默认设置,将10000修改成10万或者50万,这样的话要考虑ES内存溢出的问题,不推荐
3.具体方法
使用scroll API
public ConcurrentLinkedDeque < FramePickingInfoVo > getSearchRequest ( FramePickingPageVo vo) {
long start = System . currentTimeMillis ( ) ;
ConcurrentLinkedDeque < FramePickingInfoVo > list= new ConcurrentLinkedDeque < > ( ) ;
try {
SearchRequest baseSubOrderIndexRequest = CloudBaseQueryBuilder . getBaseSubOrderIndexRequest ( ) ;
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder ( ) ;
BoolQueryBuilder boolQuery = CloudBaseQueryBuilder . getBaseFramePickBoolQuery ( vo, true ) ;
sourceBuilder. size ( batchSize) ;
sourceBuilder. query ( boolQuery) ;
getSort ( vo. getOrderByType ( ) , sourceBuilder) ;
SearchRequest searchRequest = new SearchRequest ( BASE_SUB_ORDER_DIMENSION_INDEX ) ;
searchRequest. source ( sourceBuilder) ;
searchRequest. scroll ( TimeValue . timeValueMinutes ( scrollTime) ) ;
SearchResponse searchResponse = elasticsearchClient. search ( searchRequest, RequestOptions . DEFAULT ) ;
String scrollId = searchResponse. getScrollId ( ) ;
SearchHit [ ] searchHits = searchResponse. getHits ( ) . getHits ( ) ;
ConcurrentLinkedDeque < FramePickingInfoVo > framPickDateInfos = getFramPickDateInfos ( searchResponse) ;
list. addAll ( framPickDateInfos) ;
while ( searchHits != null && searchHits. length > 0 ) {
SearchScrollRequest scrollRequest = new SearchScrollRequest ( scrollId) ;
scrollRequest. scroll ( TimeValue . timeValueMinutes ( scrollTime) ) ;
searchResponse = elasticsearchClient. scroll ( scrollRequest, RequestOptions . DEFAULT ) ;
scrollId = searchResponse. getScrollId ( ) ;
searchHits = searchResponse. getHits ( ) . getHits ( ) ;
ConcurrentLinkedDeque < FramePickingInfoVo > framPickDateInfos1 = getFramPickDateInfos ( searchResponse) ;
list. addAll ( framPickDateInfos1) ;
}
ClearScrollRequest clearScrollRequest = new ClearScrollRequest ( ) ;
clearScrollRequest. addScrollId ( scrollId) ;
ClearScrollResponse clearScrollResponse = elasticsearchClient. clearScroll ( clearScrollRequest, RequestOptions . DEFAULT ) ;
boolean succeeded = clearScrollResponse. isSucceeded ( ) ;
long end = System . currentTimeMillis ( ) ;
System . out. println ( "共执行时间:" + ( end - start) / 1000 + " s" ) ;
} catch ( Exception e) {
System . out. println ( "===error==" + e. getMessage ( ) ) ;
e. printStackTrace ( ) ;
}
return list;
}
修改ES默认配置
# 调大查询窗口巨细,比方100 w
PUT index/ _settings
{
"index.max_result_window" : "1000000"
}
# 检查查询最大数
GET index/ _settings
-- -
{
"demo_scroll" : {
"settings" : {
"index" : {
"number_of_shards" : "5" ,
"provided_name" : "demo_scroll" ,
"max_result_window" : "1000000" ,
"creation_date" : "1680832840425" ,
"number_of_replicas" : "1" ,
"uuid" : "OLV5W_D9R-WBUaZ_QbGeWA" ,
"version" : {
"created" : "6082399"
}
}
}
}
}