一、 ElasticSearcg 报index.max_result_window错误
如果确实需要深度翻页,需要修改max_result_window:
curl -XPUT "http://localhost:9200/my_index/_settings" -d '{ "index" : { "max_result_window" : 500000 } }'
it returns: {acknowledged : true}
二、 ES超时:
elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10))
mean the request didn't end in the specified time (by default, timeout=10).
This will work with 30 seconds :
res = es.index(index="test-index", doc_type='tweet', id=1, body=doc, timeout=30)
三、 scan-scroll
result = helpers.scan(client=es, index=IndexName, query=bodyDic, scroll="3m",size=1000,preserve_order=False)
- 参数
size
控制了每个分片上而非每个请求的结果数目,所以size
为10
的情况下,如果命中了 5 个分片,那么每个 scroll 请求最多会返回 50 个结果。 -
scroll="3m",保持上下文3分钟