bulk es 删除_ES系列：集群、索引、搜索配置优化方案-CSDN博客

本文链接：https://blog.csdn.net/weixin_35940511/article/details/112106097

集群优化

配置服务器open file的最大数量(使用ulimit -a 查看)
配置启动内存，修改bin/elasticsearch 文件，增加 ES_HEAP_SIZE=4g(最大不可超过32G)
配置禁止物理内存交换 config/elasticsearch.yml bootstrap.memory_lock: true
禁用监控 marvel.agent.enabled(很耗CPU)
elasticsearch.yml文件，写与读的线程池的配置

#---------------------------------thread pool-----------------------------------

threadpool.index.type: fixed

thread_pool.index.size: 500

thread_pool.index.queue_size: 2000

threadpool.bulk.type: fixed

threadpool.bulk.size: 100

threadpool.bulk.queue_size: 500

各司其职，配置只作为master或者data的节点，还可以配置客户端节点

索引优化

修改分片和副本的数量，太大太小都不合适(index.number_of_shards )
定时对索引进行合并优化 _forcemerge接口
删除已标记为删除的文档：curl -XPOST localhost:9200/uploaddata/_forcemerge?max_num_segments=1
curl -XPOST localhost:9200/uploaddata/_forcemerge?only_expunge_deletes=true
设置存储压缩方式，在速度与存储空间之间平衡(index.codec)
设置刷新时间间隔 index.refresh_interval，时间增长可以增加索引速度
设置日志策略index.translog.durability，降低数据flush到磁盘的频率。如果对数据丢失有一定的容忍，可以打开async模式
宕机之后，设置分片重分配时间index.unassigned.node_left.delayed_timeout
后台merge的线程数 index.merge.scheduler.max_thread_count merge
每台机器上的分片数量index.routing.allocation.total_shards_per_node(注意，不可设置为：( pri_shard_num + rep_shard_num) / data_node_num)
对于经常有取topN的需求，可设置按照某字段排序，避免全数据扫描：