通过执行 GET /_cluster/allocation/explain
查看当前索引分配详情
"deciders": [{
"decider": "throttling",
"decision": "THROTTLE",
"explanation": "reached the limit of outgoing shard recoveries [2] on the node [6gauyszJRDKhUo7clOCNsw] which holds the primary, cluster setting [cluster.routing.allocation.node_concurrent_outgoing_recoveries=2] (can also be set via [cluster.routing.allocation.node_concurrent_recoveries])"
}]
如果 decider 中返回 “throttling” 时,通常表示该节点恢复并发达到上限,如果集群资源利用率比较低的话,可以适当调大恢复并发参数,加速分片分配。如果当前集群资源利用率比较高建议适当调小。
cluster.routing.allocation.node_initial_primaries_recoveries : 2,初始化主分片的数量
cluster.routing.allocation.cluster_concurrent_rebalance : 2,分片balance的数量
cluster.routing.allocation.node_concurrent_recoveries : 2,控制节点同时进行恢复和分配操作的数量
cluster.routing.allocation.node_concurrent_incoming_recoveries : 2,控制节点并发进行恢复操作的数量
cluster.routing.allocation.node_concurrent_outgoing_recoveries : 2,控制节点并发进行分配操作的数量
indices.recovery.max_bytes_per_sec : 40mb,带宽大小
解决方案
按需调整对应参数,初始化分片数量建议调大,balance数量一般不建议调大,否则会影响业务读写。其余并发恢复或分配数量一般建议调整为小于或等于单节点cpu核数。
Persistent设置:这些设置是持久的,一旦设置后将一直保持有效,即使集群重启也会保留。这意味着persistent设置会被写入到集群的配置文件中,并且会在每次启动时自动加载。
Transient设置:这些设置是临时的,它们在集群重启后不会保留。这意味着transient设置只在当前运行的集群会话中有效,并且在集群重启后会被重置为默认值。
如果只需临时修改并发的话,可以只修改transient
PUT _cluster/settings
{
"persistent":{
"cluster.routing.allocation.node_concurrent_recoveries":8,
"cluster.routing.allocation.node_concurrent_incoming_recoveries":8,
"cluster.routing.allocation.node_initial_primaries_recoveries":8,
"cluster.routing.allocation.node_concurrent_outgoing_recoveries":8,
"cluster.routing.allocation.cluster_concurrent_rebalance":8,
"indices.recovery.max_bytes_per_sec":"60mb"
},
"transient":{
"cluster.routing.allocation.node_concurrent_recoveries":8,
"cluster.routing.allocation.node_concurrent_incoming_recoveries":8,
"cluster.routing.allocation.node_initial_primaries_recoveries":8,
"cluster.routing.allocation.node_concurrent_outgoing_recoveries":8,
"cluster.routing.allocation.cluster_concurrent_rebalance":8,
"indices.recovery.max_bytes_per_sec":"60mb"
}
}