解决elasticsearch集群Unassigned Shards无法reroute的问题

最新推荐文章于 2024-09-27 11:08:05 发布

冬天里的懒猫

最新推荐文章于 2024-09-27 11:08:05 发布

阅读量2.1k

点赞数 1

分类专栏： ElasticSearch 文章标签： elasticsearch java es linux 经验分享

本文链接：https://blog.csdn.net/dhaibo1986/article/details/107629908

版权

1.背景&问题描述

接上篇文章https://blog.csdn.net/dhaibo1986/article/details/107564968
在上一篇文章中，由于系统宕机，导致大量索引出现了Unassigned 状态。在上一篇文章中，我们通过reroute API进行了操作，对主分片缺失的索引，经过上述操作之后，分配了主分片。但是在接下来的操作中，对于副本分片，reroute出错！
如下是索引 alarm-2017.08.12，第0个分片的副本没有分配：

下面执行语句：

  POST /_cluster/reroute
  {
  "commands": [
    {
      "allocate_replica": {
        "index": "alarm-2017.08.12",
        "shard": 0,
        "node": "node4-1"
      }
    }
  ]
}

结果执行失败！

{
  "error": {
    "root_cause": [
      {
        "type": "remote_transport_exception",
        "reason": "[node3-2][192.168.21.88:9301][cluster:admin/reroute]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "[allocate_replica] allocation of [alarm-2017.08.12][0] on node {node4-1}{u47KtJGgQw60T_xm9hmepw}{UbaCHI4KRveQeTAnJvGFEQ}{192.168.21.89}{192.168.21.89:9301}{rack=r4, ml.enabled=true} is not allowed, reason: [NO(shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2017-08-16T00:54:47.088Z], failed_attempts[5], delayed=false, details[failed recovery, failure RecoveryFailedException[[alarm-2017.08.12][0]: Recovery failed from {node8}{Bpd3y--EQsag1u1NTmtZfA}{4T_McpmjSXqLowRoXztssQ}{192.168.21.89}{192.168.21.89:9301}{rack=r4} into {node5}{i4oG4VcaSdKVeNEvStXwAw}{w4nAITEOR9u7liR55qDsVA}{192.168.21.88}{192.168.21.88:9300}{rack=r3}]; nested: RemoteTransportException[[node8][192.168.21.89:9301][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] phase1 failed]; nested: RecoverFilesRecoveryException[Failed to transfer [0] files with total size of [0b]]; nested: FileSystemException[/opt/elasticsearch/elasticsearch-node8/data/nodes/0/indices/FgLdgYTmTfazlP8i5K0Knw/0/index: Too many open files in system]; ], allocation_status[no_attempt]]])][YES(primary shard for this replica is already active)][YES(explicitly ignoring any disabling of allocation due to manual allocation commands via the reroute API)][YES(target node version [5.5.1] is the same or newer than source node version [5.5.1])][YES(the shard is not being snapshotted)][YES(node passes include/ex

最低0.47元/天解锁文章