最近遇到了一个特殊的情况,我们所使用的一个Elasticsearch集群的数据节点磁盘空间耗尽(out of space),啥事会发生呢? 数据损坏,集群RED。下面是相关的日志信息,我将其中一些关键点日志信息重点小时。这里 ES-Data_IN_11是当时的Master节点,ES-Data_IN_12是出现的磁盘耗尽的数据节点,出事儿的index名字为raw_v3.2017_03_22,我们仍然使用的是Elasticsearch 1.7.2。
[2017-03-22 11:57:30,503][WARN ][index.merge.scheduler ] [ES-Data_IN_12] [raw_v3.2017_03_22][1] failed to merge
[2017-03-22 11:57:30,646][WARN ][index.engine ] [ES-Data_IN_12] [raw_v3.2017_03_22][1] failed engine [merge exception]
[2017-03-22 11:57:30,663][WARN ][indices.cluster ] [ES-Data_IN_12] [[raw_v3.2017_03_22][1]] marking and sending shard failed due to [engine failure, reason [merge exception]]
[2017-03-22 11:57:31,677][WARN ][cluster.action.shard ] [ES-Data_IN_11] [raw_v3.2017_03_22][1] received shard failed for [raw_v3.2017_03_22][1], node[h9d1tKJFRtmg1aG-VMkA4A], [P], s[STARTED], indexUUID [2Z-UtAr8Qx2h6Xe9BkxvAA], reason [shard failure [engine failure, reason [merge exception