测试elasticsearch过程中,遇到translog损坏的异常,将修复的过程记录下来。
1. 问题
单机数据量有8亿+,一个index,20+个字段,使用bulk不停的写数据,bulk.size=5W,此时机器意外断电宕机。
机器修复后重启ES,出现translogCorruptedException异常:
[2015-01-06 16:12:34,061][WARN ][indices.cluster ] [node_141] [ips][4] failed to start shard
org.elasticsearch.index.gateway.IndexShardGatewayRecoveryException: [ips][4] failed to recover shard
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:287)
at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:132)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.index.translog.TranslogCorruptedException: translog corruption while reading from stream
at org.elasticsearch.index.translog.ChecksummedTranslogStream.read(ChecksummedTranslogStream.java:70)
at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:257)
... 4 more
Caused by: java.io.EOFException
at org.elasticsearch.common.io.stream.InputStreamStreamInput.readBytes(InputStreamStreamInput.java:53)
at org.elasticsearch.index.translog.BufferedChecksumStreamInput.readBytes(BufferedChecksumStreamInput.java:55)
at org.elasticsearch.common.io.stream.StreamInput.readBytesReference(StreamInput.java:86)
at org.elasticsearch.common.io.stream.StreamInput.readBytesReference(StreamInput.java:74)
at org.elasticsearch.index.translog.Translog$Create.readFrom(Tran