忙忙碌碌中2016就要过去,借这篇博客小小总结一下。像往年一样每年都是很忙很忙,今年尤其如此,哈哈哈!适逢年末,这一年可以总结的东西有很多,比如:开始读了这本专业书《Modern Authentication with Azure Active Directory for Web applications》,只可惜还没有完全读完,来年仍需要加把劲,多花些时间在读书,少些时间在手机上!此外,自从搬到28楼后,投身了一项新的群众体育运动 - 桌球,积极健身好好工作。
默认的Elasticsearch Azure插件只支持向一个Azure存储账号(storage account)写入/读出集群快照(snapshot)数据,索引的快照数据是以 block blob的形式存储在Azure存储账号的blob中的,我在另一篇博客《Elasticsearch-cloud-azure插件使用哪种Azure blob?》中分析这部分的代码。这个限制对于大型Elasticsearch集群(例如:数据量很大TB, 数据节点>30)而言,会导致过载单一的storage account以至于snapshot失败或者PARTIAL失败,在Elasticsearch的日志文件或者快照状态信息中会看到IndexShardSnapshotFailedException,如下面的例子所示:
"state": "PARTIAL",
"start_time": "2016-09-21T00:10:09.180Z",
"start_time_in_millis": 1474416609180,
"end_time": "2016-09-21T02:14:36.642Z",
"end_time_in_millis": 1474424076642,
"duration_in_millis": 7467462,"failures": [
{
"node_id": "SVT4jVpiTVmiH8K7ctWrOQ",
"index": "my_index_20160913d",
"reason": "IndexShardSnapshotFailedException[[my_index_20160913d][1] Failed to perform snapshot (index files)]; nested: IOException; nested: StorageException[The server encountered an unknown failure: ]; nested: IOException[Error writing to server]; ",
"shard_id": 1,
"status": "INTERNAL_SERVER_ERROR"
},
...
]
[2016-07-22 01:27:36,168][WARN ][snapshots ] [ESNode-ElasticSearchData_IN_53] [[myindex.2016_07_10][4]] [snapshot:001008] failed to create snapshot
org.elasticsearch.index.snapshots.IndexShardSnapshotFailedException: [myindex.2016_07_10][4] Failed to perform snapshot (index files)
at org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$SnapshotContext.snapshot(BlobStoreIndexShardRepository.java:509)
at org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository.snapshot(BlobStoreIndexShardRepository.java:140)
at org.elasticsearch.index.snapshots.IndexShardSnapshotAndRestoreService.snapshot(IndexShardSnapshotAndRestoreService.java:85)
at org.elasticsearch.snapshots.SnapshotsService$5.run(SnapshotsService.java:871)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException
at com.microsoft.azure.storage.core.Utility.initIOException(Utility.java:643)
at com.microsoft.azure.storage.blob.BlobOutputStream.writeBlock(BlobOutputStream.java:444)
at com.microsoft.azure.storage.blob.BlobOutputStream.access$000(BlobOutputStream.java:53)
at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:388)
at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:385)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)... 3 more
Caused by: com.microsoft.azure.storage.StorageException: The server encountered an unknown failure:
at com.microsoft.azure.storage.StorageException.translateException(StorageException.java:101)
at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:199)
at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlockInternal(CloudBlo