一直看ES的集群状态都是yellow,一开始没在意,后来越觉奇怪,三个节点,5个主片5个副片,node-3上一直没分片,主片分在了node-1,node-2,副片分在。。。
啊,副片全都UNASSIGNED了,才发现。。。。。。。
于是乎进行rerouter
首先猫一下分片情况
>>> curl -XGET 'http://localhost:9200/_cat/shards'
my_index 4 p STARTED 104917 157.2mb 127.0.0.1 node-1
my_index 4 r UNASSIGNED
my_index 3 p STARTED 104892 156.7mb 127.0.0.1 node-1
my_index 3 r UNASSIGNED
my_index 2 p STARTED 104714 155.6mb 127.0.0.1 node-1
my_index 2 r UNASSIGNED
my_index 1 p STARTED 104874 156.5mb 127.0.0.1 node-2
my_index 1 r UNASSIGNED
my_index 0 p STARTED 105933 156.5mb 127.0.0.1 node-1
my_index 0 r UNASSIGNED
0-4 5个r分片全沦陷了(head上就看出来了好伐...)
然后进行reroute
curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "my_index",
"shard" : 这里是分片,
"node" : 这里是节点,
"allow_primary" : true
}
}
]
}'
为了方便写个脚本去做
#!bin/sh
for index in $(curl -s 'http://localhost:9200/_cat/shards' | grep UNASSIGNED | awk '{print $1}' | sort | uniq); do
for shard in $(curl -s 'http://localhost:9200/_cat/shards' | grep UNASSIGNED | grep $index | awk '{print $2}' | sort | uniq); do
echo $index $shard
curl -XPOST 'http://localhost:9200/_cluster/reroute' -d "{'commands':[{'allocate':{'index':$index,'shard':$shard,'node':'node-3','allow_primary':true}}]}"
sleep 5
done
done
好,试一下。。。。。。。果断报错
- "type": "illegal_argument_exception",
- "reason": "[allocate] allocation of [my_index][0] on node {node-3}{rjG_j423SpejhzmAAUCcqA}{127.0.0.1}{127.0.0.1:9320} is not allowed, reason: [NO(more than allowed [85.0%] used disk on node, free: [6.234234497791846%])][YES(node passes include/exclude/require filters)][YES(allocation disabling is ignored)][YES(shard is not allocated to same node or host)][YES(target node version [2.4.4] is same or newer than source node version [2.4.4])][YES(shard not primary or relocation disabled)][YES(below shard recovery limit of [2])][YES(total shard limit disabled: [index: -1, cluster: -1] <= 0)][YES(primary is already active)][YES(allocation disabling is ignored)][YES(no allocation awareness enabled)]"
这个。。。眼睛有点儿花。直接去查查reroute失败的原因,瞄到如下信息(来自博客http://blog.csdn.net/xiangcheng001/article/details/51133364)
啊,貌似在错误里看到了个85,回看眼错误信息是有个相关的描述。
df -h 检查一下
乖乖,啥时候这么多了。
嗯,再见了,我心爱的电影,大电影,小电影
删完之后再reroute,
{
- "acknowledged": true,
- "state": {
- "version": 17,
- "state_uuid": "WrYBhVr5T7aem4uReXaRCA",
- "master_node": "6gDDoI_OS32VjAxmCGTSsg",
- "blocks": { },
- "nodes": {
- "71GSL-osQeaLv-cDU9bygA": {
- "name": "node-2",
- "transport_address": "127.0.0.1:9310",
- "attributes": { }
- "6gDDoI_OS32VjAxmCGTSsg": {
- "name": "node-1",
- "transport_address": "127.0.0.1:9300",
- "attributes": { }
- "rjG_j423SpejhzmAAUCcqA": {
- "name": "node-3",
- "transport_address": "127.0.0.1:9320",
- "attributes": { }
- "71GSL-osQeaLv-cDU9bygA": {
- "routing_table": {
- "indices": {
- "enterprise_data_gov_20170324": {
- "shards": {
- "0": [
- {
- "state": "STARTED",
- "primary": true,
- "node": "6gDDoI_OS32VjAxmCGTSsg",
- "relocating_node": null,
- "shard": 0,
- {
- "0": [
- "shards": {
- "enterprise_data_gov_20170324": {
- "indices": {
balabala的
后来发现其实不用rerouter就已经自动修复了,
好,大家都绿的发亮了,皆大欢喜