es 插入很慢_ES节点丢失导致实时数据导入速度特别慢

一个节点死机了,无法自动重启。通过logtash导数据,由于当天入的数据是0备份,节点丢失后,某些shard丢失,导致集群一直处于red状态。节点丢失后,该索引的导入速度直线下降。经测试发现是logtash的原因,logtash的input阶段是一个线程,filter和output用一个线程。中间通过一个同步队列缓存数据。如果在output的过程中出现问题,那么失败的数据会无限制地放回同步队列,然后队列中的数据被再次分配shard导入,分配到丢失shard的数据会再次失败,再次放入同步队列。因此数据一直在同步队列和es的bulk中循环,导致整个索引的导入速度变慢。

用测试机测试出的结果如下:

1、正常导数据:

xxx-20170925 1 p STARTED 24713 24.7mb xxx.7.67 node-xxx.7.67-performance_test

xxx-20170925 5 p STARTED 24256 33.7mb xxx.7.67 node-xxx.7.67-performance_test

xxx-20170925 2 p STARTED 24702 24.2mb xxx.11.131 node-xxx.11.131-performance_test

xxx-20170925 3 p STARTED 24626 24.2mb xxx.7.81 node-xxx.7.81-performance_test

xxx-20170925 7 p STARTED 24916 34.2mb xxx.7.81 node-xxx.7.81-performance_test

xxx-20170925 4 p STARTED 23970 38.2mb xxx.6.105 node-xxx.6.105-performance_test

xxx-20170925 6 p STARTED 24786 24mb xxx.11.131 node-xxx.11.131-performance_test

xxx-20170925 0 p STARTED 24824 34.4mb xxx.6.105 node-xxx.6.105-performance_test

2 关闭一个节点

xxx-20170925 6 p STARTED 128179 110.8mb xxx.11.131 node-xxx.11.131-performance_test

xxx-20170925 1 p UNASSIGNED

xxx-20170925 4 p STARTED 128263 108.1mb xxx.6.105 node-xxx.6.105-performance_test

xxx-20170925 7 p STARTED 128593 109.3mb xxx.7.81 node-xxx.7.81-performance_test

xxx-20170925 2 p STARTED 128613 112.8mb xxx.11.131 node-xxx.11.131-performance_test

xxx-20170925 5 p UNASSIGNED

xxx-20170925 3 p STARTED 127969 115.6mb xxx.7.81 node-xxx.7.81-performance_test

xxx-20170925 0 p STARTED 128322 110.3mb xxx.6.105 node-xxx.6.105-performance_test

3 经过一段时间后查看shard,发现其他shard增长的速度特别慢

xxx-20170925 6 p STARTED 128436 111.1mb xxx.11.131 node-xxx.11.131-performance_test

xxx-20170925 5 p UNASSIGNED

xxx-20170925 3 p STARTED 128231 110.9mb xxx.7.81 node-xxx.7.81-performance_test

xxx-20170925 7 p STARTED 128814 109.6mb xxx.7.81 node-xxx.7.81-performance_test

xxx-20170925 1 p UNASSIGNED

xxx-20170925 2 p STARTED 128871 182.6mb xxx.11.131 node-xxx.11.131-performance_test

xxx-20170925 4 p STARTED 128502 108.5mb xxx.6.105 node-xxx.6.105-performance_test

xxx-20170925 0 p STARTED 128568 109.1mb xxx.6.105 node-xxx.6.105-performance_test

logtash的日志如下:

[2017-11-21T11:04:26,780][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[xxx-20170925][5] primary shard is not active Timeout: [1m], request: [BulkShardRequest to [xxx-20170925] containing [19] requests]"})

[2017-11-21T11:04:26,780][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[xxx-20170925][5] primary shard is not active Timeout: [1m], request: [BulkShardRequest to [xxx-20170925] containing [19] requests]"})

[2017-11-21T11:04:26,780][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[xxx-20170925][1] primary shard is not active Timeout: [1m], request: [BulkShardRequest to [xxx-20170925] containing [15] requests]"})

[2017-11-21T11:04:26,780][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[xxx-20170925][5] primary shard is not active Timeout: [1m], request: [BulkShardRequest to [xxx-20170925] containing [19] requests]"})

[2017-11-21T11:04:26,784][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[xxx-20170925][5] primary shard is not active Timeout: [1m], request: [BulkShardRequest to [xxx-20170925] containing [19] requests]"})

[2017-11-21T11:04:26,784][ERROR][logstash.outputs.elasticsearch] Retrying individual actions

[2017-11-21T11:04:26,784][ERROR][logstash.outputs.elasticsearch] Action

[2017-11-21T11:04:26,784][ERROR][logstash.outputs.elasticsearch] Action

[2017-11-21T11:04:26,784][ERROR][logstash.outputs.elasticsearch] Action

[2017-11-21T11:04:26,784][ERROR][logstash.outputs.elasticsearch] Action

[2017-11-21T11:04:26,784][ERROR][logstash.outputs.elasticsearch] Action

[2017-11-21T11:04:26,784][ERROR][logstash.outputs.elasticsearch] Action

[2017-11-21T11:04:26,784][ERROR][logstash.outputs.elasticsearch] Action

4 数据恢复后

xxx-20170925 4 p STARTED 154764 125.3mb xxx.6.105 node-xxx.6.105-performance_test

xxx-20170925 5 p STARTED 157936 126.4mb xxx.7.67 node-xxx.7.67-performance_test

xxx-20170925 2 p STARTED 154945 138.9mb xxx.11.131 node-xxx.11.131-performance_test

xxx-20170925 7 p STARTED 155224 156.8mb xxx.7.81 node-xxx.7.81-performance_test

xxx-20170925 1 p STARTED 158080 124.8mb xxx.7.67 node-xxx.7.67-performance_test

xxx-20170925 3 p STARTED 154243 153.8mb xxx.7.81 node-xxx.7.81-performance_test

xxx-20170925 6 p STARTED 154909 146.9mb xxx.11.131 node-xxx.11.131-performance_test

xxx-20170925 0 p STARTED 154681 127mb xxx.6.105 node-xxx.6.105-performance_test

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值