ES 集群调整、升级最佳实践

最新推荐文章于 2024-05-27 07:31:01 发布

小小无敌无悔

最新推荐文章于 2024-05-27 07:31:01 发布

阅读量1.3k

点赞数

分类专栏： elasticsearch 文章标签： elasticsearch

原文链接：https://www.cnblogs.com/huangpeng1990/p/5760210.html

版权

elasticsearch 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

日常应用中我们会经常对es 集群做一些参数调整或者升级版本，但是每次关闭节点再打开其中的数据同步的痛苦估计有很多人领悟过（有可能出现IO或者网络拥堵导致恶性循环）官网有套方案可以尝试一下：

1.关掉集群分片自动分配设置

PUT /_cluster/settings
{
  "transient": {
    "cluster.routing.allocation.enable": "none"
  }
}

2.停掉不必要的索引维护让碎片恢复的更快

POST /_flush/synced

3.停掉需要升级的节点，配置或者升级，推荐覆盖之前的elasticsearch.yml 配置文件,记得copy data 文件夹。
4.启动升级好的节点查看状态

GET _cat/nodes

5.重新开启集群分片自动分配

PUT /_cluster/settings
{
  "transient": {
    "cluster.routing.allocation.enable": "all"
  }
}

6.等待节点恢复

GET _cat/health

如果没有第二部操作，那估计得需要比较多的时间恢复，用下面api查询恢复状态，当然我推荐插件 kopf

GET _cat/health

重复上面步骤来升级所有需要升级的节点
PS : 千万注意数据只能从低版本到高版本，不能逆向~ 切记

参考官网
https://www.elastic.co/guide/en/elasticsearch/reference/2.2/rolling-upgrades.html#_step_2_stop_non_essential_indexing_and_perform_a_synced_flush_optional

非安全重启面临的问题

直接kill掉节点，可能导致数据丢失集群会认为该节点挂掉了，集群重新分配数据进行数据转移（shard
rebalance），会导致节点直接大量传输数据节点重启之后，恢复数据，同样产生大量的磁盘、网络流量，耗费机器和网络资源的。

安全重启步骤

暂停数据写入程序
关闭集群shard allocation
手动执行POST /_flush/synced
重启节点
重新开启集群shard
allocation 等待recovery完成，集群health status变成green
重新开启数据写入程序

速度调优

可临时增大 max_bytes_per_sec；随后在进行更改
可以多节点同时操作
可以将历史索引的副本数暂时调整为0；集群恢复后在进行调整
使用 _forcemerge

相关API

synced flush: curl -XPOST localhost:9200/_flush/synced

_forcemerge: forcemerge?max_num_segments=1
禁用 shard allocation curl -XPUT localhost:9200/_cluster/settings { “persistent”: { “cluster.routing.allocation.enable”: “none” }}
启用 shard allocation：curl -XPUT localhost:9200/_cluster/settings { “persistent”:
{ “cluster.routing.allocation.enable”: “all” }}
增大max_bytes_per_sec：http://localhost:port/_cluster/settings?flat_settings=true{“transient”
: {“indices.recovery.max_bytes_per_sec” : 200mb}}
恢复max_bytes_per_sec：http://localhost:port/_cluster/settings?flat_settings=true{“transient”
: {“indices.recovery.max_bytes_per_sec” :null}}
一些查看恢复速度的API：curl localhost:9200/{index}/_stats?level=shards&pretty
curl localhost:9200/{index}/_recovery?pretty&human&detailed=true
curl localhost:9200/_cat/recovery

作者：A_You
链接：https://www.jianshu.com/p/22a712a657bf
来源：简书
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。

小小无敌无悔

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
ES 集群调整、升级最佳实践

日常应用中我们会经常对es 集群做一些参数调整或者升级版本，但是每次关闭节点再打开其中的数据同步的痛苦估计有很多人领悟过（有可能出现IO或者网络拥堵导致恶性循环）官网有套方案可以尝试一下：1.关掉集群分片自动分配设置PUT /_cluster/settings{“transient”: {“cluster.routing.allocation.enable”: “none”}}2.停掉不必要的索引维护让碎片恢复的更快POST /_flush/synced3.停掉需要升级的节点，配
复制链接

扫一扫