ES分片优化

最新推荐文章于 2024-08-09 13:23:37 发布

凤舞飘伶

最新推荐文章于 2024-08-09 13:23:37 发布

阅读量1.1k

点赞数

分类专栏： ELK 文章标签： elasticsearch Powered by 金山文档

原文链接：https://mp.weixin.qq.com/s/a0PM5TzyIHHS2cFNieyawA

版权

ELK 专栏收录该内容

25 篇文章 0 订阅

订阅专栏

ES 优化

jvm.options配置文件如下：

-Xms30g-Xmx30g# GC configuration8-13:-XX:+UseConcMarkSweepGc8-13:-XX:CMSInitiatingOccupancyFraction=758-13:-XX+UseCMSInitiatingOccupancyOnly# G1GC configuration#NOTE G1GC in only supported ON JDK version 10 or later# to use G1GC uncomment the next two lines and update the version on the# following three lines to your verison of the JDK# 10-13:-XX:-UseConcMarkSweepGc# 10-13:-XX:-UseCMSInitiatingOccupancyOnly14-:-XX:+UseG1Gc# jvm temdir=${ES_TMPDIR}-Djava.io.tmpdir=${ES_TMPDIR}# heap dumps#generate a heap dump when an allocation from the java heap fails ; heap dumps# are created in the working directory of the jvm unless an alternative path is # specified-XX:+HeapDumpOnOutOfMeoryError#exit right after heap dump on out of memory error. recommended to also#use on java 8 for supported version 9-:-XX+ExitOnOutOfMemoryError# specify an alternative path for jvm fatal error logs-XX:ErrorFile=logs/hs_err_pid%p.log#JDK 8 GC logging8:-XX:+PrintGCDetails8:-XX:+PrintGCDateStamps8:-XX:+PrintTenuringDistribution8:-XX:+PrintGCApplicationStoppedTime8:-Xloggc:logs/gc.log8:-XX:+UseGCLogFileRotation8:-XX:NumberOfGCLogFiles=328:-XX:GCLogFileSize=64m

GC 内存大小建议在32G 以内：

ES是用java开发的java32位虚拟机的寻址最大4G，64位理论上来说无限大，请参考java Object的header结构。但是实际上64位的地址指针会导致一下问题:增加了GC开销：64位对象引用需要占用更多的堆空间，留给其他数据的空间将会减少， 从而加快了GC的发生，更频繁的进行GC。降低CPU缓存命中率：64位对象引用增大了，CPU能缓存的oop将会更少，从而降低了CPU缓存的效率。指针压缩技术可以解决这个问题，请参考JAVA指针压缩的实现原理(https://blog.csdn.net/liujianyangbj/article/details/108049482)指针压缩技术支持的理论最大内存为32G，但是最好小于这个值.超过这个值会导致指针压缩技术失效，会导致指针变成普通指针（64位），速度变慢了。ES限制这个小于32G就是这个原因，超过就效率变差了。如果你的电脑内存大，可以开多个数据节点，但是这会导致另外一个问题，操作系统文件句柄个数是有限的，多个ES会争抢系统资源，总之一句话，还是要根据实际情况来选择你的配置。

CMS

Elasticsearch 带有并发标记扫描 (CMS) 收集器。这种类型的垃圾收集器专为短暂停顿而设计。顾名思义，CMS 使用应用程序线程同时跟踪可达对象引用。由于 GC 使用了应用程序线程，因此会出现暂停时间。每次触发 CMS，都会清空整个老一代。

分片优化

ES的分片分为两种，主分片（Primary Shard）和副本（Replicas）。默认情况下，ES会为每个索引创建5个分片，即使是在单机环境下，这种冗余被称作过度分配（Over Allocation），目前看来这么做完全没有必要，仅在散布文档到分片和处理查询的过程中就增加了更多的复杂性，好在ES的优秀性能掩盖了这一点。如果过度分配，就增大了Lucene在合并分片查询结果时的复杂度，从而增大了耗时，所以我们应该使用尽量少的分片。

主分片，副本和节点最大数之间数量存在以下关系：

节点数<=主分片数*（副本数+1）

一般来说一个分片数据量可以保持在10-20G，最好不要超过20G,根据不同的index的数据量，ES有多少节点，来评估每一个索引需要多少分片。做相应的调配

举个例子，用户环境idxpoc索引数据量较小，一天1G不到，可以改为一个分片一个副本

Weblog数据量较大一天80G，带副本160G，4个节点情况下，

可以改为4个分片一个副本

这样每个分片数据量在160G除以8 20G一个分片

index 优化

随着使用时间增长每天创建的index和分片越来越多，也会消耗ES性能，建议定期或指用脚本关闭超过60天甚至30天的index

每个节点的分片数量保持在低于每1GB堆内存对应集群的分片在20-25之间。

并且注意，平时尽量减少跨时间范围较大的聚合搜索，这样会很消耗ES性能。

Curl 运维

#检查es版本信息curl -u <user>:<passwd> http://hostname:9200#此时观察ES集群状态：curl -XGET http://hostname:9200/_cluster/health?pretty# 查询es 所有node 信息curl -XGET 'http://10.50.30.147:9200/_cat/nodes'# 查询es 健康状况curl -XGET 'http://10.50.30.147:9200/_cat/health'# 查看主节点curl -XGET 'http://10.50.30.147:9200/_cat/master'#查看所有索引curl -XGET 'http://10.50.30.147:9200/_cat/indices'# 创建索引curl -X PUT "hostname:9200/test_index?pretty" -H 'Content-Type: application/json' -d'{    "settings": {        "number_of_shards" :   5,        "number_of_replicas" : 0    }}'# 修改索引  index的shard数量定好后，就不能再修改curl -X PUT "hostname:9200/test_index/_settings?pretty" -H 'Content-Type: application/json' -d'{    "number_of_replicas": 1}'#观察集群内各索引状态：curl http://hostname:9200/_cat/indices?pretty#查询elasticsearch节点状态:curl -XGET http://hostname:9200/_cat/shards |grep UNASSIGNED# 删除索引信息curl -X DELETE "hostname:9200/test-index"# 查看任务堆积情况curl -XGET http://hostname:9200/_cat/pending_tasks#列出所有索引及存储大小curl 'http://hostname:9200/_cat/indices?v'#创建索引创建索引名为XX,默认会有5个分片，1个副本curl -XPUT 'http://hostname:9200/XX?pretty'# 查询每个shard 的大小curl -s -XGET "http://hostname:9200/_cat/shards?v&h=index,shard,docs,store" -H 'Content-Type: application/json'index                                                         shard  docs   storetest_index                                                    23819181.9kbtest_index                                                    23819171.4kbtest_index                                                    13834182.7kbtest_index                                                    13834184.4kbtest_index                                                    43832173.5kbtest_index                                                    43832173.5kbtest_index                                                    33807173.9kbtest_index                                                    33807173.9kbtest_index                                                    03908189.9kbtest_index                                                    03908189kb#创建索引名为qsh_test ,有10个分片，2个副本curl -XPUT http://hostname:9200/qsh_test -d '{"settings": {"number_of_shards": 10,"number_of_replicas": 2  }}'查询索引分片信息curl -X GET "hostname:9200/_cat/shards?v&pretty"#添加一个类型curl -XPUT 'http://hosname:9200/XX/external/2?pretty' -d '{"gwyy": "John"}'#更新一个类型curl -XPOST 'http://hostname:9200/XX/external/1/_update?pretty' -d '{"doc": {"name": "Jaf"}}'#删除指定索引curl -XDELETE 'http://hosname:9200/_index?pretty'#删除索引：booklist 下的所有数据curl -XPOST http://<ip>:<port>/booklist/_delete_by_query?pretty -d '{"query": {"match_all": {        }    }}'# 设置 minimum_master_nodes 为 2 确保一个是active 一个standby curl -XPUT 'http://hostname:9200/_cluster/settings' -H 'Content-Type: application/json' -d '{"persistent" : {"discovery.zen.minimum_master_nodes" : 2  }}'# 从集群中剔除某个节点或者实例curl -XPUT http://hostname:9200/_cluster/settings?pretty -H 'Content-Type: application/json' -d '{"transient": {"cluster.routing.allocation.exclude._name": "{node.name}"  }}'上面其实会触发分片的 Allocation 机制，涉及的参数为 cluster.routing.allocation.exclude.{attribute}，其中 {attribute} 表示节点的匹配方式，支持三种：_name：匹配 node 名称，多个 node 名称用逗号隔开；_ip：匹配 node ip 地址，多个地址用逗号隔开；_host：匹配 node 主机名，多个主机名用逗号隔开；执行后将导致 {node.name} 节点上的分片慢慢迁移到其他节点，可能会花几分钟甚至更多的时间，期间不会影响正常业务curl -s "http://hostname:9200/_cat/shards" | grep RELOCATINGcurl http://hostname:9200/_cluster/health?pretty