Elastic Certified Engineer复习记录-复习题详解篇-集群管理(1)

LUSTER ADMINISTRATION

集群管理

GOAL: Allocate the shards in a way that satisfies a given set of requirements

目标,根据要求把分配放置在合适的位置

REQUIRED SETUP: /

建议docker-compose文件:1m2d1k_normal_cluster.yml

前期准备:

  1. Download the exam version of Elasticsearch
    1. 下载考试版本的ES包
  2. Deploy the cluster eoc-06-cluster, with three nodes named node1, node2, and node3
    1. 部署一个名叫eoc-06-cluster的集群,集群里有仨节点,分别叫node1node2node3
  3. Configure the Zen Discovery module of each node so that they can communicate
    1. 调整节点发现配置使他们能彼此通讯
  4. Start the cluster
    1. 启动集群

第0题,初始化索引和数据

  1. Create the index hamlet-1 with two primary shards and one replica

    1. 创建一个名叫hamlet-1的索引,它需要有2个主分片和1个副本
  2. Add some documents to hamlet-1 by running the command below

    1. 用下面命令添加一些数据到hamlet-1里面
    PUT hamlet-1/_doc/_bulk
    {"index":{"_index":"hamlet-1","_id":0}}  
    {"line_number":"1","speaker":"BERNARDO","text_entry":"Whos there?"}
    {"index":{"_index":"hamlet-1","_id":1}} 
    {"line_number":"2","speaker":"FRANCISCO","text_entry":"Nay, answer me: stand, and unfold yourself."}
    {"index":{"_index":"hamlet-1","_id":2}}
    {"line_number":"3","speaker":"BERNARDO","text_entry":"Long live the king!"}
    {"index":{"_index":"hamlet-1","_id":3}}
    {"line_number":"4","speaker":"FRANCISCO","text_entry":"Bernardo?"}
    {"index":{"_index":"hamlet-1","_id":4}}
    {"line_number":"5","speaker":"BERNARDO","text_entry":"He."}
    
  3. Create the index hamlet-2 with two primary shard and one replica

    1. 创建一个名叫hamlet-2的索引,它需要有2个主分片和1个副本
  4. Add some documents to hamlet-2 by running the command below

    1. 用下面命令添加一些数据到hamlet-2里面
    PUT hamlet-2/_doc/_bulk
    {"index":{"_index":"hamlet-2","_id":5}}
    {"line_number":"6","speaker":"FRANCISCO","text_entry":"You come most carefully upon your hour."}
    {"index":{"_index":"hamlet-2","_id":6}}
    {"line_number":"7","speaker":"BERNARDO","text_entry":"Tis now struck twelve; get thee to bed, Francisco."}
    {"index":{"_index":"hamlet-2","_id":7}}
    {"line_number":"8","speaker":"FRANCISCO","text_entry":"For this relief much thanks: tis bitter cold,"}
    {"index":{"_index":"hamlet-2","_id":8}}
    {"line_number":"9","speaker":"FRANCISCO","text_entry":"And I am sick at heart."}
    {"index":{"_index":"hamlet-2","_id":9}}
    {"line_number":"10","speaker":"BERNARDO","text_entry":"Have you had quiet guard?"}
    

第0题,题解

(由于我是基于我的docker-compose文件启动的集群,所以会和题目不完全一样,比如题目里的node1我用的是esNode1,下同)

  1. 创建索引,在kibana里运行下面的命令
     PUT hamlet-1
     {
       "settings": {
         "number_of_shards":2,
         "number_of_replicas": 1
       }
     }
    
  2. 把上面那些命令直接照抄在kibana里,运行

第0题,题解说明

  • 这题主要考察索引的建立以及初始化的配置,两段索引建立和数据添加除了索引名字不一样,其他差不多。
    1. 主分片(primary shard)和副本(replica)分别对应着索引配置(_settings)里的number_of_shardsnumber_of_replicas
      1. 这里有个坑点在,如果不先指定索引分片/副本数,ES 7.X会默认创建一个1分片1副本点索引,所以如果直接运行bulk命令的话,索引设置就不对了
      2. 参考链接
      3. 页面路径:Indices APIs =》 Create Index
    2. 插入数据的部分没什么特别的,直接照着执行就好,主要需要关注的是第一行写索引信息和索引命令,第二行写数据就好
      1. 参考链接
      2. 页面路径:Document APIs =》 Bulk API

第1题,检查分片放置情况

  1. Check that the replicas of indices hamlet-1 and hamlet-2 have been allocated
    1. 查看hamlet-1hamlet-2的分片放置情况
  2. Check the distribution of primary shards and replicas of indices hamlet-1 and hamlet-2 across the nodes of the cluster
    1. 通过集群中的节点查看hamlet-1hamlet-2的分片分布情况

第1题,题解

  1. 查看分片放置情况可以通过以下命令 GET /_cat/allocation?v&h=shards,disk.indices,disk.used,disk.avail,disk.total,disk.percent,host,ip,node 查看每个节点的资源分配情况
shards disk.indices disk.used disk.avail disk.total disk.percent host       ip         node
    6      179.8kb    26.1gb     32.2gb     58.4gb           44 172.18.0.2 172.18.0.2 es721Node1
    5       65.1kb    26.1gb     32.2gb     58.4gb           44 172.18.0.5 172.18.0.5 es721Node3
    5      134.8kb    26.1gb     32.2gb     58.4gb           44 172.18.0.4 172.18.0.4 es721Node2
  1. 通过以下命令 GET /_cat/nodeattrs?v&h=node,id,pid,host,ip,port,attr,value 查看节点属性
node       id   pid host       ip         port attr              value
es721Node2 q0pL 1   172.18.0.4 172.18.0.4 9300 name              es721Node2
es721Node2 q0pL 1   172.18.0.4 172.18.0.4 9300 xpack.installed   true
es721Node3 4ym8 1   172.18.0.5 172.18.0.5 9300 name              es721Node3
es721Node3 4ym8 1   172.18.0.5 172.18.0.5 9300 xpack.installed   true
es721Node1 PGDN 1   172.18.0.2 172.18.0.2 9300 ml.machine_memory 12564148224
es721Node1 PGDN 1   172.18.0.2 172.18.0.2 9300 xpack.installed   true
es721Node1 PGDN 1   172.18.0.2 172.18.0.2 9300 name              es721Node1
es721Node1 PGDN 1   172.18.0.2 172.18.0.2 9300 ml.max_open_jobs  20
  1. 通过以下命令 GET /_cat/shards/hamlet-1,hamlet-2?v&h=index,shard,prirep,state,docs,store,ip,node 查看索引的分配分配情况
index    shard prirep state   docs store ip         node
hamlet-1 1     r      STARTED    2 4.8kb 172.18.0.4 es721Node2
hamlet-1 1     p      STARTED    2 4.8kb 172.18.0.5 es721Node3
hamlet-1 0     p      STARTED    3 5.2kb 172.18.0.4 es721Node2
hamlet-1 0     r      STARTED    3 5.2kb 172.18.0.5 es721Node3
hamlet-2 1     p      STARTED    1 4.8kb 172.18.0.2 es721Node1
hamlet-2 1     r      STARTED    1 4.8kb 172.18.0.5 es721Node3
hamlet-2 0     r      STARTED    4 5.7kb 172.18.0.4 es721Node2
hamlet-2 0     p      STARTED    4 5.7kb 172.18.0.2 es721Node1
  1. (如果有分片无法正常放置)通过以下命令 GET /_cluster/allocation/explain 查看分片无法分配的原因

第1题,题解说明

  • 这题主要考察集群属性,分片分配
    1. 大部分的状态都可以通过 GET /_cat/${api} 接口和 GET /_cluster/${api} 接口来查看
      1. GET /_cat/allocation?v 接口用来查看节点资源分布状况
        1. 可以通过 ?help 来查看支持属性,通过 ?h=${header} 来指定需要属性
        2. 参考链接
        3. 页面路径:cat APIs =》 cat allocation
      2. GET /_cat/nodeattrs?v 接口用来查看节点自身属性
        1. 参考链接
        2. 页面路径:cat APIs =》 cat nodeattrs
      3. GET /_cat/shards 来查看指定/所有索引的分片分配情况
        1. 参考链接
        2. 页面路径:cat APIs =》 cat shards
    2. 当分片存在分配失败的时候,可以通过GET /_cluster/allocation/explain查看分片无法分配的原因
      1. 参考链接
      2. 页面路径:Cluster APIs =》 Cluster Allocation Explain API

第2题,索引级别分片分配设置

  1. Configure hamlet-1 to allocate both primary shards to node2, using the node name
    1. 通过配置节点名字,指定hamlet-1的俩主分片都放在node2
  2. Configure hamlet-2 so that no primary shard is allocated to node3
    1. 通过配置,避免hamlet-2的分片放置在node3
  3. Remove any allocation filter setting associated with hamlet-1 and hamlet-2
    1. 取消所有针对hamlet-1hamlet-2分片放置的设置
  4. Verify the success of the last action by using the _cat API
    1. 通过_catAPI验证一下这些操作成功了没

第2题,题解

  1. hamlet-1的主分片都放node2
    PUT hamlet-1/_settings
    {
       "index.routing.allocation.include._name": "es721Node2"
    }
    

校验命令:GET /_cat/shards/hamlet-1?v&h=index,shard,prirep,state,docs,store,ip,node

运行之前:

index    shard prirep state   docs store ip         node
hamlet-1 1     p      STARTED    2 4.8kb 172.18.0.5 es721Node3
hamlet-1 0     p      STARTED    3 5.2kb 172.18.0.4 es721Node2

运行之后:

index    shard prirep state   docs store ip         node
hamlet-1 1     p      STARTED    2 4.8kb 172.18.0.4 es721Node2
hamlet-1 0     p      STARTED    3 5.2kb 172.18.0.4 es721Node2
  1. hamlet-2所有主分片都移出node3
PUT hamlet-2/_settings
{
  "index.routing.allocation.exclude._name": "es721Node3"
}

校验命令:

GET /_cat/shards/hamlet-2?v&h=index,shard,prirep,state,docs,store,ip,node

运行之前:

index    shard prirep state   docs store ip         node
hamlet-2 1     p      STARTED    1 4.8kb 172.18.0.2 es721Node1
hamlet-2 1     r      STARTED    1 4.8kb 172.18.0.5 es721Node3
hamlet-2 0     r      STARTED    4 5.7kb 172.18.0.4 es721Node2
hamlet-2 0     p      STARTED    4 5.7kb 172.18.0.2 es721Node1

运行之后:

index    shard prirep state   docs store ip         node
hamlet-2 1     r      STARTED    1 4.8kb 172.18.0.4 es721Node2
hamlet-2 1     p      STARTED    1 4.8kb 172.18.0.2 es721Node1
hamlet-2 0     r      STARTED    4 5.7kb 172.18.0.4 es721Node2
hamlet-2 0     p      STARTED    4 5.7kb 172.18.0.2 es721Node1
  1. 移除所有hamlet-1hamlet-2的分配放置设置
PUT hamlet-1,hamlet-2/_settings
{
  "index.routing.allocation.include._name": null,
  "index.routing.allocation.exclude._name": null
}

校验命令:

GET /_cat/shards/hamlet-1,hamlet-2?v&h=index,shard,prirep,state,docs,store,ip,node

执行之后:

index    shard prirep state   docs store ip         node
hamlet-2 1     r      STARTED    1 4.8kb 172.18.0.4 es721Node2
hamlet-2 1     p      STARTED    1 4.8kb 172.18.0.2 es721Node1
hamlet-2 0     r      STARTED    4 5.7kb 172.18.0.4 es721Node2
hamlet-2 0     p      STARTED    4 5.7kb 172.18.0.5 es721Node3
hamlet-1 1     p      STARTED    2 4.8kb 172.18.0.4 es721Node2
hamlet-1 0     p      STARTED    3 5.2kb 172.18.0.2 es721Node1

第2题,题解说明

  • 这题主要考察索引配置中的节点放置配置,会涉及到索引_settings里的index.routing.allocation.include._nameindex.routing.allocation.exclude._name俩设置。
    1. 题目里用到的是节点名字(_name)其实也可以通过其他属性进行节点的筛选,比如_ip _host 等内置的属性,以及${attr}的一些外置的属性。
  • 这里可能会有个坑点也是考点在于,有些索引的节点放置属性设置之后,会触发ES集群的分片放置限制,造成一些分片(主/副)无法被正常放置,被标记为UNASSIGNED
    1. 同一个节点上不可以同时放置某个分片的主分片和副本(如shard 1的主分片和副本不可以同时存在于node1
    2. includeexclude属性有冲突时,集群里所有的节点都没有资格放置分片等
  • 本题官方文档:
    1. 参考链接
    2. 页面路径:Index modules =》 Index Shard Allocation =》 Index-level shard allocation filtering

第3题,集群级别分片分配设置

  1. Let’s assume that we deployed the eoc-06-cluster cluster across two availability zones, named earth and mars. Add the attribute AZ to the nodes configuration, and set its value to “earth” for node1 and node2, and to “mars” for node3
    1. 让我们假装我们部署的eoc-06-cluster集群横跨了俩数据中心,earthmars。在节点配置里加入AZ这个属性,让node1node2属于earth中心,node3属于mars中心
  2. Restart the cluster
    1. 重启集群
  3. Configure the cluster to force shard allocation awareness based on the two availability zones, and persist such configuration across cluster restarts
    1. 使得集群的分片分布基于俩数据中心的配置,并在集群重启时保持这个配置不失效
  4. Verify the success of the last action by using the _cat API
    1. 通过_catAPI来检验一下这些操作是否生效

第3题,题解

  1. 修改每个节点的配置文件$ES_HOME/config/elasticsearch.yml

    1. 添加属性node.attr.AZ: earthnode.attr.AZ: mars
      Node1
    node.name: node1
    node.attr.AZ: earth
    

    Node2

    node.name: node2
    node.attr.AZ: earth
    

    Node3

    node.name: node3
    node.attr.AZ: mars
    
  2. 重启集群

  3. 运行GET /_cat/nodeattrs?v&h=node,id,pid,host,ip,port,attr,value命令来查看这个属性是否生效

       id   pid host       ip         port attr              value
es721Node2 q0pL 1   172.18.0.4 172.18.0.4 9300 name              es721Node2
es721Node2 q0pL 1   172.18.0.4 172.18.0.4 9300 AZ                earth
es721Node2 q0pL 1   172.18.0.4 172.18.0.4 9300 xpack.installed   true
es721Node1 PGDN 1   172.18.0.3 172.18.0.3 9300 ml.machine_memory 12564148224
es721Node1 PGDN 1   172.18.0.3 172.18.0.3 9300 xpack.installed   true
es721Node1 PGDN 1   172.18.0.3 172.18.0.3 9300 name              es721Node1
es721Node1 PGDN 1   172.18.0.3 172.18.0.3 9300 AZ                earth
es721Node1 PGDN 1   172.18.0.3 172.18.0.3 9300 ml.max_open_jobs  20
es721Node3 4ym8 1   172.18.0.5 172.18.0.5 9300 name              es721Node3
es721Node3 4ym8 1   172.18.0.5 172.18.0.5 9300 AZ                mars
es721Node3 4ym8 1   172.18.0.5 172.18.0.5 9300 xpack.installed   true
  1. 给集群进行可用性配置的时候有两种方式
    1. 在master节点里指定需要使用的属性,可能的话加上可用的属性值
    cluster.routing.allocation.awareness.attributes: AZ
    cluster.routing.allocation.awareness.force.AZ.values: earth,mars
    
    1. 通过集群配置接口进行设置
    PUT /_cluster/settings
    {
       "persistent" : {
          "cluster.routing.allocation.awareness.attributes": "AZ",
          "cluster.routing.allocation.awareness.force.AZ.values": "earth,mars"
       }
    }
    
  2. 运行GET /_cluster/settings查看集群配置
       {
       "persistent" : {
          "cluster" : {
             "routing" : {
             "allocation" : {
                "awareness" : {
                   "attributes" : "AZ"
                }
             }
             }
          }
       },
       "transient" : { }
       }
    

第3题,题解说明

  • 上一题主要的配置写在每个 indexsettings 里面,可以随时修改 + 生效,这一题中的配置主要写在配置文件里,作为节点的属性,需要修改/生效时就需要重启节点了。
    • 如果强行通过集群配置接口进行修改会报错不说,这种设置是节点级别的,在集群配置接口里也不好指定在哪个节点生效。
      PUT _cluster/settings
      {
        "persistent": {
          "node.attr.AZ":"earth"
        }
      }
    
    结果:
    {
      "error": {
        "root_cause": [
          {
            "type": "illegal_argument_exception",
            "reason": "persistent setting [node.attr.AZ], not dynamically updateable"
          }
        ],
        "type": "illegal_argument_exception",
        "reason": "persistent setting [node.attr.AZ], not dynamically updateable"
      },
      "status": 400
    }
    
    1. 参考链接
    2. 页面路径:Modules =》 Shard allocation and cluster-level routing =》 Cluster level shard allocation
  • 集群的分配属性可以通过配置文件或者集群配置接口来修改,但是要注意集群配置接口的更改分临时和永久两种情况,persistent是永久的transient是临时的,区别在于永久生效的配置集群重启之后还存在,临时的重启了就没了
    1. 通过接口设置的属性可以通过把值设为null来取消
       PUT /_cluster/settings
       {
          "persistent": {
             "cluster.routing.allocation.awareness.attributes": null,
             "cluster.routing.allocation.awareness.force.AZ.values": null
          }
       }
    
    1. 这里还要注意一点,两个awareness配置的写法不太一样,我在第一次写的时候也拼错了,他们前几位都一样都是cluster.routing.allocation.,但是最后几位一个是awareness.attributes用来标记用于发现的属性(key)的值,一个是awareness.force.${key}.values代表了这个属性(key)可用的值。
    2. 参考链接-awareness参考链接-cluster 配置更新接口
    3. 页面路径:Modules =》 Shard allocation and cluster-level routing =》 Cluster level shard allocation
    4. 页面路径:Modules =》 Cluster APIs =》 Cluster Update Settings
  • 集群配置获取接口是GET /_cluster/settings url参数 include_defaults 可以管理是否显示默认配置
    1. GET /_cluster/settings?include_defaults=true 可以带着系统默认配置一起返回
    2. 参考链接
    3. 页面路径:Modules =》 Cluster APIs =》 Cluster Get Settings

第4题,冷热架构部署

  1. Configure the cluster to reflect a hot/warm architecture, with node1 as the only hot node
    1. 配置集群以满足冷热架构的部署,让node1作为唯一的热节点
  2. Configure the hamlet-1 index to allocate its shards only to warm nodes
    1. 修改hamlet-1的索引配置,让他只能放置在温节点上
  3. Verify the success of the last action by using the _cat API
    1. 通过_catAPI来校验这个操作是否成功
  4. Remove the hot/warm shard filtering configuration from the hamlet-1 configuration
    1. hamlet-1的配置中把冷热节点过滤的条件去掉

第4题,题解

  1. 修改每个节点配置文件$ES_HOME/config/elasticsearch.yml

    1. 添加属性node.attr.hot_warm_type: hotnode.attr.hot_warm_type: warm
      Node1
    node.name: node1
    node.attr.hot_warm_type: hot
    

    Node2

    node.name: node2
    node.attr.hot_warm_type: warm
    

    Node3

    node.name: node3
    node.attr.hot_warm_type: warm
    
  2. 检查节点属性GET /_cat/nodeattrs?v&h=node,id,pid,host,ip,port,attr,value

    node       id   pid host       ip         port attr              value
    es721Node2 q0pL 1   172.18.0.2 172.18.0.2 9300 name              es721Node2
    es721Node2 q0pL 1   172.18.0.2 172.18.0.2 9300 AZ                earth
    es721Node2 q0pL 1   172.18.0.2 172.18.0.2 9300 xpack.installed   true
    es721Node2 q0pL 1   172.18.0.2 172.18.0.2 9300 hot_warm_type     warm
    es721Node1 PGDN 1   172.18.0.4 172.18.0.4 9300 ml.machine_memory 12564156416
    es721Node1 PGDN 1   172.18.0.4 172.18.0.4 9300 xpack.installed   true
    es721Node1 PGDN 1   172.18.0.4 172.18.0.4 9300 name              es721Node1
    es721Node1 PGDN 1   172.18.0.4 172.18.0.4 9300 AZ                earth
    es721Node1 PGDN 1   172.18.0.4 172.18.0.4 9300 ml.max_open_jobs  20
    es721Node1 PGDN 1   172.18.0.4 172.18.0.4 9300 hot_warm_type     hot
    es721Node3 4ym8 1   172.18.0.3 172.18.0.3 9300 name              es721Node3
    es721Node3 4ym8 1   172.18.0.3 172.18.0.3 9300 AZ                mars
    es721Node3 4ym8 1   172.18.0.3 172.18.0.3 9300 xpack.installed   true
    es721Node3 4ym8 1   172.18.0.3 172.18.0.3 9300 hot_warm_type     warm
    
  3. hamlet-1删了重建(避免其他配置的干扰)

    1. 索引删除DELETE hamlet-1
    2. 索引重建
    PUT hamlet-1
    {
       "settings": {
          "index.routing.allocation.include.hot_warm_type": "warm"
       }
    }
    

    或者

    PUT hamlet-1
    {
       "settings": {
          "index":{
             "number_of_shards":2,
             "number_of_replicas":0,
             "routing":{
             "allocation":{
                "include":{
                   "hot_warm_type":"warm"
                }
             }
             }
          }
       }
    }
    
  4. 运行一下之前的给hamlet-1插数据的脚本

  5. 运行GET /_cat/shards/hamlet-1?v&h=index,shard,prirep,state,docs,store,ip,node检查索引分片分布情况

    index    shard prirep state   docs store ip         node
    hamlet-1 1     p      STARTED    2 4.7kb 172.18.0.2 es721Node2
    hamlet-1 0     p      STARTED    3 5.1kb 172.18.0.2 es721Node2
    
  6. 去掉hamlet-1有关冷热分片的配置

    PUT hamlet-1/_settings
    {
       "index.routing.allocation.include.hot_warm_type": null
    }
    
  7. 运行GET /_cat/shards/hamlet-1?v&h=index,shard,prirep,state,docs,store,ip,node检查索引分片分布情况

    index    shard prirep state   docs store ip         node
    hamlet-1 1     p      STARTED    2 4.8kb 172.18.0.2 es721Node2
    hamlet-1 0     p      STARTED    3 5.2kb 172.18.0.4 es721Node1
    

第4题,题解说明

  • 这题主要考察的是通过ES节点属性来进行冷热部署,本质上和之前的一些配置一样,也是基于node.attr.的附加配置,但是对于生产场景中,确实会存在集群节点的配置不尽相同,某些高性能节点具有更多的CPU,SSD硬盘用来存储、计算和召回数据更快,而另一些节点使用更少的内存,更差的硬盘,用来进行较低频数据的搜索
    1. 与上一题一样,首先我们需要给集群的各个节点添加节点属性,这个操作需要修改节点的配置/启动命令,所以需要重启生效
      1. 参考连接
      2. 页面路径:Index modules =》 Index Shard Allocation =》 Index-level shard allocation filtering
    2. 接着对索引进行allocation的设置,可以直接通过PUT ${index}/_settings的接口进行修改
      1. 参考连接
      2. 页面路径:Index modules =》 Index Shard Allocation =》 Index-level shard allocation filtering

第5题,基于节点存储属性部署

  1. Let’s assume that the nodes have either a “large” or “small” local storage. Add the attribute storage to the nodes config, and set its value so that node2 is the only with a “small” storage
    1. 让我们假设节点存在 “大” 和 ”小“ 两种本地存储能力,给节点们添加 storage 属性,其中 node2 有 “小” 的存储,其他的都是 “大” 存储
  2. Configure the hamlet-2 index to allocate its shards only to nodes with a large storage size
    1. 修改 hamlet-2 索引的配置,让它只能把分片放置在 “大” 存储的节点上
  3. Verify the success of the last action by using the _cat API
    1. 通过 _cat API 来校验操作成功与否

第4题,题解

  1. 修改每个节点配置文件$ES_HOME/config/elasticsearch.yml

    1. 添加属性node.attr.storage: largenode.attr.storage: small
      Node1
    node.name: node1
    node.attr.storage: large
    

    Node2

    node.name: node2
    node.attr.storage: small
    

    Node3

    node.name: node3
    node.attr.storage: large
    
  2. 修改hamlet-2的索引配置

    PUT hamlet-2
    {
       "settings": {
          "index.routing.allocation.include.storage": "large"
       }
    }
    

    或者

    PUT hamlet-2/_settings
    {
       "routing": {
          "allocation": {
          "include": {
             "storage": "small",
             "AZ": "earth",
             "hot_warm_type": "hot"
          }
          }
       }
    }
    
  3. 通过接口GET /_cat/nodeattrs?v&h=node,id,pid,host,ip,port,attr,valueGET hamlet-2GET /_cat/shards/hamlet-2?v&h=index,shard,prirep,state,docs,store,ip,node来检验索引、节点和分片的属性和放置情况

    1. GET /_cat/nodeattrs?v&h=node,id,pid,host,ip,port,attr,value
    node       id   pid host       ip         port attr              value
    es721Node2 q0pL 1   172.18.0.2 172.18.0.2 9300 name              es721Node2
    es721Node2 q0pL 1   172.18.0.2 172.18.0.2 9300 AZ                earth
    es721Node2 q0pL 1   172.18.0.2 172.18.0.2 9300 xpack.installed   true
    es721Node2 q0pL 1   172.18.0.2 172.18.0.2 9300 storage           small
    es721Node2 q0pL 1   172.18.0.2 172.18.0.2 9300 hot_warm_type     warm
    es721Node3 4ym8 1   172.18.0.3 172.18.0.3 9300 name              es721Node3
    es721Node3 4ym8 1   172.18.0.3 172.18.0.3 9300 AZ                mars
    es721Node3 4ym8 1   172.18.0.3 172.18.0.3 9300 xpack.installed   true
    es721Node3 4ym8 1   172.18.0.3 172.18.0.3 9300 storage           large
    es721Node3 4ym8 1   172.18.0.3 172.18.0.3 9300 hot_warm_type     warm
    es721Node1 PGDN 1   172.18.0.4 172.18.0.4 9300 ml.machine_memory 12564156416
    es721Node1 PGDN 1   172.18.0.4 172.18.0.4 9300 xpack.installed   true
    es721Node1 PGDN 1   172.18.0.4 172.18.0.4 9300 name              es721Node1
    es721Node1 PGDN 1   172.18.0.4 172.18.0.4 9300 AZ                earth
    es721Node1 PGDN 1   172.18.0.4 172.18.0.4 9300 ml.max_open_jobs  20
    es721Node1 PGDN 1   172.18.0.4 172.18.0.4 9300 storage           large
    es721Node1 PGDN 1   172.18.0.4 172.18.0.4 9300 hot_warm_type     hot
    
    1. GET hamlet-2
    {
       "hamlet-2" : {
          "aliases" : { },
          "mappings" : { },
          "settings" : {
             "index" : {
             "routing" : {
                "allocation" : {
                   "include" : {
                   "AZ" : "earth",
                   "hot_warm_type" : "hot",
                   "storage" : "small"
                   }
                }
             },
             "number_of_shards" : "1",
             "provided_name" : "hamlet-2",
             "creation_date" : "1604987394421",
             "number_of_replicas" : "1",
             "uuid" : "9sVWeaDTTFS9XeQWzMJu4w",
             "version" : {
                "created" : "7020199"
             }
             }
          }
       }
    }
    
    1. GET /_cat/shards/hamlet-2?v&h=index,shard,prirep,state,docs,store,ip,node
       index    shard prirep state      docs store ip         node
    hamlet-2 0     p      STARTED       0  283b 172.18.0.2 es721Node2
    hamlet-2 0     r      UNASSIGNED                       
    
  4. (如果存在)用GET /_cluster/allocation/explain来查看处于UNASSIGNED状态的分片/副本

    1. GET /_cluster/allocation/explain
    {
       "index" : "hamlet-2",
       "shard" : 0,
       "primary" : false,
       "current_state" : "unassigned",
       "unassigned_info" : {
          "reason" : "INDEX_CREATED",
          "at" : "2020-11-10T05:49:54.426Z",
          "last_allocation_status" : "no_attempt"
       },
       "can_allocate" : "no",
       "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
       "node_allocation_decisions" : [
          {
             "node_id" : "PGDN7jTpTPes0ICgNb19Bw",
             "node_name" : "es721Node1",
             "transport_address" : "172.18.0.4:9300",
             "node_attributes" : {
                "ml.machine_memory" : "12564156416",
                "xpack.installed" : "true",
                "name" : "es721Node1",
                "AZ" : "earth",
                "ml.max_open_jobs" : "20",
                "storage" : "large",
                "hot_warm_type" : "hot"
             },
             "node_decision" : "no",
             "weight_ranking" : 1,
             "deciders" : [
                {
                   "decider" : "awareness",
                   "decision" : "NO",
                   "explanation" : "there are too many copies of the shard allocated to nodes with attribute [AZ], there are [2] total configured shard copies for this shard id and [3] total attribute values, expected the allocated shard count per attribute [2] to be less than or equal to the upper bound of the required number of shards per attribute [1]"
                }
             ]
          },
          {
             "node_id" : "4ym8nm8WS7yu1RcN1v7mcg",
             "node_name" : "es721Node3",
             "transport_address" : "172.18.0.3:9300",
             "node_attributes" : {
                "name" : "es721Node3",
                "AZ" : "mars",
                "xpack.installed" : "true",
                "storage" : "large",
                "hot_warm_type" : "warm"
             },
             "node_decision" : "no",
             "weight_ranking" : 2,
             "deciders" : [
                {
                   "decider" : "filter",
                   "decision" : "NO",
                   "explanation" : """node does not match index setting [index.routing.allocation.include] filters [AZ:"earth",storage:"small",hot_warm_type:"hot"]"""
                }
             ]
          },
          {
             "node_id" : "q0pL9eXcSviCphv0EhV52g",
             "node_name" : "es721Node2",
             "transport_address" : "172.18.0.2:9300",
             "node_attributes" : {
                "name" : "es721Node2",
                "AZ" : "earth",
                "xpack.installed" : "true",
                "storage" : "small",
                "hot_warm_type" : "warm"
             },
             "node_decision" : "no",
             "weight_ranking" : 3,
             "deciders" : [
                {
                   "decider" : "same_shard",
                   "decision" : "NO",
                   "explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists [[hamlet-2][0], node[q0pL9eXcSviCphv0EhV52g], [P], s[STARTED], a[id=KBmldvUvQfGulBhS2fQ9Ug]]"
                },
                {
                   "decider" : "awareness",
                   "decision" : "NO",
                   "explanation" : "there are too many copies of the shard allocated to nodes with attribute [AZ], there are [2] total configured shard copies for this shard id and [3] total attribute values, expected the allocated shard count per attribute [2] to be less than or equal to the upper bound of the required number of shards per attribute [1]"
                }
             ]
          }
       ]
    }
    

第4题,题解说明

  • 这题前面一半和上一题一样,通过节点的额外属性node.attr.${attribute}配合索引的index.routing.allocation.include.${attribute}来管理索引分片的放置,后半段考察的也是平时做类似分片管理的时候需要注意的就是节点的属性可能会存在除斥(多个配置不存在交集),导致索引的分片无法被放置,以至于被标记为UNASSIGNED
    • 为了能满足这个状态,我把题目的一些要求稍做了修改
    1. 节点配置属性和索引添加节点筛选配置和上题一样,略
    2. 查看集群中未分配的分片以及不能分配的原因的接口是GET /_cluster/allocation/explain
      1. 这个接口的返回结果中主要会包含以下内容
        1. “index” : “hamlet-2”:当前分片所属索引
        2. “shard” : 0:第几个分片(从0开始)
        3. “primary” : false:是否主分片(true:主分片,false:副本)
        4. “current_state” : “unassigned”:当前状态:未分配
        5. “unassigned_info” : 未分配的相关信息
          1. “reason” : “INDEX_CREATED”:未分配原因,索引创建(失败)
          2. “at” : “2020-11-10T05:49:54.426Z”:操作时间
          3. “last_allocation_status” : “no_attempt”:最后一次(尝试)分配状态:失败
        6. “can_allocate” : “no”:是否可分配:否
        7. “allocate_explanation” : “cannot allocate because allocation is not permitted to any of the nodes”:分配解释:无法分配分片,因为分片配置无法匹配任何节点
        8. “node_allocation_decisions” : 各个节点分配情况详情
      2. 其他的不写了,看key和value也可以知道大概意思
      3. 参考链接
      4. 页面路径:Cluster APIs =》 Cluster Allocation Explain API
    3. UNASSIGNED分片存在的时候,通过接口GET _cat/health可以看出来集群是不健康的
      1. GET _cat/health?v
        epoch      timestamp cluster        status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
        1604989564 06:26:04  docker-cluster yellow          3         3     11   7    0    0        1             0                  -                 91.7%
        
        1. 主要看statusactive_shards_percent这两列
          1. status
            1. yellow:有副本无法分配或缺失,集群不健康,但是可以支持搜索,几乎无数据缺失
            2. red:有主分片无法分配或缺失,集群严重不健康,部分数据不能搜索,有数据缺失
          2. active_shards_percent
            1. 活跃(能正常使用)分片数占比
      2. 参考连接
      3. 页面路径:cat APIs =》 cat health
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值