今天发运行项目的时候发现ES的集群和索引健康状态都变成yellow黄色了,
一开始是以为Linux虚拟机磁盘空间不足
我这里是通过浏览器Elasticvue插件查看状态,也可以通过kibana
在将VMware里面的centOS虚拟机扩容之后,集群健康状态还是黄色,查看es
的日志,发现了一些端倪
{"type": "server", "timestamp": "2024-08-14T03:05:39,901Z", "level": "INFO",
"component": "o.e.c.r.a.AllocationService", "cluster.name": "docker-cluster",
"node.name": "b029464c5838", "message": "Cluster health status changed from
[RED] to [YELLOW] (reason: [shards started [[user][0]]]).", "cluster.uuid":
"fw7MBctsRhKs7ZXVdwIuBA", "node.id": "kVueuBHqQ9uazalV9JNEfA" }
但这是说明集群健康状态从红色变为黄色,是和分片有关
继续调用检查api GET /_cluster/health
返回如下
{
"cluster_name": "docker-cluster",
"status": "yellow",
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"active_primary_shards": 2,
"active_shards": 2,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 2,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 50
}
其中**“unassigned_shards”: 2**,表示未分配的分片数为二,推测可能原因就是
分片未分配,导致集群状态Yellow
再通过GET /_cluster/allocation/explain 查看unassigned 的原因,得到
{
"index": "user",
"shard": 0,
"primary": false,
"current_state": "unassigned",
"unassigned_info": {
"reason": "REPLICA_ADDED",
"at": "2024-08-14T03:54:24.668Z",
"last_allocation_status": "no_attempt"
},
"can_allocate": "no",
"allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisions": [
{
"node_id": "kVueuBHqQ9uazalV9JNEfA",
"node_name": "b029464c5838",
"transport_address": "172.19.0.2:9300",
"node_attributes": {
"ml.machine_memory": "6067675136",
"xpack.installed": "true",
"transform.node": "true",
"ml.max_open_jobs": "20",
"ml.max_jvm_size": "536870912"
},
"node_decision": "no",
"deciders": [
{
"decider": "same_shard",
"decision": "NO",
"explanation": "a copy of this shard is already allocated to this node [[user][0], node[kVueuBHqQ9uazalV9JNEfA], [P], s[STARTED], a[id=Jj5rl1j_ROGqw3Fn4tjJxQ]]"
}
]
}
]
}
“primary”: false:表明这是一个副本分片,而不是主分片。主分片负责处理写入操作,而副本分片用于提供读取操作的负载均衡和高可用性。
“allocate_explanation”: “cannot allocate because allocation is not permitted to any of the nodes”:解释了为什么分片不能被分配,因为不允许将其分配到任何节点。
“explanation”: “a copy of this shard is already allocated to this node [[user][0], node[kVueuBHqQ9uazalV9JNEfA], [P], s[STARTED], a[id=Jj5rl1j_ROGqw3Fn4tjJxQ]]”:提供了不允许分配的详细解释,即该节点上已经分配了该分片的主副本,并且状态为 STARTED。
说明该索引已经有了启动的副本节点,接下来查看一下user索引的setting
GET /user/_settings
{
"user": {
"settings": {
"index": {
"routing": {
"allocation": {
"include": {
"_tier_preference": "data_content"
}
}
},
"number_of_shards": "1",
"provided_name": "user",
"creation_date": "1723366659772",
"number_of_replicas": "1",
"uuid": "JFWB2NceRouaAEAum7ZTzg",
"version": {
"created": "7120199"
}
}
}
}
}
**“number_of_replicas”: “1”**表示副本分片为1
因为我只开启了一个es服务,所以主分片运行在这时,这一个副本分片无法分配到当前服务,解决方法,将其设为0
PUT user/_settings
{
"number_of_replicas" : 0
}
更改完成之后索引状态变为gree