解决Elasticsearch集群 master_not_discovered_exception 异常

最新推荐文章于 2023-05-09 16:49:31 发布

外星喵

最新推荐文章于 2023-05-09 16:49:31 发布

阅读量1.7w

点赞数 5

分类专栏：分布式架构并发编程与网络通信数据库文章标签： elasticsearch 大数据搜索引擎

本文链接：https://blog.csdn.net/c15158032319/article/details/125424331

版权

分布式架构同时被 3 个专栏收录

36 篇文章

订阅专栏

并发编程与网络通信

28 篇文章

订阅专栏

数据库

15 篇文章

订阅专栏

错误描述

查看集群健康返回以下错误：

{
	"error": {
		"root_cause": [{
			"type": "master_not_discovered_exception",
			"reason": null
		}],
		"type": "master_not_discovered_exception",
		"reason": null
	},
	"status": 503
}

我通过docker命令在三台机器上分别启动es应用后，单个节点可以通过网络访问，但是他们彼此之间却显示无法通信，导致选举失败，发现不了主节点。

问题排查

查看es日志发现：

java.net.NoRouteToHostException: No route to host (Host unreachable)

重要的关键属性是network.publish_host。它是节点向其他节点发布的地址，作为其他节点加入集群时要到达的地址。

可能是防火墙的原因：

service iptables status #查看防火墙状态

如果没关的话使用以下命令：

systemctl stop firewalldsy
stemctl disable firewalld

关闭后发现还是不行，这里发下我发生错误时节点的es配置：

#跨域配置
http.cors.enabled: true
http.cors.allow-origin: "*"
#集群名称
cluster.name: elasticsearch-cluster
#节点名称
node.name: node-1
#是不是有资格竞选主节点
node.master: true
#是否存储数据
node.data: true
#最大集群节点数
node.max_local_storage_nodes: 3
#网络地址
network.host: 0.0.0.0
#端口
http.port: 9200
#内部节点之间沟通端口
transport.tcp.port: 9300
#es7.x 之后新增的配置，写入候选主节点的设备地址，在开启服务后可以被选为主节点
discovery.seed_hosts: ["192.168.2.90","192.168.2.91","192.168.2.92"]
#es7.x 之后新增的配置，初始化一个新的集群时需要此配置来选举master
cluster.initial_master_nodes: ["node-1", "node-2","node-3"]

直到我在加入了以下配置后成功解决了该问题：

#设置当前节点与其他节点交互的IP地址
network.publish_host: 192.168.2.90

正确结果

{
  "cluster_name" : "elasticsearch-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 3,
  "active_shards" : 6,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

原理分析

network.bind_host设置允许控制不同的网络组件将绑定的主机。默认情况下，绑定主机将是anyLocalAddress（通常为0.0.0.0或者:: 0）。默认情况下network.host将设置network.bind_host和network.publish_host为相同的值。

内部ip（eth1）用于让elasticsearch的不同节点相互通信，发现等，而外部ip地址（eth0）是我的web应用程序在另一个网络中发出请求的地址。bind_host（我的例子中的ip，与默认值0.0.0.0相同，用于连接所有接口）是elasticsearch侦听的地方，publish_host（本例中为internal ip）是elasticsearch与其他集群组件进行通信的地方。

这样，我们就可以从bind_host地址访问ES群集，而elasticsearch使用publish_host地址与群集通信。