注意:集群ES为了集群的稳定性,应该尽量确保master节点与数据节点进行分离,即master节点不作为存储数据的节点。
操作节点
集群环境搭建
环境的搭建和之前单机的一模一样,只不过配置有所不同,因此只需要把一台机器上的es分发到其他节点,然后更改elasticsearch.yml
中的关键配置即可。
分发节点
这里在4台机器已经互相配置了SSH免密登录的基础上。
scp -r elasticsearch-7.8.0 node02:`pwd`
scp -r elasticsearch-7.8.0 node03:`pwd`
scp -r elasticsearch-7.8.0 node04:`pwd`
各节点配置
node01:
#所属集群名称
cluster.name: my-es-cluster
#节点名称
node.name: node-1
#绑定ip
network.host: 192.168.221.66
#作为候选master的节点的host
discovery.seed_hosts: ["node01"]
#集群启动时选取的master节点
cluster.initial_master_nodes: ["node-01"]
#是否作为候选节点
node.master: true
#是否作为数据节点
node.data: false
node02;
#所属集群名称
cluster.name: my-es-cluster
#节点名称
node.name: node-2
#绑定ip
network.host: 192.168.221.68
#作为候选master的节点的host
discovery.seed_hosts: ["node01"]
#集群启动时选取的master节点
cluster.initial_master_nodes: ["node-01"]
#是否作为候选节点
node.master: false
#是否作为数据节点
node.data: true
node03:
#所属集群名称
cluster.name: my-es-cluster
#节点名称
node.name: node-3
#绑定ip
network.host: 192.168.221.70
#作为候选master的节点的host
discovery.seed_hosts: ["node01"]
#集群启动时选取的master节点
cluster.initial_master_nodes: ["node-01"]
#是否作为候选节点
node.master: false
#是否作为数据节点
node.data: true
node04:
#所属集群名称
cluster.name: my-es-cluster
#节点名称
node.name: node-4
#绑定ip
network.host: 192.168.221.72
#作为候选master的节点的host
discovery.seed_hosts: ["node01"]
#集群启动时选取的master节点
cluster.initial_master_nodes: ["node-01"]
#是否作为候选节点
node.master: false
#是否作为数据节点
node.data: true
分别启动ES实例
这里启动的时候可能会报错: 说是节点已存在。
org.elasticsearch.transport.RemoteTransportException: [node-1][192.168.221.66:9300][internal:cluster/coordination/join]
Caused by: java.lang.IllegalArgumentException: can't add node {node-2}{5pHIl5zLRZqap07Gh2ykBA}{iY5QeDlsSB6b9PZabxVVNw}
{192.168.221.68}{192.168.221.68:9300}{dilrt}{ml.machine_memory=1952313344, ml.max_open_jobs=20, xpack.installed=true, transform.node=true},
found existing node {node-1}{5pHIl5zLRZqap07Gh2ykBA}{9UnlM15gS4G43NRJF2s2JQ}{192.168.221.66}{192.168.221.66:9300}{ilmr}
{ml.machine_memory=1952313344, xpack.installed=true, transform.node=false, ml.max_open_jobs=20} with the same id but is a different node instance
at org.elasticsearch.cluster.node.DiscoveryNodes$Builder.add(DiscoveryNodes.java:619) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.cluster.coordination.JoinTaskExecutor.execute(JoinTaskExecutor.java:152) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.cluster.coordination.JoinHelper$1.execute(JoinHelper.java:123) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:702) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:324) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:219) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.cluster.service.MasterService.access$000(MasterService.java:73) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:151) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:636) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252) ~[elasticsearch-7.8.0.jar:7.8.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215) ~[elasticsearch-7.8.0.jar:7.8.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_211]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_211]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211]
原因是我们分发的都是同一份ES文件,因此data目录中包含了节点的信息,因此我们需要把node2,3,4的data进行删除,让它重新生成。
验证集群启动
通过浏览器访问各个节点的REST端口,发现都已经可以访问成功:
OK,大功告成!