本来之前已经将es集群搭建完成了,但是不小心使用rm -rf /var/lib把文件删除了,我本来只想删除/var/lib下的一个文件夹的,结果删错了,整个集群没了,这,,,,还好集群上跑的都是一些测试数据,但是重新搭建也是废了一番功夫。有关防止rm -rf误删的方法我在另一篇博客中有提及待补充博客链接
,大家可以做个参考。
搭建elasticsearch集群的步骤我在这篇博客:"https://blog.csdn.net/m0_49984184/article/details/108181812 k8s集群搭建efk集群日志收集系统"中介绍过,事情是这样的(我写博客喜欢把事情的前因后果介绍一下,因为我觉得这样能理清楚报错的来龙去脉,有助于大家针对性的参考),我按照之前的安装步骤安装好这个es集群后,然后访问了下es集群中各节点的9200端口,显示情况如下:
我本以为es服务这样就是启动成功了,可是之后发现这个仅仅能证明节点启动elasticsearch服务暂时成功了(意思是9200读那口已经被elasticsearch服务占用,但是启动日志可能报错),待按照以上方式启动各节点的elasticsearch服务后我查看了下es集群中各节点的信息:
这里报了个错误就是master_not_discovered_exception
的错误,这个错误就是es集群中master节点没有发现的异常,注意curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
命令返回的es节点状态都是健康的才能说明es集群部署成功,而不是浏览器访问ip地址+9200端口
,因为浏览器访问各节点的9200端口成功返回信息的话只能说明各节点的es(elasticsearch)服务启动成功
,而不能说明es集群部署成功
(因为es集群中所有节点要互相发现,加入集群,es集群才算搭建成功)。
我看了下某个节点上的elasticsearch的启动信息,这样看来elasticsearch的服务确实是启动起来了,这也是上面我访问9200端口能成功访问到cluster相关信息的原因。
但是过了会我的日志信息就报错了,报错如下:
报错信息(图片版):
报错信息(代码版):
{
"type": "server", "timestamp": "2020-08-27T06:30:51,828Z", "level": "WARN", "component": "o.e.c.c.Coordinator", "cluster.name": "es-cluster", "node.name": "node-a", "message": "failed to validate incoming join request from node [{node-c}{YmHns882QzChKkOGKR0Rcw}{hNqhxLpVTkGqxWHtMXXcFQ}{10.24.2.222}{10.24.2.222:9300}{dilm}{ml.machine_memory=8127139840, ml.max_open_jobs=20, xpack.installed=true}]", "cluster.uuid": "qdaqityTTEKuXwA4cRCclw", "node.id": "FR7TLa4TRbePpKFrF4FVtg" ,
"stacktrace": ["org.elasticsearch.transport.RemoteTransportException: [node-c][172.17.0.2:9300][internal:cluster/coordination/join/validate]",
"Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid qdaqityTTEKuXwA4cRCclw than local cluster uuid cWcEtWIISVuWM700WsmBZQ, rejecting",
"at org.elasticsearch.cluster.coordination.JoinHelper.lambda$new$4(JoinHelper.java:148) ~[elasticsearch-7.5.0.jar:7.5.0]",
"at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler$1.doRun(SecurityServerTransportInterceptor.java:257) ~[?:?]",
"at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.5.0.jar:7.5.0]",
"at org.elasticsearch.xpack.security.transport.SecurityServerTransportInterceptor$ProfileSecuredRequestHandler.messageReceived(SecurityServerTransportInterceptor.java:315) ~[?:?]",
"at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) ~[elasticsearch-7.5.0.jar:7.5.0]",
"at org.elasticsearch.transport.InboundHandler$RequestHandler.doRun(InboundHandler.java:264) ~[elasticsearch-7.5.0.jar:7.5.0]",
"at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:773) ~[elasticsearch-7.5.0.jar:7.5.0]",
"at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.5.0.jar:7.5.0]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]",
"at java.lang.Thread.run(Thread.java:830) [?:?]"] }
为方便大家在浏览器中能够尽快找到此类问题的解决方案,再次放上报错信息,报错信息(文字版):
{“type”: “server”, “timestamp”: “2020-08-27T06:30:51,828Z”, “level”: “WARN”, “component”: “o.e.c.c.Coordinator”, “cluster.name”: “es-cluster”, “node.name”: “node-a”, “message”: “failed to validate incoming join request from node [{node-c}{YmHns882QzChKkOGKR0Rcw}{hNqhxLpVTkGqxWHtMXXcFQ}{10.24.2.222}{10.24.2.222:9300}{dilm}{ml.machine_memory=8127139840, ml.max_open_jobs=20, xpack.installed=true}]”, “cluster.uuid”: “qdaqityTTEKuXwA4cRCclw”, “node.id”: “FR7TLa4TRbePpKFrF4FVtg” ,
“stacktrace”: [“org.elasticsearch.transport.RemoteTransportException: [node-c][172.17.0.2:9300][internal:cluster/coordination/join/validate]”,
“Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid qdaqityTTEKuXwA4cRCclw than local cluster uuid cWcEtWIISVuWM700WsmBZQ, rejecting”,
“at org.elasticsearch.cluster.coordination.JoinHelper.lambda$new 4 ( J o i n H e l p e r . j a v a : 148 ) [ e l a s t i c s e a r c h − 7.5.0. j a r : 7.5.0 ] " , " a t o r g . e l a s t i c s e a r c h . x p a c k . s e c u r i t y . t r a n s p o r t . S e