解决storm1.1.1集群找不到nimbus异常

各应用程序及版本

  • storm1.1.1
  • zookeeper3.4.10

拓朴结构

  • nimbus、ui:CentOS65App
  • supervisor:CentOS65M1、CentOS65M2、CentOS65M3

配置文件

  • zoo.cfg,zookeeper仅安装在三台supervisor服务器

server.6=CentOS65M1:2888:3888
server.7=CentOS65M2:2888:3888
server.8=CentOS65M3:2888:3888
  • storm.yaml,在四台服务器配置一样

storm.zookeeper.servers:
     - "CentOS65M1"
     - "CentOS65M2"
     - "CentOS65M3"

nimbus.seeds: ["CentOS65App"]

storm.zookeeper.port: 2181
storm.local.dir: "/app/data"
ui.port: 9099

supervisor.slots.ports:
    - 6700
    - 6701
    - 6702
    - 6703

按序启动zookeeper,nimbus,ui,supervisor

storm ui可以在浏览器正常访问

http://centos65app:9099/index.html

不部署topology,集群一切正常;一旦部署了topology,supervisor就抛异常了。

./bin/storm jar /opt/storm_jars/storm-starter-1.1.1.jar org.apache.storm.starter.WordCountTopology WordCountTopology
过一两分钟,从storm ui就发现Supervisors由3变为0。逐个进入三台supervisor服务器,jps后,发现确实原来的supervisor已经不存在了。在supervisor服务器的[storm_home]/logs/supervisor.log文件发现有如下异常:

2018-01-17 11:18:45.329 o.a.s.d.s.Slot main [WARN] SLOT CentOS65M1:6700 Starting in state EMPTY - assignment null
2018-01-17 11:18:45.330 o.a.s.d.s.Slot main [WARN] SLOT CentOS65M1:6701 Starting in state EMPTY - assignment null
2018-01-17 11:18:45.331 o.a.s.d.s.Slot main [WARN] SLOT CentOS65M1:6702 Starting in state EMPTY - assignment null
2018-01-17 11:18:45.331 o.a.s.d.s.Slot main [WARN] SLOT CentOS65M1:6703 Starting in state EMPTY - assignment null
2018-01-17 11:18:45.337 o.a.s.l.AsyncLocalizer main [INFO] Cleaning up unused topologies in /app/data/supervisor/stormdist
2018-01-17 11:18:45.344 o.a.s.d.s.Supervisor main [INFO] Starting supervisor with id 7cb2d4a8-a09c-4bb3-9449-16cb1d6b96da at host CentOS65M1.
2018-01-17 11:18:45.354 o.a.s.d.m.MetricsUtils main [INFO] Using statistics reporter plugin:org.apache.storm.daemon.metrics.reporters.JmxPreparableReporter
2018-01-17 11:18:45.356 o.a.s.d.m.r.JmxPreparableReporter main [INFO] Preparing...
2018-01-17 11:18:45.368 o.a.s.m.StormMetricsRegistry main [INFO] Started statistics report plugin...
2018-01-17 11:18:47.359 o.a.s.d.s.Slot SLOT_6701 [INFO] STATE EMPTY msInState: 2034 -> WAITING_FOR_BASIC_LOCALIZATION msInState: 6
2018-01-17 11:18:47.360 o.a.s.d.s.Slot SLOT_6700 [INFO] STATE EMPTY msInState: 2031 -> WAITING_FOR_BASIC_LOCALIZATION msInState: 0
2018-01-17 11:18:47.476 o.a.s.u.StormBoundedExponentialBackoffRetry Async Localizer [WARN] WILL SLEEP FOR 2001ms (NOT MAX)
2018-01-17 11:18:49.480 o.a.s.u.StormBoundedExponentialBackoffRetry Async Localizer [WARN] WILL SLEEP FOR 2002ms (NOT MAX)
2018-01-17 11:18:51.485 o.a.s.u.StormBoundedExponentialBackoffRetry Async Localizer [WARN] WILL SLEEP FOR 2006ms (NOT MAX)
2018-01-17 11:18:53.493 o.a.s.u.StormBoundedExponentialBackoffRetry Async Localizer [WARN] WILL SLEEP FOR 2015ms (NOT MAX)
2018-01-17 11:18:55.510 o.a.s.u.StormBoundedExponentialBackoffRetry Async Localizer [WARN] WILL SLEEP FOR 2024ms (NOT MAX)
2018-01-17 11:18:57.537 o.a.s.u.NimbusClient Async Localizer [WARN] Ignoring exception while trying to get leader nimbus info from CentOS65App. will retry with a different seed host.
java.lang.RuntimeException: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.NoRouteToHostException: 没有到主机的路由 (Host unreachable)
        at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:108) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.security.auth.ThriftClient.<init>(ThriftClient.java:69) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.utils.NimbusClient.<init>(NimbusClient.java:127) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:83) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:57) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.utils.Utils.getClientBlobStoreForSupervisor(Utils.java:538) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.downloadBaseBlobs(AsyncLocalizer.java:121) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:148) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:101) ~[storm-core-1.1.1.jar:1.1.1]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
Caused by: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.NoRouteToHostException: 没有到主机的路由 (Host unreachable)
        at org.apache.storm.security.auth.TBackoffConnect.retryNext(TBackoffConnect.java:64) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:56) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.1.jar:1.1.1]
        ... 13 more
Caused by: org.apache.storm.thrift.transport.TTransportException: java.net.NoRouteToHostException: 没有到主机的路由 (Host unreachable)
        at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:226) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.1.jar:1.1.1]
        ... 13 more
Caused by: java.net.NoRouteToHostException: 没有到主机的路由 (Host unreachable)
        at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_121]
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_121]
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_121]
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_121]
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_121]
        at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_121]
        at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:221) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:81) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:100) ~[storm-core-1.1.1.jar:1.1.1]
        ... 13 more
2018-01-17 11:18:57.537 o.a.s.l.AsyncLocalizer Async Localizer [WARN] Failed to download basic resources for topology-id WordCountTopology-1-1516157915
2018-01-17 11:18:57.537 o.a.s.d.s.AdvancedFSOps Async Localizer [INFO] Deleting path /app/data/supervisor/tmp/27e1b9d2-9993-43d1-8d1d-9156b96a8bd2
2018-01-17 11:18:57.543 o.a.s.d.s.AdvancedFSOps Async Localizer [INFO] Deleting path /app/data/supervisor/stormdist/WordCountTopology-1-1516157915
2018-01-17 11:18:57.543 o.a.s.l.AsyncLocalizer Async Localizer [WARN] Caught Exception While Downloading (rethrowing)...
org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [CentOS65App]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
        at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:111) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:57) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.utils.Utils.getClientBlobStoreForSupervisor(Utils.java:538) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.downloadBaseBlobs(AsyncLocalizer.java:121) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:148) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:101) ~[storm-core-1.1.1.jar:1.1.1]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
2018-01-17 11:18:57.544 o.a.s.d.s.Slot SLOT_6700 [ERROR] Error when processing event
java.util.concurrent.ExecutionException: org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [CentOS65App]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
        at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_121]
        at java.util.concurrent.FutureTask.get(FutureTask.java:206) ~[?:1.8.0_121]
        at org.apache.storm.localizer.LocalDownloadedResource$NoCancelFuture.get(LocalDownloadedResource.java:63) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.daemon.supervisor.Slot.handleWaitingForBasicLocalization(Slot.java:413) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.daemon.supervisor.Slot.stateMachineStep(Slot.java:273) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.daemon.supervisor.Slot.run(Slot.java:741) ~[storm-core-1.1.1.jar:1.1.1]
Caused by: org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [CentOS65App]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
        at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:111) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:57) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.utils.Utils.getClientBlobStoreForSupervisor(Utils.java:538) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.downloadBaseBlobs(AsyncLocalizer.java:121) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:148) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:101) ~[storm-core-1.1.1.jar:1.1.1]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
2018-01-17 11:18:57.544 o.a.s.u.Utils SLOT_6700 [ERROR] Halting process: Error when processing an event
java.lang.RuntimeException: Halting process: Error when processing an event
        at org.apache.storm.utils.Utils.exitProcess(Utils.java:1773) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.daemon.supervisor.Slot.run(Slot.java:774) ~[storm-core-1.1.1.jar:1.1.1]
2018-01-17 11:18:57.544 o.a.s.d.s.Slot SLOT_6701 [ERROR] Error when processing event
java.util.concurrent.ExecutionException: org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [CentOS65App]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
        at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_121]
        at java.util.concurrent.FutureTask.get(FutureTask.java:206) ~[?:1.8.0_121]
        at org.apache.storm.localizer.LocalDownloadedResource$NoCancelFuture.get(LocalDownloadedResource.java:63) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.daemon.supervisor.Slot.handleWaitingForBasicLocalization(Slot.java:413) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.daemon.supervisor.Slot.stateMachineStep(Slot.java:273) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.daemon.supervisor.Slot.run(Slot.java:741) ~[storm-core-1.1.1.jar:1.1.1]
Caused by: org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [CentOS65App]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
        at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:111) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:57) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.utils.Utils.getClientBlobStoreForSupervisor(Utils.java:538) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.downloadBaseBlobs(AsyncLocalizer.java:121) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:148) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:101) ~[storm-core-1.1.1.jar:1.1.1]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]
2018-01-17 11:18:57.544 o.a.s.u.Utils SLOT_6701 [ERROR] Halting process: Error when processing an event
java.lang.RuntimeException: Halting process: Error when processing an event
        at org.apache.storm.utils.Utils.exitProcess(Utils.java:1773) ~[storm-core-1.1.1.jar:1.1.1]
        at org.apache.storm.daemon.supervisor.Slot.run(Slot.java:774) ~[storm-core-1.1.1.jar:1.1.1]
2018-01-17 11:18:57.562 o.a.s.d.s.Supervisor Thread-5 [INFO] Shutting down supervisor 7cb2d4a8-a09c-4bb3-9449-16cb1d6b96da
2018-01-17 11:18:57.565 o.a.s.e.EventManagerImp Thread-4 [INFO] Event manager interrupted
解决办法有以下四种,根据情况依次偿试:
  • ping nimbus主机,如果不通,则检查这四台服务器的/etc/hosts(之前公司ip和家用ip共同配置在hosts,改为单一ip)
  • 在zookeeper中,删除storm目录
./bin/zkCli.sh
ls /
rmr /storm
  • 关闭防火墙
  • 防火墙中,只打开需要的端口
登录storm ui,在Nimbus Summary中看到nimbus用到的端口号为6627,在防火墙中打开此端口(之前已经打开了9099、2181、6700等端口,nimbus端口要单独打开)
  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值