Flink集群报错(Could not resolve ResourceManager address akka.tcp)

task和jobmanager不知道为什么挂了
flink-flink-taskexecutor-0-node2.log日志如下:


2021-04-04 10:03:15,058 INFO  org.apache.flink.runtime.io.network.netty.NettyConfig         - NettyConfig [server address: /192.168.11.132, server port: 0, ssl enabled: false, memory segment size (bytes): 32768, transport type: NIO, number of server threads: 1 (manual), number of client threads: 1 (manual), server connect backlog: 0 (use Netty's default), client connect timeout (sec): 120, send/receive buffer size (bytes): 0 (use Netty's default)]
2021-04-04 10:03:15,297 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerServices     - Temporary file directory '/tmp': total 26 GB, usable 22 GB (84.62% usable)
2021-04-04 10:03:16,123 INFO  org.apache.flink.runtime.io.network.buffer.NetworkBufferPool  - Allocated 102 MB for network buffer pool (number of memory segments: 3278, bytes per segment: 32768).
2021-04-04 10:03:16,197 INFO  org.apache.flink.runtime.io.network.NetworkEnvironment        - Starting the network environment and its components.
2021-04-04 10:03:16,252 INFO  org.apache.flink.runtime.io.network.netty.NettyClient         - Successful initialization (took 52 ms).
2021-04-04 10:03:16,309 INFO  org.apache.flink.runtime.io.network.netty.NettyServer         - Successful initialization (took 56 ms). Listening on SocketAddress /192.168.11.132:37718.
2021-04-04 10:03:16,310 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerServices     - Limiting managed memory to 0.7 of the currently free heap space (641 MB), memory will be allocated lazily.
2021-04-04 10:03:16,314 INFO  org.apache.flink.runtime.io.disk.iomanager.IOManager          - I/O manager uses directory /tmp/flink-io-5cb46d08-d7bd-41bb-91d0-e67a2ca8ab47 for spill files.
2021-04-04 10:03:16,409 INFO  org.apache.flink.runtime.taskexecutor.TaskManagerConfiguration  - Messages have a max timeout of 10000 ms
2021-04-04 10:03:16,421 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Starting RPC endpoint for org.apache.flink.runtime.taskexecutor.TaskExecutor at akka://flink/user/taskmanager_0 .
2021-04-04 10:03:16,438 INFO  org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - Starting ZooKeeperLeaderRetrievalService /leader/resource_manager_lock.
2021-04-04 10:03:16,439 INFO  org.apache.flink.runtime.taskexecutor.JobLeaderService        - Start job leader service.
2021-04-04 10:03:16,441 INFO  org.apache.flink.runtime.filecache.FileCache                  - User file cache uses directory /tmp/flink-dist-cache-9bd42cb9-9f68-419a-9381-95693ff61ac5
2021-04-04 10:03:16,452 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Connecting to ResourceManager akka.tcp://flink@localhost:46715/user/resourcemanager(97844b5c0749ea747b4749fffa964081).
2021-04-04 10:03:16,570 WARN  akka.remote.transport.netty.NettyTransport                    - Remote connection to [null] failed with java.net.ConnectException: 拒绝连接: localhost/127.0.0.1:46715
2021-04-04 10:03:16,577 WARN  akka.remote.ReliableDeliverySupervisor                        - Association with remote system [akka.tcp://flink@localhost:46715] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink@localhost:46715]] Caused by: [拒绝连接: localhost/127.0.0.1:46715]
2021-04-04 10:03:16,583 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@localhost:46715/user/resourcemanager, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@localhost:46715/user/resourcemanager..
2021-04-04 10:03:26,617 WARN  akka.remote.transport.netty.NettyTransport                    - Remote connection to [null] failed with java.net.ConnectException: 拒绝连接: localhost/127.0.0.1:46715
2021-04-04 10:03:26,623 WARN  akka.remote.ReliableDeliverySupervisor

......

2021-04-04 10:08:07,454 WARN  akka.remote.transport.netty.NettyTransport                    - Remote connection to [null] failed with java.net.ConnectException: 拒绝连接: localhost/127.0.0.1:46715
2021-04-04 10:08:07,455 WARN  akka.remote.ReliableDeliverySupervisor                        - Association with remote system [akka.tcp://flink@localhost:46715] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink@localhost:46715]] Caused by: [拒绝连接: localhost/127.0.0.1:46715]
2021-04-04 10:08:07,456 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@localhost:46715/user/resourcemanager, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@localhost:46715/user/resourcemanager..
2021-04-04 10:08:16,468 ERROR org.apache.flink.runtime.taskexecutor.TaskExecutor            - Fatal error occurred in TaskExecutor akka.tcp://flink@192.168.11.132:45382/user/taskmanager_0.
org.apache.flink.runtime.taskexecutor.exceptions.RegistrationTimeoutException: Could not register at the ResourceManager within the specified maximum registration duration 300000 ms. This indicates a problem with this instance. Terminating now.
	at org.apache.flink.runtime.taskexecutor.TaskExecutor.registrationTimeout(TaskExecutor.java:1034)
	at org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$startRegistrationTimeout$3(TaskExecutor.java:1020)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:392)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:185)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:147)
	at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)
	at akka.actor.Actor.aroundReceive(Actor.scala:502)
	at akka.actor.Actor.aroundReceive$(Actor.scala:500)
	at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
	at akka.actor.ActorCell.invoke(ActorCell.scala:495)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
	at akka.dispatch.Mailbox.run(Mailbox.scala:224)
	at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
2021-04-04 10:08:16,472 ERROR org.apache.flink.runtime.taskexecutor.TaskManagerRunner       - Fatal error occurred while executing the TaskManager. Shutting it down...
org.apache.flink.runtime.taskexecutor.exceptions.RegistrationTimeoutException: Could not register at the ResourceManager within the specified maximum registration duration 300000 ms. This indicates a problem with this instance. Terminating now.
	at org.apache.flink.runtime.taskexecutor.TaskExecutor.registrationTimeout(TaskExecutor.java:1034)
	at org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$startRegistrationTimeout$3(TaskExecutor.java:1020)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:392)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:185)
	at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:147)
	at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)
	at akka.actor.Actor.aroundReceive(Actor.scala:502)
	at akka.actor.Actor.aroundReceive$(Actor.scala:500)
	at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
	at akka.actor.ActorCell.invoke(ActorCell.scala:495)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
	at akka.dispatch.Mailbox.run(Mailbox.scala:224)
	at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
2021-04-04 10:08:16,478 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Stopping TaskExecutor akka.tcp://flink@192.168.11.132:45382/user/taskmanager_0.
2021-04-04 10:08:16,478 INFO  org.apache.flink.runtime.taskexecutor.JobLeaderService        - Stop job leader service.
2021-04-04 10:08:16,507 INFO  org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - Stopping ZooKeeperLeaderRetrievalService /leader/resource_manager_lock.
2021-04-04 10:08:16,507 INFO  org.apache.flink.runtime.state.TaskExecutorLocalStateStoresManager  - Shutting down TaskExecutorLocalStateStoresManager.
2021-04-04 10:08:16,514 INFO  org.apache.flink.runtime.io.disk.iomanager.IOManager          - I/O manager removed spill file directory /tmp/flink-io-5cb46d08-d7bd-41bb-91d0-e67a2ca8ab47
2021-04-04 10:08:16,514 INFO  org.apache.flink.runtime.io.network.NetworkEnvironment        - Shutting down the network environment and its components.
2021-04-04 10:08:16,515 INFO  org.apache.flink.runtime.io.network.netty.NettyClient         - Successful shutdown (took 0 ms).
2021-04-04 10:08:16,518 INFO  org.apache.flink.runtime.io.network.netty.NettyServer         - Successful shutdown (took 1 ms).
2021-04-04 10:08:16,532 INFO  org.apache.flink.runtime.taskexecutor.JobLeaderService        - Stop job leader service.
2021-04-04 10:08:16,532 INFO  org.apache.flink.runtime.filecache.FileCache                  - removed file cache directory /tmp/flink-dist-cache-9bd42cb9-9f68-419a-9381-95693ff61ac5
2021-04-04 10:08:16,539 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Stopped TaskExecutor akka.tcp://flink@192.168.11.132:45382/user/taskmanager_0.
2021-04-04 10:08:16,540 INFO  org.apache.flink.runtime.blob.PermanentBlobCache              - Shutting down BLOB cache
2021-04-04 10:08:16,540 INFO  org.apache.flink.runtime.blob.TransientBlobCache              - Shutting down BLOB cache
2021-04-04 10:08:16,553 INFO  org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl  - backgroundOperationsLoop exiting
2021-04-04 10:08:16,565 INFO  org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper  - Session: 0x10000007e9d0008 closed
2021-04-04 10:08:16,565 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopping Akka RPC service.
2021-04-04 10:08:16,583 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down remote daemon.
2021-04-04 10:08:16,594 INFO  org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn  - EventThread shut down for session: 0x10000007e9d0008
2021-04-04 10:08:16,597 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down remote daemon.
2021-04-04 10:08:16,601 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon shut down; proceeding with flushing remote transports.
2021-04-04 10:08:16,611 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon shut down; proceeding with flushing remote transports.
2021-04-04 10:08:16,640 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting shut down.
2021-04-04 10:08:16,641 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting shut down.
2021-04-04 10:08:16,661 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopped Akka RPC service.

原因:flink-conf.yaml配置zookeeper错误,改正后

high-availability.zookeeper.quorum: node1:2181,node2:2181,node3:2181

另外lib里面jar的权限改为了755,后面就正确了。

另外,虚拟机直接reboot发现,或3台机器一起启动taskmanager,也可能造成上面的错误,估计是多个taskmanager启动太过于同步导致的

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值