flink1.10三节点集群standalone模式搭建

各台机器上提前准备jdk1.8以及上的java环境,并且配置ssh免密登录。

集群环境

flink1:172.21.89.128jobmanager
flink2:172.21.89.129taskmanager
flink3:172.21.89.130taskmanager

在flink1上做flink配置,主要是flink-conf.yaml、masters和slaves

flink-conf.yaml:

jobmanager.rpc.address: flink1
# 每个taskmanager机器提供的slot数量
taskmanager.numberOfTaskSlots: 2
# 默认并行度
parallelism.default: 2
# 临时文件存储路径。需要提前创建,否则启动集群会报错。
io.tmp.dirs: /root/flink/tmp

slaves:

flink2

配置完之后通过scp -r ./flink-1.10.0 flink2:/opt将flink文件传输到flink2。之后就可以在flink1上启动集群:bin/start-cluster.sh

我这里因为只在flink1上提前创建了临时文件目录/root/flink/tmp,而flink2上没有,所以启动时,jobmanager启动成功,但taskmanager启动失败:

2020-05-19 16:03:32,276 INFO  org.apache.flink.runtime.security.modules.HadoopModuleFactory  - Cannot create Hadoop Security Module because Hadoop cannot be found in the Classpath.
2020-05-19 16:03:32,645 ERROR org.apache.flink.runtime.taskexecutor.TaskManagerRunner       - TaskManager initialization failed.
java.lang.Exception: unable to establish the security context
        at org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:73)
        at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManagerSecurely(TaskManagerRunner.java:319)
        at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.main(TaskManagerRunner.java:287)
Caused by: java.lang.RuntimeException: unable to generate a JAAS configuration file
        at org.apache.flink.runtime.security.modules.JaasModule.generateDefaultConfigFile(JaasModule.java:170)
        at org.apache.flink.runtime.security.modules.JaasModule.install(JaasModule.java:94)
        at org.apache.flink.runtime.security.SecurityUtils.install(SecurityUtils.java:67)
        ... 2 more
Caused by: java.nio.file.NoSuchFileException: /root/flink/tmp/jaas-7614411253117836328.conf
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
        at java.nio.file.Files.newByteChannel(Files.java:361)
        at java.nio.file.Files.createFile(Files.java:632)
        at java.nio.file.TempFileHelper.create(TempFileHelper.java:138)
        at java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:161)
        at java.nio.file.Files.createTempFile(Files.java:852)
        at org.apache.flink.runtime.security.modules.JaasModule.generateDefaultConfigFile(JaasModule.java:163)
        ... 4 more

刚好,这样可以测试动态添加taskmanager。首先在flink2上创建临时文件目录/root/flink/tmp,然后在flink2上运行./taskmanager.sh start实现动态添加taskmanager节点。

启动后看taskmanager日志:

此时可以在jobmanager的日志上看到此taskmanager注册成功的日志,其中ResourceID与taskmanager日志中的一致:

2020-05-19 08:17:54,099 INFO  org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - Registering TaskManager with ResourceID f6fad8c42b36967dabf400ffb52da4df (akka.tcp
://flink@172.21.89.129:44783/user/taskmanager_0) at ResourceManager

对于flink3,也采用动态添加的方式(没有在slaves配置文件中声明flink3)。首先在flink3上创建临时文件目录/root/flink/tmp,然后在flink3上运行./taskmanager.sh start实现动态添加taskmanager节点。

此时可以在jobmanager的日志上看到此taskmanager注册成功的日志,其中ResourceID与taskmanager日志中的一致:

2020-05-19 08:17:54,099 INFO  org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - Registering TaskManager with ResourceID f6fad8c42b36967dabf400ffb52da4df (akka.tcp://flink@172.21.89.129:44783/user/taskmanager_0) at ResourceManager
2020-05-19 08:46:03,676 INFO  org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - Registering TaskManager with ResourceID e008aba19fdb81984b393f19a24df5f1 (akka.tcp://flink@172.21.89.130:36784/user/taskmanager_0) at ResourceManager

 webui可以看到taskmanager和jobmanager的配置和使用情况

 

总结:

  1. 本文介绍了standalone模式的非高可用的三节点集群搭建
  2. 介绍了动态添加taskmanager的过程,需要注意的是,动态添加的机器因为没有在slaves文件中配置,因此在使用stop-cluster.sh脚本关闭flink集群的时候,会无法正常关闭此动态添加的机器(下面的日志是执行stop-cluster.sh脚本后,flink3中的taskmanager日志,可以看到flink3上的taskmanager尝试与flink1的resourceManager通信失败,最终关闭taskmanager)。 
2020-05-19 16:43:42,100 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Successful registration at resource manager akka.tcp://flink@flink
1:6123/user/resourcemanager under registration id e09e4d6b3c15e36d26f7f6569cdecd76.
2020-05-19 18:05:56,030 WARN  akka.remote.ReliableDeliverySupervisor                        - Association with remote system [akka.tcp://flink@flink1:6123] has
failed, address is now gated for [50] ms. Reason: [Disassociated]
2020-05-19 18:11:25,751 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@flink1:6123/user/resourcemanager, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@flink1:6123/user/resourcemanager..
2020-05-19 18:11:35,769 WARN  akka.remote.transport.netty.NettyTransport                    - Remote connection to [null] failed with java.net.ConnectException: 拒绝连接: flink1/172.21.89.128:6123
2020-05-19 18:11:35,770 WARN  akka.remote.ReliableDeliverySupervisor                        - Association with remote system [akka.tcp://flink@flink1:6123] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink@flink1:6123]] Caused by: [java.net.ConnectException: 拒绝连接: flink1/172.21.89.128:6123]
2020-05-19 18:11:35,771 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.tcp://flink@flink1:6123/user/resourcemanager, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@flink1:6123/user/resourcemanager..
2020-05-19 18:11:45,086 ERROR org.apache.flink.runtime.taskexecutor.TaskExecutor            - Fatal error occurred in TaskExecutor akka.tcp://flink@172.21.89.130:36784/user/taskmanager_0.
org.apache.flink.runtime.taskexecutor.exceptions.RegistrationTimeoutException: Could not register at the ResourceManager within the specified maximum registration duration 300000 ms. This indicates a problem with this instance. Terminating now.
        at org.apache.flink.runtime.taskexecutor.TaskExecutor.registrationTimeout(TaskExecutor.java:1120)
        at org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$startRegistrationTimeout$9(TaskExecutor.java:1106)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:397)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:190)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
        at scala.PartialFunction.applyOrElse(PartialFunction.scala:123)
        at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)
        at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
        at akka.actor.Actor.aroundReceive(Actor.scala:517)
        at akka.actor.Actor.aroundReceive$(Actor.scala:515)
        at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
        at akka.actor.ActorCell.invoke(ActorCell.scala:561)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
        at akka.dispatch.Mailbox.run(Mailbox.scala:225)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
        at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2020-05-19 18:11:45,088 ERROR org.apache.flink.runtime.taskexecutor.TaskManagerRunner       - Fatal error occurred while executing the TaskManager. Shutting it down...
org.apache.flink.runtime.taskexecutor.exceptions.RegistrationTimeoutException: Could not register at the ResourceManager within the specified maximum registration duration 300000 ms. This indicates a problem with this instance. Terminating now.
        at org.apache.flink.runtime.taskexecutor.TaskExecutor.registrationTimeout(TaskExecutor.java:1120)
        at org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$startRegistrationTimeout$9(TaskExecutor.java:1106)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:397)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:190)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
        at scala.PartialFunction.applyOrElse(PartialFunction.scala:123)
        at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)
        at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
        at akka.actor.Actor.aroundReceive(Actor.scala:517)
        at akka.actor.Actor.aroundReceive$(Actor.scala:515)
        at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
        at akka.actor.ActorCell.invoke(ActorCell.scala:561)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
        at akka.dispatch.Mailbox.run(Mailbox.scala:225)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
        at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2020-05-19 18:11:45,093 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Stopping TaskExecutor akka.tcp://flink@172.21.89.130:36784/user/taskmanager_0.
2020-05-19 18:11:45,094 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Terminating registration attempts towards ResourceManager akka.tcp://flink@flink1:6123/user/resourcemanager.
2020-05-19 18:11:45,097 INFO  org.apache.flink.runtime.taskexecutor.JobLeaderService        - Stop job leader service.
2020-05-19 18:11:45,097 INFO  org.apache.flink.runtime.state.TaskExecutorLocalStateStoresManager  - Shutting down TaskExecutorLocalStateStoresManager.
2020-05-19 18:11:45,108 INFO  org.apache.flink.runtime.io.disk.FileChannelManagerImpl       - FileChannelManager removed spill file directory /root/flink/tmp/flink-io-e748c002-6239-47e7-9dea-2901689d750b
2020-05-19 18:11:45,108 INFO  org.apache.flink.runtime.io.network.NettyShuffleEnvironment   - Shutting down the network environment and its components.
2020-05-19 18:11:45,109 INFO  org.apache.flink.runtime.io.network.netty.NettyClient         - Successful shutdown (took 1 ms).
2020-05-19 18:11:45,111 INFO  org.apache.flink.runtime.io.network.netty.NettyServer         - Successful shutdown (took 2 ms).
2020-05-19 18:11:45,113 INFO  org.apache.flink.runtime.io.disk.FileChannelManagerImpl       - FileChannelManager removed spill file directory /root/flink/tmp/flink-netty-shuffle-57c232aa-407c-4a78-9eaa-5d2a233b2f06
2020-05-19 18:11:45,114 INFO  org.apache.flink.runtime.taskexecutor.KvStateService          - Shutting down the kvState service and its components.
2020-05-19 18:11:45,114 INFO  org.apache.flink.runtime.taskexecutor.JobLeaderService        - Stop job leader service.
2020-05-19 18:11:45,114 INFO  org.apache.flink.runtime.filecache.FileCache                  - removed file cache directory /root/flink/tmp/flink-dist-cache-665f8fb6-d15f-4fea-8d63-7e9fe9031f5f
2020-05-19 18:11:45,117 INFO  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Stopped TaskExecutor akka.tcp://flink@172.21.89.130:36784/user/taskmanager_0.
2020-05-19 18:11:45,117 INFO  org.apache.flink.runtime.blob.PermanentBlobCache              - Shutting down BLOB cache
2020-05-19 18:11:45,117 INFO  org.apache.flink.runtime.blob.TransientBlobCache              - Shutting down BLOB cache
2020-05-19 18:11:45,117 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopping Akka RPC service.
2020-05-19 18:11:45,134 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopping Akka RPC service.
2020-05-19 18:11:45,138 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down remote daemon.
2020-05-19 18:11:45,164 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon shut down; proceeding with flushing remote transports.
2020-05-19 18:11:45,140 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down remote daemon.
2020-05-19 18:11:45,222 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon shut down; proceeding with flushing remote transports.
2020-05-19 18:11:45,270 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting shut down.
2020-05-19 18:11:45,277 INFO  akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting shut down.
2020-05-19 18:11:45,318 INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Stopped Akka RPC service.

 

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
### 回答1: Flink standalone集群搭建步骤如下: 1. 下载Flink安装包并解压缩到指定目录。 2. 配置Flink集群的masters和workers节点,可以在conf目录下的masters和workers文件中进行配置。 3. 启动Flink集群的masters节点,可以使用bin/start-cluster.sh命令启动。 4. 启动Flink集群的workers节点,可以使用bin/taskmanager.sh start命令启动。 5. 验证Flink集群是否正常运行,可以使用bin/flink list命令查看当前运行的Flink作业。 6. 在Flink集群中提交作业,可以使用bin/flink run命令提交作业。 7. 监控Flink集群的运行状态,可以使用Flink的Web UI或者JMX监控工具进行监控。 以上就是Flink standalone集群搭建的基本步骤,希望对您有所帮助。 ### 回答2: Apache Flink是一个处理流和批量数据的通用分布式计算引擎,可在大规模数据集上快速实现低延迟和高吞吐量。Flink提供了一个Standalone集群模式,使开发人员可以在自己的本地机器上测试和验证他们的应用程序,而无需构建一个完整的分布式环境。在本文中,我们将介绍如何搭建一个Flink Standalone集群。 1. 确保你的环境满足Flink的要求,比如安装Java环境等。 2. 下载Flink二进制文件。从Flink官网下载最新的tar文件,然后解压到一个目录下。 3. 配置Flink。打开conf/flink-conf.yaml文件,配置Flink的参数,比如jobmanager.rpc.address(JobManager监听的主机地址),taskmanager.numberOfTaskSlots(每个TaskManager能够执行的任务数)等。 4. 启动JobManager。在Flink的bin目录下执行以下命令: ./start-cluster.sh 这将启动JobManager和TaskManager进程。 5. 访问Flink Web Dashboard。在浏览器中输入http://localhost:8081,可以访问Flink Web Dashboard。这里可以查看集群的状态、运行中的任务、日志等。 6. 启动应用程序。使用Flink提供的运行脚本(bin/flink run)来提交应用程序。 7. 观察应用程序的运行状态。可以在Flink Web Dashboard中查看应用程序的运行状态和日志,还可以监控各种指标,如吞吐量、延迟、资源使用情况等。 8. 停止集群。在bin目录下执行以下命令: ./stop-cluster.sh 这将停止JobManager和TaskManager进程。 总之,通过Flink Standalone集群,您可以在本地机器上测试和验证您的应用程序,并且几乎没有任何成本。值得注意的是,Standalone集群并不适合生产环境,但当您需要在本地机器上调试应用程序时,它是一个很好的选择。 ### 回答3: Apache Flink是一个开源的分布式流处理系统。它以高效、可伸缩和容错为设计目标,因此广泛应用于大数据领域。Flink可以运行在各种集群上,包括Hadoop YARN和Apache Mesos等。在本文中,我们将讨论如何在Flink standalone集群搭建分布式流处理系统。 Flink standalone集群搭建的准备工作: 在搭建Flink standalone集群之前,需要确保已经完成以下准备工作: 1. 安装Java 8或更高版本。 2. 下载Flink发行版,并解压缩至安装目录。 Flink standalone集群搭建的步骤: 1. 在主节点上启动Flink集群管理器。在Flink所在目录下,输入以下命令: ./bin/start-cluster.sh 2. 查看集群状态。在Flink所在目录下,输入以下命令: ./bin/flink list 如果输出结果为空,则说明集群状态正常。 3. 在从节点上启动TaskManager。在从节点所在机器上,输入以下命令: ./bin/taskmanager.sh start 4. 查看TaskManager状态。在从节点所在机器上,输入以下命令: ./bin/taskmanager.sh status 如果输出结果为“正常运行”,则说明TaskManager已经成功启动。 5. 提交Flink作业。在Flink所在目录下,输入以下命令: ./bin/flink run ./examples/streaming/SocketWindowWordCount.jar --port 9000 其中,SocketWindowWordCount.jar是一个简单的Flink作业,用于计算流式数据的词频统计。 6. 监控作业运行情况。在浏览器中输入以下地址: http://localhost:8081 可以查看作业的运行状态、性能指标等信息。 总结: 通过以上步骤,我们已经成功搭建Flink standalone集群,并提交了一个简单的流处理作业。需要注意的是,本文仅提供了基础的搭建步骤,实际生产环境中还需要进行更加细致的配置和管理。同时,Flink具有丰富的API和生态系统,可以灵活应对不同的数据处理场景。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值