dolphinscheduler 集成flink

最新推荐文章于 2024-08-27 11:08:37 发布

weixin_40455124

最新推荐文章于 2024-08-27 11:08:37 发布

阅读量4.8k

点赞数 3

分类专栏： workflows 文章标签： dolphin scheduler flink yarn 坑

本文链接：https://blog.csdn.net/weixin_40455124/article/details/118685931

版权

workflows 专栏收录该内容

74 篇文章 52 订阅 ¥49.90 ¥99.00

订阅专栏

超级会员免费看

本文介绍了在dolphinscheduler 1.3.6版本中集成flink 1.13.1时遇到的主要问题，包括classloader.check-leaked-classloader错误以及HADOOP_CLASSPATH环境变量的设置。解决方法包括在flink-conf.yaml中禁用classloader检查，并在dolphinscheduler worker节点上安装和配置hadoop，以及在提交任务前设置HADOOP_CLASSPATH。此外，还提到在遇到非叶节点队列错误时，需要通过自定义参数来指定正确的YARN队列。

摘要由CSDN通过智能技术生成

版本

dolphinscheduler 1.3.6
hadoop 3.2.1
flink 1.13.1

主要问题

classloader.check-leaked-classloader

使用默认配置（将flink copy到worker节点的/opt/soft/flink下），会遇到这个错误

Exception in thread "Thread-5" java.lang.IllegalStateException: Trying to access closed classloader. Please check if you store classloaders directly or indirectly in static fields. If the stacktrace suggests that the leak occurs in a third party library and cannot be fixed immediately, you can disable this check with the configuration 'classloader.check-leaked-classloader'.
        at org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:164)
        at org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResource(FlinkUserCodeClassLoaders.java:183)
        at org.apache.hadoop.conf.Configuration.getResource(Configuration.java:2780)
        at org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3036)
        at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2995)
        at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2968)
        at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2848)
        at org.apache.hadoop.conf.Configuration.get(Configuration.java:1200)
        at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1812)
        at org.apache.hadoop.conf.Configuration.getTimeDuration(Configuration.java:1789)

只需要按提示在flink-conf.yaml里面增加“classloader.check-leaked-classloader: false”就可以

[root@37e3e6d56452 flink]# tail  conf/flink*.yaml  
# The port under which the web-based HistoryServer listens.
#historyserver.web.port: 8082

# Comma separated list of directories to monitor for completed jobs.
#historyserver.archive.fs.dir: hdfs:///completed-jobs/

# Interval in milliseconds for refreshing the monitored directories.
#historyserver.archive.fs.refresh-interval: 10000
classloader.check-leaked-classloader: false

HADOOP_CLASSPATH environment

在1.13.1 flink 版本，需要设置HADOOP_CLASSPATH 才可以提交到yarn，需要做2步

dolphinscheduler worker 节点同步安装hadoop并正确配置
提交前需要设置HADOOP_CLASSPATH，建议修改bin/flink 脚本，在第一行增加export HADOOP_CLASSPATH=hadoop classpath

以下为修改后的 bin/flink文件示意

[root@37e3e6d56452 flink]# head -n 30 bin/flink
#!/usr/bin/env bash
################################################################################
#  Licensed to the Apache Software Foundation (ASF) under one
#  or more contributor license agreements.  See the NOTICE file
#  distributed with this work for additional information
#  regarding copyright ownership.  The ASF licenses this file
#  to you under the Apache License, Version 2.0 (the
#  "License"); you may not use this file except in compliance
#  with the License.  You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
#  Unless required by applicable law or agreed to in writing, software
#  distributed under the License is distributed on an "AS IS" BASIS,
#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#  See the License for the specific language governing permissions and
# limitations under the License.
################################################################################

export HADOOP_CLASSPATH=`hadoop classpath`
target="$0"
# For the case, the executable has been directly symlinked, figure out
# the correct bin path by following its symlink up to an upper bound.
# Note: we can't use the readlink utility here if we want to be POSIX
# compatible.
iteration=0
while [ -L "$target" ]; do
    if [ "$iteration" -gt 100 ]; then
        echo "Cannot resolve path: You have a cyclic symlink in $target."
        break

non-leaf queue

还可能遇到如下错误

Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1626093777623_0001 to YARN : Application application_1626093777623_0001 submitted by user : root to non-leaf queue : root
		at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:327)
		at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:1178)
		at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:593)
		at org.apache.flink.yarn.YarnClusterDescriptor.deployJobCluster(YarnClusterDescriptor.java:474)

分析日志，可以看到如下flink执行代码

flink run -m yarn-cluster -ys 1 -ynm WordCount -yjm 1G -ytm 1G -yqu root -p 1 -sae -c org.apache.flink.examples.java.wordcount.WordCount WordCount.jar --input  hdfs:///griffin/env.json    --output hdfs:///opt/dol-wcoutput/

注意-yqu，执行bin/flink run -h可以看到如下解释

 -yqu,--yarnqueue <arg>               Specify YARN queue.

查看yarn queue配置
在这里插入图片描述
可以看到root下面有子queue，因此报 non-leaf 错误，可以通过自定义参数解决

配置后查看日志，可以看到提交yarn的命令修改为

flink run -m yarn-cluster -ys 1 -ynm WordCount -yjm 1G -ytm 1G -p 1 -sae -yqu default -c org.apache.flink.examples.java.wordcount.WordCount WordCount.jar --input  hdfs:///griffin/env.json    --output hdfs:///opt/dol-wcoutput/