Flink on yarn实践中踩过的一些坑

背景

最近公司项目在做实时计算相关,也是使用了flink做实时计算的引擎,是以flink on yarn的方式进行任务的调度,过程中也踩了一些坑,没有完整记录下来,就记录一些自己印象比较深刻的问题或者坑,希望对大家也有所帮助,问题如下:

1、jar包集群lib中管理? or maven打fat jar?

2、运行直接报错 Caused by: org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find a suitable table factory for ‘org.apache.flink.table.factories.TableSourceFactory’ in the classpath.

3、运行JobManager报错 Could not resolve ResourceManager address akka tcp://flink@xxxxx, retrying in 10000 ms : Count not connect to rpc endpoint under address akka tcp://flink@xxxxx

下面我们一个个来说明。

一、jar包集群lib中管理? or maven打fat jar?

因为我们是第一次做flink相关的实时计算,对于某些方面的最佳实践可不是特别清楚,一开始的做法是在flink的lib中把所有的jar包扔进去,不过结合后面实践过程中的思考,也问过一些经验丰富的flink大牛,最终还是确定下来最佳实践是maven中打fat jar,下面我们看看官方的说法:

官方文档关于该问题的说法:flink官方文档关于依赖的说明

在这里插入图片描述

可以看到官方已经对这两种方式进行了讨论,而且更推荐是fat jar的方式。其实简单的想想是这样的道理,如果什么依赖都往lib下放的话,一个集群可能相同的组件就有好几套,对应的版本也都不同,比如我们公司就同时存在hbase1.x和2.x,势必会造成依赖冲突。而lib下仅仅依赖flink的公共组件,定制化的依赖都由应用程序自己提供的话,可以达到一个依赖隔离的效果。


二、运行直接报错 Caused by: org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find a suitable table factory for ‘org.apache.flink.table.factories.TableSourceFactory’ in the classpath.

这个问题可以参考下下面的链接:
1、https://blog.csdn.net/zhanghuolei/article/details/105767190
2、https://stackoverflow.com/questions/52500048/flink-could-not-find-a-suitable-table-factory-for-org-apache-flink-table-facto

解决方法就是pom的里shade打包加上下面这段:

<transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>

ps : 奇怪的是,为什么官方的自动生成flink代码的脚手架不主动帮你生成这个呢,有没有大佬回答下


三、运行JobManager报错 Could not resolve ResourceManager address akka tcp://flink@xxxxx, retrying in 10000 ms : Count not connect to rpc endpoint under address akka tcp://flink@xxxxx

报错的具体表现就是flink on yarn per-job模式任务能提交上去,但是一直是ACCEPTED状态,JobManager有部分报错日志,但是不够明显,ResourceManager也没起来。

JobManager具体报错如下图
在这里插入图片描述

这个问题一开始我以为是flink的部署问题,后面用flink on k8s的方式跑了一下也是这个问题,然后用flink官方的demo跑了下on yarn的方式却能正常的跑,然后怀疑是pom的问题,查了一天,问了一天,最后咨询了下flink大牛zhisheng,居然被他秒解了哈哈。具体原因就是:pom里依赖了flink-connector-hbase,而我们启动flink的时候,环境变量也会加载集群里的hadoop包,跟flink-connector-hbase里的hadoop包冲突了,后面我把hadoop相关的依赖包exclude之后确实是能正常运行了。 幸好有zhisheng啊,这个问题报错信息也不够明显,经验不够丰富还真不好排查出来。


最后,附上我们flink on yarn的pom可以供大家参考下:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.getui.flink</groupId>
    <artifactId>flink-sql-submit</artifactId>
    <version>1.0-SNAPSHOT</version>

    <properties>
        <flink.version>1.11.2</flink.version>
        <java.version>1.8</java.version>
        <scala.binary.version>2.12</scala.binary.version>
        <maven.compiler.source>${java.version}</maven.compiler.source>
        <maven.compiler.target>${java.version}</maven.compiler.target>
    </properties>

    <dependencies>
        <!-- Flink modules -->
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-table-api-java</artifactId>
            <version>${flink.version}</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-table-planner-blink_${scala.binary.version}</artifactId>
            <version>${flink.version}</version>
            <scope>provided</scope>
        </dependency>

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-table-planner_${scala.binary.version}</artifactId>
            <version>${flink.version}</version>
            <scope>provided</scope>
        </dependency>

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-json</artifactId>
            <version>1.11.2</version>
            <scope>provided</scope>
        </dependency>


        <!-- Add logging framework, to produce console output when running in the IDE. -->
        <!-- These dependencies are excluded from the application JAR by default. -->
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-log4j12</artifactId>
            <version>1.7.7</version>
        </dependency>
        <dependency>
            <groupId>log4j</groupId>
            <artifactId>log4j</artifactId>
            <version>1.2.17</version>
        </dependency>


        <!-- CLI dependencies -->
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-clients_${scala.binary.version}</artifactId>
            <version>${flink.version}</version>
            <scope>provided</scope>
        </dependency>



        <!-- connector -->

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-connector-hbase_${scala.binary.version}</artifactId>
            <version>${flink.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>org.apache.hadoop</groupId>
                    <artifactId>hadoop-core</artifactId>
                </exclusion>
                <exclusion>
                    <artifactId>hadoop-common</artifactId>
                    <groupId>org.apache.hadoop</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>hadoop-client</artifactId>
                    <groupId>org.apache.hadoop</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>hadoop-yarn-common</artifactId>
                    <groupId>org.apache.hadoop</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>hadoop-mapreduce-client-core</artifactId>
                    <groupId>org.apache.hadoop</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>hadoop-auth</artifactId>
                    <groupId>org.apache.hadoop</groupId>
                </exclusion>
            </exclusions>
        </dependency>

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-sql-connector-kafka_${scala.binary.version}</artifactId>
            <version>${flink.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-connector-jdbc_${scala.binary.version}</artifactId>
            <version>${flink.version}</version>
        </dependency>

        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>5.1.38</version>
        </dependency>

    </dependencies>


    <build>
        <plugins>
            <!-- Java Compiler -->
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.1</version>
                <configuration>
                    <source>${java.version}</source>
                    <target>${java.version}</target>
                </configuration>
            </plugin>

            <!-- We use the maven-shade plugin to create a fat jar that contains all necessary dependencies. -->
            <!-- Change the value of <mainClass>...</mainClass> if your program entry point changes. -->
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <version>3.0.0</version>
                <executions>
                    <!-- Run shade goal on package phase -->
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <artifactSet>
                                <excludes>
                                    <exclude>org.apache.flink:force-shading</exclude>
                                    <exclude>com.google.code.findbugs:jsr305</exclude>
                                    <exclude>org.slf4j:*</exclude>
                                    <exclude>log4j:*</exclude>
                                </excludes>
                            </artifactSet>
                            <filters>
                                <filter>
                                    <!-- Do not copy the signatures in the META-INF folder.
                                    Otherwise, this might cause SecurityExceptions when using the JAR. -->
                                    <artifact>*:*</artifact>
                                    <excludes>
                                        <exclude>META-INF/*.SF</exclude>
                                        <exclude>META-INF/*.DSA</exclude>
                                        <exclude>META-INF/*.RSA</exclude>
                                    </excludes>
                                </filter>
                            </filters>
                            <transformers>
                                <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                    <mainClass>com.getui.flink.core.SqlSubmit</mainClass>
                                </transformer>
                                <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
                            </transformers>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>

        <pluginManagement>
            <plugins>

                <!-- This improves the out-of-the-box experience in Eclipse by resolving some warnings. -->
                <plugin>
                    <groupId>org.eclipse.m2e</groupId>
                    <artifactId>lifecycle-mapping</artifactId>
                    <version>1.0.0</version>
                    <configuration>
                        <lifecycleMappingMetadata>
                            <pluginExecutions>
                                <pluginExecution>
                                    <pluginExecutionFilter>
                                        <groupId>org.apache.maven.plugins</groupId>
                                        <artifactId>maven-shade-plugin</artifactId>
                                        <versionRange>[3.0.0,)</versionRange>
                                        <goals>
                                            <goal>shade</goal>
                                        </goals>
                                    </pluginExecutionFilter>
                                    <action>
                                        <ignore/>
                                    </action>
                                </pluginExecution>
                                <pluginExecution>
                                    <pluginExecutionFilter>
                                        <groupId>org.apache.maven.plugins</groupId>
                                        <artifactId>maven-compiler-plugin</artifactId>
                                        <versionRange>[3.1,)</versionRange>
                                        <goals>
                                            <goal>testCompile</goal>
                                            <goal>compile</goal>
                                        </goals>
                                    </pluginExecutionFilter>
                                    <action>
                                        <ignore/>
                                    </action>
                                </pluginExecution>
                            </pluginExecutions>
                        </lifecycleMappingMetadata>
                    </configuration>
                </plugin>
            </plugins>
        </pluginManagement>

   </build>

</project>
  • 1
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值