Flink提交任务（总篇）——执行逻辑整体分析

最新推荐文章于 2024-06-12 08:15:51 发布

hxcaifly

最新推荐文章于 2024-06-12 08:15:51 发布

阅读量1w

点赞数 2

分类专栏： Flink Flink原理和应用

本文链接：https://blog.csdn.net/hxcaifly/article/details/87864154

版权

Flink客户端提交任务执行的逻辑分析

针对Flink1.7-release版本

前言

Flink的源码体系比较庞大，一头扎进去，很容易一头雾水，不知道从哪部分代码看起。但是如果结合我们的业务开发，有针对性地去跟进源码去发现问题，理解源码里的执行细节，效果会更好。

笔者在近期的Flink开发过程中，因为产品的原因，只允许部署Flink standalone模式，出于性能考虑，很有必要对其性能做下测试。

Flink的standalone模式的部署方式很简单。只需要设定下基本的全局配置参数就行。比如jobmanager.heap.size， taskmanager.heap.size， parallelism.default, taskmanager.numberOfTaskSlots等这些常用参数，就可以执行./bin/start-cluster.sh来启动Flink的standalone模式。

但是当我执行：

./bin/flink run -c chx.demo.FirstDemo /demo/chx.jar

来提交我的任务时，发现问题了。当批处理的数据量达2000W时，一切还挺正常，但是当批处理的数据量达3800W时，报出了异常：

Caused by: akka.pattern.AskTimeoutException: Ask timed out on
>>>> [Actor[akka://flink/user/$a#183984057]] after [10000ms]

碰到这种报错，首先Akka的机制我们是有必要熟悉下的，但是本文不重点讲解Akka的原理和用法，不过我后续文章想对akka做具体的分析和总结。

本文重点讲述我们通过./bin/flink run提交任务时，程序到底做了什么事情。对背后代码的执行逻辑做一番分析。

1. 整体逻辑

Flink通过客户端提交任务的入口在：org.apache.flink.client.cli$CliFrontend。其入口函数main的逻辑如下：

public static void main(final String[] args) {
   
    // 1. 打印基本的环境信息
    EnvironmentInformation.logEnvironmentInfo(LOG, "Command Line Client", args);

    // 2. 获取配置目录。一般是flink安装目录下的/conf目录
    final String configurationDirectory = getConfigurationDirectoryFromEnv();

    // 3. 加载全局配置（加载配置yaml文件，将其解析出来）
    final Configuration configuration = GlobalConfiguration.loadConfiguration(configurationDirectory);

    // 4. 加载自定义命令行(包含yarn模式命令行和默认命令行两种）
    final List<CustomCommandLine<?>> customCommandLines = loadCustomCommandLines(
        configuration,
        configurationDirectory);


    try {
   
        // 5. 初始化命令行前端
        final CliFrontend cli = new CliFrontend(
            configuration,
            customCommandLines);
        // 6. 安装安全机制
        SecurityUtils.install(new SecurityConfiguration(cli.configuration));
        // 7. 执行，回调。返回状态码retCode。所以这块将是主要逻辑
        int retCode = SecurityUtils.getInstalledContext()
            .runSecured(() -> cli.parseParameters(args));
        System.exit(retCode);
    }
    catch (Throwable t) {
   
        final Throwable strippedThrowable = ExceptionUtils.stripException(t, UndeclaredThrowableException.class);
        LOG.error("Fatal error while running command line interface.", strippedThrowable);
        strippedThrowable.printStackTrace();
        System.exit(31);
    }
}

2. 细节分析

2.1. 打印基本的环境信息

main入口执行的第一步是打印基本的环境信息。我们具体看下主要的逻辑：

/**
	 * 环境的日志信息, 像代码修订，当前用户，Java版本,和 JVM参数.
	 *
	 * @param log The logger to log the information to.
	 * @param componentName 日志中要提到的组件名称.
	 * @param commandLineArgs 启动组件时附带的参数。
	 */
public static void logEnvironmentInfo(Logger log, String componentName, String[] commandLineArgs) {
   
    if (log.isInfoEnabled()) {
   
        // 1. 得到代码git的最终提交id和日期
        RevisionInformation rev = getRevisionInformation();
        // 2. 代码版本
        String version = getVersion();
        // 3.JVM版本,利用JavaSDK自带的ManagementFactory类来获取。
        String jvmVersion = getJvmVersion();
        // 4. JVM的启动参数，也是通过JavaSDK自带的ManagementFactory类来获取。
        String[] options = getJvmStartupOptionsArray();
        // 5. JAVA_Home目录
        String javaHome = System.getenv("JAVA_HOME");
        // 6. JVM的最大堆内存大小，单位Mb。
        long maxHeapMegabytes = getMaxJvmHeapMemory() >>> 20;

        // 7. 打印基本信息
        log.info("--------------------------------------------------------------------------------");
        log.info(" Starting " + componentName + " (Version: " + version + ", "
                 + "Rev:" + rev.commitId + ", " + "Date:" + rev.commitDate + ")");
        log.info(" OS current user: " + System.getProperty("user.name"));
        log.info(" Current Hadoop/Kerberos user: " + getHadoopUser());
        log.info(" JVM: " + jvmVersion);
        log.info(" Maximum heap size: " + maxHeapMegabytes + " MiBytes");
        log.info(" JAVA_HOME: " + (javaHome == null ? "(not set)" : javaHome));
        // 打印出Hadoop的版本信息
        String hadoopVersionString = getHadoopVersionString();
        if (hadoopVersionString != null) {
   
            log.info(" Hadoop version: " + hadoopVersionString);
        } else {
   
            log.info(" No Hadoop Dependency available");
        }
        // 打印JVM运行 参数
        if (options.length == 0) {
   
            log.info(" JVM Options: (none)");
        }
        else {
   
            log.info(" JVM Options:");
            for (String s: options) {
   
                log.info("    " + s);
            }
        }
        // 打印任务程序启动参数
        if (commandLineArgs == null || commandLineArgs.length == 0) {
   
            log.info(" Program Arguments: (none)");
        }
        else {
   
            log.info(" Program Arguments:");
            for (String s: commandLineArgs) {
   
                log.info("    " + s);
            }
        }

        log.info(" Classpath: " + System.getProperty("java.class.path"));

        log.info("--------------------------------------------------------------------------------");
    }
}

2.2. 获取配置目录

代码如下：

public static String getConfigurationDirectoryFromEnv() {
   
		// 1. 得到环境变量的FLINK_CONF_DIR值
		String location = System.getenv(ConfigConstants.ENV_FLINK_CONF_DIR);

		if (location != null) {
   
			if (new File(location).exists()) {
   
				return location;
			}
			else {
   
				throw new RuntimeException("The configuration directory '" + location + "', specified in the '" +
					ConfigConstants.ENV_FLINK_CONF_DIR + "' environment variable, does not exist.");
			}
		}
		// 2. 这里是得到./conf目录
		else if (new File(CONFIG_DIRECTORY_FALLBACK_1).exists()) {
   
			location = CONFIG_DIRECTORY_FALLBACK_1;
		}
		// 3. 这里是得到conf目录
		else if (new File(CONFIG_DIRECTORY_FALLBACK_2).exists()) {
   
			location = CONFIG_DIRECTORY_FALLBACK_2;
		}
		else {
   
			throw new RuntimeException("The configuration directory was not specified. " +
					"Please specify the directory containing the configuration file through the '" +
				ConfigConstants.ENV_FLINK_CONF_DIR + "' environment variable.");
		}
		return location;
	}

2.3. 加载全局配置

将第2步获取到的配置路径作为参数传进GlobalConfiguration.loadConfiguration方法中，以此用来加载全局配置。看下具体的逻辑：

public static Configuration loadConfiguration(final String configDir) {
   
    return loadConfiguration(configDir, null);
}

进一步调用loadConfiguration方法：

public static Configuration loadConfiguration(final String configDir, @Nullable final Configuration dynamicProperties) {
   

    if (configDir == null) {
   
        throw new IllegalArgumentException("Given configuration directory is null, cannot load configuration");
    }

    final File confDirFile = new File(configDir);
    if (!(confDirFile.exists())) {
   
        throw new IllegalConfigurationException(
            "The given configuration directory name '" + configDir +
            "' (" + confDirFile.getAbsolutePath() + ") does not describe an existing directory.");
    }

    // 1. 得到flink-conf.yaml配置文件。
    final File yamlConfigFile = new File(confDirFile, FLINK_CONF_FILENAME);

    if (!yamlConfigFile.exists()) {
   
        throw new IllegalConfigurationException(
            "The Flink config file '" + yamlConfigFile +
            "' (" + confDirFile.getAbsolutePath() + ") does not exist.");
    }

    // 2. 核心逻辑，解析YAML配置文件
    Configuration configuration = loadYAMLResource(yamlConfigFile);

    if (dynamicProperties != null) {
   
        configuration.addAll(dynamicProperties);
    }

    return configuration;
}

代码可以看出来，加载全局配置的逻辑，是解析/conf/flink-conf.yaml文件，将里面的配置映射出来。存到Configuration中去。

2.4. 加载自定义命令行

任务提交方式有两种：yarn命令行提交模式和普通默认提交模式。看下具体逻辑：

/**
  * 加载自定义命令行
  * @param configuration 配置项
  * @param configurationDirectory  配置文件目录
  * @return
  */
public static List<CustomCommandLine<?>> loadCustomCommandLines(Configuration configuration, String configurationDirectory) {
   
    // 1. 初始化一个容量是2的命令栏容器。
    List<CustomCommandLine<?>> customCommandLines = new ArrayList<>(2);

     // 2. YARN会话的命令行接口，所有选项参数都是以y/yarn前缀。
    final String flinkYarnSessionCLI = "org.apache.flink.yarn.cli.FlinkYarnSessionCli";
    try {
   
        // 3. 添加yarn模式命令行
        customCommandLines.add(
            loadCustomCommandLine(flinkYarnSessionCLI,
                                  configuration,
                                  configurationDirectory,
                                  "y",
                                  "yarn"));
    } catch (NoClassDefFoundError | Exception e) {
   
        LOG.warn("Could not load CLI class {}.", flinkYarnSessionCLI, e);
    }

    // 4. 添加默认模式命令行
    customCommandLines.add(new DefaultCLI(configuration));

    return customCommandLines;
}

下面分别展开分析是怎么添加yarn模式命令行和默认模式命令行的。

添加yarn模式化命令行

/**
  * 通过反射构建命令行
  * @param className 加载的类名全程.
  * @param params 构建参数
  */
private static CustomCommandLine<?> loadCustomCommandLine(String className, Object... params) throws IllegalAccessException, InvocationTargetException, InstantiationException, ClassNotFoundException, NoSuchMethodException {
   

    // 1. 加载classpath里相关的类，这个加载的类实现了CustomCommandLine接口
    Class<? extends CustomCommandLine> customCliClass =
        Class.forName(

最低0.47元/天解锁文章

hxcaifly

关注

2
点赞
踩
11

收藏

觉得还不错? 一键收藏
2
评论
Flink提交任务（总篇）——执行逻辑整体分析

Flink客户端提交任务执行的逻辑分析针对Flink1.7-release版本前言Flink的源码体系比较庞大，一头扎进去，很容易一头雾水，不知道从哪部分代码看起。但是如果结合我们的业务开发，有针对性地去跟进源码去发现问题，理解源码里的执行细节，效果会更好。笔者在近期的Flink开发过程中，因为产品的原因，只允许部署Flink standalone模式，出于性能考虑，很有必要对其性能做下测...
复制链接

扫一扫

专栏目录