提交scala程序_从flink1.11的flinkcli了解flink作业的提交流程

当前数据平台的作业提交过程,一直是一个两阶段的提交过程,任务状态的汇报,两阶段通信,造成了任务队列多提交,状态更新过慢,状态不一致等问题。从flink1.11开始,flinkcli改进了flink run 的启动模式,新增了run-application模式。所以,我们就从flink1.11的源码探索一下flinkcli的启动流程,和run-application的启动模式,看有什么新东西,可以优化数据平台的作业提交流程。

当我们编写了一个flink作业提交后,通过jps命令,会发现,后台有一个org.apache.flink.client.cli.CliFrontend 的进程一直在运行,直到我们的作业正确的运行在集群中,这个进程会消失。这本质上其实就是我们的flink安装目录里会带有一个flink作业的提交程序,而且是通过java的main方法运行的。所以我们开始看一下这个org.apache.flink.client.cli.CliFrontend类的main方法,到底有哪些玄机。

    //是否开启info级别的日志打印,如果开启,则从环境变量中获取环境信息,    // 1. Logs information about the environment, like code revision, current user, Java version,and JVM parameters.        EnvironmentInformation.logEnvironmentInfo(LOG, "Command Line Client", args);    // 2. find the configuration directory    final String configurationDirectory = getConfigurationDirectoryFromEnv();    // 3. load the global configuration    //加载flink-conf.yaml中的全局配置转成Configuration对象    final Configuration configuration = GlobalConfiguration.loadConfiguration(configurationDirectory);    // 4. load the custom command lines    //加载所有的命令行模式     //先加载flinkYarnSessionCLI = "org.apache.flink.yarn.cli.FlinkYarnSessionCli";    //如果环境中没有则加载errorYarnSessionCLI = "org.apache.flink.yarn.cli.FallbackYarnSessionCli"    //无论加载哪一种CLI,最后将创建默认的CLI=> new DefaultCLI(configuration)    //最后必须添加DefaultCLI,因为getActiveCustomCommandLine(..)将按顺序获取活跃的CustomCommandLine,    //并且DefaultCLI isActive始终返回true。    final List customCommandLines = loadCustomCommandLines(        configuration,        configurationDirectory);    try {        //5. 创建当前类的对象,在创建对象过程中,初始化文件系统,同时为每个customCommandLines的对象添加        //org.apache.commons.cli.Options对象        final CliFrontend cli = new CliFrontend(            configuration,            customCommandLines);                //6. 通过配置文件,加载安全相关配置信息        SecurityUtils.install(new SecurityConfiguration(cli.configuration));        //7. 执行用户端程序        int retCode = SecurityUtils.getInstalledContext()                .runSecured(() -> cli.parseParameters(args));        System.exit(retCode);    }    catch (Throwable t) {        final Throwable strippedThrowable = ExceptionUtils.stripException(t, UndeclaredThrowableException.class);        LOG.error("Fatal error while running command line interface.", strippedThrowable);        strippedThrowable.printStackTrace();        System.exit(31);    }

逐步分析:

一、日志打印:

/*** Logs information about the environment, like code revision, current user, Java version,* and JVM parameters.** @param  log The logger to log the information to. * @param  componentName The component name to mention in the log. * @param  commandLineArgs The arguments accompanying the starting the component. */public static void logEnvironmentInfo(Logger log, String componentName, String[] commandLineArgs) {    //判断info级别是否开启    if (log.isInfoEnabled()) {        //1. 获取代码git的最终提交id和日期        //  public static RevisionInformation getRevisionInformation() {        //return new RevisionInformation(getGitCommitIdAbbrev(),        //                               getGitCommitTimeString());}        RevisionInformation rev = getRevisionInformation();        //获取flink的version        String version = getVersion();        //获取scala版本信息        String scalaVersion = getScalaVersion();        //JVM版本,利用JavaSDK自带的ManagementFactory类来获取。        String jvmVersion = getJvmVersion();        //JVM的启动参数,也是通过JavaSDK自带的ManagementFactory类来获取        String[] options = getJvmStartupOptionsArray();        //从环境变量中获取JAVA_HOME        String javaHome = System.getenv("JAVA_HOME");        //获取flink预配置日志        String inheritedLogs = System.getenv("FLINK_INHERITED_LOGS");                //JVM的最大堆内存大小,单位Mb        long maxHeapMegabytes = getMaxJvmHeapMemory() >>> 20;        if (inheritedLogs != null) {            log.info("--------------------------------------------------------------------------------");            log.info(" Preconfiguration: ");            log.info(inheritedLogs);        }        log.info("--------------------------------------------------------------------------------");        log.info(" Starting " + componentName + " (Version: " + version + ", Scala: " + scalaVersion + ", "                + "Rev:" + rev.commitId + ", " + "Date:" + rev.commitDate + ")");        log.info(" OS current user: " + System.getProperty("user.name"));        log.info(" Current Hadoop/Kerberos user: " + getHadoopUser());        log.info(" JVM: " + jvmVersion);        log.info(" Maximum heap size: " + maxHeapMegabytes + " MiBytes");        log.info(" JAVA_HOME: " + (javaHome == null ? "(not set)" : javaHome));        String hadoopVersionString = getHadoopVersionString();        //打印hadoop的版本信息        if (hadoopVersionString != null) {            log.info(" Hadoop version: " + hadoopVersionString);        } else {            log.info(" No Hadoop Dependency available");        }        if (options.length == 0) {            log.info(" JVM Options: (none)");        }        else {            log.info(" JVM Options:");            for (String s: options) {                log.info("    " + s);            }        }        if (commandLineArgs == null || commandLineArgs.length == 0) {            log.info(" Program Arguments: (none)");        }        else {            log.info(" Program Arguments:");            for (String s: commandLineArgs) {                log.info("    " + s);            }        }        log.info(" Classpath: " + System.getProperty("java.class.path"));        log.info("--------------------------------------------------------------------------------");    }}

二:检查配置文件目录

public static String getConfigurationDirectoryFromEnv() {        //从环境变量中获取flink的conf文件        String location = System.getenv(ConfigConstants.ENV_FLINK_CONF_DIR);        if (location != null) {            //检查是否存在,如果没有抛出异常            if (new File(location).exists()) {                return location;            }            else {                throw new RuntimeException("The configuration directory '" + location + "', specified in the '" +                    ConfigConstants.ENV_FLINK_CONF_DIR + "' environment variable, does not exist.");            }        }        //如果为空,则检查上级目录的conf是否存在,存在则返回        else if (new File(CONFIG_DIRECTORY_FALLBACK_1).exists()) {            location = CONFIG_DIRECTORY_FALLBACK_1;        }        //如果为空,则检查当前目录的conf是否存在,存在则返回        else if (new File(CONFIG_DIRECTORY_FALLBACK_2).exists()) {            location = CONFIG_DIRECTORY_FALLBACK_2;        }        else {            throw new RuntimeException("The configuration directory was not specified. " +                    "Please specify the directory containing the configuration file through the '" +                ConfigConstants.ENV_FLINK_CONF_DIR + "' environment variable.");        }        return location;}

三:加载配置

//调用加载配置项,动态参数默认为空public static Configuration loadConfiguration(final String configDir) {        return loadConfiguration(configDir, null);}/*** Loads the configuration files from the specified directory. If the dynamic properties* configuration is not null, then it is added to the loaded configuration.** @param  configDir directory to load the configuration from * @param  dynamicProperties configuration file containing the dynamic properties. Null if none. * @return  The configuration loaded from the given configuration directory */public static Configuration loadConfiguration(final String configDir, @Nullable  final Configuration dynamicProperties) {    //先判断配置文件是否传递    if (configDir == null) {        throw new IllegalArgumentException("Given configuration directory is null, cannot load configuration");    }    //检查配置目录是否存在    final File confDirFile = new File(configDir);    if (!(confDirFile.exists())) {        throw new IllegalConfigurationException(            "The given configuration directory name '" + configDir +                "' (" + confDirFile.getAbsolutePath() + ") does not describe an existing directory.");    }    // get Flink yaml configuration file    //再检查link的配置文件flink-conf.yaml是否存在    final File yamlConfigFile = new File(confDirFile, FLINK_CONF_FILENAME);    if (!yamlConfigFile.exists()) {        throw new IllegalConfigurationException(            "The Flink config file '" + yamlConfigFile +                "' (" + confDirFile.getAbsolutePath() + ") does not exist.");    }    //将配置文件转换为配置对象    Configuration configuration = loadYAMLResource(yamlConfigFile);    //判断是否有动态传递的配置    if (dynamicProperties != null) {        configuration.addAll(dynamicProperties);    }    return configuration;}

四:封装用户输入的命令行参数

    public static ListloadCustomCommandLines(Configuration configuration, String configurationDirectory) {        List customCommandLines = new ArrayList<>();        customCommandLines.add(new GenericCLI(configuration, configurationDirectory));        //  Command line interface of the YARN session, with a special initialization here        //  to prefix all options with y/yarn.        final String flinkYarnSessionCLI = "org.apache.flink.yarn.cli.FlinkYarnSessionCli";        try {            //通过反射将flinkYarnSessionCLI 加载进内存,如果当前环境中没有加载则抛出异常            //yarn的会话选项参数以y或者yarn为前缀            customCommandLines.add(                loadCustomCommandLine(flinkYarnSessionCLI,                    configuration,                    configurationDirectory,                    "y",                    "yarn"));        } catch (NoClassDefFoundError | Exception e) {            //加载errorYarnSessionCLI的实例            final String errorYarnSessionCLI = "org.apache.flink.yarn.cli.FallbackYarnSessionCli";            try {                LOG.info("Loading FallbackYarnSessionCli");                customCommandLines.add(                        loadCustomCommandLine(errorYarnSessionCLI, configuration));            } catch (Exception exception) {                //如果没有则继续                LOG.warn("Could not load CLI class {}.", flinkYarnSessionCLI, e);            }        }        //  Tips: DefaultCLI must be added at last, because getActiveCustomCommandLine(..) will get the        //        active CustomCommandLine in order and DefaultCLI isActive always return true.        //将默认的cli加入到用户命令行中,一定最后加入。因为默认的CLI是活跃的        customCommandLines.add(new DefaultCLI(configuration));        return customCommandLines;    }    private static CustomCommandLine loadCustomCommandLine(String className, Object... params) throws Exception {        //将类名实例化,该实例需要实现了org.apache.flink.client.cli.CustomCommandLine        Class extends CustomCommandLine> customCliClass =            Class.forName(className).asSubclass(CustomCommandLine.class);        // construct class types from the parameters        // 获取所有参数的类型        Class>[] types = new Class>[params.length];        for (int i = 0; i < params.length; i++) {            checkNotNull(params[i], "Parameters for custom command-lines may not be null.");            types[i] = params[i].getClass();        }                //通过参数类型获取构造器实例        Constructor extends CustomCommandLine> constructor = customCliClass.getConstructor(types);        //实例化对象,此处调用cli实现类的构造方法。        return constructor.newInstance(params);    }

org.apache.flink.yarn.cli.FlinkYarnSessionCli类构造方法:

    //反射调用当前构造器实例化    public FlinkYarnSessionCli(            Configuration configuration,            String configurationDirectory,            String shortPrefix,            String longPrefix) throws FlinkException {        this(configuration, new DefaultClusterClientServiceLoader(), configurationDirectory, shortPrefix, longPrefix, true);    }    //调用最终构造器    public FlinkYarnSessionCli(            Configuration configuration,            ClusterClientServiceLoader clusterClientServiceLoader,            String configurationDirectory,            String shortPrefix,            String longPrefix,            //交互式的参数输入            boolean acceptInteractiveInput) throws FlinkException {        super(configuration);        this.clusterClientServiceLoader = checkNotNull(clusterClientServiceLoader);        this.configurationDirectory = checkNotNull(configurationDirectory);        this.acceptInteractiveInput = acceptInteractiveInput;        // Create the command line options        //支持的参数选项        query = new Option(shortPrefix + "q", longPrefix + "query", false, "Display available YARN resources (memory, cores)");        applicationId = new Option(shortPrefix + "id", longPrefix + "applicationId", true, "Attach to running YARN session");        queue = new Option(shortPrefix + "qu", longPrefix + "queue", true, "Specify YARN queue.");        shipPath = new Option(shortPrefix + "t", longPrefix + "ship", true, "Ship files in the specified directory (t for transfer)");        flinkJar = new Option(shortPrefix + "j", longPrefix + "jar", true, "Path to Flink jar file");        jmMemory = new Option(shortPrefix + "jm", longPrefix + "jobManagerMemory", true, "Memory for JobManager Container with optional unit (default: MB)");        tmMemory = new Option(shortPrefix + "tm", longPrefix + "taskManagerMemory", true, "Memory per TaskManager Container with optional unit (default: MB)");        slots = new Option(shortPrefix + "s", longPrefix + "slots", true, "Number of slots per TaskManager");        dynamicproperties = Option.builder(shortPrefix + "D")            .argName("property=value")            .numberOfArgs(2)            .valueSeparator()            .desc("use value for given property")            .build();        name = new Option(shortPrefix + "nm", longPrefix + "name", true, "Set a custom name for the application on YARN");        applicationType = new Option(shortPrefix + "at", longPrefix + "applicationType", true, "Set a custom application type for the application on YARN");        zookeeperNamespace = new Option(shortPrefix + "z", longPrefix + "zookeeperNamespace", true, "Namespace to create the Zookeeper sub-paths for high availability mode");        nodeLabel = new Option(shortPrefix + "nl", longPrefix + "nodeLabel", true, "Specify YARN node label for the YARN application");        help = new Option(shortPrefix + "h", longPrefix + "help", false, "Help for the Yarn session CLI.");        allOptions = new Options();        allOptions.addOption(flinkJar);        allOptions.addOption(jmMemory);        allOptions.addOption(tmMemory);        allOptions.addOption(queue);        allOptions.addOption(query);        allOptions.addOption(shipPath);        allOptions.addOption(slots);        allOptions.addOption(dynamicproperties);        allOptions.addOption(DETACHED_OPTION);        allOptions.addOption(YARN_DETACHED_OPTION);        allOptions.addOption(name);        allOptions.addOption(applicationId);        allOptions.addOption(applicationType);        allOptions.addOption(zookeeperNamespace);        allOptions.addOption(nodeLabel);        allOptions.addOption(help);        // try loading a potential yarn properties file        //尝试加载yarn的配置文件        this.yarnPropertiesFileLocation = configuration.getString(YarnConfigOptions.PROPERTIES_FILE_LOCATION);        final File yarnPropertiesLocation = getYarnPropertiesLocation(yarnPropertiesFileLocation);        yarnPropertiesFile = new Properties();        if (yarnPropertiesLocation.exists()) {            LOG.info("Found Yarn properties file under {}.", yarnPropertiesLocation.getAbsolutePath());            try (InputStream is = new FileInputStream(yarnPropertiesLocation)) {                yarnPropertiesFile.load(is);            } catch (IOException ioe) {                throw new FlinkException("Could not read the Yarn properties file " + yarnPropertiesLocation +                    ". Please delete the file at " + yarnPropertiesLocation.getAbsolutePath() + '.', ioe);            }            //从yarn的配置项中获取applicationId的字符串            final String yarnApplicationIdString = yarnPropertiesFile.getProperty(YARN_APPLICATION_ID_KEY);            if (yarnApplicationIdString == null) {                throw new FlinkException("Yarn properties file found but doesn't contain a " +                    "Yarn application id. Please delete the file at " + yarnPropertiesLocation.getAbsolutePath());            }            try {                // try converting id to ApplicationId                // 将字符串转换为Application对象                yarnApplicationIdFromYarnProperties = ConverterUtils.toApplicationId(yarnApplicationIdString);            }            catch (Exception e) {                throw new FlinkException("YARN properties contain an invalid entry for " +                    "application id: " + yarnApplicationIdString + ". Please delete the file at " +                    yarnPropertiesLocation.getAbsolutePath(), e);            }        } else {            yarnApplicationIdFromYarnProperties = null;        }    }

五:创建CliFrontend对象,即当前类对象

    public CliFrontend(            Configuration configuration,            ClusterClientServiceLoader clusterClientServiceLoader,            List customCommandLines) {        this.configuration = checkNotNull(configuration);        this.customCommandLines = checkNotNull(customCommandLines);        this.clusterClientServiceLoader = checkNotNull(clusterClientServiceLoader);        //初始化文件系统        FileSystem.initialize(configuration, PluginUtils.createPluginManagerFromRootFolder(configuration));        this.customCommandLineOptions = new Options();                //添加参数选项        for (CustomCommandLine customCommandLine : customCommandLines) {            //添加到一般配置            customCommandLine.addGeneralOptions(customCommandLineOptions);            //添加到运行时参数            customCommandLine.addRunOptions(customCommandLineOptions);        }        //设置客户端超时时间        this.clientTimeout = configuration.get(ClientOptions.CLIENT_TIMEOUT);        //设置默认的并行度        this.defaultParallelism = configuration.getInteger(CoreOptions.DEFAULT_PARALLELISM);    }

六:加载安全配置项

SecurityUtils.install(new SecurityConfiguration(cli.configuration));

七:根据命令行参数运行客户端代码

//调用匿名方法,并返回状态码    int retCode = SecurityUtils.getInstalledContext()        //调用当前类的parseParameters方法                    .runSecured(() -> cli.parseParameters(args));//匿名方法    /**     * Parses the command line arguments and starts the requested action.     *     * @param args command line arguments of the client.     * @return The return code of the program     *  ACTION_RUN = "run";     *  ACTION_RUN_APPLICATION = "run-application";     *  ACTION_INFO = "info";     *  ACTION_LIST = "list";     *  ACTION_CANCEL = "cancel";     *  ACTION_STOP = "stop";     *  ACTION_SAVEPOINT = "savepoint";     */    public int parseParameters(String[] args) {        // check for action        if (args.length < 1) {            CliFrontendParser.printHelp(customCommandLines);            System.out.println("Please specify an action.");            return 1;        }        // get action 获取动作        // flink run -m yarn-cluster -ynm datasync-item-itemcommon -yqu common -ys 2 -ytm 2048 -d -c com.wwdz.bigdata.flink.streaming.ItemDataSyncJob /home/flink/submitjar/trace-etl/datasync/master/20210114/datasync-0.1.jar         String action = args[0];        // remove action from parameters        //保留参数,移除动作        final String[] params = Arrays.copyOfRange(args, 1, args.length);        try {            // do action            switch (action) {                case ACTION_RUN:                    //启动一般的运行方法,并提交客户端代码                    run(params);                    return 0;                case ACTION_RUN_APPLICATION:                    //启动application模式,并提交客户端代码                    runApplication(params);                    return 0;                    //获取运行列表                case ACTION_LIST:                    list(params);                    return 0;                case ACTION_INFO:                    info(params);                    return 0;                    //                case ACTION_CANCEL:                    cancel(params);                    return 0;                case ACTION_STOP:                    stop(params);                    return 0;                case ACTION_SAVEPOINT:                    savepoint(params);                    return 0;                case "-h":                case "--help":                    CliFrontendParser.printHelp(customCommandLines);                    return 0;                case "-v":                case "--version":                    String version = EnvironmentInformation.getVersion();                    String commitID = EnvironmentInformation.getRevisionInformation().commitId;                    System.out.print("Version: " + version);                    System.out.println(commitID.equals(EnvironmentInformation.UNKNOWN) ? "" : ", Commit ID: " + commitID);                    return 0;                default:                    System.out.printf("\"%s\" is not a valid action.\n", action);                    System.out.println();                    System.out.println("Valid actions are \"run\", \"list\", \"info\", \"savepoint\", \"stop\", or \"cancel\".");                    System.out.println();                    System.out.println("Specify the version option (-v or --version) to print Flink version.");                    System.out.println();                    System.out.println("Specify the help option (-h or --help) to get help on the command.");                    return 1;            }        } catch (CliArgsException ce) {            return handleArgException(ce);        } catch (ProgramParametrizationException ppe) {            return handleParametrizationException(ppe);        } catch (ProgramMissingJobException pmje) {            return handleMissingJobException();        } catch (Exception e) {            return handleError(e);        }    }
  1. 运行方法

 
   /**     * Executions the run action.     *     * @param args Command line arguments for the run action.     */    protected void run(String[] args) throws Exception {        LOG.info("Running 'run' command.");        //获取命令行参数选项        final Options commandOptions = CliFrontendParser.getRunCommandOptions();        //合并用户参数到命令行运行参数中,并封装到命令行对象中        final CommandLine commandLine = getCommandLine(commandOptions, args, true);        // evaluate help flag        //如果是help命令则返回        if (commandLine.hasOption(HELP_OPTION.getOpt())) {            CliFrontendParser.printHelpForRun(customCommandLines);            return;        }        //检查并返回要使用的用户自定义的CustomCommandLine,此时调用的方法为当前类的验证和获取使用的CommandLine方法:        //  /**        //     * Gets the custom command-line for the arguments.        //    * @param commandLine The input to the command-line.        //     * @return custom command-line which is active (may only be one at a time)        //     */        //    public CustomCommandLine validateAndGetActiveCommandLine(CommandLine commandLine) {        //        LOG.debug("Custom commandlines: {}", customCommandLines);        //            //customCommandLines为main 初始化的只有两个元素的CustomCommandLine对象集合        //        for (CustomCommandLine cli : customCommandLines) {        //            LOG.debug("Checking custom commandline {}, isActive: {}", cli, cli.isActive(commandLine));        //            if (cli.isActive(commandLine)) {        //                return cli;        //            }        //        }        //        throw new IllegalStateException("No valid command-line found.");        //    }        final CustomCommandLine activeCommandLine =                validateAndGetActiveCommandLine(checkNotNull(commandLine));        //解析提交参数        //  public static ProgramOptions create(CommandLine line) throws CliArgsException {        //    if (isPythonEntryPoint(line) || containsPythonDependencyOptions(line)) {        //        //当提交端环境是pyflink时,创建py的提交参数        //        return createPythonProgramOptions(line);        //    } else {        //        //创建java执行参数        //        return new ProgramOptions(line);        //    }        // }        final ProgramOptions programOptions = ProgramOptions.create(commandLine);        //获取运行程序入口        final PackagedProgram program =                getPackagedProgram(programOptions);        //获取所有依赖的lib的url        final List jobJars = program.getJobJarAndDependencies();        //获取有效的参数,并封装到configuration        final Configuration effectiveConfiguration = getEffectiveConfiguration(                activeCommandLine, commandLine, programOptions, jobJars);        LOG.debug("Effective executor configuration: {}", effectiveConfiguration);        try {            executeProgram(effectiveConfiguration, program);        } finally {            //清理加载进临时目录的lib            program.deleteExtractedLibraries();        }    }

1.1 new programOptions(line)方法

    protected ProgramOptions(CommandLine line) throws CliArgsException {        super(line);        //获取程序入口主类        this.entryPointClass = line.hasOption(CLASS_OPTION.getOpt()) ?            line.getOptionValue(CLASS_OPTION.getOpt()) : null;                //获取jar位置        this.jarFilePath = line.hasOption(JAR_OPTION.getOpt()) ?            line.getOptionValue(JAR_OPTION.getOpt()) : null;        //数组结构,抽取用户提交的参数        this.programArgs = extractProgramArgs(line);        //获取jar文件存在的classpath        List classpaths = new ArrayList();        if (line.hasOption(CLASSPATH_OPTION.getOpt())) {            for (String path : line.getOptionValues(CLASSPATH_OPTION.getOpt())) {                try {                    classpaths.add(new URL(path));                } catch (MalformedURLException e) {                    throw new CliArgsException("Bad syntax for classpath: " + path);                }            }        }        this.classpaths = classpaths;        //获取并行度        if (line.hasOption(PARALLELISM_OPTION.getOpt())) {            String parString = line.getOptionValue(PARALLELISM_OPTION.getOpt());            try {                parallelism = Integer.parseInt(parString);                if (parallelism <= 0) {                    throw new NumberFormatException();                }            }            catch (NumberFormatException e) {                throw new CliArgsException("The parallelism must be a positive number: " + parString);            }        }        else {            //如果没有时,则并行度为-1。此时代表为默认并行度。            parallelism = ExecutionConfig.PARALLELISM_DEFAULT;        }        //获取运行模式,如果加了-d 或者-yd则后台运行模式        detachedMode = line.hasOption(DETACHED_OPTION.getOpt()) || line.hasOption(YARN_DETACHED_OPTION.getOpt());        //如果加了sae,则为前台运行模式,比如用户使用control+c时shutdown        shutdownOnAttachedExit = line.hasOption(SHUTDOWN_IF_ATTACHED_OPTION.getOpt());        //获取savepoint设置,当用户指定了-n时,则忽略无法映射的状态,参考代码见下文        this.savepointSettings = CliFrontendParser.createSavepointRestoreSettings(line);    }

1.2 常见运行参数:

org.apache.flink.client.cli.CliFrontendParser

    static final Option HELP_OPTION = new Option("h", "help", false,            "Show the help message for the CLI Frontend or the action.");    static final Option JAR_OPTION = new Option("j", "jarfile", true, "Flink program JAR file.");    static final Option CLASS_OPTION = new Option("c", "class", true,            "Class with the program entry point (\"main()\" method). Only needed if the " +            "JAR file does not specify the class in its manifest.");    static final Option CLASSPATH_OPTION = new Option("C", "classpath", true, "Adds a URL to each user code " +            "classloader  on all nodes in the cluster. The paths must specify a protocol (e.g. file://) and be " +                    "accessible on all nodes (e.g. by means of a NFS share). You can use this option multiple " +                    "times for specifying more than one URL. The protocol must be supported by the " +                    "{@link java.net.URLClassLoader}.");    public static final Option PARALLELISM_OPTION = new Option("p", "parallelism", true,            "The parallelism with which to run the program. Optional flag to override the default value " +            "specified in the configuration.");    /**     * @deprecated This has no effect anymore, we're keeping it to not break existing bash scripts.     */    @Deprecated    static final Option LOGGING_OPTION = new Option("q", "sysoutLogging", false, "If present, " +            "suppress logging output to standard out.");    public static final Option DETACHED_OPTION = new Option("d", "detached", false, "If present, runs " +            "the job in detached mode");    public static final Option SHUTDOWN_IF_ATTACHED_OPTION = new Option(        "sae", "shutdownOnAttachedExit", false,        "If the job is submitted in attached mode, perform a best-effort cluster shutdown " +            "when the CLI is terminated abruptly, e.g., in response to a user interrupt, such as typing Ctrl + C.");    /**     * @deprecated use non-prefixed variant {@link #DETACHED_OPTION} for both YARN and non-YARN deployments     */    @Deprecated    public static final Option YARN_DETACHED_OPTION = new Option("yd", "yarndetached", false, "If present, runs " +        "the job in detached mode (deprecated; use non-YARN specific option instead)");    public static final Option ARGS_OPTION = new Option("a", "arguments", true,            "Program arguments. Arguments can also be added without -a, simply as trailing parameters.");    public static final Option ADDRESS_OPTION = new Option("m", "jobmanager", true,            "Address of the JobManager to which to connect. " +            "Use this flag to connect to a different JobManager than the one specified in the configuration.");    public static final Option SAVEPOINT_PATH_OPTION = new Option("s", "fromSavepoint", true,            "Path to a savepoint to restore the job from (for example hdfs:///flink/savepoint-1537).");    public static final Option SAVEPOINT_ALLOW_NON_RESTORED_OPTION = new Option("n", "allowNonRestoredState", false,            "Allow to skip savepoint state that cannot be restored. " +                    "You need to allow this if you removed an operator from your " +                    "program that was part of the program when the savepoint was triggered.");    static final Option SAVEPOINT_DISPOSE_OPTION = new Option("d", "dispose", true,            "Path of savepoint to dispose.");    // list specific options    static final Option RUNNING_OPTION = new Option("r", "running", false,            "Show only running programs and their JobIDs");    static final Option SCHEDULED_OPTION = new Option("s", "scheduled", false,            "Show only scheduled programs and their JobIDs");    static final Option ALL_OPTION = new Option("a", "all", false,        "Show all programs and their JobIDs");    static final Option ZOOKEEPER_NAMESPACE_OPTION = new Option("z", "zookeeperNamespace", true,            "Namespace to create the Zookeeper sub-paths for high availability mode");                            .....................                                篇幅限制不在赘述

1.3 获取运行程序入口

 
   //调用链    private PackagedProgram getPackagedProgram(ProgramOptions programOptions) throws ProgramInvocationException, CliArgsException {        PackagedProgram program;        try {            LOG.info("Building program from JAR file");            program = buildProgram(programOptions);        } catch (FileNotFoundException e) {            throw new CliArgsException("Could not build the program from JAR file: " + e.getMessage(), e);        }        return program;    }    //调用此方法    private PackagedProgram(            @Nullable File jarFile,            List classpaths,            @Nullable String entryPointClassName,            Configuration configuration,            SavepointRestoreSettings savepointRestoreSettings,            String... args) throws ProgramInvocationException {        this.classpaths = checkNotNull(classpaths);        this.savepointSettings = checkNotNull(savepointRestoreSettings);        this.args = checkNotNull(args);        //检查两者其一不为null        checkArgument(jarFile != null || entryPointClassName != null, "Either the jarFile or the entryPointClassName needs to be non-null.");        // whether the job is a Python job.        this.isPython = isPython(entryPointClassName);        // load the jar file if exists        //如果存在则将jarFile赋值给当前jarFile对象        this.jarFile = loadJarFile(jarFile);        assert this.jarFile != null || entryPointClassName != null;        // now that we have an entry point, we can extract the nested jar files (if any)        //将所有运行时lib下的包抽取到临时目录        this.extractedTempLibraries = this.jarFile == null ? Collections.emptyList() : extractContainedLibraries(this.jarFile);        //封装类加载器        this.userCodeClassLoader = ClientUtils.buildUserCodeClassLoader(            getJobJarAndDependencies(),            classpaths,            getClass().getClassLoader(),            configuration);        // load the entry point class        //找到用户代码主类        this.mainClass = loadMainClass(            // if no entryPointClassName name was given, we try and look one up through the manifest            entryPointClassName != null ? entryPointClassName : getEntryPointClassNameFromJar(this.jarFile),            userCodeClassLoader);        //如果没有main方法入口则抛出异常        if (!hasMainMethod(mainClass)) {            throw new ProgramInvocationException("The given program class does not have a main(String[]) method.");        }    }

1.4 运行前参数准备阶段

    private Configuration getEffectiveConfiguration(            final CustomCommandLine activeCustomCommandLine,            final CommandLine commandLine,            final ProgramOptions programOptions,            final List jobJars) throws FlinkException {        //检查        final ExecutionConfigAccessor executionParameters = ExecutionConfigAccessor.fromProgramOptions(                checkNotNull(programOptions),                checkNotNull(jobJars));        //检查,并且将CMD的参数初始化到Configuration中,接下来以org.apache.flink.yarn.cli.FlinkYarnSessionCli#applyCommandLineOptionsToConfiguration为例        final Configuration executorConfig = checkNotNull(activeCustomCommandLine)                .applyCommandLineOptionsToConfiguration(commandLine);        final Configuration effectiveConfiguration = new Configuration(executorConfig);        executionParameters.applyToConfiguration(effectiveConfiguration);        LOG.debug("Effective executor configuration: {}", effectiveConfiguration);        return effectiveConfiguration;    }    @Override    public Configuration applyCommandLineOptionsToConfiguration(CommandLine commandLine) throws FlinkException {        // we ignore the addressOption because it can only contain "yarn-cluster"        final Configuration effectiveConfiguration = new Configuration(configuration);        applyDescriptorOptionToConfig(commandLine, effectiveConfiguration);        //获取applicationId对象        final ApplicationId applicationId = getApplicationId(commandLine);        if (applicationId != null) {            final String zooKeeperNamespace;            if (commandLine.hasOption(zookeeperNamespace.getOpt())){                //获取zk命名空间                zooKeeperNamespace = commandLine.getOptionValue(zookeeperNamespace.getOpt());            } else {                zooKeeperNamespace = effectiveConfiguration.getString(HA_CLUSTER_ID, applicationId.toString());            }            //获取集群的id            effectiveConfiguration.setString(HA_CLUSTER_ID, zooKeeperNamespace);            //获取yarn applicationId            effectiveConfiguration.setString(YarnConfigOptions.APPLICATION_ID, ConverterUtils.toString(applicationId));            //获取部署模式            effectiveConfiguration.setString(DeploymentOptions.TARGET, YarnSessionClusterExecutor.NAME);        } else {            effectiveConfiguration.setString(DeploymentOptions.TARGET, YarnJobClusterExecutor.NAME);        }        //获取jm分配内存        if (commandLine.hasOption(jmMemory.getOpt())) {            String jmMemoryVal = commandLine.getOptionValue(jmMemory.getOpt());            if (!MemorySize.MemoryUnit.hasUnit(jmMemoryVal)) {                jmMemoryVal += "m";            }            effectiveConfiguration.set(JobManagerOptions.TOTAL_PROCESS_MEMORY, MemorySize.parse(jmMemoryVal));        }        //获取tm分配内存        if (commandLine.hasOption(tmMemory.getOpt())) {            String tmMemoryVal = commandLine.getOptionValue(tmMemory.getOpt());            if (!MemorySize.MemoryUnit.hasUnit(tmMemoryVal)) {                tmMemoryVal += "m";            }            effectiveConfiguration.set(TaskManagerOptions.TOTAL_PROCESS_MEMORY, MemorySize.parse(tmMemoryVal));        }        //获取slots个数        if (commandLine.hasOption(slots.getOpt())) {            effectiveConfiguration.setInteger(TaskManagerOptions.NUM_TASK_SLOTS, Integer.parseInt(commandLine.getOptionValue(slots.getOpt())));        }        //检查其他命令行参数        dynamicPropertiesEncoded = encodeDynamicProperties(commandLine);        if (!dynamicPropertiesEncoded.isEmpty()) {            Map dynProperties = getDynamicProperties(dynamicPropertiesEncoded);            for (Map.Entry dynProperty : dynProperties.entrySet()) {                effectiveConfiguration.setString(dynProperty.getKey(), dynProperty.getValue());            }        }        //检查是否是yarn的Detached模式,最后将提取的参数返回        if (isYarnPropertiesFileMode(commandLine)) {            return applyYarnProperties(effectiveConfiguration);        } else {            return effectiveConfiguration;        }    }

1.5 执行用户程序

    protected void executeProgram(final Configuration configuration, final PackagedProgram program) throws ProgramInvocationException {        ClientUtils.executeProgram(new DefaultExecutorServiceLoader(), configuration, program, false, false);    }    public static void executeProgram(            PipelineExecutorServiceLoader executorServiceLoader,            Configuration configuration,            PackagedProgram program,            boolean enforceSingleJobExecution,            boolean suppressSysout) throws ProgramInvocationException {        checkNotNull(executorServiceLoader);        //获取上一部初始化的PackagedProgram内置的classLoader        final ClassLoader userCodeClassLoader = program.getUserCodeClassLoader();        final ClassLoader contextClassLoader = Thread.currentThread().getContextClassLoader();        try {            Thread.currentThread().setContextClassLoader(userCodeClassLoader);            LOG.info("Starting program (detached: {})", !configuration.getBoolean(DeploymentOptions.ATTACHED));            //设置上下文环境            ContextEnvironment.setAsContext(                executorServiceLoader,                configuration,                userCodeClassLoader,                enforceSingleJobExecution,                suppressSysout);            //设置StreamContext上下文环境            StreamContextEnvironment.setAsContext(                executorServiceLoader,                configuration,                userCodeClassLoader,                enforceSingleJobExecution,                suppressSysout);            try {                //调用用户主类方法,最终执行程序                program.invokeInteractiveModeForExecution();            } finally {                //恢复上下文环境                ContextEnvironment.unsetAsContext();                //恢复StreamContext上下文环境                StreamContextEnvironment.unsetAsContext();            }        } finally {            Thread.currentThread().setContextClassLoader(contextClassLoader);        }    }

至此,基于org.apache.flink.yarn.cli.FlinkYarnSessionCli和run模式的运行流程结束,通过执行用户端代码的过程,发现1.11的和早期版本已经差别很大,但是基于1.11又有了run-application运行模式,这个模式和之前的run运行模式非常相似。

    protected void runApplication(String[] args) throws Exception {        LOG.info("Running 'run-application' command.");        //获取命令行参数选项        final Options commandOptions = CliFrontendParser.getRunCommandOptions();        //合并用户参数到命令行运行参数中,并封装到命令行对象中        final CommandLine commandLine = getCommandLine(commandOptions, args, true);         //如果是help命令则返回        if (commandLine.hasOption(HELP_OPTION.getOpt())) {            CliFrontendParser.printHelpForRun(customCommandLines);            return;        }        //同上,验证CMD参数,并获取要使用的CommandLine实例        final CustomCommandLine activeCommandLine =                validateAndGetActiveCommandLine(checkNotNull(commandLine));        //封装提交端的参数到ProgramOptions对象中        final ProgramOptions programOptions = new ProgramOptions(commandLine);        //创建应用部署实例        final ApplicationDeployer deployer =                new ApplicationClusterDeployer(clusterClientServiceLoader);        //检验jar包是否存在        programOptions.validate();        final URI uri = PackagedProgramUtils.resolveURI(programOptions.getJarFilePath());         //获取有效的参数,并封装到configuration        final Configuration effectiveConfiguration = getEffectiveConfiguration(                activeCommandLine, commandLine, programOptions, Collections.singletonList(uri.toString()));        final ApplicationConfiguration applicationConfiguration =            //通过用户的入参,和主类创建一个ApplicationConfiguration配置类实例                new ApplicationConfiguration(programOptions.getProgramArgs(), programOptions.getEntryPointClassName());        //使用deployr实例部署应用        deployer.run(effectiveConfiguration, applicationConfiguration);    }

继续看:org.apache.flink.client.deployment.application.cli.ApplicationClusterDeployer#run

    public void run(            final Configuration configuration,            final ApplicationConfiguration applicationConfiguration) throws Exception {        checkNotNull(configuration);        checkNotNull(applicationConfiguration);        LOG.info("Submitting application in 'Application Mode'.");        //实例化集群客户端        final ClusterClientFactory clientFactory = clientServiceLoader.getClusterClientFactory(configuration);        try (final ClusterDescriptor clusterDescriptor = clientFactory.createClusterDescriptor(configuration)) {            //获取集群实例            final ClusterSpecification clusterSpecification = clientFactory.getClusterSpecification(configuration);            //部署应用            clusterDescriptor.deployApplicationCluster(clusterSpecification, applicationConfiguration);        }    }

此时部署应用时,调用了ClusterDescriptor接口下的deployApplicationCluster方法,该方法有两个实现类:

YarnClusterDescriptor和StandaloneClusterDescriptor。

我们主要看YarnClusterDescriptor的方法调用:

    @Override    public ClusterClientProviderdeployApplicationCluster(            final ClusterSpecification clusterSpecification,            final ApplicationConfiguration applicationConfiguration) throws ClusterDeploymentException {        checkNotNull(clusterSpecification);        checkNotNull(applicationConfiguration);        //获取部署模式        final YarnDeploymentTarget deploymentTarget = YarnDeploymentTarget.fromConfig(flinkConfiguration);        //限定部署模式必须为run-application模式        if (YarnDeploymentTarget.APPLICATION != deploymentTarget) {            throw new ClusterDeploymentException(                    "Couldn't deploy Yarn Application Cluster." +                            " Expected deployment.target=" + YarnDeploymentTarget.APPLICATION.getName() +                            " but actual one was \"" + deploymentTarget.getName() + "\"");        }        //吸收flink的配置        applicationConfiguration.applyToConfiguration(flinkConfiguration);        //获取jar,并检查是否有多个        final List pipelineJars = flinkConfiguration.getOptional(PipelineOptions.JARS).orElse(Collections.emptyList());        Preconditions.checkArgument(pipelineJars.size() == 1, "Should only have one jar");        try {            //开始将应用部署到yarn集群上            return deployInternal(                    clusterSpecification,                    "Flink Application Cluster",                    YarnApplicationClusterEntryPoint.class.getName(),                    null,                    false);        } catch (Exception e) {            throw new ClusterDeploymentException("Couldn't deploy Yarn Application Cluster", e);        }    }

接下来看最后一步:

    /**     * 该方法将会阻塞,直到applicationmaster或者jobManager部署到yarn上     * This method will block until the ApplicationMaster/JobManager have been deployed on YARN.     *     * @param clusterSpecification Initial cluster specification for the Flink cluster to be deployed     * @param applicationName name of the Yarn application to start     * @param yarnClusterEntrypoint Class name of the Yarn cluster entry point.     * @param jobGraph A job graph which is deployed with the Flink cluster, {@code null} if none     * @param detached True if the cluster should be started in detached mode     */    private ClusterClientProviderdeployInternal(            ClusterSpecification clusterSpecification,            String applicationName,            String yarnClusterEntrypoint,            @Nullable JobGraph jobGraph,            boolean detached) throws Exception {        //获取执行的用户        final UserGroupInformation currentUser = UserGroupInformation.getCurrentUser();        if (HadoopUtils.isKerberosSecurityEnabled(currentUser)) {            boolean useTicketCache = flinkConfiguration.getBoolean(SecurityOptions.KERBEROS_LOGIN_USETICKETCACHE);            if (!HadoopUtils.areKerberosCredentialsValid(currentUser, useTicketCache)) {                throw new RuntimeException("Hadoop security with Kerberos is enabled but the login user " +                    "does not have Kerberos credentials or delegation tokens!");            }        }        isReadyForDeployment(clusterSpecification);        // ------------------ Check if the specified queue exists --------------------        //检查yarn队列        checkYarnQueues(yarnClient);        // ------------------ Check if the YARN ClusterClient has the requested resources --------------        // Create application via yarnClient        //创建一个yarnClient        final YarnClientApplication yarnApplication = yarnClient.createApplication();        final GetNewApplicationResponse appResponse = yarnApplication.getNewApplicationResponse();        //获取最大资源        Resource maxRes = appResponse.getMaximumResourceCapability();        //获取空闲资源        final ClusterResourceDescription freeClusterMem;        try {            freeClusterMem = getCurrentFreeClusterResources(yarnClient);        } catch (YarnException | IOException e) {            failSessionDuringDeployment(yarnClient, yarnApplication);            throw new YarnDeploymentException("Could not retrieve information about free cluster resources.", e);        }        //获取允许使用的最小内存资源        final int yarnMinAllocationMB = yarnConfiguration.getInt(YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB, 0);        final ClusterSpecification validClusterSpecification;        try {            validClusterSpecification = validateClusterResources(                    clusterSpecification,                    yarnMinAllocationMB,                    maxRes,                    freeClusterMem);        } catch (YarnDeploymentException yde) {            failSessionDuringDeployment(yarnClient, yarnApplication);            throw yde;        }        LOG.info("Cluster specification: {}", validClusterSpecification);        final ClusterEntrypoint.ExecutionMode executionMode = detached ?                ClusterEntrypoint.ExecutionMode.DETACHED                : ClusterEntrypoint.ExecutionMode.NORMAL;        flinkConfiguration.setString(ClusterEntrypoint.EXECUTION_MODE, executionMode.toString());        //启动appMaster        ApplicationReport report = startAppMaster(                flinkConfiguration,                applicationName,                yarnClusterEntrypoint,                jobGraph,                yarnClient,                yarnApplication,                validClusterSpecification);        // print the application id for user to cancel themselves.        //判断是否时detached模式,如果是,则打印applicationId        if (detached) {            final ApplicationId yarnApplicationId = report.getApplicationId();            logDetachedClusterInformation(yarnApplicationId, LOG);        }        //打印yarn的信息,如applicationId 并将信息写回flinkConfig,即report        setClusterEntrypointInfoToConfig(report);        return () -> {            try {                //将RestClusterClient返回,可以通过客户端获取部署到集群上的flink配置信息                return new RestClusterClient<>(flinkConfiguration, report.getApplicationId());            } catch (Exception e) {                throw new RuntimeException("Error while creating RestClusterClient.", e);            }        };    }

至此,run-application部署模式已经结束。

回顾整个部署过程,org.apache.flink.client.deployment.application.cli.ApplicationClusterDeployer#run方法才是整个部署的灵魂。所以我们考虑,当我们自己准备了Configuration configuration, ApplicationConfiguration applicationConfiguration时,可以省略前面的步骤,直接提交我们的作业到集群中。由此我们参考公众号[大数据技术与应用实战]的代码https://github.com/zhangjun0x01/bigdata-examples.git #cluster.SubmitJobApplicationMode.通过自己封装参数的形式,将代码部署到yarn集群中,通过返回的客户端对象,我们可以获取当前应用的applicationId等信息。这样就可以轻量的将flink的部署程序整合进我们自有的大数据平台中。

参考代码:

    public static void main(String[] args){        //flink的本地配置目录,为了得到flink的配置        String configurationDirectory = "/Users/user/work/flink/conf/";        //存放flink集群相关的jar包目录        String flinkLibs = "hdfs://hadoopcluster/data/flink/libs";        //用户jar        String userJarPath = "hdfs://hadoopcluster/data/flink/user-lib/TopSpeedWindowing.jar";        String flinkDistJar = "hdfs://hadoopcluster/data/flink/libs/flink-yarn_2.11-1.11.0.jar";        YarnClient yarnClient = YarnClient.createYarnClient();        YarnConfiguration yarnConfiguration = new YarnConfiguration();        yarnClient.init(yarnConfiguration);        yarnClient.start();        YarnClusterInformationRetriever clusterInformationRetriever = YarnClientYarnClusterInformationRetriever                .create(yarnClient);        //获取flink的配置        Configuration flinkConfiguration = GlobalConfiguration.loadConfiguration(                configurationDirectory);        flinkConfiguration.set(CheckpointingOptions.INCREMENTAL_CHECKPOINTS, true);        flinkConfiguration.set(                PipelineOptions.JARS,                Collections.singletonList(                        userJarPath));        Path remoteLib = new Path(flinkLibs);        flinkConfiguration.set(                YarnConfigOptions.PROVIDED_LIB_DIRS,                Collections.singletonList(remoteLib.toString()));        flinkConfiguration.set(                YarnConfigOptions.FLINK_DIST_JAR,                flinkDistJar);        //设置为application模式        flinkConfiguration.set(                DeploymentOptions.TARGET,                YarnDeploymentTarget.APPLICATION.getName());        //yarn application name        flinkConfiguration.set(YarnConfigOptions.APPLICATION_NAME, "jobName");        //规范        ClusterSpecification clusterSpecification = new ClusterSpecification.ClusterSpecificationBuilder()                .createClusterSpecification();//      设置用户jar的参数和主类        ApplicationConfiguration appConfig = new ApplicationConfiguration(args, null);        YarnClusterDescriptor yarnClusterDescriptor = new YarnClusterDescriptor(                flinkConfiguration,                yarnConfiguration,                yarnClient,                clusterInformationRetriever,                true);        ClusterClientProvider clusterClientProvider = null;        try {            clusterClientProvider = yarnClusterDescriptor.deployApplicationCluster(                    clusterSpecification,                    appConfig);        } catch (ClusterDeploymentException e){            e.printStackTrace();        }        ClusterClient clusterClient = clusterClientProvider.getClusterClient();        ApplicationId applicationId = clusterClient.getClusterId();        System.out.println(applicationId);    }

完结,欢迎指正。

参考:

https://blog.csdn.net/weixin_43161811/article/details/103152867

https://blog.csdn.net/zhangjun5965/article/details/107511615

83f283c2d42340bc5957255945fc543b.png

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值