Flink On Yarn文件分发

1.相关配置

public static final ConfigOption<List<String>> SHIP_FILES =
        key("yarn.ship-files")
                .stringType()
                .asList()
                .noDefaultValue()
                .withDeprecatedKeys("yarn.ship-directories")
                .withDescription(
                        "A semicolon-separated list of files and/or directories to be shipped to the YARN cluster.");

  配置就是用于上传第三方文件到yarn上的

2.文件加载

  在YarnClusterDescriptor中,读取配置并添加进一个集合

decodeFilesToShipToCluster(flinkConfiguration, YarnConfigOptions.SHIP_FILES)
        .ifPresent(this::addShipFiles);

  之后在startAppMaster的时候进行文件上传

// Register all files in provided lib dirs as local resources with public visibility
// and upload the remaining dependencies as local resources with APPLICATION visibility.
final List<String> systemClassPaths = fileUploader.registerProvidedLocalResources();
final List<String> uploadedDependencies =
        fileUploader.registerMultipleLocalResources(
                systemShipFiles.stream()
                        .map(e -> new Path(e.toURI()))
                        .collect(Collectors.toSet()),
                Path.CUR_DIR,
                LocalResourceType.FILE);
systemClassPaths.addAll(uploadedDependencies);

  最终的文件上传在YarnApplicationFileUploader的uploadLocalFileToRemote接口,将数据上传HDFS。

final Path applicationDir = getApplicationDirPath(homeDir, applicationId);
final String suffix =
        (relativeDstPath.isEmpty() ? "" : relativeDstPath + "/") + localSrcPath.getName();
final Path dst = new Path(applicationDir, suffix);

LOG.debug(
        "Copying from {} to {} with replication factor {}",
        localSrcPath,
        dst,
        replicationFactor);
fileSystem.copyFromLocalFile(false, true, localSrcPath, dst);
fileSystem.setReplication(dst, (short) replicationFactor);
return dst;

  这里需要注意一点,上传是基于fileSystem进行的(包括homeDir的定义),fileSystem是在创建YarnApplicationFileUploader是传入的。homeDir和fileSystem都是由yarn.staging-directory确定的,如果没有配置,则由fs.defaultFS决定,默认是本地目录

Path stagingDirPath = getStagingDir(fs);
FileSystem stagingDirFs = stagingDirPath.getFileSystem(yarnConfiguration);
final YarnApplicationFileUploader fileUploader =
        YarnApplicationFileUploader.from(
                stagingDirFs,
                stagingDirPath,
                providedLibDirs,
                appContext.getApplicationId(),
                getFileReplication());

3.启动

  根据启动类型,启动入口为YarnApplicationClusterEntryPoint、YarnJobClusterEntrypoint等,启用的rest都是MiniDispatcherRestEndpoint。
  这里最终要的一个点是配置文件目录,同时也是yarn.ship-files文件上的目录

Map<String, String> env = System.getenv();

final String workingDirectory = env.get(ApplicationConstants.Environment.PWD.key());

  最终加载配置时传入配置目录

final Configuration configuration =
        YarnEntrypointUtils.loadConfiguration(workingDirectory, dynamicParameters, env);

4.补充

  注意有一个配置,可以用来提升Application的效率,防止每次下发Flink的jar文件。后续的文件上传流程会

public static final ConfigOption<List<String>> PROVIDED_LIB_DIRS =
        key("yarn.provided.lib.dirs")
                .stringType()
                .asList()
                .noDefaultValue()
                .withDescription(
                        "A semicolon-separated list of provided lib directories. They should be pre-uploaded and "
                                + "world-readable. Flink will use them to exclude the local Flink jars(e.g. flink-dist, lib/, plugins/)"
                                + "uploading to accelerate the job submission process. Also YARN will cache them on the nodes so that "
                                + "they doesn't need to be downloaded every time for each application. An example could be "
                                + "hdfs://$namenode_address/path/of/flink/lib");

  如果这个配置为空,SHIP_FILES的时候会加上flink的jar包目录

if (providedLibDirs == null || providedLibDirs.isEmpty()) {
    addLibFoldersToShipFiles(systemShipFiles);
}

  jar包目录来自环境变量

String libDir = System.getenv().get(ENV_FLINK_LIB_DIR);
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值