Flink源码阅读之FileSystem Connector

代码在flink-table-runtime-blink模块,用户指南参考官网.

目前是旧的实现方式,将会按FLIP-95重新实现FLINK-19336

入口类FileSystemTableFactory,如何做Factory discover的可以参考之前的博文,这里就不赘述了。

Sink

构造FileSystemTableSink对象,传入相关属性参数

public TableSink<RowData> createTableSink(TableSinkFactory.Context context) {
   
		Configuration conf = new Configuration();
		context.getTable().getOptions().forEach(conf::setString);

		return new FileSystemTableSink(
				context.getObjectIdentifier(),//connector标识符
				context.isBounded(),//是否有界流
				context.getTable().getSchema(),//表的schema
				getPath(conf),//file 路径
				context.getTable().getPartitionKeys(),//分区key
				conf.get(PARTITION_DEFAULT_NAME),//默认分区名称
				context.getTable().getOptions());//参数
	}

FileSystemTableSink会根据DataStream构造DataStreamSink

consumeDataStream主要做几个事情:

  1. 构造RowDataPartitionComputer,将分区字段和非分区字段index和type分开。
  2. EmptyMetaStoreFactory空的metastore实现。
  3. UUID生成文件前缀
  4. 构造FileSystemFactory的实现
  5. 根据是否有界流走不同分支处理
public final DataStreamSink<RowData> consumeDataStream(DataStream<RowData> dataStream) {
   
		RowDataPartitionComputer computer = new RowDataPartitionComputer(
				defaultPartName,
				schema.getFieldNames(),
				schema.getFieldDataTypes(),
				partitionKeys.toArray(new String[0]));

		EmptyMetaStoreFactory metaStoreFactory = new EmptyMetaStoreFactory(path);
		OutputFileConfig outputFileConfig = OutputFileConfig.builder()
				.withPartPrefix("part-" + UUID.randomUUID().toString())
				.build();
		FileSystemFactory fsFactory = FileSystem::get;

		if (isBounded) {
   
			FileSystemOutputFormat.Builder<RowData> builder = new FileSystemOutputFormat.Builder<>();
			builder.setPartitionComputer(computer);
			builder.setDynamicGrouped(dynamicGrouping);
			builder.setPartitionColumns(partitionKeys.toArray(new String[0]));
			builder.setFormatFactory(createOutputFormatFactory());
			builder.setMetaStoreFactory(metaStoreFactory);
			builder.setFileSystemFactory(fsFactory);
			builder.setOverwrite(overwrite);
			builder.setStaticPartitions(staticPartitions);
			builder.setTempPath(toStagingPath());
			builder.setOutputFileConfig(outputFileConfig);
			return dataStream.writeUsingOutputFormat(builder.build())
					.setParallelism(dataStream.getParallelism());
		} else {
   
			Configuration conf = new Configuration();
			properties.forEach(conf::setString);
			Object writer = createWriter();
			TableBucketAssigner assigner = new TableBucketAssigner(computer);
			TableRollingPolicy rollingPolicy = new TableRollingPolicy(
					!(writer instanceof Encoder),
					conf.get(SINK_ROLLING_POLICY_FILE_SIZE).getBytes(),
					conf.get(SINK_ROLLING_POLICY_ROLLOVER_INTERVAL).toMillis());

			BucketsBuilder<RowData, String, ? extends BucketsBuilder<RowData, ?, ?>> bucketsBuilder;
			if (writer instanceof Encoder) {
   
				//noinspection unchecked
				bucketsBuilder = StreamingFileSink.forRowFormat(
						path, new ProjectionEncoder((Encoder<RowData>) writer, computer))
						.withBucketAssigner(assigner)
						.withOutputFileConfig(outputFileConfig)
						.withRollingPolicy(rollingPolicy);
			} else {
   
				//noinspection unchecked
				bucketsBuilder = StreamingFileSink.forBulkFormat(
						path, new ProjectionBulkFactory((BulkWriter.Factory<RowData>) writer, computer))
						.withBucketAssigner(assigner)
						.withOutputFileConfig(outputFileConfig)
						.withRollingPolicy(rollingPolicy);
			}
			return createStreamingSink(
					conf,
					path,
					partitionKeys,
					tableIdentifier,
					overwrite,
					dataStream,
					bucketsBuilder,
					metaStoreFactory,
					fsFactory,
					conf.get(SINK_ROLLING_POLICY_CHECK_INTERVAL).toMillis(
  • 3
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值