代码在flink-table-runtime-blink模块,用户指南参考官网.
目前是旧的实现方式,将会按FLIP-95重新实现FLINK-19336
入口类FileSystemTableFactory,如何做Factory discover的可以参考之前的博文,这里就不赘述了。
Sink
构造FileSystemTableSink对象,传入相关属性参数
public TableSink<RowData> createTableSink(TableSinkFactory.Context context) {
Configuration conf = new Configuration();
context.getTable().getOptions().forEach(conf::setString);
return new FileSystemTableSink(
context.getObjectIdentifier(),//connector标识符
context.isBounded(),//是否有界流
context.getTable().getSchema(),//表的schema
getPath(conf),//file 路径
context.getTable().getPartitionKeys(),//分区key
conf.get(PARTITION_DEFAULT_NAME),//默认分区名称
context.getTable().getOptions());//参数
}
FileSystemTableSink会根据DataStream构造DataStreamSink
consumeDataStream主要做几个事情:
- 构造RowDataPartitionComputer,将分区字段和非分区字段index和type分开。
- EmptyMetaStoreFactory空的metastore实现。
- UUID生成文件前缀
- 构造FileSystemFactory的实现
- 根据是否有界流走不同分支处理
public final DataStreamSink<RowData> consumeDataStream(DataStream<RowData> dataStream) {
RowDataPartitionComputer computer = new RowDataPartitionComputer(
defaultPartName,
schema.getFieldNames(),
schema.getFieldDataTypes(),
partitionKeys.toArray(new String[0]));
EmptyMetaStoreFactory metaStoreFactory = new EmptyMetaStoreFactory(path);
OutputFileConfig outputFileConfig = OutputFileConfig.builder()
.withPartPrefix("part-" + UUID.randomUUID().toString())
.build();
FileSystemFactory fsFactory = FileSystem::get;
if (isBounded) {
FileSystemOutputFormat.Builder<RowData> builder = new FileSystemOutputFormat.Builder<>();
builder.setPartitionComputer(computer);
builder.setDynamicGrouped(dynamicGrouping);
builder.setPartitionColumns(partitionKeys.toArray(new String[0]));
builder.setFormatFactory(createOutputFormatFactory());
builder.setMetaStoreFactory(metaStoreFactory);
builder.setFileSystemFactory(fsFactory);
builder.setOverwrite(overwrite);
builder.setStaticPartitions(staticPartitions);
builder.setTempPath(toStagingPath());
builder.setOutputFileConfig(outputFileConfig);
return dataStream.writeUsingOutputFormat(builder.build())
.setParallelism(dataStream.getParallelism());
} else {
Configuration conf = new Configuration();
properties.forEach(conf::setString);
Object writer = createWriter();
TableBucketAssigner assigner = new TableBucketAssigner(computer);
TableRollingPolicy rollingPolicy = new TableRollingPolicy(
!(writer instanceof Encoder),
conf.get(SINK_ROLLING_POLICY_FILE_SIZE).getBytes(),
conf.get(SINK_ROLLING_POLICY_ROLLOVER_INTERVAL).toMillis());
BucketsBuilder<RowData, String, ? extends BucketsBuilder<RowData, ?, ?>> bucketsBuilder;
if (writer instanceof Encoder) {
//noinspection unchecked
bucketsBuilder = StreamingFileSink.forRowFormat(
path, new ProjectionEncoder((Encoder<RowData>) writer, computer))
.withBucketAssigner(assigner)
.withOutputFileConfig(outputFileConfig)
.withRollingPolicy(rollingPolicy);
} else {
//noinspection unchecked
bucketsBuilder = StreamingFileSink.forBulkFormat(
path, new ProjectionBulkFactory((BulkWriter.Factory<RowData>) writer, computer))
.withBucketAssigner(assigner)
.withOutputFileConfig(outputFileConfig)
.withRollingPolicy(rollingPolicy);
}
return createStreamingSink(
conf,
path,
partitionKeys,
tableIdentifier,
overwrite,
dataStream,
bucketsBuilder,
metaStoreFactory,
fsFactory,
conf.get(SINK_ROLLING_POLICY_CHECK_INTERVAL).toMillis(