java streaming编程_java – Flink Streaming：如何根据数据将一个数据流输出到不同的输出？...

最新推荐文章于 2024-07-10 00:45:00 发布

黄泓毅

最新推荐文章于 2024-07-10 00:45:00 发布

阅读量290

点赞数

文章标签： java streaming编程

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_30074763/article/details/114799170

版权

在Apache Flink中,我有一串元组.让我们假设一个非常简单的Tuple1< String>.元组可以在其值字段中具有任意值(例如,“P1”,“P2”等).一组可能的值是有限的,但我不知道预先设定的全部(所以可能有一个’P362′).我想将该元组写入某个输出位置,具体取决于元组中的值.例如我想拥有以下文件结构：

> / output / P1

> / output / P2

在文档中,我只发现写入我事先知道的位置的可能性(例如stream.writeCsv(“/ output / athere”)),但是没有办法让数据的内容决定数据实际结束的位置.

我阅读了关于文档中的输出分割,但是这似乎并没有提供一种方法来将输出重定向到不同的目的地,我想要拥有它(或者我不明白这将如何工作).

这可以用Flink API来完成吗？如果没有,是否有可能有第三方图书馆可以做到这一点,或者我必须自己建立一个这样的事情？

更新

按照Matthias的建议,我想出了一个筛选接收函数,它确定输出路径,然后在将序列化之后将元组写入相应的文件.我把它放在这里供参考,也许它对别人有用：

public class SiftingSinkFunction extends RichSinkFunction {

private final OutputSelector outputSelector;

private final MapFunction serializationFunction;

private final String basePath;

Map> formats = new HashMap<>();

/**

* @param outputSelector the selector which determines into which output(s) a record is written.

* @param serializationFunction a function which serializes the record to a string.

* @param basePath the base path for writing the records. It will be appended with the output selector.

*/

public SiftingSinkFunction(OutputSelector outputSelector,MapFunction serializationFunction,String basePath) {

this.outputSelector = outputSelector;

this.serializationFunction = serializationFunction;

this.basePath = basePath;

}

@Override

public void invoke(IT value) throws Exception {

// find out where to write.

Iterable selection = outputSelector.select(value);

for (String s : selection) {

// ensure we have a format for this.

TextOutputFormat destination = ensureDestinationExists(s);

// then serialize and write.

destination.writeRecord(serializationFunction.map(value));

}

}

private TextOutputFormat ensureDestinationExists(String selection) throws IOException {

// if we know the destination,we just return the format.

if (formats.containsKey(selection)) {

return formats.get(selection);

}

// create a new output format and initialize it from the context.

TextOutputFormat format = new TextOutputFormat<>(new Path(basePath,selection));

StreamingRuntimeContext context = (StreamingRuntimeContext) getRuntimeContext();

format.configure(context.getTaskStubParameters());

format.open(context.getIndexOfThisSubtask(),context.getNumberOfParallelSubtasks());

// put it into our map.

formats.put(selection,format);

return format;

}

@Override

public void close() throws IOException {

Exception lastException = null;

try {

for (TextOutputFormat format : formats.values()) {

try {

format.close();

} catch (Exception e) {

lastException = e;

format.tryCleanupOnError();

}

}

} finally {

formats.clear();

}

if (lastException != null) {

throw new IOException("Close Failed.",lastException);

}

}

}

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
java streaming编程_java – Flink Streaming：如何根据数据将一个数据流输出到不同的输出？...

在Apache Flink中,我有一串元组.让我们假设一个非常简单的Tuple1< String>.元组可以在其值字段中具有任意值(例如,“P1”,“P2”等).一组可能的值是有限的,但我不知道预先设定的全部(所以可能有一个’P362′).我想将该元组写入某个输出位置,具体取决于元组中的值.例如我想拥有以下文件结构：> / output / P1> / output / P...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。