Flink 添加 keyby 算子,增加数据
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(4);
//env.setBufferTimeout(0);
env.addSource(new SourceFunction<ABean>() {
@Override
public void run(SourceContext<ABean> ctx) throws Exception {
while(true){
ctx.collect(ABean.builder().key("aaa").value(System.currentTimeMillis()).build());
Thread.sleep(100);
}
}
@Override
public void cancel() {
}
}).keyBy(new KeySelector<ABean, String>() {
@Override
public String getKey(ABean value) throws Exception {
return value.getKey();
}
}).map(new RichMapFunction<ABean, ABean>() {
@Override
public ABean map(ABean value) throws Exception {
log.warn("延迟时间:" + (System.currentTimeMillis() - value.getValue()));
return value;
}
});
env.execute("keyby 时间延迟");
The reason for this delay is that by adding that keyBy you are forcing a network shuffle along with serialization/deserialization. The reason the delay is so variable is because of the network buffering.
https://stackoverflow.com/questions/56819799/flink-keyby-adding-delay-how-can-i-reduce-this-latency
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(1);
//
env.setBufferTimeout(0);
/**
* Sets the maximum time frequency (milliseconds) for the flushing of the output buffers. By
* default the output buffers flush frequently to provide low latency and to aid smooth
* developer experience. Setting the parameter can result in three logical modes:
*
* <ul>
* <li>A positive integer triggers flushing periodically by that integer
* <li>0 triggers flushing after every record thus minimizing latency
* <li>-1 triggers flushing only when the output buffer is full thus maximizing throughput
* </ul>
*
* @param timeoutMillis The maximum time between two output flushes.
*/
public StreamExecutionEnvironment setBufferTimeout(long timeoutMillis) {
if (timeoutMillis < -1) {
throw new IllegalArgumentException("Timeout of buffer must be non-negative or -1");
}
this.bufferTimeout = timeoutMillis;
return this;
}
参考:
https://www.modb.pro/db/116828
https://stackoverflow.com/questions/56819799/flink-keyby-adding-delay-how-can-i-reduce-this-latency
https://flink.apache.org/2019/06/05/flink-network-stack.html