背景
最近几天把Flink项目的版本从1.12升级到了最新的1.14.2,然后发现项目里的CEP事件都没有输出了,即使将Stream打印到控制台,也是啥也没有。
问题原因
Flink在1.12版本之后,PatternStream
默认使用Event Time。如果业务使用的事Processing Time,必须要明确配置。
解决办法
样例代码,下面的代码是不会有任何输出的。
package spendreport;
import org.apache.flink.cep.CEP;
import org.apache.flink.cep.PatternSelectFunction;
import org.apache.flink.cep.PatternStream;
import org.apache.flink.cep.nfa.aftermatch.AfterMatchSkipStrategy;
import org.apache.flink.cep.pattern.Pattern;
import org.apache.flink.cep.pattern.conditions.SimpleCondition;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.walkthrough.common.entity.Transaction;
import org.apache.flink.walkthrough.common.source.TransactionSource;
/**
* @author jixiang.ma@mail.nwpu.edu.cn
* @date 2022/1/7 19:00
* @copyright © 2021 ruanjian.nwpu all rights reserved.
*/
public class CEPtest {
public static void main(String[] args) throws Exception {
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment().setParallelism(1);
DataStream<Transaction> dataStream = env.addSource(new TransactionSource());
Pattern<Transaction, ?> pattern = Pattern.<Transaction>begin("begin", AfterMatchSkipStrategy.noSkip()).where(
new SimpleCondition<Transaction>() {
@Override
public boolean filter(Transaction transaction) throws Exception {
return transaction.getAmount() > 300;
}
}).timesOrMore(1);
PatternStream<Transaction> patternStream = CEP.pattern(dataStream, pattern);
DataStream<Transaction> d = patternStream.select(
(PatternSelectFunction<Transaction, Transaction>) map -> map.get("begin").get(0))
.name("bbbb");
d.print();
env.execute();
}
}
而下面的代码就正常了
package spendreport;
import org.apache.flink.cep.CEP;
import org.apache.flink.cep.PatternSelectFunction;
import org.apache.flink.cep.PatternStream;
import org.apache.flink.cep.nfa.aftermatch.AfterMatchSkipStrategy;
import org.apache.flink.cep.pattern.Pattern;
import org.apache.flink.cep.pattern.conditions.SimpleCondition;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.walkthrough.common.entity.Transaction;
import org.apache.flink.walkthrough.common.source.TransactionSource;
/**
* @author jixiang.ma@mail.nwpu.edu.cn
* @date 2022/1/7 19:00
* @copyright © 2021 ruanjian.nwpu all rights reserved.
*/
public class CEPtest {
public static void main(String[] args) throws Exception {
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment().setParallelism(1);
DataStream<Transaction> dataStream = env.addSource(new TransactionSource());
Pattern<Transaction, ?> pattern = Pattern.<Transaction>begin("begin", AfterMatchSkipStrategy.noSkip()).where(
new SimpleCondition<Transaction>() {
@Override
public boolean filter(Transaction transaction) throws Exception {
return transaction.getAmount() > 300;
}
}).timesOrMore(1);
PatternStream<Transaction> patternStream = CEP.pattern(dataStream, pattern).inProcessingTime();
DataStream<Transaction> d = patternStream.select(
(PatternSelectFunction<Transaction, Transaction>) map -> map.get("begin").get(0))
.name("bbbb");
d.print();
env.execute();
}
}
只需要在在后面加上inProcessingTime()或者inEventTime()即可。
PatternStream<Transaction> patternStream = CEP.pattern(dataStream, pattern).inProcessingTime();
总结
当代码能跑起来的时候,能别动就别动!
备注:示例中的数据源是Flink内置数据源,需要加载Maven pom:
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-walkthrough-common_${scala.binary.version}</artifactId>
<version>${flink.version}</version>
</dependency>