在前面的 Flink学习笔记(十一):flink KeyedState运用介绍了如何使用state进行sum操作。但是数据流通常是长时间运行,那么存在的状态将越来越多,如何解决这个问题呢?
1、Flink State Time-To-Live (TTL)
Flink提供了StateTtlConfig机制进行处理。首先我们看下提供的策略类型:
- TTL 刷新策略(默认OnCreateAndWrite)
策略类型 | 描述 |
---|---|
StateTtlConfig.UpdateType.Disabled | 禁用TTL,永不过期 |
StateTtlConfig.UpdateType.OnCreateAndWrite | 每次写操作都会更新State的最后访问时间 |
StateTtlConfig.UpdateType.OnReadAndWrite | 每次读写操作都会跟新State的最后访问时间 |
- 状态可见性(默认NeverReturnExpired)
策略类型 | 描述 |
---|---|
StateTtlConfig.StateVisibility.NeverReturnExpired | 永不返回过期状态 |
StateTtlConfig.StateVisibility.ReturnExpiredIfNotCleanedUp | 可以返回过期但尚未被清理的状态值 |
里面有更具体的介绍,包括state类型,清理策略和相关例子
2、实例
还是上面文章中的一个例子
我们可以看到在keybystream中配置了StateTtlConfig,配置方式如下,当一个状态超过两秒后重新计算状态
StateTtlConfig ttlConfig = StateTtlConfig
newBuilder(Time.seconds(2))
.setUpdateType(StateTtlConfig.UpdateType.OnCreateAndWrite)
.setStateVisibility(StateTtlConfig.StateVisibility.NeverReturnExpired)
.build();
ValueStateDescriptor stateDescriptor = new ValueStateDescriptor("key-fruit", Types.TUPLE(Types.STRING, Types.INT));
stateDescriptor.enableTimeToLive(ttlConfig);
当然清除状态可以使用cleanupIncrementally,如
StateTtlConfig ttlConfig = StateTtlConfig
newBuilder(Time.seconds(2))
.setUpdateType(StateTtlConfig.UpdateType.OnCreateAndWrite)
.setStateVisibility(StateTtlConfig.StateVisibility.NeverReturnExpired)
.cleanupIncrementally(10, true)
.build();
我们看下完整代码
public class TestStateTtlConfig {
private static final String[] FRUIT = {"苹果", "梨", "西瓜", "葡萄", "火龙果", "橘子", "桃子", "香蕉"};
public static void main(String args[]) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setParallelism(1);
DataStream<Tuple2<String, Integer>> fruit = env.addSource(new SourceFunction<Tuple2<String, Integer>>() {
private volatile boolean isRunning = true;
private final Random random = new Random();
@Override
public void run(SourceContext<Tuple2<String, Integer>> ctx) throws Exception {
while (isRunning) {
TimeUnit.SECONDS.sleep(1);
ctx.collect(Tuple2.of(FRUIT[random.nextInt(FRUIT.length)], 1));
}
}
@Override
public void cancel() {
isRunning = false;
}
});
fruit.keyBy(0).map(new RichMapFunction<Tuple2<String, Integer>, Tuple2<String, Integer>>() {
private ValueState<Tuple2<String, Integer>> valueState;
@Override
public void open(Configuration parameters) throws Exception {
StateTtlConfig ttlConfig = StateTtlConfig
.newBuilder(Time.seconds(2))
.setUpdateType(StateTtlConfig.UpdateType.OnCreateAndWrite)
.setStateVisibility(StateTtlConfig.StateVisibility.NeverReturnExpired)
.cleanupIncrementally(10, true)
.build();
ValueStateDescriptor stateDescriptor = new ValueStateDescriptor("key-fruit", Types.TUPLE(Types.STRING, Types.INT));
stateDescriptor.enableTimeToLive(ttlConfig);
valueState = getRuntimeContext().getState(stateDescriptor);
}
@Override
public Tuple2<String, Integer> map(Tuple2<String, Integer> tuple2) throws Exception {
Tuple2<String, Integer> currentState = valueState.value();
// 初始化 ValueState 值
if (null == currentState) {
currentState = new Tuple2<>(tuple2.f0, 0);
}
Tuple2<String, Integer> newState = new Tuple2<>(currentState.f0, currentState.f1 + tuple2.f1);
// 更新 ValueState 值
valueState.update(newState);
return Tuple2.of(newState.f0, newState.f1);
}
}).print();
env.execute("fruit");
}
执行结果