public interface CheckpointedFunction {
void snapshotState(FunctionSnapshotContext context) throws Exception;
void initializeState(FunctionInitializationContext context) throws Exception;
}
CheckpointedFunction是有状态转化的核心接口,虽然有其他的更轻量级的接口,例如ListCheckpoint,但是CheckpointFunction接口更加灵活.
特别是对keyed state 和 operator state
1 initializeState方法在并行的装换算子被创建的时候会调用,并且通过该方法来访问OperatorStateStore和KeyedStateStore.
OperatorStateStore和KeyedStateStore提供了Flink用来访问state的数据结构,例如ValueState和ListState
2 snapshotState方法是在checkpoint要获得转换函数的state时被调用.其中的参数FunctionSnapshotContext会获得checkpoint的元数据信息.
例如 context.getCheckpointTimestamp 和 context.getCheckpointId
同时,也可以把该方法作为钩子函数来flush commit synchronize 外部系统 例如 context.notify();
官方提供的例子
public class MyCheckpointFunction<T> implements MapFunction<T,T> , CheckpointedFunction {
private ReducingState<Long> countperKey;
private ListState<Long> countperPartition;
private long localCount;
@Override
public T map(T value) throws Exception {
countperKey.add(1L);
localCount++;
return value;
}
@Override
public void snapshotState(FunctionSnapshotContext context) throws Exception {
countperPartition.clear();
countperPartition.add(localCount);
context.notify();
}
@Override
public void initializeState(FunctionInitializationContext context) throws Exception {
countperKey = context.getKeyedStateStore().getReducingState(
new ReducingStateDescriptor<Object>("countperKey",new AddFunction<>(),Long.class)
);
countperPartition = context.getOperatorStateStore().getListState(
new ListStateDescriptor<>("countperPartition",Long.class)
);
for(long l:countperPartition.get()){
localCount += l;
}
}
}
ps RichMapFunction也可以单独实现算子函数的stateful化
public class MyCountPerKeyFunction<T> extends RichMapFunction<T,T> {
private ValueState<Long> count;
public void open(Configuration config) throws Exception{
count= getRuntimeContext().getState(new ValueStateDescriptor<>("count", Long.class));
}
@Override
public T map(T value) throws Exception {
Long current = count.value();
count.update(current == null?1L:current+1);
return null;
}
}