1. 数据初始化
org.apache.flink.streaming.util.AbstractStreamOperatorTestHarness#open
public void open() throws Exception {
if (!this.initializeCalled) { // initializeCalled变量,控制是否初始化state
this.initializeEmptyState();
}
this.operator.open();
}
首次执行initializeEmptyState方法,底层进入initializeState方法
public void initializeState(OperatorSubtaskState jmOperatorStateHandles, OperatorSubtaskState tmOperatorStateHandles) throws Exception {
Preconditions.checkState(!this.initializeCalled, "TestHarness has already been initialized. Have you opened this harness before initializing it?"); // 校验是否重复初始化state
if (!this.setupCalled) {
this.setup(); // 1. 初始化streamTaskStateInitializer 2. 执行operator.setup 3. setupCalled = true
}
if (jmOperatorStateHandles != null) {
TaskStateSnapshot jmTaskStateSnapshot = new TaskStateSnapshot();
jmTaskStateSnapshot.putSubtaskStateByOperatorID(this.operator.getOperatorID(), jmOperatorStateHandles);
this.taskStateManager.setReportedCheckpointId(0L);
this.taskStateManager.setJobManagerTaskStateSnapshotsByCheckpointId(Collections.singletonMap(0L, jmTaskStateSnapshot));
if (tmOperatorStateHandles != null) {
TaskStateSnapshot tmTaskStateSnapshot = new TaskStateSnapshot();
tmTaskStateSnapshot.putSubtaskStateByOperatorID(this.operator.getOperatorID(), tmOperatorStateHandles);
this.taskStateManager.setTaskManagerTaskStateSnapshotsByCheckpointId(Collections.singletonMap(0L, tmTaskStateSnapshot));
}
}
this.operator.initializeState();
this.initializeCalled = true;
}
1.初始化streamTaskStateInitializer
其实就是创建StreamTaskStateInitializerImpl对象, 持有environment , stateBackend, processingTimeService引用
2.执行operator.setup
编写测试用例时,如果是创建KeyedOneInputStreamOperatorTestHarness对象,则底层执行的是org.apache.flink.streaming.api.operators.AbstractStreamOperator#setup方法
2.1 填充相关runtime fields到KeyedProcessOperator对象中
// ---------------- runtime fields ------------------
/** The task that contains this operator (and other operators in the same chain). /
private transient StreamTask<?, ?> container;
protected transient StreamConfig config;
protected transient String distributionIdentifier;
protected transient Output<StreamRecord> output;
/* The runtime context for UDFs. */
private transient StreamingRuntimeContext runtimeContext;
2.2 填充RuntimeContext context 到 KeyedProcessFunction(KeyedProcessOperator属性)中
2.3 set setupCalled = true
3.执行operator.initializeState()
底层执行的org.apache.flink.streaming.api.operators.AbstractStreamOperator#initializeState()方法
3.1 初始化operatorStateBackend、keyedStateBackend、keyedStateStore
3.2 org.apache.flink.streaming.util.functions.StreamingFunctionUtils#tryRestoreFunction方法判断 & 是否需要从state中恢复数据
private static boolean tryRestoreFunction(
StateInitializationContext context,
Function userFunction) throws Exception {
if (userFunction instanceof CheckpointedFunction) {
((CheckpointedFunction) userFunction).initializeState(context);
return true;
}
if (context.isRestored() && userFunction instanceof ListCheckpointed) {
@SuppressWarnings("unchecked")
ListCheckpointed<Serializable> listCheckpointedFun = (ListCheckpointed<Serializable>) userFunction;
ListState<Serializable> listState = context.getOperatorStateStore().
getSerializableListState(DefaultOperatorStateBackend.DEFAULT_OPERATOR_STATE_NAME);
List<Serializable> list = new ArrayList<>();
for (Serializable serializable : listState.get()) {
list.add(serializable);
}
try {
listCheckpointedFun.restoreState(list);
} catch (Exception e) {
throw new Exception("Failed to restore state to function: " + e.getMessage(), e);
}
return true;
}
return false;
}
context.isRestored() = false,所以tryRestoreFunction返回false,不会执行state中恢复数据逻辑
2.执行测试
org.apache.flink.streaming.util.OneInputStreamOperatorTestHarness#processElement(IN, long)
执行测试时,需要传入数据 + timestamp
public void processElement(StreamRecord<IN> element) throws Exception {
this.operator.setKeyContextElement1(element);
this.oneInputOperator.processElement(element);
}
2.1 设置rawBackend currentkey
2.2 执行oneInputOperator.processElement
public void processElement(StreamRecord<IN> element) throws Exception {
collector.setTimestamp(element);
context.element = element;
userFunction.processElement(element.getValue(), context, collector);
context.element = null;
}
userFunction就是我们构建构建KeyedProcessOperator传入的需要测试的function
2.3 执行function时,需要保存function输出,用来校验
这里的output方法,已经在前面setup初始化,替换为MockOutPut(org.apache.flink.streaming.util.AbstractStreamOperatorTestHarness#setup),所以在执行function的process方法时,结果会保存在org.apache.flink.streaming.util.AbstractStreamOperatorTestHarness#outputList
3.校验结果
org.apache.flink.streaming.util.TestHarnessUtil#assertOutputEquals
底层用到了 java.util.Arrays#deepEquals
public static boolean deepEquals(Object[] a1, Object[] a2) {
if (a1 == a2)
return true;
if (a1 == null || a2==null)
return false;
int length = a1.length;
if (a2.length != length)
return false;
for (int i = 0; i < length; i++) {
Object e1 = a1[i];
Object e2 = a2[i];
if (e1 == e2)
continue;
if (e1 == null)
return false;
// Figure out whether the two elements are equal
boolean eq = deepEquals0(e1, e2);
if (!eq)
return false;
}
return true;
}