Flink框架源码——核心抽象

最新推荐文章于 2022-03-31 18:25:00 发布

yanyan 姐夫

最新推荐文章于 2022-03-31 18:25:00 发布

阅读量559

点赞数 1

文章标签： flink 源码

本文链接：https://blog.csdn.net/weixin_43097905/article/details/112799209

版权

导读

环境对象

StreamExecutionEnvironment

OneInputStreamOperator

TwoInputStreamOperator

导读

源码的阅读我一般会从顶层与核心抽象开始，切勿一来就钻到具体的代码中，一个成熟的框架少则百万行的代码，不可能读完；只有先从架构设计与顶层抽象开始，才能掌握其设计要领，需要知道方法如何实现、算法如何实现时，才能快速找到位置，详细的看；这才是看源码的精髓。

环境对象

Flink的环境对象可以分为3种：开发时执行环境对象-StreamExecutionEnvironment、运行时执行环境对象-ExecutionEnvironment、运行时上下文对象

StreamExecutionEnvironment

是Flink开发时的入口（位于flink-streaming-java内），表示流式计算job的执行环境，包括了：job开发入口、数据源接口、DataStream生成与转换接口、Sink数据接口、job配置接口、job启动入口。

细分的话还包含： LocalStreamEnvironment、RemoteStreamEnvironment、StreamContextEnvironment、StreamPlanEnvironment

ExecutionEnvironment

是运行时job级别的环境对象（位于flink-java内），是从StreamExecutionEnvironment衍生出来的。启动job时，会从StreamExecutionEnvironment中抽取出需要的上下文数据，根据job的不同情况选择不同的运行时执行环境对象

LocalEnvironment：本地执行环境，在单jvm环境下模拟运行flink集群，用于本地的开发测试
RemoteEnvironment：在远端部署的flink集群的执行环境
CollectionEnvironment：集合数据集模式执行环境，允许以连续的本地集合数据运行flink程序
OptimizerPlanEnvironment：不会创建执行环境，只会创建执行计划
ContextEnvironment：用于在客户端上远程执行

Environment

flink的运行时环境，作为接口定义了运行时task所需要的配置

RuntimeEnvironment：运行时环境，在task开始执行时初始化
DummyEnvironment：用于测试的运行时环境
MockEnvironment：单元测试的运行时环境

RuntimeContext

是function运行时的上下文，每个function实例都会有一个runtimecontext对象，可以用RichFunction.getRuntimeContext()得到该对象

StreamingRuntimeContext：用于流式计算的上下文
DistributedRuntimeUDFContext：在运行时自定义函数所在的批处理算子创建，dataset批处理中使用
RuntimeUDFContext：在批处理应用的自定义函数中使用

数据流元素

StreamElement

包括不同用途的4类元素，在执行层面上会被序列化为二进制数据流，在算子总会反序列化出来，进行处理

StreamRecord：业务数据，也可以认为是一个事件
watermark：水位线-时间戳，将告诉算子早于水位线的数据均已到达，可以触发计算窗口或者定时器
StreamStatus：数据流状态，用于告知task是否继续接收上游的数据，在数据源算子中生成，沿着dataflow向下游传递；状态包括：
1. IDLE：闲置
2. ACTIVE：活动
LatencyMarker：用于监控数据处理延迟，在数据源算子中生成，沿着dataflow向下游传递但会绕过业务逻辑，最终在sink中估算整体耗时

数据转换

Transformation

是现结DataStream与Flink内核的结构，DataStream面向开发、transformation面向Flink内核；在数据处理时，DataStream流水线会被转换为transformation流水线

transformation可以分为物理与虚拟2大类

SourceTransformation：物理，Flink作业的起点，不存在输入因此不会出现实际意义的转换；一个作业可以有多个SourceTransformation
SinkTransformation：物理，Flink作业的终点，将数据输出到外部存储，其不会再有下游转换；一个作业可以有多个SourceTransformation
OneInputTransformation：物理，单输入单输出转换
TwoInputTransformation：物理，双输入单输出转换
SplitTransformation：虚拟，按条件将单DataStream拆分为多数据流，使用OutputSelector；并不会真正的做数据转换，只是做上下游的衔接
SelectTransformation：虚拟，与SplitTransformation配合使用，为其选择切分后的DataStream
PartitionTransformation：虚拟，根据输入的StreamPartitioner对数据流做分区选择，只是做上下游的衔接
UnionTransformation：虚拟，将上游多个DataStream合并为一个，要求输入的多条DataStream结构一致

transformation作为中介，会将StreamTask、算子工厂构建好，算子作为UDF执行容器。

算子

StreamOperator

是流式计算的算子，一个算子就是一个计算步骤，而真正的计算则是算子中包含的function；

DataStream与DataSet有着2套不同的算子体系，未来的发展趋势是批流议题，本文只讨论DataStream的算子。

算子生命周期

setup ：实例化operator，初始化包括：环境、时间范围、注册监控等
open ：它的实现通常包含了operator的初始化逻辑；算子在执行该方法后，才会执行function的数据处理。
close ：该方法在所有的元素都进入到operator被处理之后调用，会保证计算后的缓存数据向下游发送
dispose ：该方法在operator生命周期的最后阶段执行，主要用于回收资源

状态与容错

状态存储
触发checkpoint后，保存快照
快照保存到外部存储
作业失败的时候，负责从快照中恢复状态

数据处理

在数据处理的同时，也会对数据元素中的watermark、LatencyMarker进出处理。

算子根据单输入和双输入定义了2个行为接口：

OneInputStreamOperator

public interface OneInputStreamOperator<IN, OUT> extends StreamOperator<OUT> {

	/**
	 * Processes one element that arrived at this operator.
	 * This method is guaranteed to not be called concurrently with other methods of the operator.
	 */
	void processElement(StreamRecord<IN> element) throws Exception;

	/**
	 * Processes a {@link Watermark}.
	 * This method is guaranteed to not be called concurrently with other methods of the operator.
	 *
	 * @see org.apache.flink.streaming.api.watermark.Watermark
	 */
	void processWatermark(Watermark mark) throws Exception;

	void processLatencyMarker(LatencyMarker latencyMarker) throws Exception;
}

TwoInputStreamOperator

public interface TwoInputStreamOperator<IN1, IN2, OUT> extends StreamOperator<OUT> {

	/**
	 * Processes one element that arrived on the first input of this two-input operator.
	 * This method is guaranteed to not be called concurrently with other methods of the operator.
	 */
	void processElement1(StreamRecord<IN1> element) throws Exception;

	/**
	 * Processes one element that arrived on the second input of this two-input operator.
	 * This method is guaranteed to not be called concurrently with other methods of the operator.
	 */
	void processElement2(StreamRecord<IN2> element) throws Exception;

	/**
	 * Processes a {@link Watermark} that arrived on the first input of this two-input operator.
	 * This method is guaranteed to not be called concurrently with other methods of the operator.
	 *
	 * @see org.apache.flink.streaming.api.watermark.Watermark
	 */
	void processWatermark1(Watermark mark) throws Exception;

	/**
	 * Processes a {@link Watermark} that arrived on the second input of this two-input operator.
	 * This method is guaranteed to not be called concurrently with other methods of the operator.
	 *
	 * @see org.apache.flink.streaming.api.watermark.Watermark
	 */
	void processWatermark2(Watermark mark) throws Exception;

	/**
	 * Processes a {@link LatencyMarker} that arrived on the first input of this two-input operator.
	 * This method is guaranteed to not be called concurrently with other methods of the operator.
	 *
	 * @see org.apache.flink.streaming.runtime.streamrecord.LatencyMarker
	 */
	void processLatencyMarker1(LatencyMarker latencyMarker) throws Exception;

	/**
	 * Processes a {@link LatencyMarker} that arrived on the second input of this two-input operator.
	 * This method is guaranteed to not be called concurrently with other methods of the operator.
	 *
	 * @see org.apache.flink.streaming.runtime.streamrecord.LatencyMarker
	 */
	void processLatencyMarker2(LatencyMarker latencyMarker) throws Exception;

}

异步算子

为了解决与外部系统交互是带来的延迟瓶颈；可以同时发出请求处理回执，不需要阻塞式的等待；关于回调的顺序，支持2种模式：

顺序输出模式：保证输出的数据与输入数据的顺序一致，但会增加延迟、降低算子吞吐量；内部是一个队列保证先收到的数据先输出，即使后续数据先得到回执也会等待
无序输出模式：先处理完得到回执的数据先输出，但不保证顺序，但延迟更低、吞吐量更高

要说的是，即便是无序模式，也不是完全的没有顺序，还记得watermark吗？flink仍会保证水位线不会超越先到数据；即可以依旧水位线来分割成组，组内是乱序但组与组之间是有序的。

函数

Function

其中自定义函数检查UDF，同时也有很多的内置函数；类型上看大体分为3类：

SourceFunction：负责从外部读取数据，其所在的算子是起始点，不会有上游算子
SinkFunction：负责将数据写入到外部存储，其所在的算子是终点，不会有下游算子
Function：负责数据的处理，因此会同时有上游算子与下游算子；出于简单有效的考虑，设计与算子类似UDF也只分为单流输入与双流输入两种

层次

在DataStream API中看函数的层次分为3层，由高到底的封装分别为：

Function：无状态、UDF接口；在使用时无需关系底层概念，只需要实现业务逻辑即可
RichFunction：UDF接口+状态+生命周期；可以实现open、close方法来管理初始化与清理释放等动作；可以get/setRuntimeContext来得到运行时环境的参数，这可能是非常有用的
ProcessFunction：UDF接口+状态+生命周期+触发器

需要说的是，无状态Function可以无脑使用，但有状态的函数，需要考虑中间结果的保存与恢复。

简单的类图上能看出其差异：

Keyed与Non-Keyed的区别是，Keyed的函数只能应用与KeyedStream
Co与Non-Co的区别是，Co函数是双流输入

延迟计算

这个概念是批量一体的一个非常重要的设计

流式计算中数据到抵达会乱序、延迟，为了提高处理效率，使用小批次的计算模式，而不是每个事件都触发一次。

典型场景像Join的定时器，或者window中的watermark。

支持延迟计算的算子都需要继承Triggerable接口，可以实现基于事件时间与处理时间的行为。

广播函数

继承于RichFunction接口、AbstractRichFunction抽象类、BaseBroadcastProcessFunction抽象类。

两大抽象类：BroadcastProcessFunction、KeyedBroadcastProcessFunction，区别在于Keyed的函数只能应用与KeyedStream

processElement：只能使用ReadOnlyContext只读上下文；这是因为在广播状态下，要求所有的算子上的广播状态完全一致，如果允许修改可能就导致状态可能不一致而出现不可预测的异常；另一方面平行算子无法通讯，因此在设计上也做不到广播更新。
processBroadcastElement：支持使用可读写的上下文Context

/**
 * A function to be applied to a
 * {@link org.apache.flink.streaming.api.datastream.BroadcastConnectedStream BroadcastConnectedStream} that
 * connects {@link org.apache.flink.streaming.api.datastream.BroadcastStream BroadcastStream}, i.e. a stream
 * with broadcast state, with a <b>non-keyed</b> {@link org.apache.flink.streaming.api.datastream.DataStream DataStream}.
 *
 * <p>The stream with the broadcast state can be created using the
 * {@link org.apache.flink.streaming.api.datastream.DataStream#broadcast(MapStateDescriptor[])}
 * stream.broadcast(MapStateDescriptor)} method.
 *
 * <p>The user has to implement two methods:
 * <ol>
 *     <li>the {@link #processBroadcastElement(Object, Context, Collector)} which will be applied to
 *     each element in the broadcast side
 *     <li> and the {@link #processElement(Object, ReadOnlyContext, Collector)} which will be applied to the
 *     non-broadcasted/keyed side.
 * </ol>
 *
 * <p>The {@code processElementOnBroadcastSide()} takes as argument (among others) a context that allows it to
 * read/write to the broadcast state, while the {@code processElement()} has read-only access to the broadcast state.
 *
 * @param <IN1> The input type of the non-broadcast side.
 * @param <IN2> The input type of the broadcast side.
 * @param <OUT> The output type of the operator.
 */
@PublicEvolving
public abstract class BroadcastProcessFunction<IN1, IN2, OUT> extends BaseBroadcastProcessFunction {

	private static final long serialVersionUID = 8352559162119034453L;

	/**
	 * This method is called for each element in the (non-broadcast)
	 * {@link org.apache.flink.streaming.api.datastream.DataStream data stream}.
	 *
	 * <p>This function can output zero or more elements using the {@link Collector} parameter,
	 * query the current processing/event time, and also query and update the local keyed state.
	 * Finally, it has <b>read-only</b> access to the broadcast state.
	 * The context is only valid during the invocation of this method, do not store it.
	 *
	 * @param value The stream element.
	 * @param ctx A {@link ReadOnlyContext} that allows querying the timestamp of the element,
	 *            querying the current processing/event time and updating the broadcast state.
	 *            The context is only valid during the invocation of this method, do not store it.
	 * @param out The collector to emit resulting elements to
	 * @throws Exception The function may throw exceptions which cause the streaming program
	 *                   to fail and go into recovery.
	 */
	public abstract void processElement(final IN1 value, final ReadOnlyContext ctx, final Collector<OUT> out) throws Exception;

	/**
	 * This method is called for each element in the
	 * {@link org.apache.flink.streaming.api.datastream.BroadcastStream broadcast stream}.
	 *
	 * <p>This function can output zero or more elements using the {@link Collector} parameter,
	 * query the current processing/event time, and also query and update the internal
	 * {@link org.apache.flink.api.common.state.BroadcastState broadcast state}. These can be done
	 * through the provided {@link Context}.
	 * The context is only valid during the invocation of this method, do not store it.
	 *
	 * @param value The stream element.
	 * @param ctx A {@link Context} that allows querying the timestamp of the element,
	 *            querying the current processing/event time and updating the broadcast state.
	 *            The context is only valid during the invocation of this method, do not store it.
	 * @param out The collector to emit resulting elements to
	 * @throws Exception The function may throw exceptions which cause the streaming program
	 *                   to fail and go into recovery.
	 */
	public abstract void processBroadcastElement(final IN2 value, final Context ctx, final Collector<OUT> out) throws Exception;

	/**
	 * A {@link BaseBroadcastProcessFunction.Context context} available to the broadcast side of
	 * a {@link org.apache.flink.streaming.api.datastream.BroadcastConnectedStream}.
	 */
	public abstract class Context extends BaseBroadcastProcessFunction.Context {}

	/**
	 * A {@link BaseBroadcastProcessFunction.Context context} available to the non-keyed side of
	 * a {@link org.apache.flink.streaming.api.datastream.BroadcastConnectedStream} (if any).
	 */
	public abstract class ReadOnlyContext extends BaseBroadcastProcessFunction.ReadOnlyContext {}
}

异步函数

RichAsyncFunction抽象类实现AsyncFunction接口、继承与AbstractRichFunction获得了声明周期管理和RuntimeContext的访问能力。

AsyncFunction接口定义了2种行为，异步调用行为将结果封装到ResultFuture中，超时处理可以防止资源不释放

public interface AsyncFunction<IN, OUT> extends Function, Serializable {

	/**
	 * Trigger async operation for each stream input.
	 *
	 * @param input element coming from an upstream task
	 * @param resultFuture to be completed with the result data
	 * @exception Exception in case of a user code error. An exception will make the task fail and
	 * trigger fail-over process.
	 */
	void asyncInvoke(IN input, ResultFuture<OUT> resultFuture) throws Exception;

	/**
	 * {@link AsyncFunction#asyncInvoke} timeout occurred.
	 * By default, the result future is exceptionally completed with a timeout exception.
	 *
	 * @param input element coming from an upstream task
	 * @param resultFuture to be completed with the result data
	 */
	default void timeout(IN input, ResultFuture<OUT> resultFuture) throws Exception {
		resultFuture.completeExceptionally(
			new TimeoutException("Async function call has timed out."));
	}

}

数据源函数

SourceFunction接口之定义了接口的业务相关行为，一般在使用上会继承下RichSourceFunction或者RichParallelSourceFunction，这2个抽象类则通过继承了AbstractRichFunction获得了Function的生命周期管理与访问RuntimeContext的能力。

这2个抽象类型的区别在于分别是实现了SourceFunction、ParallelSourceFunction，使得RichParallelSourceFunction拥有并行执行的能力

包括如下关键行为：

生命周期：一般的实现类都会集成AbstractRichFunction，所以可以包含生命周期中的：open、close、cancel3个方法
数据读取：可以根据不同的外部存储实现持续的数据读取，如：kafka
数据发送：没啥好说的
水位线的生成并向下游发送
空闲标记：如果未读取到数据，则标记task为空闲，会向下游发送Idel，阻止水位线向下游的传递

/**
 * Base interface for all stream data sources in Flink. The contract of a stream source
 * is the following: When the source should start emitting elements, the {@link #run} method
 * is called with a {@link SourceContext} that can be used for emitting elements.
 * The run method can run for as long as necessary. The source must, however, react to an
 * invocation of {@link #cancel()} by breaking out of its main loop.
 *
 * <h3>CheckpointedFunction Sources</h3>
 *
 * <p>Sources that also implement the {@link org.apache.flink.streaming.api.checkpoint.CheckpointedFunction}
 * interface must ensure that state checkpointing, updating of internal state and emission of
 * elements are not done concurrently. This is achieved by using the provided checkpointing lock
 * object to protect update of state and emission of elements in a synchronized block.
 *
 * <p>This is the basic pattern one should follow when implementing a checkpointed source:
 *
 * <pre>{@code
 *  public class ExampleCountSource implements SourceFunction<Long>, CheckpointedFunction {
 *      private long count = 0L;
 *      private volatile boolean isRunning = true;
 *
 *      private transient ListState<Long> checkpointedCount;
 *
 *      public void run(SourceContext<T> ctx) {
 *          while (isRunning && count < 1000) {
 *              // this synchronized block ensures that state checkpointing,
 *              // internal state updates and emission of elements are an atomic operation
 *              synchronized (ctx.getCheckpointLock()) {
 *                  ctx.collect(count);
 *                  count++;
 *              }
 *          }
 *      }
 *
 *      public void cancel() {
 *          isRunning = false;
 *      }
 *
 *      public void initializeState(FunctionInitializationContext context) {
 *          this.checkpointedCount = context
 *              .getOperatorStateStore()
 *              .getListState(new ListStateDescriptor<>("count", Long.class));
 *
 *          if (context.isRestored()) {
 *              for (Long count : this.checkpointedCount.get()) {
 *                  this.count = count;
 *              }
 *          }
 *      }
 *
 *      public void snapshotState(FunctionSnapshotContext context) {
 *          this.checkpointedCount.clear();
 *          this.checkpointedCount.add(count);
 *      }
 * }
 * }</pre>
 *
 *
 * <h3>Timestamps and watermarks:</h3>
 * Sources may assign timestamps to elements and may manually emit watermarks.
 * However, these are only interpreted if the streaming program runs on
 * {@link TimeCharacteristic#EventTime}. On other time characteristics
 * ({@link TimeCharacteristic#IngestionTime} and {@link TimeCharacteristic#ProcessingTime}),
 * the watermarks from the source function are ignored.
 *
 * <h3>Gracefully Stopping Functions</h3>
 * Functions may additionally implement the {@link org.apache.flink.api.common.functions.StoppableFunction}
 * interface. "Stopping" a function, in contrast to "canceling" means a graceful exit that leaves the
 * state and the emitted elements in a consistent state.
 *
 * <p>When a source is stopped, the executing thread is not interrupted, but expected to leave the
 * {@link #run(SourceContext)} method in reasonable time on its own, preserving the atomicity
 * of state updates and element emission.
 *
 * @param <T> The type of the elements produced by this source.
 *
 * @see org.apache.flink.api.common.functions.StoppableFunction
 * @see org.apache.flink.streaming.api.TimeCharacteristic
 */
@Public
public interface SourceFunction<T> extends Function, Serializable {

	/**
	 * Starts the source. Implementations can use the {@link SourceContext} emit
	 * elements.
	 *
	 * <p>Sources that implement {@link org.apache.flink.streaming.api.checkpoint.CheckpointedFunction}
	 * must lock on the checkpoint lock (using a synchronized block) before updating internal
	 * state and emitting elements, to make both an atomic operation:
	 *
	 * <pre>{@code
	 *  public class ExampleCountSource implements SourceFunction<Long>, CheckpointedFunction {
	 *      private long count = 0L;
	 *      private volatile boolean isRunning = true;
	 *
	 *      private transient ListState<Long> checkpointedCount;
	 *
	 *      public void run(SourceContext<T> ctx) {
	 *          while (isRunning && count < 1000) {
	 *              // this synchronized block ensures that state checkpointing,
	 *              // internal state updates and emission of elements are an atomic operation
	 *              synchronized (ctx.getCheckpointLock()) {
	 *                  ctx.collect(count);
	 *                  count++;
	 *              }
	 *          }
	 *      }
	 *
	 *      public void cancel() {
	 *          isRunning = false;
	 *      }
	 *
	 *      public void initializeState(FunctionInitializationContext context) {
	 *          this.checkpointedCount = context
	 *              .getOperatorStateStore()
	 *              .getListState(new ListStateDescriptor<>("count", Long.class));
	 *
	 *          if (context.isRestored()) {
	 *              for (Long count : this.checkpointedCount.get()) {
	 *                  this.count = count;
	 *              }
	 *          }
	 *      }
	 *
	 *      public void snapshotState(FunctionSnapshotContext context) {
	 *          this.checkpointedCount.clear();
	 *          this.checkpointedCount.add(count);
	 *      }
	 * }
	 * }</pre>
	 *
	 * @param ctx The context to emit elements to and for accessing locks.
	 */
	void run(SourceContext<T> ctx) throws Exception;

	/**
	 * Cancels the source. Most sources will have a while loop inside the
	 * {@link #run(SourceContext)} method. The implementation needs to ensure that the
	 * source will break out of that loop after this method is called.
	 *
	 * <p>A typical pattern is to have an {@code "volatile boolean isRunning"} flag that is set to
	 * {@code false} in this method. That flag is checked in the loop condition.
	 *
	 * <p>When a source is canceled, the executing thread will also be interrupted
	 * (via {@link Thread#interrupt()}). The interruption happens strictly after this
	 * method has been called, so any interruption handler can rely on the fact that
	 * this method has completed. It is good practice to make any flags altered by
	 * this method "volatile", in order to guarantee the visibility of the effects of
	 * this method to any interruption handler.
	 */
	void cancel();

	// ------------------------------------------------------------------------
	//  source context
	// ------------------------------------------------------------------------

	/**
	 * Interface that source functions use to emit elements, and possibly watermarks.
	 *
	 * @param <T> The type of the elements produced by the source.
	 */
	@Public // Interface might be extended in the future with additional methods.
	interface SourceContext<T> {

		/**
		 * Emits one element from the source, without attaching a timestamp. In most cases,
		 * this is the default way of emitting elements.
		 *
		 * <p>The timestamp that the element will get assigned depends on the time characteristic of
		 * the streaming program:
		 * <ul>
		 *     <li>On {@link TimeCharacteristic#ProcessingTime}, the element has no timestamp.</li>
		 *     <li>On {@link TimeCharacteristic#IngestionTime}, the element gets the system's
		 *         current time as the timestamp.</li>
		 *     <li>On {@link TimeCharacteristic#EventTime}, the element will have no timestamp initially.
		 *         It needs to get a timestamp (via a {@link TimestampAssigner}) before any time-dependent
		 *         operation (like time windows).</li>
		 * </ul>
		 *
		 * @param element The element to emit
		 */
		void collect(T element);

		/**
		 * Emits one element from the source, and attaches the given timestamp. This method
		 * is relevant for programs using {@link TimeCharacteristic#EventTime}, where the
		 * sources assign timestamps themselves, rather than relying on a {@link TimestampAssigner}
		 * on the stream.
		 *
		 * <p>On certain time characteristics, this timestamp may be ignored or overwritten.
		 * This allows programs to switch between the different time characteristics and behaviors
		 * without changing the code of the source functions.
		 * <ul>
		 *     <li>On {@link TimeCharacteristic#ProcessingTime}, the timestamp will be ignored,
		 *         because processing time never works with element timestamps.</li>
		 *     <li>On {@link TimeCharacteristic#IngestionTime}, the timestamp is overwritten with the
		 *         system's current time, to realize proper ingestion time semantics.</li>
		 *     <li>On {@link TimeCharacteristic#EventTime}, the timestamp will be used.</li>
		 * </ul>
		 *
		 * @param element The element to emit
		 * @param timestamp The timestamp in milliseconds since the Epoch
		 */
		@PublicEvolving
		void collectWithTimestamp(T element, long timestamp);

		/**
		 * Emits the given {@link Watermark}. A Watermark of value {@code t} declares that no
		 * elements with a timestamp {@code t' <= t} will occur any more. If further such
		 * elements will be emitted, those elements are considered <i>late</i>.
		 *
		 * <p>This method is only relevant when running on {@link TimeCharacteristic#EventTime}.
		 * On {@link TimeCharacteristic#ProcessingTime},Watermarks will be ignored. On
		 * {@link TimeCharacteristic#IngestionTime}, the Watermarks will be replaced by the
		 * automatic ingestion time watermarks.
		 *
		 * @param mark The Watermark to emit
		 */
		@PublicEvolving
		void emitWatermark(Watermark mark);

		/**
		 * Marks the source to be temporarily idle. This tells the system that this source will
		 * temporarily stop emitting records and watermarks for an indefinite amount of time. This
		 * is only relevant when running on {@link TimeCharacteristic#IngestionTime} and
		 * {@link TimeCharacteristic#EventTime}, allowing downstream tasks to advance their
		 * watermarks without the need to wait for watermarks from this source while it is idle.
		 *
		 * <p>Source functions should make a best effort to call this method as soon as they
		 * acknowledge themselves to be idle. The system will consider the source to resume activity
		 * again once {@link SourceContext#collect(T)}, {@link SourceContext#collectWithTimestamp(T, long)},
		 * or {@link SourceContext#emitWatermark(Watermark)} is called to emit elements or watermarks from the source.
		 */
		@PublicEvolving
		void markAsTemporarilyIdle();

		/**
		 * Returns the checkpoint lock. Please refer to the class-level comment in
		 * {@link SourceFunction} for details about how to write a consistent checkpointed
		 * source.
		 *
		 * @return The object to use as the lock
		 */
		Object getCheckpointLock();

		/**
		 * This method is called by the system to shut down the context.
		 */
		void close();
	}
}

SourceFunction中的SourceContext：StreamSourceContexts类中包含2大类的SourceContext

NonTimestampContext：无时间，将全部元素的时间戳set为-1，这意味着永远不向下游发送水位线
WatermarkContext：带时间，定义了与Watermark相关的行为
1. 管理当前的StreamStatus，并向下游传递
2. 空闲检查，当超过设定的事件间隔仍未收到数据或者水位线时，将task置为空闲
AutomaticWatermarkContext：使用Ingestion time的时候，会自动生成水位线；原理是使用定时器（WatermarkEmittingTask），其触发时间=（作业启动时间戳+水位线周期*n），并持续的向下游发送水位线
ManualWatermarkContext：使用event time的时候，不产生水位线，而是向下游透传上游传递来的水位线

		private AutomaticWatermarkContext(

			long now = this.timeService.getCurrentProcessingTime();
			this.nextWatermarkTimer = this.timeService.registerTimer(now + watermarkInterval,
				new WatermarkEmittingTask(this.timeService, checkpointLock, output));


}



		private class WatermarkEmittingTask implements ProcessingTimeCallback {

			private final ProcessingTimeService timeService;
			private final Object lock;
			private final Output<StreamRecord<T>> output;

			private WatermarkEmittingTask(
					ProcessingTimeService timeService,
					Object checkpointLock,
					Output<StreamRecord<T>> output) {
				this.timeService = timeService;
				this.lock = checkpointLock;
				this.output = output;
			}

			@Override
			public void onProcessingTime(long timestamp) {
				final long currentTime = timeService.getCurrentProcessingTime();

				synchronized (lock) {
					// we should continue to automatically emit watermarks if we are active
					if (streamStatusMaintainer.getStreamStatus().isActive()) {
						if (idleTimeout != -1 && currentTime - lastRecordTime > idleTimeout) {
							// if we are configured to detect idleness, piggy-back the idle detection check on the
							// watermark interval, so that we may possibly discover idle sources faster before waiting
							// for the next idle check to fire
							markAsTemporarilyIdle();

							// no need to finish the next check, as we are now idle.
							cancelNextIdleDetectionTask();
						} else if (currentTime > nextWatermarkTime) {
							// align the watermarks across all machines. this will ensure that we
							// don't have watermarks that creep along at different intervals because
							// the machine clocks are out of sync
							final long watermarkTime = currentTime - (currentTime % watermarkInterval);

							output.emitWatermark(new Watermark(watermarkTime));
							nextWatermarkTime = watermarkTime + watermarkInterval;
						}
					}
				}

				long nextWatermark = currentTime + watermarkInterval;
				nextWatermarkTimer = this.timeService.registerTimer(
						nextWatermark, new WatermarkEmittingTask(this.timeService, lock, output));
			}
		}
	}

输出函数

SinkFunction是个单纯的数据输出函数，没有生命周期管理行为，生命周期由AbstractRichFunction实现。

我们在实现Sink的时候，基本上都是继承RichSinkFunction、TwoPhaseCommitSinkFunction，其中TwoPhaseCommitSinkFunction是Flink实现Exactly-Once语义的关键函数，提供框架级别的Exactly-Once实现，还会与checkpoint机制融合。

检查点函数

负责函数级别的状态保存与恢复，我们一般需要实现CheckpointedFunction、ListCheckpointed接口，状态快照的备份与恢复行为。

CheckpointedFunction：在状态保存之后会调用snapshotState()，可以将状态保存到外部存储；当状态恢复时initializeState可以初始化状态，执行从上一个checkpoint恢复状态的逻辑。

public interface CheckpointedFunction {

	/**
	 * This method is called when a snapshot for a checkpoint is requested. This acts as a hook to the function to
	 * ensure that all state is exposed by means previously offered through {@link FunctionInitializationContext} when
	 * the Function was initialized, or offered now by {@link FunctionSnapshotContext} itself.
	 *
	 * @param context the context for drawing a snapshot of the operator
	 * @throws Exception
	 */
	void snapshotState(FunctionSnapshotContext context) throws Exception;

	/**
	 * This method is called when the parallel function instance is created during distributed
	 * execution. Functions typically set up their state storing data structures in this method.
	 *
	 * @param context the context for initializing the operator
	 * @throws Exception
	 */
	void initializeState(FunctionInitializationContext context) throws Exception;

}

ListCheckpointed：则会更加强大，在修改作业并行度时，会提供状态重新分布的支持

public interface ListCheckpointed<T extends Serializable> {

	/**
	 * Gets the current state of the function. The state must reflect the result of all prior
	 * invocations to this function.
	 *
	 * <p>The returned list should contain one entry for redistributable unit of state. See
	 * the {@link ListCheckpointed class docs} for an illustration how list-style state
	 * redistribution works.
	 *
	 * <p>As special case, the returned list may be null or empty (if the operator has no state)
	 * or it may contain a single element (if the operator state is indivisible).
	 *
	 * @param checkpointId The ID of the checkpoint - a unique and monotonously increasing value.
	 * @param timestamp The wall clock timestamp when the checkpoint was triggered by the master.
	 *
	 * @return The operator state in a list of redistributable, atomic sub-states.
	 *         Should not return null, but empty list instead.
	 *
	 * @throws Exception Thrown if the creation of the state object failed. This causes the
	 *                   checkpoint to fail. The system may decide to fail the operation (and trigger
	 *                   recovery), or to discard this checkpoint attempt and to continue running
	 *                   and to try again with the next checkpoint attempt.
	 */
	List<T> snapshotState(long checkpointId, long timestamp) throws Exception;

	/**
	 * Restores the state of the function or operator to that of a previous checkpoint.
	 * This method is invoked when the function is executed after a failure recovery.
	 * The state list may be empty if no state is to be recovered by the particular parallel instance
	 * of the function.
	 *
	 * <p>The given state list will contain all the <i>sub states</i> that this parallel
	 * instance of the function needs to handle. Refer to the  {@link ListCheckpointed class docs}
	 * for an illustration how list-style state redistribution works.
	 *
	 * <p><b>Important:</b> When implementing this interface together with {@link RichFunction},
	 * then the {@code restoreState()} method is called before {@link RichFunction#open(Configuration)}.
	 *
	 * @param state The state to be restored as a list of atomic sub-states.
	 *
	 * @throws Exception Throwing an exception in this method causes the recovery to fail.
	 *                   The exact consequence depends on the configured failure handling strategy,
	 *                   but typically the system will re-attempt the recovery, or try recovering
	 *                   from a different checkpoint.
	 */
	void restoreState(List<T> state) throws Exception;
}

数据分区

Partition

Flink作为流式计算框架，分布式计算是最核心的部分，简单的理解就是吧一个作业切分为子任务，将不同的数据交给不同的Task计算，即每个task计算一部分数据。

StreamPartitioner是数据流分区的抽象接口，它的行为决定了数据分发的模式。

ChannelSelector是负载均衡的关键，所有的数据分区器都实现了它，它的行为决定了负载均衡的模式。

selectChannels方法可以知道下游的通道数量，通道数量在一次作业中是固定的，除非我们修改的并行度。

/**
 * The {@link ChannelSelector} determines to which logical channels a record
 * should be written to.
 *
 * @param <T> the type of record which is sent through the attached output gate
 */
public interface ChannelSelector<T extends IOReadableWritable> {

	/**
	 * Returns the logical channel indexes, to which the given record should be
	 * written.
	 *
	 * @param record      the record to the determine the output channels for
	 * @param numChannels the total number of output channels which are attached to respective output gate
	 * @return a (possibly empty) array of integer numbers which indicate the indices of the output channels through
	 * which the record shall be forwarded
	 */
	int[] selectChannels(T record, int numChannels);
}

常用的数据分区方式：

partitionCustom：DataStream的自定义分区，为每个原始选择目标分区，它将生成一个新的DataStream
ForwardPartitioner：上游算子数据直接转发给下游算子，它将生成一个新的DataStream
ShufflePartitioner：随机的选择
RebalancePartitioner：轮训的方式向下游发送数据，避免数据倾斜
RescalePartitioner：根据上下游task的数量进行分区
BroadcastPartitioner：广播方式
KeyGroupStreamPartitioner：KeyedStream根据key的分组进行分区

连接器

没啥好说的，就是与外部数据产品对接。

分布式ID

分布式框架为了跨网络进行传递数据，需要对各种对象生成序列号。

/**
 * A statistically unique identification number.
 */
@PublicEvolving
public class AbstractID implements Comparable<AbstractID>, java.io.Serializable {

	private static final long serialVersionUID = 1L;

	private static final Random RND = new Random();

	/** The size of a long in bytes. */
	private static final int SIZE_OF_LONG = 8;

	/** The size of the ID in byte. */
	public static final int SIZE = 2 * SIZE_OF_LONG;

	// ------------------------------------------------------------------------

	/** The upper part of the actual ID. */
	protected final long upperPart;

	/** The lower part of the actual ID. */
	protected final long lowerPart;

	/** The memoized value returned by toString(). */
	private transient String toString;

	// --------------------------------------------------------------------------------------------

	/**
	 * Constructs a new ID with a specific bytes value.
	 */
	public AbstractID(byte[] bytes) {
		if (bytes == null || bytes.length != SIZE) {
			throw new IllegalArgumentException("Argument bytes must by an array of " + SIZE + " bytes");
		}

		this.lowerPart = byteArrayToLong(bytes, 0);
		this.upperPart = byteArrayToLong(bytes, SIZE_OF_LONG);
	}

	/**
	 * Constructs a new abstract ID.
	 *
	 * @param lowerPart the lower bytes of the ID
	 * @param upperPart the higher bytes of the ID
	 */
	public AbstractID(long lowerPart, long upperPart) {
		this.lowerPart = lowerPart;
		this.upperPart = upperPart;
	}

	/**
	 * Copy constructor: Creates a new abstract ID from the given one.
	 *
	 * @param id the abstract ID to copy
	 */
	public AbstractID(AbstractID id) {
		if (id == null) {
			throw new IllegalArgumentException("Id must not be null.");
		}
		this.lowerPart = id.lowerPart;
		this.upperPart = id.upperPart;
	}

	/**
	 * Constructs a new random ID from a uniform distribution.
	 */
	public AbstractID() {
		this.lowerPart = RND.nextLong();
		this.upperPart = RND.nextLong();
	}

	// --------------------------------------------------------------------------------------------

	/**
	 * Gets the lower 64 bits of the ID.
	 *
	 * @return The lower 64 bits of the ID.
	 */
	public long getLowerPart() {
		return lowerPart;
	}

	/**
	 * Gets the upper 64 bits of the ID.
	 *
	 * @return The upper 64 bits of the ID.
	 */
	public long getUpperPart() {
		return upperPart;
	}

	/**
	 * Gets the bytes underlying this ID.
	 *
	 * @return The bytes underlying this ID.
	 */
	public byte[] getBytes() {
		byte[] bytes = new byte[SIZE];
		longToByteArray(lowerPart, bytes, 0);
		longToByteArray(upperPart, bytes, SIZE_OF_LONG);
		return bytes;
	}

	// --------------------------------------------------------------------------------------------
	//  Standard Utilities
	// --------------------------------------------------------------------------------------------

	@Override
	public boolean equals(Object obj) {
		if (obj == this) {
			return true;
		} else if (obj != null && obj.getClass() == getClass()) {
			AbstractID that = (AbstractID) obj;
			return that.lowerPart == this.lowerPart && that.upperPart == this.upperPart;
		} else {
			return false;
		}
	}

	@Override
	public int hashCode() {
		return ((int)  this.lowerPart) ^
				((int) (this.lowerPart >>> 32)) ^
				((int)  this.upperPart) ^
				((int) (this.upperPart >>> 32));
	}

	@Override
	public String toString() {
		if (this.toString == null) {
			final byte[] ba = new byte[SIZE];
			longToByteArray(this.lowerPart, ba, 0);
			longToByteArray(this.upperPart, ba, SIZE_OF_LONG);

			this.toString = StringUtils.byteToHexString(ba);
		}

		return this.toString;
	}

	@Override
	public int compareTo(AbstractID o) {
		int diff1 = Long.compare(this.upperPart, o.upperPart);
		int diff2 = Long.compare(this.lowerPart, o.lowerPart);
		return diff1 == 0 ? diff2 : diff1;
	}

	// --------------------------------------------------------------------------------------------
	//  Conversion Utilities
	// --------------------------------------------------------------------------------------------

	/**
	 * Converts the given byte array to a long.
	 *
	 * @param ba the byte array to be converted
	 * @param offset the offset indicating at which byte inside the array the conversion shall begin
	 * @return the long variable
	 */
	private static long byteArrayToLong(byte[] ba, int offset) {
		long l = 0;

		for (int i = 0; i < SIZE_OF_LONG; ++i) {
			l |= (ba[offset + SIZE_OF_LONG - 1 - i] & 0xffL) << (i << 3);
		}

		return l;
	}

	/**
	 * Converts a long to a byte array.
	 *
	 * @param l the long variable to be converted
	 * @param ba the byte array to store the result the of the conversion
	 * @param offset offset indicating at what position inside the byte array the result of the conversion shall be stored
	 */
	private static void longToByteArray(long l, byte[] ba, int offset) {
		for (int i = 0; i < SIZE_OF_LONG; ++i) {
			final int shift = i << 3; // i * 8
			ba[offset + SIZE_OF_LONG - 1 - i] = (byte) ((l & (0xffL << shift)) >>> shift);
		}
	}
}