Flink框架源码——核心抽象

目录

导读

环境对象

StreamExecutionEnvironment

ExecutionEnvironment

Environment

RuntimeContext

数据流元素

StreamElement

数据转换

Transformation

算子

StreamOperator

算子生命周期

状态与容错

数据处理

OneInputStreamOperator

TwoInputStreamOperator

异步算子

函数

Function

层次

延迟计算

广播函数

异步函数

数据源函数

输出函数

检查点函数

数据分区

Partition

连接器

分布式ID

总结


导读

源码的阅读我一般会从顶层与核心抽象开始,切勿一来就钻到具体的代码中,一个成熟的框架少则百万行的代码,不可能读完;只有先从架构设计与顶层抽象开始,才能掌握其设计要领,需要知道方法如何实现、算法如何实现时,才能快速找到位置,详细的看;这才是看源码的精髓。

环境对象

Flink的环境对象可以分为3种:开发时执行环境对象-StreamExecutionEnvironment、运行时执行环境对象-ExecutionEnvironment、运行时上下文对象

StreamExecutionEnvironment

是Flink开发时的入口(位于flink-streaming-java内),表示流式计算job的执行环境,包括了:job开发入口、数据源接口、DataStream生成与转换接口、Sink数据接口、job配置接口、job启动入口。

细分的话还包含: LocalStreamEnvironment、RemoteStreamEnvironment、StreamContextEnvironment、StreamPlanEnvironment

 

ExecutionEnvironment

是运行时job级别的环境对象(位于flink-java内),是从StreamExecutionEnvironment衍生出来的。启动job时,会从StreamExecutionEnvironment中抽取出需要的上下文数据,根据job的不同情况选择不同的运行时执行环境对象

  1. LocalEnvironment:本地执行环境,在单jvm环境下模拟运行flink集群,用于本地的开发测试
  2. RemoteEnvironment:在远端部署的flink集群的执行环境
  3. CollectionEnvironment:集合数据集模式执行环境,允许以连续的本地集合数据运行flink程序
  4. OptimizerPlanEnvironment:不会创建执行环境,只会创建执行计划
  5. ContextEnvironment:用于在客户端上远程执行

Environment

flink的运行时环境,作为接口定义了运行时task所需要的配置

  1. RuntimeEnvironment:运行时环境,在task开始执行时初始化
  2. DummyEnvironment:用于测试的运行时环境
  3. MockEnvironment:单元测试的运行时环境

 

RuntimeContext

是function运行时的上下文,每个function实例都会有一个runtimecontext对象,可以用RichFunction.getRuntimeContext()得到该对象

  1. StreamingRuntimeContext:用于流式计算的上下文
  2. DistributedRuntimeUDFContext:在运行时自定义函数所在的批处理算子创建,dataset批处理中使用
  3. RuntimeUDFContext:在批处理应用的自定义函数中使用

 

数据流元素

StreamElement

包括不同用途的4类元素,在执行层面上会被序列化为二进制数据流,在算子总会反序列化出来,进行处理

  1. StreamRecord:业务数据,也可以认为是一个事件
  2. watermark:水位线-时间戳,将告诉算子早于水位线的数据均已到达,可以触发计算窗口或者定时器
  3. StreamStatus:数据流状态,用于告知task是否继续接收上游的数据,在数据源算子中生成,沿着dataflow向下游传递;状态包括:
    1. IDLE:闲置
    2. ACTIVE:活动
  4. LatencyMarker:用于监控数据处理延迟,在数据源算子中生成,沿着dataflow向下游传递但会绕过业务逻辑,最终在sink中估算整体耗时

 

数据转换

Transformation

是现结DataStream与Flink内核的结构,DataStream面向开发、transformation面向Flink内核;在数据处理时,DataStream流水线会被转换为transformation流水线

transformation可以分为物理与虚拟2大类

  1. SourceTransformation:物理,Flink作业的起点,不存在输入因此不会出现实际意义的转换;一个作业可以有多个SourceTransformation
  2. SinkTransformation:物理,Flink作业的终点,将数据输出到外部存储,其不会再有下游转换;一个作业可以有多个SourceTransformation
  3. OneInputTransformation:物理,单输入单输出转换
  4. TwoInputTransformation:物理,双输入单输出转换
  5. SplitTransformation:虚拟,按条件将单DataStream拆分为多数据流,使用OutputSelector;并不会真正的做数据转换,只是做上下游的衔接
  6. SelectTransformation:虚拟,与SplitTransformation配合使用,为其选择切分后的DataStream
  7. PartitionTransformation:虚拟,根据输入的StreamPartitioner对数据流做分区选择,只是做上下游的衔接
  8. UnionTransformation:虚拟,将上游多个DataStream合并为一个,要求输入的多条DataStream结构一致

transformation作为中介,会将StreamTask、算子工厂构建好,算子作为UDF执行容器。

 

算子

StreamOperator

是流式计算的算子,一个算子就是一个计算步骤,而真正的计算则是算子中包含的function;

DataStream与DataSet有着2套不同的算子体系,未来的发展趋势是批流议题,本文只讨论DataStream的算子。

算子生命周期

  1. setup :实例化operator,初始化包括:环境、时间范围、注册监控等
  2. open :它的实现通常包含了operator的初始化逻辑;算子在执行该方法后,才会执行function的数据处理。
  3. close :该方法在所有的元素都进入到operator被处理之后调用,会保证计算后的缓存数据向下游发送
  4. dispose :该方法在operator生命周期的最后阶段执行,主要用于回收资源

状态与容错

  1. 状态存储
  2. 触发checkpoint后,保存快照
  3. 快照保存到外部存储
  4. 作业失败的时候,负责从快照中恢复状态

数据处理

在数据处理的同时,也会对数据元素中的watermark、LatencyMarker进出处理。

算子根据单输入和双输入定义了2个行为接口:

OneInputStreamOperator

public interface OneInputStreamOperator<IN, OUT> extends StreamOperator<OUT> {

	/**
	 * Processes one element that arrived at this operator.
	 * This method is guaranteed to not be called concurrently with other methods of the operator.
	 */
	void processElement(StreamRecord<IN> element) throws Exception;

	/**
	 * Processes a {@link Watermark}.
	 * This method is guaranteed to not be called concurrently with other methods of the operator.
	 *
	 * @see org.apache.flink.streaming.api.watermark.Watermark
	 */
	void processWatermark(Watermark mark) throws Exception;

	void processLatencyMarker(LatencyMarker latencyMarker) throws Exception;
}

TwoInputStreamOperator

public interface TwoInputStreamOperator<IN1, IN2, OUT> extends StreamOperator<OUT> {

	/**
	 * Processes one element that arrived on the first input of this two-input operator.
	 * This method is guaranteed to not be called concurrently with other methods of the operator.
	 */
	void processElement1(StreamRecord<IN1> element) throws Exception;

	/**
	 * Processes one element that arrived on the second input of this two-input operator.
	 * This method is guaranteed to not be called concurrently with other methods of the operator.
	 */
	void processElement2(StreamRecord<IN2> element) throws Exception;

	/**
	 * Processes a {@link Watermark} that arrived on the first input of this two-input operator.
	 * This method is guaranteed to not be called concurrently with other methods of the operator.
	 *
	 * @see org.apache.flink.streaming.api.watermark.Watermark
	 */
	void processWatermark1(Watermark mark) throws Exception;

	/**
	 * Processes a {@link Watermark} that arrived on the second input of this two-input operator.
	 * This method is guaranteed to not be called concurrently with other methods of the operator.
	 *
	 * @see org.apache.flink.streaming.api.watermark.Watermark
	 */
	void processWatermark2(Watermark mark) throws Exception;

	/**
	 * Processes a {@link LatencyMarker} that arrived on the first input of this two-input operator.
	 * This method is guaranteed to not be called concurrently with other methods of the operator.
	 *
	 * @see org.apache.flink.streaming.runtime.streamrecord.LatencyMarker
	 */
	void processLatencyMarker1(LatencyMarker latencyMarker) throws Exception;

	/**
	 * Processes a {@link LatencyMarker} that arrived on the second input of this two-input operator.
	 * This method is guaranteed to not be called concurrently with other methods of the operator.
	 *
	 * @see org.apache.flink.streaming.runtime.streamrecord.LatencyMarker
	 */
	void processLatencyMarker2(LatencyMarker latencyMarker) throws Exception;

}

异步算子

为了解决与外部系统交互是带来的延迟瓶颈;可以同时发出请求处理回执,不需要阻塞式的等待;关于回调的顺序,支持2种模式:

  1. 顺序输出模式:保证输出的数据与输入数据的顺序一致,但会增加延迟、降低算子吞吐量;内部是一个队列保证先收到的数据先输出,即使后续数据先得到回执也会等待
  2. 无序输出模式:先处理完得到回执的数据先输出,但不保证顺序,但延迟更低、吞吐量更高

要说的是,即便是无序模式,也不是完全的没有顺序,还记得watermark吗?flink仍会保证水位线不会超越先到数据;即可以依旧水位线来分割成组,组内是乱序但组与组之间是有序的。

 

 

函数

Function

其中自定义函数检查UDF,同时也有很多的内置函数;类型上看大体分为3类:

  1. SourceFunction:负责从外部读取数据,其所在的算子是起始点,不会有上游算子
  2. SinkFunction:负责将数据写入到外部存储,其所在的算子是终点,不会有下游算子
  3. Function:负责数据的处理,因此会同时有上游算子与下游算子;出于简单有效的考虑,设计与算子类似UDF也只分为单流输入与双流输入两种

层次

在DataStream API中看函数的层次分为3层,由高到底的封装分别为:

  1. Function:无状态、UDF接口;在使用时无需关系底层概念,只需要实现业务逻辑即可
  2. RichFunction:UDF接口+状态+生命周期;可以实现open、close方法来管理初始化与清理释放等动作;可以get/setRuntimeContext来得到运行时环境的参数,这可能是非常有用的
  3. ProcessFunction:UDF接口+状态+生命周期+触发器

需要说的是,无状态Function可以无脑使用,但有状态的函数,需要考虑中间结果的保存与恢复。

简单的类图上能看出其差异:

  1. Keyed与Non-Keyed的区别是,Keyed的函数只能应用与KeyedStream
  2. Co与Non-Co的区别是,Co函数是双流输入

延迟计算

这个概念是批量一体的一个非常重要的设计

流式计算中数据到抵达会乱序、延迟,为了提高处理效率,使用小批次的计算模式,而不是每个事件都触发一次。

典型场景像Join的定时器,或者window中的watermark。

支持延迟计算的算子都需要继承Triggerable接口,可以实现基于事件时间与处理时间的行为。

广播函数

继承于RichFunction接口、AbstractRichFunction抽象类、BaseBroadcastProcessFunction抽象类。

两大抽象类:BroadcastProcessFunction、KeyedBroadcastProcessFunction,区别在于Keyed的函数只能应用与KeyedStream

  1. processElement:只能使用ReadOnlyContext只读上下文;这是因为在广播状态下,要求所有的算子上的广播状态完全一致,如果允许修改可能就导致状态可能不一致而出现不可预测的异常;另一方面平行算子无法通讯,因此在设计上也做不到广播更新。
  2. processBroadcastElement:支持使用可读写的上下文Context 
/**
 * A function to be applied to a
 * {@link org.apache.flink.streaming.api.datastream.BroadcastConnectedStream BroadcastConnectedStream} that
 * connects {@link org.apache.flink.streaming.api.datastream.BroadcastStream BroadcastStream}, i.e. a stream
 * with broadcast state, with a <b>non-keyed</b> {@link org.apache.flink.streaming.api.datastream.DataStream DataStream}.
 *
 * <p>The stream with the broadcast state can be created using the
 * {@link org.apache.flink.streaming.api.datastream.DataStream#broadcast(MapStateDescriptor[])}
 * stream.broadcast(MapStateDescriptor)} method.
 *
 * <p>The user has to implement two methods:
 * <ol>
 *     <li>the {@link #processBroadcastElement(Object, Context, Collector)} which will be applied to
 *     each element in the broadcast side
 *     <li> and the {@link #processElement(Object, ReadOnlyContext, Collector)} which will be applied to the
 *     non-broadcasted/keyed side.
 * </ol>
 *
 * <p>The {@code processElementOnBroadcastSide()} takes as argument (among others) a context that allows it to
 * read/write to the broadcast state, while the {@code processElement()} has read-only access to the broadcast state.
 *
 * @param <IN1> The input type of the non-broadcast side.
 * @param <IN2> The input type of the broadcast side.
 * @param <OUT> The output type of the operator.
 */
@PublicEvolving
public abstract class BroadcastProcessFunction<IN1, IN2, OUT> extends BaseBroadcastProcessFunction {

	private static final long serialVersionUID = 8352559162119034453L;

	/**
	 * This method is called for each element in the (non-broadcast)
	 * {@link org.apache.flink.streaming.api.datastream.DataStream data stream}.
	 *
	 * <p>This function can output zero or more elements using the {@link Collector} parameter,
	 * query the current processing/event time, and also query and update the local keyed state.
	 * Finally, it has <b>read-only</b> access to the broadcast state.
	 * The context is only valid during the invocation of this method, do not store it.
	 *
	 * @param value The stream element.
	 * @param ctx A {@link ReadOnlyContext} that allows querying the timestamp of the element,
	 *            querying the current processing/event time and updating the broadcast state.
	 *            The context is only valid during the invocation of this method, do not store it.
	 * @param out The collector to emit resulting elements to
	 * @throws Exception The function may throw exceptions which cause the streaming program
	 *                   to fail and go into recovery.
	 */
	public abstract void processElement(final IN1 value, final ReadOnlyContext ctx, final Collector<OUT> out) throws Exception;

	/**
	 * This method is called for each element in the
	 * {@link org.apache.flink.streaming.api.datastream.BroadcastStream broadcast stream}.
	 *
	 * <p>This function can output zero or more elements using the {@link Collector} parameter,
	 * query the current processing/event time, and also query and update the internal
	 * {@link org.apache.flink.api.common.state.BroadcastState broadcast state}. These can be done
	 * through the provided {@link Context}.
	 * The context is only valid during the invocation of this method, do not store it.
	 *
	 * @param value The stream element.
	 * @param ctx A {@link Context} that allows querying the timestamp of the element,
	 *            querying the current processing/event time and updating the broadcast state.
	 *            The context is only valid during the invocation of this method, do not store it.
	 * @param out The collector to emit resulting elements to
	 * @throws Exception The function may throw exceptions which cause the streaming program
	 *                   to fail and go into recovery.
	 */
	public abstract void processBroadcastElement(final IN2 value, final Context ctx, final Collector<OUT> out) throws Exception;

	/**
	 * A {@link BaseBroadcastProcessFunction.Context context} available to the broadcast side of
	 * a {@link org.apache.flink.streaming.api.datastream.BroadcastConnectedStream}.
	 */
	public abstract class Context extends BaseBroadcastProcessFunction.Context {}

	/**
	 * A {@link BaseBroadcastProcessFunction.Context context} available to the non-keyed side of
	 * a {@link org.apache.flink.streaming.api.datastream.BroadcastConnectedStream} (if any).
	 */
	public abstract class ReadOnlyContext extends BaseBroadcastProcessFunction.ReadOnlyContext {}
}

异步函数

RichAsyncFunction抽象类实现AsyncFunction接口、继承与AbstractRichFunction获得了声明周期管理和RuntimeContext的访问能力。

AsyncFunction接口定义了2种行为,异步调用行为将结果封装到ResultFuture中,超时处理可以防止资源不释放

public interface AsyncFunction<IN, OUT> extends Function, Serializable {

	/**
	 * Trigger async operation for each stream input.
	 *
	 * @param input element coming from an upstream task
	 * @param resultFuture to be completed with the result data
	 * @exception Exception in case of a user code error. An exception will make the task fail and
	 * trigger fail-over process.
	 */
	void asyncInvoke(IN input, ResultFuture<OUT> resultFuture) throws Exception;

	/**
	 * {@link AsyncFunction#asyncInvoke} timeout occurred.
	 * By default, the result future is exceptionally completed with a timeout exception.
	 *
	 * @param input element coming from an upstream task
	 * @param resultFuture to be completed with the result data
	 */
	default void timeout(IN input, ResultFuture<OUT> resultFuture) throws Exception {
		resultFuture.completeExceptionally(
			new TimeoutException("Async function call has timed out."));
	}

}

数据源函数

SourceFunction接口之定义了接口的业务相关行为,一般在使用上会继承下RichSourceFunction或者RichParallelSourceFunction,这2个抽象类则通过继承了AbstractRichFunction获得了Function的生命周期管理与访问RuntimeContext的能力。

这2个抽象类型的区别在于分别是实现了SourceFunction、ParallelSourceFunction,使得RichParallelSourceFunction拥有并行执行的能力

包括如下关键行为:

  1. 生命周期:一般的实现类都会集成AbstractRichFunction,所以可以包含生命周期中的:open、close、cancel3个方法
  2. 数据读取:可以根据不同的外部存储实现持续的数据读取,如:kafka
  3. 数据发送:没啥好说的
  4. 水位线的生成并向下游发送
  5. 空闲标记:如果未读取到数据,则标记task为空闲,会向下游发送Idel,阻止水位线向下游的传递
/**
 * Base interface for all stream data sources in Flink. The contract of a stream source
 * is the following: When the source should start emitting elements, the {@link #run} method
 * is called with a {@link SourceContext} that can be used for emitting elements.
 * The run method can run for as long as necessary. The source must, however, react to an
 * invocation of {@link #cancel()} by breaking out of its main loop.
 *
 * <h3>CheckpointedFunction Sources</h3>
 *
 * <p>Sources that also implement the {@link org.apache.flink.streaming.api.checkpoint.CheckpointedFunction}
 * interface must ensure that state checkpointing, updating of internal state and emission of
 * elements are not done concurrently. This is achieved by using the provided checkpointing lock
 * object to protect update of state and emission of elements in a synchronized block.
 *
 * <p>This is the basic pattern one should follow when implementing a checkpointed source:
 *
 * <pre>{@code
 *  public class ExampleCountSource implements SourceFunction<Long>, CheckpointedFunction {
 *      private long count = 0L;
 *      private volatile boolean isRunning = true;
 *
 *      private transient ListState<Long> checkpointedCount;
 *
 *      public void run(SourceContext<T> ctx) {
 *          while (isRunning && count < 1000) {
 *              // this synchronized block ensures that state checkpointing,
 *              // internal state updates and emission of elements are an atomic operation
 *              synchronized (ctx.getCheckpointLock()) {
 *                  ctx.collect(count);
 *                  count++;
 *              }
 *          }
 *      }
 *
 *      public void cancel() {
 *          isRunning = false;
 *      }
 *
 *      public void initializeState(FunctionInitializationContext context) {
 *          this.checkpointedCount = context
 *              .getOperatorStateStore()
 *              .getListState(new ListStateDescriptor<>("count", Long.class));
 *
 *          if (context.isRestored()) {
 *              for (Long count : this.checkpointedCount.get()) {
 *                  this.count = count;
 *              }
 *          }
 *      }
 *
 *      public void snapshotState(FunctionSnapshotContext context) {
 *          this.checkpointedCount.clear();
 *          this.checkpointedCount.add(count);
 *      }
 * }
 * }</pre>
 *
 *
 * <h3>Timestamps and watermarks:</h3>
 * Sources may assign timestamps to elements and may manually emit watermarks.
 * However, these are only interpreted if the streaming program runs on
 * {@link TimeCharacteristic#EventTime}. On other time characteristics
 * ({@link TimeCharacteristic#IngestionTime} and {@link TimeCharacteristic#ProcessingTime}),
 * the watermarks from the source function are ignored.
 *
 * <h3>Gracefully Stopping Functions</h3>
 * Functions may additionally implement the {@link org.apache.flink.api.common.functions.StoppableFunction}
 * interface. "Stopping" a function, in contrast to "canceling" means a graceful exit that leaves the
 * state and the emitted elements in a consistent state.
 *
 * <p>When a source is stopped, the executing thread is not interrupted, but expected to leave the
 * {@link #run(SourceContext)} method in reasonable time on its own, preserving the atomicity
 * of state updates and element emission.
 *
 * @param <T> The type of the elements produced by this source.
 *
 * @see org.apache.flink.api.common.functions.StoppableFunction
 * @see org.apache.flink.streaming.api.TimeCharacteristic
 */
@Public
public interface SourceFunction<T> extends Function, Serializable {

	/**
	 * Starts the source. Implementations can use the {@link SourceContext} emit
	 * elements.
	 *
	 * <p>Sources that implement {@link org.apache.flink.streaming.api.checkpoint.CheckpointedFunction}
	 * must lock on the checkpoint lock (using a synchronized block) before updating internal
	 * state and emitting elements, to make both an atomic operation:
	 *
	 * <pre>{@code
	 *  public class ExampleCountSource implements SourceFunction<Long>, CheckpointedFunction {
	 *      private long count = 0L;
	 *      private volatile boolean isRunning = true;
	 *
	 *      private transient ListState<Long> checkpointedCount;
	 *
	 *      public void run(SourceContext<T> ctx) {
	 *          while (isRunning && count < 1000) {
	 *              // this synchronized block ensures that state checkpointing,
	 *              // internal state updates and emission of elements are an atomic operation
	 *              synchronized (ctx.getCheckpointLock()) {
	 *                  ctx.collect(count);
	 *                  count++;
	 *              }
	 *          }
	 *      }
	 *
	 *      public void cancel() {
	 *          isRunning = false;
	 *      }
	 *
	 *      public void initializeState(FunctionInitializationContext context) {
	 *          this.checkpointedCount = context
	 *              .getOperatorStateStore()
	 *              .getListState(new ListStateDescriptor<>("count", Long.class));
	 *
	 *          if (context.isRestored()) {
	 *              for (Long count : this.checkpointedCount.get()) {
	 *                  this.count = count;
	 *              }
	 *          }
	 *      }
	 *
	 *      public void snapshotState(FunctionSnapshotContext context) {
	 *          this.checkpointedCount.clear();
	 *          this.checkpointedCount.add(count);
	 *      }
	 * }
	 * }</pre>
	 *
	 * @param ctx The context to emit elements to and for accessing locks.
	 */
	void run(SourceContext<T> ctx) throws Exception;

	/**
	 * Cancels the source. Most sources will have a while loop inside the
	 * {@link #run(SourceContext)} method. The implementation needs to ensure that the
	 * source will break out of that loop after this method is called.
	 *
	 * <p>A typical pattern is to have an {@code "volatile boolean isRunning"} flag that is set to
	 * {@code false} in this method. That flag is checked in the loop condition.
	 *
	 * <p>When a source is canceled, the executing thread will also be interrupted
	 * (via {@link Thread#interrupt()}). The interruption happens strictly after this
	 * method has been called, so any interruption handler can rely on the fact that
	 * this method has completed. It is good practice to make any flags altered by
	 * this method "volatile", in order to guarantee the visibility of the effects of
	 * this method to any interruption handler.
	 */
	void cancel();

	// ------------------------------------------------------------------------
	//  source context
	// ------------------------------------------------------------------------

	/**
	 * Interface that source functions use to emit elements, and possibly watermarks.
	 *
	 * @param <T> The type of the elements produced by the source.
	 */
	@Public // Interface might be extended in the future with additional methods.
	interface SourceContext<T> {

		/**
		 * Emits one element from the source, without attaching a timestamp. In most cases,
		 * this is the default way of emitting elements.
		 *
		 * <p>The timestamp that the element will get assigned depends on the time characteristic of
		 * the streaming program:
		 * <ul>
		 *     <li>On {@link TimeCharacteristic#ProcessingTime}, the element has no timestamp.</li>
		 *     <li>On {@link TimeCharacteristic#IngestionTime}, the element gets the system's
		 *         current time as the timestamp.</li>
		 *     <li>On {@link TimeCharacteristic#EventTime}, the element will have no timestamp initially.
		 *         It needs to get a timestamp (via a {@link TimestampAssigner}) before any time-dependent
		 *         operation (like time windows).</li>
		 * </ul>
		 *
		 * @param element The element to emit
		 */
		void collect(T element);

		/**
		 * Emits one element from the source, and attaches the given timestamp. This method
		 * is relevant for programs using {@link TimeCharacteristic#EventTime}, where the
		 * sources assign timestamps themselves, rather than relying on a {@link TimestampAssigner}
		 * on the stream.
		 *
		 * <p>On certain time characteristics, this timestamp may be ignored or overwritten.
		 * This allows programs to switch between the different time characteristics and behaviors
		 * without changing the code of the source functions.
		 * <ul>
		 *     <li>On {@link TimeCharacteristic#ProcessingTime}, the timestamp will be ignored,
		 *         because processing time never works with element timestamps.</li>
		 *     <li>On {@link TimeCharacteristic#IngestionTime}, the timestamp is overwritten with the
		 *         system's current time, to realize proper ingestion time semantics.</li>
		 *     <li>On {@link TimeCharacteristic#EventTime}, the timestamp will be used.</li>
		 * </ul>
		 *
		 * @param element The element to emit
		 * @param timestamp The timestamp in milliseconds since the Epoch
		 */
		@PublicEvolving
		void collectWithTimestamp(T element, long timestamp);

		/**
		 * Emits the given {@link Watermark}. A Watermark of value {@code t} declares that no
		 * elements with a timestamp {@code t' <= t} will occur any more. If further such
		 * elements will be emitted, those elements are considered <i>late</i>.
		 *
		 * <p>This method is only relevant when running on {@link TimeCharacteristic#EventTime}.
		 * On {@link TimeCharacteristic#ProcessingTime},Watermarks will be ignored. On
		 * {@link TimeCharacteristic#IngestionTime}, the Watermarks will be replaced by the
		 * automatic ingestion time watermarks.
		 *
		 * @param mark The Watermark to emit
		 */
		@PublicEvolving
		void emitWatermark(Watermark mark);

		/**
		 * Marks the source to be temporarily idle. This tells the system that this source will
		 * temporarily stop emitting records and watermarks for an indefinite amount of time. This
		 * is only relevant when running on {@link TimeCharacteristic#IngestionTime} and
		 * {@link TimeCharacteristic#EventTime}, allowing downstream tasks to advance their
		 * watermarks without the need to wait for watermarks from this source while it is idle.
		 *
		 * <p>Source functions should make a best effort to call this method as soon as they
		 * acknowledge themselves to be idle. The system will consider the source to resume activity
		 * again once {@link SourceContext#collect(T)}, {@link SourceContext#collectWithTimestamp(T, long)},
		 * or {@link SourceContext#emitWatermark(Watermark)} is called to emit elements or watermarks from the source.
		 */
		@PublicEvolving
		void markAsTemporarilyIdle();

		/**
		 * Returns the checkpoint lock. Please refer to the class-level comment in
		 * {@link SourceFunction} for details about how to write a consistent checkpointed
		 * source.
		 *
		 * @return The object to use as the lock
		 */
		Object getCheckpointLock();

		/**
		 * This method is called by the system to shut down the context.
		 */
		void close();
	}
}

SourceFunction中的SourceContext:StreamSourceContexts类中包含2大类的SourceContext

  1. NonTimestampContext:无时间,将全部元素的时间戳set为-1,这意味着永远不向下游发送水位线
  2. WatermarkContext:带时间,定义了与Watermark相关的行为
    1. 管理当前的StreamStatus,并向下游传递
    2. 空闲检查,当超过设定的事件间隔仍未收到数据或者水位线时,将task置为空闲
  3. AutomaticWatermarkContext:使用Ingestion time的时候,会自动生成水位线;原理是使用定时器(WatermarkEmittingTask),其触发时间=(作业启动时间戳+水位线周期*n),并持续的向下游发送水位线
  4. ManualWatermarkContext:使用event time的时候,不产生水位线,而是向下游透传上游传递来的水位线

		private AutomaticWatermarkContext(

			long now = this.timeService.getCurrentProcessingTime();
			this.nextWatermarkTimer = this.timeService.registerTimer(now + watermarkInterval,
				new WatermarkEmittingTask(this.timeService, checkpointLock, output));


}



		private class WatermarkEmittingTask implements ProcessingTimeCallback {

			private final ProcessingTimeService timeService;
			private final Object lock;
			private final Output<StreamRecord<T>> output;

			private WatermarkEmittingTask(
					ProcessingTimeService timeService,
					Object checkpointLock,
					Output<StreamRecord<T>> output) {
				this.timeService = timeService;
				this.lock = checkpointLock;
				this.output = output;
			}

			@Override
			public void onProcessingTime(long timestamp) {
				final long currentTime = timeService.getCurrentProcessingTime();

				synchronized (lock) {
					// we should continue to automatically emit watermarks if we are active
					if (streamStatusMaintainer.getStreamStatus().isActive()) {
						if (idleTimeout != -1 && currentTime - lastRecordTime > idleTimeout) {
							// if we are configured to detect idleness, piggy-back the idle detection check on the
							// watermark interval, so that we may possibly discover idle sources faster before waiting
							// for the next idle check to fire
							markAsTemporarilyIdle();

							// no need to finish the next check, as we are now idle.
							cancelNextIdleDetectionTask();
						} else if (currentTime > nextWatermarkTime) {
							// align the watermarks across all machines. this will ensure that we
							// don't have watermarks that creep along at different intervals because
							// the machine clocks are out of sync
							final long watermarkTime = currentTime - (currentTime % watermarkInterval);

							output.emitWatermark(new Watermark(watermarkTime));
							nextWatermarkTime = watermarkTime + watermarkInterval;
						}
					}
				}

				long nextWatermark = currentTime + watermarkInterval;
				nextWatermarkTimer = this.timeService.registerTimer(
						nextWatermark, new WatermarkEmittingTask(this.timeService, lock, output));
			}
		}
	}

输出函数

SinkFunction是个单纯的数据输出函数,没有生命周期管理行为,生命周期由AbstractRichFunction实现。

我们在实现Sink的时候,基本上都是继承RichSinkFunction、TwoPhaseCommitSinkFunction,其中TwoPhaseCommitSinkFunction是Flink实现Exactly-Once语义的关键函数,提供框架级别的Exactly-Once实现,还会与checkpoint机制融合。

检查点函数

负责函数级别的状态保存与恢复,我们一般需要实现CheckpointedFunction、ListCheckpointed接口,状态快照的备份与恢复行为。

CheckpointedFunction:在状态保存之后会调用snapshotState(),可以将状态保存到外部存储;当状态恢复时initializeState可以初始化状态,执行从上一个checkpoint恢复状态的逻辑。

public interface CheckpointedFunction {

	/**
	 * This method is called when a snapshot for a checkpoint is requested. This acts as a hook to the function to
	 * ensure that all state is exposed by means previously offered through {@link FunctionInitializationContext} when
	 * the Function was initialized, or offered now by {@link FunctionSnapshotContext} itself.
	 *
	 * @param context the context for drawing a snapshot of the operator
	 * @throws Exception
	 */
	void snapshotState(FunctionSnapshotContext context) throws Exception;

	/**
	 * This method is called when the parallel function instance is created during distributed
	 * execution. Functions typically set up their state storing data structures in this method.
	 *
	 * @param context the context for initializing the operator
	 * @throws Exception
	 */
	void initializeState(FunctionInitializationContext context) throws Exception;

}

ListCheckpointed:则会更加强大,在修改作业并行度时,会提供状态重新分布的支持

public interface ListCheckpointed<T extends Serializable> {

	/**
	 * Gets the current state of the function. The state must reflect the result of all prior
	 * invocations to this function.
	 *
	 * <p>The returned list should contain one entry for redistributable unit of state. See
	 * the {@link ListCheckpointed class docs} for an illustration how list-style state
	 * redistribution works.
	 *
	 * <p>As special case, the returned list may be null or empty (if the operator has no state)
	 * or it may contain a single element (if the operator state is indivisible).
	 *
	 * @param checkpointId The ID of the checkpoint - a unique and monotonously increasing value.
	 * @param timestamp The wall clock timestamp when the checkpoint was triggered by the master.
	 *
	 * @return The operator state in a list of redistributable, atomic sub-states.
	 *         Should not return null, but empty list instead.
	 *
	 * @throws Exception Thrown if the creation of the state object failed. This causes the
	 *                   checkpoint to fail. The system may decide to fail the operation (and trigger
	 *                   recovery), or to discard this checkpoint attempt and to continue running
	 *                   and to try again with the next checkpoint attempt.
	 */
	List<T> snapshotState(long checkpointId, long timestamp) throws Exception;

	/**
	 * Restores the state of the function or operator to that of a previous checkpoint.
	 * This method is invoked when the function is executed after a failure recovery.
	 * The state list may be empty if no state is to be recovered by the particular parallel instance
	 * of the function.
	 *
	 * <p>The given state list will contain all the <i>sub states</i> that this parallel
	 * instance of the function needs to handle. Refer to the  {@link ListCheckpointed class docs}
	 * for an illustration how list-style state redistribution works.
	 *
	 * <p><b>Important:</b> When implementing this interface together with {@link RichFunction},
	 * then the {@code restoreState()} method is called before {@link RichFunction#open(Configuration)}.
	 *
	 * @param state The state to be restored as a list of atomic sub-states.
	 *
	 * @throws Exception Throwing an exception in this method causes the recovery to fail.
	 *                   The exact consequence depends on the configured failure handling strategy,
	 *                   but typically the system will re-attempt the recovery, or try recovering
	 *                   from a different checkpoint.
	 */
	void restoreState(List<T> state) throws Exception;
}

数据分区

Partition

Flink作为流式计算框架,分布式计算是最核心的部分,简单的理解就是吧一个作业切分为子任务,将不同的数据交给不同的Task计算,即每个task计算一部分数据。
 

StreamPartitioner是数据流分区的抽象接口,它的行为决定了数据分发的模式。

ChannelSelector是负载均衡的关键,所有的数据分区器都实现了它,它的行为决定了负载均衡的模式。

selectChannels方法可以知道下游的通道数量,通道数量在一次作业中是固定的,除非我们修改的并行度。

/**
 * The {@link ChannelSelector} determines to which logical channels a record
 * should be written to.
 *
 * @param <T> the type of record which is sent through the attached output gate
 */
public interface ChannelSelector<T extends IOReadableWritable> {

	/**
	 * Returns the logical channel indexes, to which the given record should be
	 * written.
	 *
	 * @param record      the record to the determine the output channels for
	 * @param numChannels the total number of output channels which are attached to respective output gate
	 * @return a (possibly empty) array of integer numbers which indicate the indices of the output channels through
	 * which the record shall be forwarded
	 */
	int[] selectChannels(T record, int numChannels);
}

常用的数据分区方式:

  1. partitionCustom:DataStream的自定义分区,为每个原始选择目标分区,它将生成一个新的DataStream
  2. ForwardPartitioner:上游算子数据直接转发给下游算子,它将生成一个新的DataStream
  3. ShufflePartitioner:随机的选择
  4. RebalancePartitioner:轮训的方式向下游发送数据,避免数据倾斜
  5. RescalePartitioner:根据上下游task的数量进行分区
  6. BroadcastPartitioner:广播方式
  7. KeyGroupStreamPartitioner:KeyedStream根据key的分组进行分区

 

连接器

没啥好说的,就是与外部数据产品对接。

分布式ID

分布式框架为了跨网络进行传递数据,需要对各种对象生成序列号。

 

/**
 * A statistically unique identification number.
 */
@PublicEvolving
public class AbstractID implements Comparable<AbstractID>, java.io.Serializable {

	private static final long serialVersionUID = 1L;

	private static final Random RND = new Random();

	/** The size of a long in bytes. */
	private static final int SIZE_OF_LONG = 8;

	/** The size of the ID in byte. */
	public static final int SIZE = 2 * SIZE_OF_LONG;

	// ------------------------------------------------------------------------

	/** The upper part of the actual ID. */
	protected final long upperPart;

	/** The lower part of the actual ID. */
	protected final long lowerPart;

	/** The memoized value returned by toString(). */
	private transient String toString;

	// --------------------------------------------------------------------------------------------

	/**
	 * Constructs a new ID with a specific bytes value.
	 */
	public AbstractID(byte[] bytes) {
		if (bytes == null || bytes.length != SIZE) {
			throw new IllegalArgumentException("Argument bytes must by an array of " + SIZE + " bytes");
		}

		this.lowerPart = byteArrayToLong(bytes, 0);
		this.upperPart = byteArrayToLong(bytes, SIZE_OF_LONG);
	}

	/**
	 * Constructs a new abstract ID.
	 *
	 * @param lowerPart the lower bytes of the ID
	 * @param upperPart the higher bytes of the ID
	 */
	public AbstractID(long lowerPart, long upperPart) {
		this.lowerPart = lowerPart;
		this.upperPart = upperPart;
	}

	/**
	 * Copy constructor: Creates a new abstract ID from the given one.
	 *
	 * @param id the abstract ID to copy
	 */
	public AbstractID(AbstractID id) {
		if (id == null) {
			throw new IllegalArgumentException("Id must not be null.");
		}
		this.lowerPart = id.lowerPart;
		this.upperPart = id.upperPart;
	}

	/**
	 * Constructs a new random ID from a uniform distribution.
	 */
	public AbstractID() {
		this.lowerPart = RND.nextLong();
		this.upperPart = RND.nextLong();
	}

	// --------------------------------------------------------------------------------------------

	/**
	 * Gets the lower 64 bits of the ID.
	 *
	 * @return The lower 64 bits of the ID.
	 */
	public long getLowerPart() {
		return lowerPart;
	}

	/**
	 * Gets the upper 64 bits of the ID.
	 *
	 * @return The upper 64 bits of the ID.
	 */
	public long getUpperPart() {
		return upperPart;
	}

	/**
	 * Gets the bytes underlying this ID.
	 *
	 * @return The bytes underlying this ID.
	 */
	public byte[] getBytes() {
		byte[] bytes = new byte[SIZE];
		longToByteArray(lowerPart, bytes, 0);
		longToByteArray(upperPart, bytes, SIZE_OF_LONG);
		return bytes;
	}

	// --------------------------------------------------------------------------------------------
	//  Standard Utilities
	// --------------------------------------------------------------------------------------------

	@Override
	public boolean equals(Object obj) {
		if (obj == this) {
			return true;
		} else if (obj != null && obj.getClass() == getClass()) {
			AbstractID that = (AbstractID) obj;
			return that.lowerPart == this.lowerPart && that.upperPart == this.upperPart;
		} else {
			return false;
		}
	}

	@Override
	public int hashCode() {
		return ((int)  this.lowerPart) ^
				((int) (this.lowerPart >>> 32)) ^
				((int)  this.upperPart) ^
				((int) (this.upperPart >>> 32));
	}

	@Override
	public String toString() {
		if (this.toString == null) {
			final byte[] ba = new byte[SIZE];
			longToByteArray(this.lowerPart, ba, 0);
			longToByteArray(this.upperPart, ba, SIZE_OF_LONG);

			this.toString = StringUtils.byteToHexString(ba);
		}

		return this.toString;
	}

	@Override
	public int compareTo(AbstractID o) {
		int diff1 = Long.compare(this.upperPart, o.upperPart);
		int diff2 = Long.compare(this.lowerPart, o.lowerPart);
		return diff1 == 0 ? diff2 : diff1;
	}

	// --------------------------------------------------------------------------------------------
	//  Conversion Utilities
	// --------------------------------------------------------------------------------------------

	/**
	 * Converts the given byte array to a long.
	 *
	 * @param ba the byte array to be converted
	 * @param offset the offset indicating at which byte inside the array the conversion shall begin
	 * @return the long variable
	 */
	private static long byteArrayToLong(byte[] ba, int offset) {
		long l = 0;

		for (int i = 0; i < SIZE_OF_LONG; ++i) {
			l |= (ba[offset + SIZE_OF_LONG - 1 - i] & 0xffL) << (i << 3);
		}

		return l;
	}

	/**
	 * Converts a long to a byte array.
	 *
	 * @param l the long variable to be converted
	 * @param ba the byte array to store the result the of the conversion
	 * @param offset offset indicating at what position inside the byte array the result of the conversion shall be stored
	 */
	private static void longToByteArray(long l, byte[] ba, int offset) {
		for (int i = 0; i < SIZE_OF_LONG; ++i) {
			final int shift = i << 3; // i * 8
			ba[offset + SIZE_OF_LONG - 1 - i] = (byte) ((l & (0xffL << shift)) >>> shift);
		}
	}
}

 

总结

  • 面向开发者的抽象
  • 核心运行时的抽象
    • 数据流与操作抽象
    • 数据转换抽象
    • 算子、函数、数据分区的抽象
    • 数据IO的抽象
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值