Flink

最新推荐文章于 2024-08-07 09:50:59 发布

Kamin_Wu

最新推荐文章于 2024-08-07 09:50:59 发布

阅读量873

点赞数

分类专栏： Flink 文章标签： flink

本文链接：https://blog.csdn.net/anantie/article/details/114507262

版权

Flink 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

Flink简介

Apache Flink是一个框架和分布式处理引擎，用于对无界和有界数据流进行有状态计算。Flink设计为在所有常见的集群环境中运行，以内存速度和任何规模执行计算。

官网：https://flink.apache.org/
源码：https://github.com/apache/flink

Flink特点

流处理特性
（1）支持高吞吐、低延迟、高性能的流处理
（2）支持带有事件时间的窗口（Window）操作
（3）支持有状态计算的Exactly-once语义
（4）支持高度灵活的窗口（Window）操作，支持基于time、count、session，以及data-driven的窗口操作
（5）支持具有Backpressure功能的持续流模型
（6）支持基于轻量级分布式快照（Snapshot）实现的容错
（7）运行时同时支持Batch on Streaming处理和Streaming处理
（8） Flink在JVM内部实现了自己的内存管理
（9）支持迭代计算
（10）支持程序自动优化：避免特定情况下Shuffle、排序等昂贵操作，中间结果有必要进行缓存
API支持
（1）对Streaming数据类应用，提供DataStream API
（2）对批处理类应用，提供DataSet API（支持Java/Scala）
Libraries支持
支持机器学习（FlinkML）、支持图分析（Gelly）、支持关系数据处理（Table）、支持复杂事件处理（CEP）
整合支持
支持Flink on YARN、HDFS、Kafka的输入数据、Apache HBase、Hadoop程序、Tachyon、ElasticSearch、RabbitMQ、Apache Storm、S3、XtreemFS。
随处部署应用程序
Apache Flink是一个分布式系统，需要计算资源才能执行应用程序。Flink与所有常见的集群资源管理器（如Hadoop YARN，Apache Mesos和Kubernetes）集成，但也可以设置为作为独立集群运行。
以任何比例运行应用程序
Flink旨在以任何规模运行有状态流应用程序。应用程序可以并行化为数千个在集群中分布和同时执行的任务。因此，应用程序可以利用几乎无限量的CPU，主内存，磁盘和网络IO。而且，Flink可以轻松维护非常大的应用程序状态。其异步和增量检查点算法确保对处理延迟的影响最小，同时保证一次性状态一致性。

Storm、Spark、Flink对比

吞吐量

spark是mirco-batch级别的计算，各种优化做的也很好，它的throughputs是最大的。但是需要提一下，有状态计算（如updateStateByKey算子）需要通过额外的rdd来维护状态，导致开销较大，对吞吐量影响也较大。

storm的容错机制需要对每条data进行ack，因此容错开销对throughputs影响巨大，throughputs下降甚至可以达到70%。storm trident是基于micro-batch实现的，throughput中等。

flink的容错机制较为轻量，对throughputs影响较小，而且拥有图和调度上的一些优化机制，使得flink可以达到很高 throughputs。

下图是flink官网给出的storm和flink的对比图，我们可以看出storm在打开ack容错机制后，throughputs下降非常明显。而flink在开启checkpoint和关闭的情况下throughputs变化不大，说明flink的容错机制确实代价不高。
在这里插入图片描述

延迟

spark基于micro-batch实现，提高了throughputs，但是付出了latency的代价。一般spark的latency是秒级别的。

storm是native streaming实现，可以轻松的达到几十毫秒级别的latency，在几款框架中它的latency是最低的。storm trident是基于micro-batch实现的，latency较高。

flink也是native streaming实现，也可以达到百毫秒级别的latency。

下图是flink官网给出的和storm的latency对比benchmark。storm可以达到平均5毫秒以内的latency，而flink的平均latency也在30毫秒以内。两者的99%的data都在55毫秒latency内处理完成，表现都很优秀。
在这里插入图片描述

监控方案

集群监控

进程存在性监控

Flink进程分为JobManager（StandaloneSessionClusterEntrypoint）、和TaskManager。可通过脚本，分别监控各进程是否存在。

集群进程性能监控

Flink官方提供了Prometheus 的监控方案，通过修改flink/conf/flink-conf.yaml文件，添加如下配置信息，

# 使用PrometheusReporter类对外提供监控数据
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
# 设置对外提供监控数据的接口，默认为9249，可设置端口范围
metrics.reporter.prom.port: 9249-9250

在Prometheus的yml采集配置中添加如下内容进行采集：

# List填写Flink进程和监控端口,label标签根据如下添加
- targets: ['30.0.0.20:9049','30.0.0.21:9049']
  labels:
    clusterName:'FlinkCluster001'
    job:'flink'

任务监控

任务存在性监控

Flink任务可分为批处理任务和流处理任务。通过bin/flink list命令可以查看当前运行的任务。

流处理任务会一直处于运行状态，可以使用脚本，通过调用bin/flink list命令查看当前运行的任务，监控任务是否存在。
批处理任务，在运行结束后退出，退出后bin/flink list命令将无法查看到任务，可以从调度上解决，将批处理任务的调用由crontab，改为程序调度，这样，向Flink提交批处理任务的CliFrontend进程会一直存在，由CliFrontend通过监控CliFrontend进程达到监控批处理任务的存在性。

业务监控

Flink作为数据处理引擎，其任务功能离不开数据的输入和输出，可以结合任务实际业务，对输入、输出数据量进行监控。

指标介绍

指标类型

Flink支持Counters, Gauges, Histograms 和 Meters四种指标类型。

Counter

Counter用于计数。

Gauge

Gauge根据需要提供任何类型的值。

Histogram

Histogram衡量长值的分布。

Meter

Meter衡量平均吞吐量。

指标范围（scope）

当上报metric时，metric被打上了标识符，和一系列的key-value对。

该标识符基于3个组成部分：注册度量标准时的用户定义名称，可选的用户定义范围和系统提供的范围。例如，如果A.B是系统范围，C.D用户范围和E名称，则指标的标识符将为A.B.C.D.E。

该标识符由3个部分组成：注册指标时用户定义的名称，可选的用户定义的范围（scope）和系统提供的范围（scope）。例如，如果A.B是系统提供的范围，C.D用户定义的范围，E是用户定义的名称，则指标的标识符将为A.B.C.D.E。

可以通过conf/flink-conf.yaml配置文件的metrics.scope.delimiter配置项调整标识符的分隔符，默认为为.

指标清单

CPU

Scope	Infix	Metrics	Description	Type
Job-/TaskManager	Status.JVM.CPU	Load	*当前JVM的CPU使用率*	Gauge
Job-/TaskManager	Status.JVM.CPU	Time	The CPU time used by the JVM.	Gauge

内存

Scope	Infix	Metrics	Description	Type
Job-/TaskManager	Status.JVM.Memory	Heap.Used	*The amount of heap memory currently used (in bytes).*	Gauge
		Heap.Committed	The amount of heap memory guaranteed to be available to the JVM (in bytes). JVM申请内存大小	Gauge
		Heap.Max	The maximum amount of heap memory that can be used for memory management (in bytes). 可用于内存管理的最大heap内存	Gauge
		NonHeap.Used	The amount of non-heap memory currently used (in bytes). 当前被使用的non-heap内存	Gauge
		NonHeap.Committed	The amount of non-heap memory guaranteed to be available to the JVM (in bytes).	Gauge
		NonHeap.Max	The maximum amount of non-heap memory that can be used for memory management (in bytes).	Gauge
		Direct.Count	The number of buffers in the direct buffer pool. 直接缓存池的缓存数	Gauge
		Direct.MemoryUsed	The amount of memory used by the JVM for the direct buffer pool (in bytes). JVM使用掉的直接缓存池内存大小，单位byte	Gauge
		Direct.TotalCapacity	The total capacity of all buffers in the direct buffer pool (in bytes). 直接缓存池总容量，单位byte	Gauge
		Mapped.Count	The number of buffers in the mapped buffer pool.	Gauge
		Mapped.MemoryUsed	The amount of memory used by the JVM for the mapped buffer pool (in bytes).	Gauge
		Mapped.TotalCapacity	The number of buffers in the mapped buffer pool (in bytes).	Gauge

注：

UsedHeap、MaxHeap、CommittedHeap区别，参见文章：
https://www.baeldung.com/java-heap-used-committed-max
direct buffer pool、mapped buffer pool介绍，参见文章：https://stackoverflow.com/questions/15657837/what-is-mapped-buffer-pool-direct-buffer-pool-and-how-to-increase-their-size

线程

Scope	Infix	Metrics	Description	Type
Job-/TaskManager	Status.JVM.Threads	Count	The total number of live threads.	Gauge

垃圾回收

Scope	Infix	Metrics	Description	Type
Job-/TaskManager	Status.JVM.GarbageCollector	<GarbageCollector>.Count	The total number of collections that have occurred.	Gauge
Job-/TaskManager	Status.JVM.GarbageCollector	<GarbageCollector>.Time	The total time spent performing garbage collection.	Gauge

类加载（ClassLoader）

Scope	Infix	Metrics	Description	Type
Job-/TaskManager	Status.JVM.ClassLoader	ClassesLoaded	The total number of classes loaded since the start of the JVM.	Gauge
Job-/TaskManager	Status.JVM.ClassLoader	ClassesUnloaded	The total number of classes unloaded since the start of the JVM.	Gauge

网络

Scope	Infix	Metrics	Description	Type
TaskManager	Status.Network	AvailableMemorySegments	The number of unused memory segments.	Gauge
TaskManager	Status.Network	TotalMemorySegments	The number of allocated memory segments.	Gauge
Task	buffers	inputQueueLength	The number of queued input buffers. (ignores LocalInputChannels which are using blocking subpartitions)	Gauge
		outputQueueLength	The number of queued output buffers.	Gauge
		inPoolUsage	An estimate of the input buffers usage. (ignores LocalInputChannels)	Gauge
		inputFloatingBuffersUsage	An estimate of the floating input buffers usage, dedicated for credit-based mode. (ignores LocalInputChannels)	Gauge
		inputExclusiveBuffersUsage	An estimate of the exclusive input buffers usage, dedicated for credit-based mode. (ignores LocalInputChannels)	Gauge
		outPoolUsage	An estimate of the output buffers usage.	Gauge
	Network.<Input\|Output>.<gate\|partition> (only available if `taskmanager.net.detailed-metrics` config option is set)	totalQueueLen	Total number of queued buffers in all input/output channels.	Gauge
		minQueueLen	Minimum number of queued buffers in all input/output channels.	Gauge
		maxQueueLen	Maximum number of queued buffers in all input/output channels.	Gauge
		avgQueueLen	Average number of queued buffers in all input/output channels.	Gauge

注：
Flink内存管理机制参见：https://blog.csdn.net/lvwenyuan_1/article/details/103404591

Default shuffle service

Scope	Infix	Metrics	Description	Type
TaskManager	Status.Shuffle.Netty	AvailableMemorySegments	The number of unused memory segments.	Gauge
TaskManager	Status.Shuffle.Netty	TotalMemorySegments	The number of allocated memory segments.	Gauge
Task	Shuffle.Netty.Input.Buffers	inputQueueLength	The number of queued input buffers.	Gauge
	Shuffle.Netty.Input.Buffers	inPoolUsage	An estimate of the input buffers usage.	Gauge
	Shuffle.Netty.Output.Buffers	outputQueueLength	The number of queued output buffers.	Gauge
	Shuffle.Netty.Output.Buffers	outPoolUsage	An estimate of the output buffers usage.	Gauge
	Shuffle.Netty.<Input\|Output>.<gate\|partition> (only available if `taskmanager.net.detailed-metrics` config option is set)	totalQueueLen	Total number of queued buffers in all input/output channels.	Gauge
		minQueueLen	Minimum number of queued buffers in all input/output channels.	Gauge
		maxQueueLen	Maximum number of queued buffers in all input/output channels.	Gauge
		avgQueueLen	Average number of queued buffers in all input/output channels.	Gauge
Task	Shuffle.Netty.Input	numBytesInLocal	The total number of bytes this task has read from a local source.	Counter
		numBytesInLocalPerSecond	The number of bytes this task reads from a local source per second.	Meter
		numBytesInRemote	The total number of bytes this task has read from a remote source.	Counter
		numBytesInRemotePerSecond	The number of bytes this task reads from a remote source per second.	Meter
		numBuffersInLocal	The total number of network buffers this task has read from a local source.	Counter
		numBuffersInLocalPerSecond	The number of network buffers this task reads from a local source per second.	Meter
		numBuffersInRemote	The total number of network buffers this task has read from a remote source.	Counter
		numBuffersInRemotePerSecond	The number of network buffers this task reads from a remote source per second.	Meter

注：
Job、Task、Subtask定义参见：https://stackoverflow.com/questions/53610342/difference-between-job-task-and-subtask-in-flink

集群

Scope	Metrics	Description	Type
JobManager	numRegisteredTaskManagers	The number of registered taskmanagers.	Gauge
	numRunningJobs	The number of running jobs.	Gauge
	taskSlotsAvailable	The number of available task slots.	Gauge
	taskSlotsTotal	The total number of task slots.	Gauge

可用性

Scope	Metrics	Description	Type
Job (only available on JobManager)	restartingTime	The time it took to restart the job, or how long the current restart has been in progress (in milliseconds).	Gauge
	uptime	The time that the job has been running without interruption. Returns -1 for completed jobs (in milliseconds).	Gauge
	downtime	For jobs currently in a failing/recovering situation, the time elapsed during this outage. Returns 0 for running jobs and -1 for completed jobs (in milliseconds).	Gauge
	fullRestarts	The total number of full restarts since this job was submitted. Attention: Since 1.9.2, this metric also includes fine-grained restarts.	Gauge

CheckPointing

Scope	Metrics	Description	Type
Job (only available on JobManager)	lastCheckpointDuration	The time it took to complete the last checkpoint (in milliseconds).	Gauge
	lastCheckpointSize	The total size of the last checkpoint (in bytes).	Gauge
	lastCheckpointExternalPath	The path where the last external checkpoint was stored.	Gauge
	lastCheckpointRestoreTimestamp	Timestamp when the last checkpoint was restored at the coordinator (in milliseconds).	Gauge
	lastCheckpointAlignmentBuffered	The number of buffered bytes during alignment over all subtasks for the last checkpoint (in bytes).	Gauge
	numberOfInProgressCheckpoints	The number of in progress checkpoints.	Gauge
	numberOfCompletedCheckpoints	The number of successfully completed checkpoints.	Gauge
	numberOfFailedCheckpoints	The number of failed checkpoints.	Gauge
	totalNumberOfCheckpoints	The number of total checkpoints (in progress, completed, failed).	Gauge
Task	checkpointAlignmentTime	The time in nanoseconds that the last barrier alignment took to complete, or how long the current alignment has taken so far (in nanoseconds).	Gauge

RocksDB

IO

Scope	Metrics	Description	Type
Job (only available on TaskManager)	<source_id>.<source_subtask_index>.<operator_id>.<operator_subtask_index>.latency	The latency distributions from a given source subtask to an operator subtask (in milliseconds).	Histogram
Task	numBytesInLocal	Attention: deprecated, use Default shuffle service metrics.	Counter
	numBytesInLocalPerSecond	Attention: deprecated, use Default shuffle service metrics.	Meter
	numBytesInRemote	Attention: deprecated, use Default shuffle service metrics.	Counter
	numBytesInRemotePerSecond	Attention: deprecated, use Default shuffle service metrics.	Meter
	numBuffersInLocal	Attention: deprecated, use Default shuffle service metrics.	Counter
	numBuffersInLocalPerSecond	Attention: deprecated, use Default shuffle service metrics.	Meter
	numBuffersInRemote	Attention: deprecated, use Default shuffle service metrics.	Counter
	numBuffersInRemotePerSecond	Attention: deprecated, use Default shuffle service metrics.	Meter
	numBytesOut	The total number of bytes this task has emitted.	Counter
	numBytesOutPerSecond	The number of bytes this task emits per second.	Meter
	numBuffersOut	The total number of network buffers this task has emitted.	Counter
	numBuffersOutPerSecond	The number of network buffers this task emits per second.	Meter
Task/Operator	numRecordsIn	The total number of records this operator/task has received.	Counter
	numRecordsInPerSecond	The number of records this operator/task receives per second.	Meter
	numRecordsOut	The total number of records this operator/task has emitted.	Counter
	numRecordsOutPerSecond	The number of records this operator/task sends per second.	Meter
	numLateRecordsDropped	The number of records this operator/task has dropped due to arriving late.	Counter
	currentInputWatermark	The last watermark this operator/tasks has received (in milliseconds). Note: For operators/tasks with 2 inputs this is the minimum of the last received watermarks.	Gauge
Operator	currentInput1Watermark	The last watermark this operator has received in its first input (in milliseconds). Note: Only for operators with 2 inputs.	Gauge
	currentInput2Watermark	The last watermark this operator has received in its second input (in milliseconds). Note: Only for operators with 2 inputs.	Gauge
	currentOutputWatermark	The last watermark this operator has emitted (in milliseconds).	Gauge
	numSplitsProcessed	The total number of InputSplits this data source has processed (if the operator is a data source).	Gauge

连接器（Connector）

Kafka Connectors

Scope	Metrics	User Variables	Description	Type
Operator	commitsSucceeded	n/a	The total number of successful offset commits to Kafka, if offset committing is turned on and checkpointing is enabled.	Counter
Operator	commitsFailed	n/a	The total number of offset commit failures to Kafka, if offset committing is turned on and checkpointing is enabled. Note that committing offsets back to Kafka is only a means to expose consumer progress, so a commit failure does not affect the integrity of Flink's checkpointed partition offsets.	Counter
Operator	committedOffsets	topic, partition	The last successfully committed offsets to Kafka, for each partition. A particular partition's metric can be specified by topic name and partition id.	Gauge
Operator	currentOffsets	topic, partition	The consumer's current read offset, for each partition. A particular partition's metric can be specified by topic name and partition id.	Gauge

Kinesis Connectors

Scope	Metrics	User Variables	Description	Type
Operator	millisBehindLatest	stream, shardId	The number of milliseconds the consumer is behind the head of the stream, indicating how far behind current time the consumer is, for each Kinesis shard. A particular shard's metric can be specified by stream name and shard id. A value of 0 indicates record processing is caught up, and there are no new records to process at this moment. A value of -1 indicates that there is no reported value for the metric, yet.	Gauge
Operator	sleepTimeMillis	stream, shardId	The number of milliseconds the consumer spends sleeping before fetching records from Kinesis. A particular shard's metric can be specified by stream name and shard id.	Gauge
Operator	maxNumberOfRecordsPerFetch	stream, shardId	The maximum number of records requested by the consumer in a single getRecords call to Kinesis. If ConsumerConfigConstants.SHARD_USE_ADAPTIVE_READS is set to true, this value is adaptively calculated to maximize the 2 Mbps read limits from Kinesis.	Gauge
Operator	numberOfAggregatedRecordsPerFetch	stream, shardId	The number of aggregated Kinesis records fetched by the consumer in a single getRecords call to Kinesis.	Gauge
Operator	numberOfDeggregatedRecordsPerFetch	stream, shardId	The number of deaggregated Kinesis records fetched by the consumer in a single getRecords call to Kinesis.	Gauge
Operator	averageRecordSizeBytes	stream, shardId	The average size of a Kinesis record in bytes, fetched by the consumer in a single getRecords call.	Gauge
Operator	runLoopTimeNanos	stream, shardId	The actual time taken, in nanoseconds, by the consumer in the run loop.	Gauge
Operator	loopFrequencyHz	stream, shardId	The number of calls to getRecords in one second.	Gauge
Operator	bytesRequestedPerFetch	stream, shardId	The bytes requested (2 Mbps / loopFrequencyHz) in a single call to getRecords.	Gauge

操作系统资源（System resources）

操作系统资源相关指标，默认是关闭不采集的。

监控 Checkpoint

Flink的web接口提供一个窗口用于监控任务的checkpoint，这些数据在任务被终止后仍然可用。这里提供了四个不同的窗口展示checkpoint信息，分别是Overview, History, Summary, 和 Configuration。下面将依次讲解。
在这里插入图片描述

Overview

Overview 窗口列出了如下这些数据。如果 JobManager 进程挂了，这些数据将丢失。

Checkpoint Counts
- Triggered：Job启动后，被触发的 checkpoint 总数。
- In Progress：程序中的 checkpoint 总数。
- Completed：Job启动后，成功完成的 checkpoint 总数。
- Failed：Job启动后，失败的 checkpoint 总数。
- Restored：Job启动后，恢复的 checkpoint 数。这个指标同事反映了Job提交后，被重新启动的次数。需要注意，带 savepoint 的首次提交，也被记做一次恢复。同时，如果 JobManager 挂了，计数将被重置。
Latest Completed Checkpoint：最后一个成功完成的 checkpoint。点击它，可以获取到 subtask 级别的详细的数据。
Latest Failed Checkpoint：最后一个失败的 checkpoint。点击它，可以获取到 subtask 级别的详细的数据。
Latest Savepoint：通过外部途径，最后一次触发 savepoint 。点击它，可以获取到 subtask 级别的详细的数据。
Latest Restore：这里有两种恢复操作。
- Restore from Checkpoint：从常规的、周期性的 checkpoint 恢复。
- Restore from Savepoint：从 savepoint 恢复。