Flink监控:自定义消费延迟Metrics

1. 需求

现有如下需求,以kafka作为source,使用pushgateway+prometheus架构实时统计flink任务的消费偏移量current-offset和分区偏移量总长度log-end-offset,并计算两者差值得到消费延迟lag,如图:
在这里插入图片描述

2. 名词解释

2.1 committed-offsets

每一次kafka消费者调用consumer.poll()后得到一批数据,然后会调用consumer.commitAsync()之类的方法进行提交,代码如下:

ConsumerRecords<byte[], byte[]> records = consumer.poll(pollTimeoutMs);
for (ConsumerRecord<byte[], byte[]> record : records) {
    ...
}
consumer.commitAsync();

提交后的offset会被存储到zookeeper(已废弃)或者kafka内部topic _consumer_offsets

2.2 current-offsets

指一次poll()方法所拉取的一批数据的最大的那个偏移量,因此current-offsets是业务强相关的,无法在kafka broker或者kafka client中查询到。在flink kafka source connector中,current-offsets有如下诠释:
This refers to the offset of the last element that we retrieved and emitted successfully

2.3 visible-offset(我自己命名的)

topic中的可见消息总量,当consumer的隔离级别为read_uncommitted,visible-offset等于high watermark;当consumer的隔离级别为read_committed,visible-offset等于last stable offset
what is visible offset

2.4 log-end-offset

待写入的最新消息的偏移

3. 自定义Metrics

那么,为了满足需求,我们究竟需要使用上述的哪些指标呢?我们来分析一下flink kafka source connector源码中关于offset的提交。

3.1 flink kafka source connector源码分析

flink kafka source connector中共有三种提交模式:

public enum OffsetCommitMode {

   /** Completely disable offset committing. */
   DISABLED,

   /** Commit offsets back to Kafka only when checkpoints are completed. */
   ON_CHECKPOINTS,

   /** Commit offsets periodically back to Kafka, using the auto commit functionality of internal Kafka clients. */
   KAFKA_PERIODIC;
}
3.1.2 周期提交
Properties properties = new Properties();
properties.put("enable.auto.commit", "true");
properties.setProperty("auto.commit.interval.ms", "1000");
new FlinkKafkaConsumer<>("foo", new KafkaEventSchema(), properties)

这种提交模式下,我们显然只能使用current-offsets作为监控指标,因为committed-offsets是周期提交的,当到达周期准备提交offset时,flink已经处理了千条万条数据了

3.1.3 Checkpoint时提交

在做 checkpoint 的时候会调用 FlinkKafkaConsumerBase#snapshotState方法,其中 pendingOffsetsToCommit 会保存要提交的 offset。

    public final void snapshotState(FunctionSnapshotContext context) throws Exception {
        if (offsetCommitMode == OffsetCommitMode.ON_CHECKPOINTS) {
            // the map cannot be asynchronously updated, because only one checkpoint call can happen
            // on this function at a time: either snapshotState() or notifyCheckpointComplete()
            // 保存等待提交的current-offsets
            pendingOffsetsToCommit.put(context.getCheckpointId(), currentOffsets);
        }

        for (Map.Entry<KafkaTopicPartition, Long> kafkaTopicPartitionLongEntry : currentOffsets.entrySet()) {
            // 将各个分区的current-offset写入状态
            unionOffsetStates.add(
                    Tuple2.of(kafkaTopicPartitionLongEntry.getKey(), kafkaTopicPartitionLongEntry.getValue()));
        }
    }

在 checkpoint 完成以后,task 会调用 notifyCheckpointComplete() 方法

// FlinkKafkaConsumerBase.java
public final void notifyCheckpointComplete(long checkpointId) throws Exception {
...
}

最终会将要提交的 offset 通过 KafkaFetcher#doCommitInternalOffsetsToKafka 方法中的 consumerThread.setOffsetsToCommit(offsetsToCommit, commitCallback); 保存到 KafkaConsumerThread.java 中的 nextOffsetsToCommit 成员变量里面,并进行提交

	// KafkaConsumerThread.java
	void setOffsetsToCommit(
	...
	extOffsetsToCommit.getAndSet(Tuple2.of(offsetsToCommit, commitCallback)
	...
	}

	public void run() {
		while (running) {
		...
			final Tuple2<Map<TopicPartition, OffsetAndMetadata>, KafkaCommitCallback> commitOffsetsAndCallback = nextOffsetsToCommit.getAndSet(null);
			...
			consumer.commitAsync(commitOffsetsAndCallback.f0, new CommitCallback(commitOffsetsAndCallback.f1));
			...
		}
	}

这种提交模式下,我们显然也只能使用current-offsets作为监控指标,因为commit-offsets只有在checkpoint做完之后,才会进行提交。一次checkpoint的时间往往会被设置成几分钟,这之间flink早已消费了一批又一批的数据了,差之毫厘谬以千里。
综上,我们根据需求和各概念相应的释义明确了需要的Metrics指标,即current-offsetsvisible-offsetlag通过两者差值来计算。幸运的是,我们不需要分别获取两种offset,KafkaConsumer下的SubScriptionState已经提供了这两个offset,并提供了计算方法。我们只需要拿到SubScriptionState对象即可。具体如何拿到SubScriptionState对象,可以参考这张图
在这里插入图片描述

3.2 定义HighWatermark Metrics
3.2.1 自定义flink kafka consumer
public class CustomerJsonConsumer extends FlinkKafkaConsumer011<Row> {
    private static final long serialVersionUID = -1234567890L;
	// 自定义序列化器
    private CustomerJsonDeserialization customerJsonDeserialization;
    
    public CustomerJsonConsumer(String topic, AbsKafkaDeserialization<Row> valueDeserializer, Properties props) {
    	// 构造器传入自定义序列化器
        super(Arrays.asList(topic.split(",")), valueDeserializer, props);
        this.customerJsonDeserialization = (CustomerJsonDeserialization) valueDeserializer;
    }

    public CustomerJsonConsumer(Pattern subscriptionPattern,
                                AbsKafkaDeserialization<Row> valueDeserializer, Properties props) {
        super(subscriptionPattern, valueDeserializer, props);
        this.customerJsonDeserialization = (CustomerJsonDeserialization) valueDeserializer;
    }

    // run()方法是task启动的入口
    @Override
    public void run(SourceContext<Row> sourceContext) throws Exception {
// 给反序列化器传入上下文
customerJsonDeserialization.setRuntimeContext(getRuntimeContext());
        customerJsonDeserialization.initMetric();
        super.run(sourceContext);
    }

    @Override
    protected AbstractFetcher<Row, ?> createFetcher(SourceContext<Row> sourceContext,
                                                    Map<KafkaTopicPartition, Long> assignedPartitionsWithInitialOffsets,
                                                    SerializedValue<AssignerWithPeriodicWatermarks<Row>> watermarksPeriodic,
                                                    SerializedValue<AssignerWithPunctuatedWatermarks<Row>> watermarksPunctuated,
                                                    StreamingRuntimeContext runtimeContext, OffsetCommitMode offsetCommitMode,
                                                    MetricGroup consumerMetricGroup,
                                                    boolean useMetrics) throws Exception {
        AbstractFetcher<Row, ?> fetcher = super.createFetcher(
                sourceContext,
                assignedPartitionsWithInitialOffsets,
                watermarksPeriodic,
                watermarksPunctuated,
                runtimeContext,
                offsetCommitMode,
                consumerMetricGroup,
                useMetrics);
        // 向自定义序列化器中传入fetcher,fetcher持有kafka consumer客户端,
        // 给反序列化器传入Fetcher
        customerJsonDeserialization.setFetcher(fetcher);
        return fetcher;
    }

}
3.2.2 自定义反序列化器

利用反射层层抽丝剥茧,拿到SubscriptionState对象

public class CustomerJsonDeserialization extends AbsKafkaDeserialization<Row> {
    private static final Logger LOG = LoggerFactory.getLogger(CustomerJsonDeserialization.class);
    private static final long serialVersionUID = 2385115520960444192L;
    String DT_TOPIC_GROUP = "topic";
    String DT_PARTITION_GROUP = "partition";
    
    private AbstractFetcher<Row, ?> fetcher;

    public CustomerJsonDeserialization(TypeInformation<Row> typeInfo) {
        super(typeInfo);
        this.runtimeConverter = createConverter(this.typeInfo);
    }

    @Override
    public Row deserialize(byte[] message) {

        if(openMetric && firstMsg){
            try {
                // 只有在第一条数据到来的时候,才会调用该方法
                registerPtMetric(fetcher);
            } catch (Exception e) {
                LOG.error("register topic partition metric error.", e);
            }

            firstMsg = false;
        }

        try {
            Row row;
            try {
                final JsonNode root = objectMapper.readTree(message);
                row = (Row) super.runtimeConverter.convert(objectMapper, root);
            } catch (Throwable t) {
                throw new IOException("Failed to deserialize JSON object.", t);
            }
            return row;
        } catch (Exception e) {
            // add metric of dirty data
            LOG.error(e.getMessage(), e);
            return null;
        }
    }
    // fetcher由自定义flink kafka consumer传入
    public void setFetcher(AbstractFetcher<Row, ?> fetcher) {
        this.fetcher = fetcher;
    }

    protected void registerPtMetric(AbstractFetcher<Row, ?> fetcher) throws Exception {
        // 通过反射获取fetcher中的kafka消费者等信息, 反射获取属性路径如下:
        // Flink: Fetcher -> KafkaConsumerThread -> KafkaConsumer -> 
        // Kafka Consumer: KafkaConsumer -> SubscriptionState -> partitionLag()
        Field consumerThreadField = fetcher.getClass().getSuperclass().getDeclaredField("consumerThread");
        consumerThreadField.setAccessible(true);
        KafkaConsumerThread consumerThread = (KafkaConsumerThread) consumerThreadField.get(fetcher);

        Field hasAssignedPartitionsField = consumerThread.getClass().getDeclaredField("hasAssignedPartitions");
        hasAssignedPartitionsField.setAccessible(true);

        boolean hasAssignedPartitions = (boolean) hasAssignedPartitionsField.get(consumerThread);

        if(!hasAssignedPartitions){
            throw new RuntimeException("wait 50 secs, but not assignedPartitions");
        }

        Field consumerField = consumerThread.getClass().getDeclaredField("consumer");
        consumerField.setAccessible(true);

        KafkaConsumer kafkaConsumer = (KafkaConsumer) consumerField.get(consumerThread);
        Field subscriptionStateField = kafkaConsumer.getClass().getDeclaredField("subscriptions");
        subscriptionStateField.setAccessible(true);
        
        SubscriptionState subscriptionState = (SubscriptionState) subscriptionStateField.get(kafkaConsumer);
        Set<TopicPartition> assignedPartitions = subscriptionState.assignedPartitions();
        for(TopicPartition topicPartition : assignedPartitions){
            MetricGroup metricGroup = getRuntimeContext().getMetricGroup().addGroup(DT_TOPIC_GROUP, topicPartition.topic())
                    .addGroup(DT_PARTITION_GROUP, topicPartition.partition() + "");
            metricGroup.gauge(DT_TOPIC_PARTITION_LAG_GAUGE, new KafkaTopicPartitionLagMetric(subscriptionState, topicPartition));

        }
    }
}
3.2.1 自定义消费延迟Metrics
public class KafkaTopicPartitionLagMetric implements Gauge<Long> {

    private SubscriptionState subscriptionState;

    private TopicPartition tp;

    public KafkaTopicPartitionLagMetric(SubscriptionState subscriptionState, TopicPartition tp) {
        this.subscriptionState = subscriptionState;
        this.tp = tp;
    }

    @Override
    public Long getValue() {
        // 计算消费延迟
        return subscriptionState.partitionLag(tp, IsolationLevel.READ_UNCOMMITTED);
    }
}
public class SubscriptionState {
    // 使用visible-offset和position(current-offset)计算消费延迟
    public Long partitionLag(TopicPartition tp, IsolationLevel isolationLevel) {
        TopicPartitionState topicPartitionState = assignedState(tp);
        if (isolationLevel == IsolationLevel.READ_COMMITTED)
            return topicPartitionState.lastStableOffset == null ? null : topicPartitionState.lastStableOffset - topicPartitionState.position;
        else
            return topicPartitionState.highWatermark == null ? null : topicPartitionState.highWatermark - topicPartitionState.position;
    }
}

参考:
https://ververica.cn/developers/flink-kafka-source-sink-source-analysis/
https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#kafka-connector-metrics
https://www.cnblogs.com/huxi2b/p/7453543.html
https://blog.csdn.net/u013256816/article/details/88985769

  • 0
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 6
    评论
评论 6
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值