FlinkKafkaProducer源码分析

画画的老顽童

已于 2022-01-28 18:51:45 修改

阅读量1.1k

点赞数 1

分类专栏： flink 文章标签： flink kafka

于 2021-01-30 15:41:52 首次发布

本文链接：https://blog.csdn.net/m0_46449152/article/details/113433016

版权

flink 专栏收录该内容

17 篇文章 5 订阅

订阅专栏

FlinkKafkaProducer extends TwoPhaseCommitSinkFunction implements CheckpointedFunction, CheckpointListener

TwoPhaseCommitSinkFunction 实现 CheckpointedFunction中的initializeState和snapshotState
CheckpointListener 中的notifyCheckpointComplete

参考：https://cloud.tencent.com/developer/article/1583233
###############################################################################
总结 FlinkKafkaProducer && TPC
1、开启事务：initializeState initializeState TPC.beginTransaction() 开启事务，并初始化KafkaProducer，kafkaProducer 会初始化 accumulateRecord和sender线程
FlinkKafkaProducer.initializeState -> TPC.initializeState
获取状态信息
如果是restore，获取state中未提交的事务,构建kafkaProducer，重新提交事务 getPendingCommitTransactions -> recoverAndCommitInternal -> recoverAndCommit{producer = initTransactionalProducer,producer.commitTransaction()}
初始化用户context，生成 transactionalIds initializeUserContext -> generateNewTransactionalIds -> generateIdsToUse

   Flink用一个队列作为transactional id的Pool，新的Transaction开始时从队头拿出一个transactional id，Transaction结束时将transactional id放回队尾。
   因为每开始一个Transaction，都会构造一个新的Kafka Producer，因此availableTransactionalIds初始的大小就是配置的Kafka Producer Pool Size（默认是5）
   	public Set<String> generateIdsToUse(long nextFreeTransactionalId) {
		Set<String> transactionalIds = new HashSet<>();
		for (int i = 0; i < poolSize; i++) {
			long transactionalId = nextFreeTransactionalId + subtaskIndex * poolSize + i;
			transactionalIds.add(generateTransactionalId(transactionalId));
		}
		return transactionalIds;
	}
	开启新的事务,初始化新的事务关联的producer        beginTransactionInternal -> createTransactionalProducer
		private FlinkKafkaInternalProducer<byte[], byte[]> createTransactionalProducer() throws FlinkKafkaException {
		String transactionalId = availableTransactionalIds.poll();
		if (transactionalId == null) {
			throw new FlinkKafkaException(
				FlinkKafkaErrorCode.PRODUCERS_POOL_EMPTY,
				"Too many ongoing snapshots. Increase kafka producers pool size or decrease number of concurrent checkpoints.");
		}
		FlinkKafkaInternalProducer<byte[], byte[]> producer = initTransactionalProducer(transactionalId, true);
		producer.initTransactions();
		return producer;
	}
	initTransactionalProducer -> initProducer -> createProducer -> FlinkKafkaInternalProducer -> kafkaProducer = new KafkaProducer<>(properties)    kafkaProducer 会初始化 accumulateRecord和sender线程

2、invoke 方法，调用kafkaProducer.send 将处理的数据写入 kafka
3、预提交kafka事务，同时开启下一个事务：preCommit, beginTransactionInternal()
snapshotState -> preCommit 每次checkpoint触发时，调用snapshotState，调用FlinkKafkaProducer.preCommit 再调用kafkaProduer.flush方法，将将RecordAccumulator 中未写入完kafka broker中的剩余数据使用sender写入完。
注： 2，3两步已经将消息发送到kafka，因为beginTransaction 已经启动sender线程，accumulateRecord中有数据就会发送到kafka，如果kafkaconsumer的isolation.level为 read_uncommitted（默认），就能读到写入的数据导致脏读，
将其设置为read_committed 才能读到提交的数据，但会有延时，延时时间为checkpoint间隔时间
4、提交kafka事务： notifyCheckpointComplete -> commit checkpoint 完成时调用 notifyCheckpointComplete -> comit 调用 kafkaProducer.commitTransaction 提交kafka事务

################################################################################

TwoPhaseCommitSinkFunction
1、initializeState checkpoint 初始化

	@Override
	public void initializeState(FunctionInitializationContext context) throws Exception {
		// when we are restoring state with pendingCommitTransactions, we don't really know whether the
		// transactions were already committed, or whether there was a failure between
		// completing the checkpoint on the master, and notifying the writer here.

		// (the common case is actually that is was already committed, the window
		// between the commit on the master and the notification here is very small)

		// it is possible to not have any transactions at all if there was a failure before
		// the first completed checkpoint, or in case of a scale-out event, where some of the
		// new task do  not have and transactions assigned to check)

		// we can have more than one transaction to check in case of a scale-in event, or
		// for the reasons discussed in the 'notifyCheckpointComplete()' method.

		state = context.getOperatorStateStore().getListState(stateDescriptor);

		boolean recoveredUserContext = false;
		// 遇到故障重启时
		if (context.isRestored()) {
			LOG.info("{} - restoring state", name());
			for (State<TXN, CONTEXT> operatorState : state.get()) {
				userContext = operatorState.getContext();
				List<TransactionHolder<TXN>> recoveredTransactions = operatorState.getPendingCommitTransactions();
				List<TXN> handledTransactions = new ArrayList<>(recoveredTransactions.size() + 1);
				for (TransactionHolder<TXN> recoveredTransaction : recoveredTransactions) {
					// If this fails to succeed eventually, there is actually data loss
					recoverAndCommitInternal(recoveredTransaction);
					handledTransactions.add(recoveredTransaction.handle);
					LOG.info("{} committed recovered transaction {}", name(), recoveredTransaction);
				}

				{
					TXN transaction = operatorState.getPendingTransaction().handle;
					recoverAndAbort(transaction);
					handledTransactions.add(transaction);
					LOG.info("{} aborted recovered transaction {}", name(), operatorState.getPendingTransaction());
				}

				if (userContext.isPresent()) {
					finishRecoveringContext(handledTransactions);
					recoveredUserContext = true;
				}
			}
		}

		// if in restore we didn't get any userContext or we are initializing from scratch
		if (!recoveredUserContext) {
			LOG.info("{} - no state to restore", name());

			userContext = initializeUserContext();
		}
		this.pendingCommitTransactions.clear();

		currentTransactionHolder = beginTransactionInternal();
		LOG.debug("{} - started new transaction '{}'", name(), currentTransactionHolder);
	}
beginTransactionInternal-> 
private TransactionHolder<TXN> beginTransactionInternal() throws Exception {
		return new TransactionHolder<>(beginTransaction(), clock.millis());
	}
// 开启事务
FlinkKafkaProducer.beginTransaction() 见下文

2、snapshotState 每次checkpoint时执行预提交kafka事务

public void snapshotState(FunctionSnapshotContext context) throws Exception {
		// this is like the pre-commit of a 2-phase-commit transaction
		// we are ready to commit and remember the transaction

		checkState(currentTransactionHolder != null, "bug: no transaction object when performing state snapshot");

		long checkpointId = context.getCheckpointId();
		LOG.debug("{} - checkpoint {} triggered, flushing transaction '{}'", name(), context.getCheckpointId(), currentTransactionHolder);
        // 执行预提交
		preCommit(currentTransactionHolder.handle);
		pendingCommitTransactions.put(checkpointId, currentTransactionHolder);
		LOG.debug("{} - stored pending transactions {}", name(), pendingCommitTransactions);

		currentTransactionHolder = beginTransactionInternal();
		LOG.debug("{} - started new transaction '{}'", name(), currentTransactionHolder);

		state.clear();
		state.add(new State<>(
			this.currentTransactionHolder,
			new ArrayList<>(pendingCommitTransactions.values()),
			userContext));
	}
preCommit -> FlinkKafkaProducer.preCommit

3、notifyCheckpointComplete checkpoint 完成时提交kafka事务

public final void notifyCheckpointComplete(long checkpointId) throws Exception {
		// the following scenarios are possible here
		//
		//  (1) there is exactly one transaction from the latest checkpoint that
		//      was triggered and completed. That should be the common case.
		//      Simply commit that transaction in that case.
		//
		//  (2) there are multiple pending transactions because one previous
		//      checkpoint was skipped. That is a rare case, but can happen
		//      for example when:
		//
		//        - the master cannot persist the metadata of the last
		//          checkpoint (temporary outage in the storage system) but
		//          could persist a successive checkpoint (the one notified here)
		//
		//        - other tasks could not persist their status during
		//          the previous checkpoint, but did not trigger a failure because they
		//          could hold onto their state and could successfully persist it in
		//          a successive checkpoint (the one notified here)
		//
		//      In both cases, the prior checkpoint never reach a committed state, but
		//      this checkpoint is always expected to subsume the prior one and cover all
		//      changes since the last successful one. As a consequence, we need to commit
		//      all pending transactions.
		//
		//  (3) Multiple transactions are pending, but the checkpoint complete notification
		//      relates not to the latest. That is possible, because notification messages
		//      can be delayed (in an extreme case till arrive after a succeeding checkpoint
		//      was triggered) and because there can be concurrent overlapping checkpoints
		//      (a new one is started before the previous fully finished).
		//
		// ==> There should never be a case where we have no pending transaction here
		//

		Iterator<Map.Entry<Long, TransactionHolder<TXN>>> pendingTransactionIterator = pendingCommitTransactions.entrySet().iterator();
		Throwable firstError = null;

		while (pendingTransactionIterator.hasNext()) {
			Map.Entry<Long, TransactionHolder<TXN>> entry = pendingTransactionIterator.next();
			Long pendingTransactionCheckpointId = entry.getKey();
			TransactionHolder<TXN> pendingTransaction = entry.getValue();
			if (pendingTransactionCheckpointId > checkpointId) {
				continue;
			}

			LOG.info("{} - checkpoint {} complete, committing transaction {} from checkpoint {}",
				name(), checkpointId, pendingTransaction, pendingTransactionCheckpointId);

			logWarningIfTimeoutAlmostReached(pendingTransaction);
			try {
			    // 提交kafka事务
				commit(pendingTransaction.handle);
			} catch (Throwable t) {
				if (firstError == null) {
					firstError = t;
				}
			}

			LOG.debug("{} - committed checkpoint transaction {}", name(), pendingTransaction);

			pendingTransactionIterator.remove();
		}

		if (firstError != null) {
			throw new FlinkRuntimeException("Committing one of transactions failed, logging first encountered failure",
				firstError);
		}
	}
commit ->FlinkKafkaProducer.commit

FlinkKafkaProducer

1、beginTransaction

protected FlinkKafkaProducer.KafkaTransactionState beginTransaction() throws FlinkKafkaException {
		switch (semantic) {
			case EXACTLY_ONCE:
			    // 创建KafkaProducer
				FlinkKafkaInternalProducer<byte[], byte[]> producer = createTransactionalProducer();
				producer.beginTransaction();
				return new FlinkKafkaProducer.KafkaTransactionState(producer.getTransactionalId(), producer);
			case AT_LEAST_ONCE:
			case NONE:
				// Do not create new producer on each beginTransaction() if it is not necessary
				final FlinkKafkaProducer.KafkaTransactionState currentTransaction = currentTransaction();
				if (currentTransaction != null && currentTransaction.producer != null) {
					return new FlinkKafkaProducer.KafkaTransactionState(currentTransaction.producer);
				}
				return new FlinkKafkaProducer.KafkaTransactionState(initNonTransactionalProducer(true));
			default:
				throw new UnsupportedOperationException("Not implemented semantic");
		}
	}
--> createTransactionalProducer -> initTransactionalProducer -> initProducer -> createProducer -> new FlinkKafkaInternalProducer
    -> new KafkaProducer  -> KafkaProducer  
--> KafkaProducer   // 以下为kafka源码
   // 创建RecordAccumulator
   this.accumulator = new RecordAccumulator
   // 创建sender对象并启动  
   this.sender = newSender(logContext, kafkaClient, this.metadata);
            String ioThreadName = NETWORK_THREAD_PREFIX + " | " + clientId;
            this.ioThread = new KafkaThread(ioThreadName, this.sender, true);
            this.ioThread.start();

2、invoke

public void invoke(FlinkKafkaProducer.KafkaTransactionState transaction, IN next, Context context) throws FlinkKafkaException {
	// 此步就已经将消息发送到kafka，因为beginTransaction 以启动sender线程， 如果kafkaconsumer的isolation.level为
	// read_uncommitted（默认），就能读到写入的数据导致脏读， 将其设置为read_committed 才能读到提交的数据，但会有延时，
	// 延时时间为checkpoint间隔时间
	transaction.producer.send(record, callback);
}

3、preCommit

	protected void preCommit(FlinkKafkaProducer.KafkaTransactionState transaction) throws FlinkKafkaException {
		switch (semantic) {
			case EXACTLY_ONCE:
			case AT_LEAST_ONCE:
			    // 刷新数据，将RecordAccumulator 中未写入完kafka broker中的数据使用sender写入完 
				flush(transaction);
				break;
			case NONE:
				break;
			default:
				throw new UnsupportedOperationException("Not implemented semantic");
		}
		checkErroneous();
	}
--> flush -> transaction.producer.flush() -> kafkaProducer.flush()-> 
   /**
     * Invoking this method makes all buffered records immediately available to send (even if <code>linger.ms</code> is
     * greater than 0) and blocks on the completion of the requests associated with these records.
     */
public void flush() {
        log.trace("Flushing accumulated records in producer.");
        this.accumulator.beginFlush();
        this.sender.wakeup();
        try {
            this.accumulator.awaitFlushCompletion();
        } catch (InterruptedException e) {
            throw new InterruptException("Flush interrupted.", e);
        }
    }

4、commit

protected void commit(FlinkKafkaProducer.KafkaTransactionState transaction) {
		if (transaction.isTransactional()) {
			try {
				transaction.producer.commitTransaction();
			} finally {
				recycleTransactionalProducer(transaction.producer);
			}
		}
	}
commitTransaction-> kafkaProducer.commitTransaction()

画画的老顽童

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
FlinkKafkaProducer源码分析

FlinkKafkaProducer extends TwoPhaseCommitSinkFunction implements CheckpointedFunction, CheckpointListenerTwoPhaseCommitSinkFunction 实现 CheckpointedFunction中的initializeState和snapshotStateCheckpointListener 中的notifyCheckpointCompleteTwoPhaseCommitSinkFunc
复制链接

扫一扫