kafka2.2源码分析之幂等设计实现（exactly once语义实现）

zhifeng687

已于 2022-03-29 00:46:57 修改

阅读量612

点赞数

分类专栏： kafka 文章标签： kafka

于 2015-12-09 17:04:05 首次发布

本文链接：https://blog.csdn.net/qq_26222859/article/details/50238641

版权

kafka 专栏收录该内容

23 篇文章 0 订阅

订阅专栏

概述

幂等性生产者能实现单个producer对同一个<topic ,partition>的exactly once语义。

必要条件，缺一不可：

单个producer；
单个partition；

producer端

Producer 后台发送线程 Sender，在 run() 方法中，会先根据 TransactionManager 的 shouldResetProducerStateAfterResolvingSequences() 方法判断当前的 PID 是否需要重置，重置的原因是因为：如果有topic-partition的batch已经超时还没处理完，此时可能会造成sequence number 不连续。因为sequence number 有部分已经分配出去了，而Kafka服务端没有收到这部分sequence number 的序号，Kafka服务端为了保证幂等性，只会接受同一个pid的sequence number 等于服务端缓存sequence number +1的消息，所有这时候需要重置Pid来保证幂等性。

Sender#run()方法

 void run(long now) {
    if (transactionManager != null) {
            try {
                if (transactionManager.shouldResetProducerStateAfterResolvingSequences())
                    // Check if the previous run expired batches which requires a reset of the producer state.
//判断是否需要重置producer id
                    transactionManager.resetProducerId();
                if (!transactionManager.isTransactional()) {
                    // this is an idempotent producer, so make sure we have a producer id
//如果是幂等生产者，需要确保我们具有一个producer id
                    maybeWaitForProducerId();
                } 
             。。。。。。。。。。
            catch(Exception e){.....}
 }
     long pollTimeout = sendProducerData(now);
        client.poll(pollTimeout, now);

}

判断是否需要重置pid

这里的PID是全局唯一的，如果client挂掉重启会重新分配一个PID，这也是幂等性无法做到跨会话的原因。

// Checks if there are any partitions with unresolved partitions which may now be resolved. Returns true if
    // the producer id needs a reset, false otherwise.
    synchronized boolean shouldResetProducerStateAfterResolvingSequences() {
        if (isTransactional())
            // We should not reset producer state if we are transactional. We will transition to a fatal error instead.
//如果使用事务生产者，不需要重置producer state
            return false;
        for (Iterator<TopicPartition> iter = partitionsWithUnresolvedSequences.iterator(); iter.hasNext(); ) {
            TopicPartition topicPartition = iter.next();
            if (!hasInflightBatches(topicPartition)) {
                // The partition has been fully drained. At this point, the last ack'd sequence should be once less than
                // next sequence destined for the partition. If so, the partition is fully resolved. If not, we should
                // reset the sequence number if necessary.
                if (isNextSequence(topicPartition, sequenceNumber(topicPartition))) {
                    // This would happen when a batch was expired, but subsequent batches succeeded.
                    iter.remove();
                } else {
                    // We would enter this branch if all in flight batches were ultimately expired in the producer.
                    log.info("No inflight batches remaining for {}, last ack'd sequence for partition is {}, next sequence is {}. " +
                            "Going to reset producer state.", topicPartition, lastAckedSequence(topicPartition), sequenceNumber(topicPartition));
                    return true;
                }
            }
        }
        return false;
    }

TransactionManager的成员变量

// The base sequence of the next batch bound for a given partition.
//每个partition的下一个batch的sequence number
    private final Map<TopicPartition, Integer> nextSequence;

    // The sequence of the last record of the last ack'd batch from the given partition. When there are no
    // in flight requests for a partition, the lastAckedSequence(topicPartition) == nextSequence(topicPartition) - 1.
//每个partition的最后一个已确认的batch的最后一个record的最后一个record
    private final Map<TopicPartition, Integer> lastAckedSequence;

    // If a batch bound for a partition expired locally after being sent at least once, the partition has is considered
    // to have an unresolved state. We keep track fo such partitions here, and cannot assign any more sequence numbers
    // for this partition until the unresolved state gets cleared. This may happen if other inflight batches returned
    // successfully (indicating that the expired batch actually made it to the broker). If we don't get any successful
    // responses for the partition once the inflight request count falls to zero, we reset the producer id and
    // consequently clear this data structure as well.
//如果某个partition的一个batch在至少发送一次后失效了，这个partition被认为是有一个unresolved state。我们会去追踪这样的partition，
//并且不能给这些partition赋予任何的sequence number，直至这个unresolved state解除。这可能发生在当其他的inflight batch成功返回。
//一旦我们不能在inflight request count减少至0时，得到任何成功的返回，我们将会重置这个producerId并且清除这个数据结构
    private final Set<TopicPartition> partitionsWithUnresolvedSequences;

    // Keep track of the in flight batches bound for a partition, ordered by sequence. This helps us to ensure that
    // we continue to order batches by the sequence numbers even when the responses come back out of order during
    // leader failover. We add a batch to the queue when it is drained, and remove it when the batch completes
    // (either successfully or through a fatal failure).
//跟踪一个partition绑定的batches,batches根据sequence排序。这会帮助在leader failover过程中，即使response乱序，我们也能保证按照sequence number排序。
    private final Map<TopicPartition, PriorityQueue<ProducerBatch>> inflightBatchesBySequence;

阻塞等待producer id

private void maybeWaitForProducerId() {
        while (!forceClose && !transactionManager.hasProducerId() && !transactionManager.hasError()) {
            Node node = null;
            try {
                node = awaitLeastLoadedNodeReady(requestTimeoutMs);
                if (node != null) {
//发送init producerId请求，并阻塞等待响应返回
                    ClientResponse response = sendAndAwaitInitProducerIdRequest(node);
                    InitProducerIdResponse initProducerIdResponse = (InitProducerIdResponse) response.responseBody();
                    Errors error = initProducerIdResponse.error();
                    if (error == Errors.NONE) {
//获取响应的producerId和epoch，并生成ProducerIdAndEpoch，赋值到TransactionManager的成员变量
                        ProducerIdAndEpoch producerIdAndEpoch = new ProducerIdAndEpoch(
                                initProducerIdResponse.producerId(), initProducerIdResponse.epoch());
                        transactionManager.setProducerIdAndEpoch(producerIdAndEpoch);
                        return;
                    } else if (error.exception() instanceof RetriableException) {
                        log.debug("Retriable error from InitProducerId response", error.message());
                    } else {
                        transactionManager.transitionToFatalError(error.exception());
                        break;
                    }
                } else {
                    log.debug("Could not find an available broker to send InitProducerIdRequest to. Will back off and retry.");
                }
            } catch (UnsupportedVersionException e) {
                transactionManager.transitionToFatalError(e);
                break;
            } catch (IOException e) {
                log.debug("Broker {} disconnected while awaiting InitProducerId response", node, e);
            }
            log.trace("Retry InitProducerIdRequest in {}ms.", retryBackoffMs);
            time.sleep(retryBackoffMs);
            metadata.requestUpdate();
        }
    }

追加RecordBatch

def append(batch: RecordBatch): Option[CompletedTxn] = {
    if (batch.isControlBatch) {
      val recordIterator = batch.iterator
      if (recordIterator.hasNext) {
        val record = recordIterator.next()
        val endTxnMarker = EndTransactionMarker.deserialize(record)
        val completedTxn = appendEndTxnMarker(endTxnMarker, batch.producerEpoch, batch.baseOffset, record.timestamp)
        Some(completedTxn)
      } else {
        // An empty control batch means the entire transaction has been cleaned from the log, so no need to append
        None
      }
    } else {
      append(batch.producerEpoch, batch.baseSequence, batch.lastSequence, batch.maxTimestamp, batch.baseOffset, batch.lastOffset,
        batch.isTransactional)
      None
    }
  }

追加消息

producerStateManager#append()方法

 def append(epoch: Short,
             firstSeq: Int,
             lastSeq: Int,
             lastTimestamp: Long,
             firstOffset: Long,
             lastOffset: Long,
             isTransactional: Boolean): Unit = {
    maybeValidateAppend(epoch, firstSeq)
    updatedEntry.addBatch(epoch, lastSeq, lastOffset, (lastOffset - firstOffset).toInt, lastTimestamp)

    updatedEntry.currentTxnFirstOffset match {
      case Some(_) if !isTransactional =>
        // Received a non-transactional message while a transaction is active
        throw new InvalidTxnStateException(s"Expected transactional write from producer $producerId")

      case None if isTransactional =>
        // Began a new transaction
        updatedEntry.currentTxnFirstOffset = Some(firstOffset)
        transactions += new TxnMetadata(producerId, firstOffset)

      case _ => // nothing to do
    }
  }

校验消息追加

producerStateManager#maybeValidateAppend()方法

private def maybeValidateAppend(producerEpoch: Short, firstSeq: Int) = {
    validationType match {
      case ValidationType.None =>

      case ValidationType.EpochOnly =>
        checkProducerEpoch(producerEpoch)

      case ValidationType.Full =>
        checkProducerEpoch(producerEpoch)
        checkSequence(producerEpoch, firstSeq)
    }
  }

校验sequence

producerStateManager#checkSequence()方法

private def checkSequence(producerEpoch: Short, appendFirstSeq: Int): Unit = {
    if (producerEpoch != updatedEntry.producerEpoch) {
      if (appendFirstSeq != 0) {
        if (updatedEntry.producerEpoch != RecordBatch.NO_PRODUCER_EPOCH) {
          throw new OutOfOrderSequenceException(s"Invalid sequence number for new epoch: $producerEpoch " +
            s"(request epoch), $appendFirstSeq (seq. number)")
        } else {
          throw new UnknownProducerIdException(s"Found no record of producerId=$producerId on the broker. It is possible " +
            s"that the last message with the producerId=$producerId has been removed due to hitting the retention limit.")
        }
      }
    } else {
      val currentLastSeq = if (!updatedEntry.isEmpty)
        updatedEntry.lastSeq
      else if (producerEpoch == currentEntry.producerEpoch)
        currentEntry.lastSeq
      else
        RecordBatch.NO_SEQUENCE

      if (currentLastSeq == RecordBatch.NO_SEQUENCE && appendFirstSeq != 0) {
        // We have a matching epoch, but we do not know the next sequence number. This case can happen if
        // only a transaction marker is left in the log for this producer. We treat this as an unknown
        // producer id error, so that the producer can check the log start offset for truncation and reset
        // the sequence number. Note that this check follows the fencing check, so the marker still fences
        // old producers even if it cannot determine our next expected sequence number.
        throw new UnknownProducerIdException(s"Local producer state matches expected epoch $producerEpoch " +
          s"for producerId=$producerId, but next expected sequence number is not known.")
      } else if (!inSequence(currentLastSeq, appendFirstSeq)) {
        throw new OutOfOrderSequenceException(s"Out of order sequence number for producerId $producerId: $appendFirstSeq " +
          s"(incoming seq. number), $currentLastSeq (current end sequence number)")
      }
    }
  }

参考： Kafka幂等性介绍与源码实现

zhifeng687

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
kafka2.2源码分析之幂等设计实现（exactly once语义实现）

概述幂等性生产者能实现单个producer对同一个<topic ,partition>的exactly once语义。producer端Producer 后台发送线程 Sender，在 run() 方法中，会先根据 TransactionManager 的 shouldResetProducerStateAfterResolvingSequences() 方法判断当前的 P...
复制链接

扫一扫

专栏目录