概述
幂等性生产者能实现单个producer对同一个<topic ,partition>的exactly once语义。
必要条件,缺一不可:
- 单个producer;
- 单个partition;
producer端
Producer 后台发送线程 Sender,在 run() 方法中,会先根据 TransactionManager 的 shouldResetProducerStateAfterResolvingSequences() 方法判断当前的 PID 是否需要重置,重置的原因是因为:如果有topic-partition的batch已经超时还没处理完,此时可能会造成sequence number 不连续。因为sequence number 有部分已经分配出去了,而Kafka服务端没有收到这部分sequence number 的序号,Kafka服务端为了保证幂等性,只会接受同一个pid的sequence number 等于服务端缓存sequence number +1的消息,所有这时候需要重置Pid来保证幂等性。
Sender#run()方法
void run(long now) {
if (transactionManager != null) {
try {
if (transactionManager.shouldResetProducerStateAfterResolvingSequences())
// Check if the previous run expired batches which requires a reset of the producer state.
//判断是否需要重置producer id
transactionManager.resetProducerId();
if (!transactionManager.isTransactional()) {
// this is an idempotent producer, so make sure we have a producer id
//如果是幂等生产者,需要确保我们具有一个producer id
maybeWaitForProducerId();
}
。。。。。。。。。。
catch(Exception e){.....}
}
long pollTimeout = sendProducerData(now);
client.poll(pollTimeout, now);
}
判断是否需要重置pid
这里的PID是全局唯一的,如果client挂掉重启会重新分配一个PID,这也是幂等性无法做到跨会话的原因。
// Checks if there are any partitions with unresolved partitions which may now be resolved. Returns true if
// the producer id needs a reset, false otherwise.
synchronized boolean shouldResetProducerStateAfterResolvingSequences() {
if (isTransactional())
// We should not reset producer state if we are transactional. We will transition to a fatal error instead.
//如果使用事务生产者,不需要重置producer state
return false;
for (Iterator<TopicPartition> iter = partitionsWithUnresolvedSequences.iterator(); iter.hasNext(); ) {
TopicPartition topicPartition = iter.next();
if (!hasInflightBatches(topicPartition)) {
// The partition has been fully drained. At this point, the last ack'd sequence should be once less than
// next sequence destined for the partition. If so, the partition is fully resolved. If not, we should
// reset the sequence number if necessary.
if (isNextSequence(topicPartition, sequenceNumber(topicPartition))) {
// This would happen when a batch was expired, but subsequent batches succeeded.
iter.remove();
} else {
// We would enter this branch if all in flight batches were ultimately expired in the producer.
log.info("No inflight batches remaining for {}, last ack'd sequence for partition is {}, next sequence is {}. " +
"Going to reset producer state.", topicPartition, lastAckedSequence(topicPartition), sequenceNumber(topicPartition));
return true;
}
}
}
return false;
}
TransactionManager的成员变量
// The base sequence of the next batch bound for a given partition.
//每个partition的下一个batch的sequence number
private final Map<TopicPartition, Integer> nextSequence;
// The sequence of the last record of the last ack'd batch from the given partition. When there are no
// in flight requests for a partition, the lastAckedSequence(topicPartition) == nextSequence(topicPartition) - 1.
//每个partition的最后一个已确认的batch的最后一个record的最后一个record
private final Map<TopicPartition, Integer> lastAckedSequence;
// If a batch bound for a partition expired locally after being sent at least once, the partition has is considered
// to have an unresolved state. We keep track fo such partitions here, and cannot assign any more sequence numbers
// for this partition until the unresolved state gets cleared. This may happen if other inflight batches returned
// successfully (indicating that the expired batch actually made it to the broker). If we don't get any successful
// responses for the partition once the inflight request count falls to zero, we reset the producer id and
// consequently clear this data structure as well.
//如果某个partition的一个batch在至少发送一次后失效了,这个partition被认为是有一个unresolved state。我们会去追踪这样的partition,
//并且不能给这些partition赋予任何的sequence number,直至这个unresolved state解除。这可能发生在当其他的inflight batch成功返回。
//一旦我们不能在inflight request count减少至0时,得到任何成功的返回,我们将会重置这个producerId并且清除这个数据结构
private final Set<TopicPartition> partitionsWithUnresolvedSequences;
// Keep track of the in flight batches bound for a partition, ordered by sequence. This helps us to ensure that
// we continue to order batches by the sequence numbers even when the responses come back out of order during
// leader failover. We add a batch to the queue when it is drained, and remove it when the batch completes
// (either successfully or through a fatal failure).
//跟踪一个partition绑定的batches,batches根据sequence排序。这会帮助在leader failover过程中,即使response乱序,我们也能保证按照sequence number排序。
private final Map<TopicPartition, PriorityQueue<ProducerBatch>> inflightBatchesBySequence;
阻塞等待producer id
private void maybeWaitForProducerId() {
while (!forceClose && !transactionManager.hasProducerId() && !transactionManager.hasError()) {
Node node = null;
try {
node = awaitLeastLoadedNodeReady(requestTimeoutMs);
if (node != null) {
//发送init producerId请求,并阻塞等待响应返回
ClientResponse response = sendAndAwaitInitProducerIdRequest(node);
InitProducerIdResponse initProducerIdResponse = (InitProducerIdResponse) response.responseBody();
Errors error = initProducerIdResponse.error();
if (error == Errors.NONE) {
//获取响应的producerId和epoch,并生成ProducerIdAndEpoch,赋值到TransactionManager的成员变量
ProducerIdAndEpoch producerIdAndEpoch = new ProducerIdAndEpoch(
initProducerIdResponse.producerId(), initProducerIdResponse.epoch());
transactionManager.setProducerIdAndEpoch(producerIdAndEpoch);
return;
} else if (error.exception() instanceof RetriableException) {
log.debug("Retriable error from InitProducerId response", error.message());
} else {
transactionManager.transitionToFatalError(error.exception());
break;
}
} else {
log.debug("Could not find an available broker to send InitProducerIdRequest to. Will back off and retry.");
}
} catch (UnsupportedVersionException e) {
transactionManager.transitionToFatalError(e);
break;
} catch (IOException e) {
log.debug("Broker {} disconnected while awaiting InitProducerId response", node, e);
}
log.trace("Retry InitProducerIdRequest in {}ms.", retryBackoffMs);
time.sleep(retryBackoffMs);
metadata.requestUpdate();
}
}
追加RecordBatch
def append(batch: RecordBatch): Option[CompletedTxn] = {
if (batch.isControlBatch) {
val recordIterator = batch.iterator
if (recordIterator.hasNext) {
val record = recordIterator.next()
val endTxnMarker = EndTransactionMarker.deserialize(record)
val completedTxn = appendEndTxnMarker(endTxnMarker, batch.producerEpoch, batch.baseOffset, record.timestamp)
Some(completedTxn)
} else {
// An empty control batch means the entire transaction has been cleaned from the log, so no need to append
None
}
} else {
append(batch.producerEpoch, batch.baseSequence, batch.lastSequence, batch.maxTimestamp, batch.baseOffset, batch.lastOffset,
batch.isTransactional)
None
}
}
追加消息
producerStateManager#append()方法
def append(epoch: Short,
firstSeq: Int,
lastSeq: Int,
lastTimestamp: Long,
firstOffset: Long,
lastOffset: Long,
isTransactional: Boolean): Unit = {
maybeValidateAppend(epoch, firstSeq)
updatedEntry.addBatch(epoch, lastSeq, lastOffset, (lastOffset - firstOffset).toInt, lastTimestamp)
updatedEntry.currentTxnFirstOffset match {
case Some(_) if !isTransactional =>
// Received a non-transactional message while a transaction is active
throw new InvalidTxnStateException(s"Expected transactional write from producer $producerId")
case None if isTransactional =>
// Began a new transaction
updatedEntry.currentTxnFirstOffset = Some(firstOffset)
transactions += new TxnMetadata(producerId, firstOffset)
case _ => // nothing to do
}
}
校验消息追加
producerStateManager#maybeValidateAppend()方法
private def maybeValidateAppend(producerEpoch: Short, firstSeq: Int) = {
validationType match {
case ValidationType.None =>
case ValidationType.EpochOnly =>
checkProducerEpoch(producerEpoch)
case ValidationType.Full =>
checkProducerEpoch(producerEpoch)
checkSequence(producerEpoch, firstSeq)
}
}
校验sequence
producerStateManager#checkSequence()方法
private def checkSequence(producerEpoch: Short, appendFirstSeq: Int): Unit = {
if (producerEpoch != updatedEntry.producerEpoch) {
if (appendFirstSeq != 0) {
if (updatedEntry.producerEpoch != RecordBatch.NO_PRODUCER_EPOCH) {
throw new OutOfOrderSequenceException(s"Invalid sequence number for new epoch: $producerEpoch " +
s"(request epoch), $appendFirstSeq (seq. number)")
} else {
throw new UnknownProducerIdException(s"Found no record of producerId=$producerId on the broker. It is possible " +
s"that the last message with the producerId=$producerId has been removed due to hitting the retention limit.")
}
}
} else {
val currentLastSeq = if (!updatedEntry.isEmpty)
updatedEntry.lastSeq
else if (producerEpoch == currentEntry.producerEpoch)
currentEntry.lastSeq
else
RecordBatch.NO_SEQUENCE
if (currentLastSeq == RecordBatch.NO_SEQUENCE && appendFirstSeq != 0) {
// We have a matching epoch, but we do not know the next sequence number. This case can happen if
// only a transaction marker is left in the log for this producer. We treat this as an unknown
// producer id error, so that the producer can check the log start offset for truncation and reset
// the sequence number. Note that this check follows the fencing check, so the marker still fences
// old producers even if it cannot determine our next expected sequence number.
throw new UnknownProducerIdException(s"Local producer state matches expected epoch $producerEpoch " +
s"for producerId=$producerId, but next expected sequence number is not known.")
} else if (!inSequence(currentLastSeq, appendFirstSeq)) {
throw new OutOfOrderSequenceException(s"Out of order sequence number for producerId $producerId: $appendFirstSeq " +
s"(incoming seq. number), $currentLastSeq (current end sequence number)")
}
}
}
参考: Kafka幂等性介绍与源码实现