for (ProducerBatch batch : batches) {
List inflightBatchList = inFlightBatches.get(batch.topicPartition);
if (inflightBatchList == null) {
inflightBatchList = new ArrayList<>();
inFlightBatches.put(batch.topicPartition, inflightBatchList);
}
inflightBatchList.add(batch);
}
}
Step5:将抽取的 ProducerBatch 加入到 inFlightBatches 数据结构,该属性的声明如下:Map<TopicPartition, List< ProducerBatch >> inFlightBatches,即按照 topic-分区 为键,存放已抽取的 ProducerBatch,这个属性的含义就是存储待发送的消息批次。可以根据该数据结构得知在消息发送时以分区为维度反馈 Sender 线程的“积压情况”,max.in.flight.requests.per.connection 就是来控制积压的最大数量,如果积压达到这个数值,针对该队列的消息发送会限流。
Sender#sendProducerData
accumulator.resetNextBatchExpiryTime();
List expiredInflightBatches = getExpiredInflightBatches(now);
List expiredBatches = this.accumulator.expiredBatches(now);
expiredBatches.addAll(expiredInflightBatches);
Step6:从 inflightBatches 与 batches 中查找已过期的消息批次(ProducerBatch),判断是否过期的标准是系统当前时间与 ProducerBatch 创建时间之差是否超过120s,过期时间可以通过参数 delivery.timeout.ms 设置。
Sender#sendProducerData
if (!expiredBatches.isEmpty())
log.trace(“Expired {} batches in accumulator”, expiredBatches.size());
for (ProducerBatch expiredBatch : expiredBatches) {
String errorMessage = "Expiring " + expiredBatch.recordCount + " record(s) for " + expiredBatch.topicPartition
- “:” + (now - expiredBatch.createdMs) + " ms has passed since batch creation";
failBatch(expiredBatch, -1, NO_TIMESTAMP, new TimeoutException(errorMessage), false);
if (transactionManager != null && expiredBatch.inRetry()) {
// This ensures that no new batches are drained until the current in flight batches are fully resolved.
transactionManager.markSequenceUnresolved(expiredBatch.topicPartition);
}
}
Step7:处理已超时的消息批次,通知该批消息发送失败,即通过设置 KafkaProducer#send 方法返回的凭证中的 FutureRecordMetadata 中的 ProduceRequestResult result,使之调用其 get 方法不会阻塞。
Sender#sendProducerData
sensors.updateProduceRequestMetrics(batches);
Step8:收集统计指标,本文不打算详细分析,但后续会专门对 Kafka 的 Metrics 设计进行一个深入的探讨与学习。
Sender#sendProducerData
long pollTimeout = Math.min(result.nextReadyCheckDelayMs, notReadyTimeout);
pollTimeout = Math.min(pollTimeout, this.accumulator.nextExpiryTimeMs() - now);
pollTimeout = Math.max(pollTimeout, 0);
if (!result.readyNodes.isEmpty()) {
log.trace(“Nodes with data ready to send: {}”, result.readyNodes);
pollTimeout = 0;
}
Step9:设置下一次的发送延时,待补充详细分析。
Sender#sendProducerData
sendProduceRequests(batches, now);
private void sendProduceRequests(Map<Integer, List> collated, long now) {
for (Map.Entry<Integer, List> entry : collated.entrySet())
sendProduceRequest(now, entry.getKey(), acks, requestTimeoutMs, entry.getValue());
}
Step10:该步骤按照 brokerId 分别构建发送请求,即每一个 broker 会将多个 ProducerBatch 一起封装成一个请求进行发送,同一时间,每一个 与 broker 连接只会只能发送一个请求,注意,这里只是构建请求,并最终会通过 NetworkClient#send 方法,将该批数据设置到 NetworkClient 的待发送数据中,此时并没有触发真正的网络调用。
sendProducerData 方法就介绍到这里了,既然这里还没有进行真正的网络请求,那在什么时候触发呢?
我们