5.12生产者如何处理超时的批次_expiring 7 record(s)for stkv2 face-45:120006has pa-CSDN博客

本文链接：https://blog.csdn.net/WANGCHUNHE55/article/details/129547923

该文详细分析了Kafka在处理消息发送时如何检查并处理超时的RecordBatch。代码深入到`abortExpiredBatches`方法，该方法遍历分区队列，检查每个批次是否超过预设的请求超时时间。如果超时，批次会被标记为过期，释放资源，并可能抛出TimeoutException，供用户捕获和处理。此过程涉及请求超时、重试策略以及内存管理等关键环节。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

这里肯定又要想到sender线程了，因为当时做处理过程概览的时候有一个步骤六，我们一直没有去分析，这个步骤当时说是放弃超时的batch，准确来说要在这里看一下是如何处理的

         /**
         *     超时批次是如何处理的？
         *
         */
        List<RecordBatch> expiredBatches = this.accumulator.abortExpiredBatches(this.requestTimeout, now);

我们跟进去看一下是如何处理的

 public List<RecordBatch> abortExpiredBatches(int requestTimeout, long now) {
        List<RecordBatch> expiredBatches = new ArrayList<>();
        int count = 0;
        //遍历batches
        for (Map.Entry<TopicPartition, Deque<RecordBatch>> entry : this.batches.entrySet()) {
            //获取到每个分区的队列 -》 队列里面对应的批次
            Deque<RecordBatch> dq = entry.getValue();
            TopicPartition tp = entry.getKey();
            // We only check if the batch should be expired if the partition does not have a batch in flight.
            // This is to prevent later batches from being expired while an earlier batch is still in progress.
            // Note that `muted` is only ever populated if `max.in.flight.request.per.connection=1` so this protection
            // is only active in this case. Otherwise the expiration order is not guaranteed.
            if (!muted.contains(tp)) {
                synchronized (dq) {
                    // iterate over the batches and expire them if they have been in the accumulator for more than requestTimeOut
                    RecordBatch lastBatch = dq.peekLast();
                    //迭代每个分区里面的每个批次
                    Iterator<RecordBatch> batchIterator = dq.iterator();
                    while (batchIterator.hasNext()) {
                        RecordBatch batch = batchIterator.next();
                        boolean isFull = batch != lastBatch || batch.records.isFull();
                        // check if the batch is expired
                        //判断是否超时
                        if (batch.maybeExpire(requestTimeout, retryBackoffMs, now, this.lingerMs, isFull)) {
                            expiredBatches.add(batch);
                            count++;
                            batchIterator.remove();
                            deallocate(batch);
                        } else {
                            // Stop at the first batch that has not expired.
                            break;
                        }
                    }
                }
            }
        }

这里要去看一下maybeExpire判断超时的一个标准

public boolean maybeExpire(int requestTimeoutMs, long retryBackoffMs, long now, long lingerMs, boolean isFull) {
        boolean expire = false;
        String errorMessage = null;
        /**
         * requestTimeoutMs：默认30s ，代表请求发送的超时时间，可以自己配置
         * now：当前时间
         * lastAppendTime：批次创建时间（上一次重试的时间）
         * 如果now - this.lastAppendTime大于30s，说明超时了，还没有发送出去
         */
        if (!this.inRetry() && isFull && requestTimeoutMs < (now - this.lastAppendTime)) {
            expire = true;
            //记录异常信息
            errorMessage = (now - this.lastAppendTime) + " ms has passed since last append";
            /**
             * lingerMs：100ms ，无论如何都要把消息发送出去的时间
             * createdMs：批次创建的时间
             *
             * now - (this.createdMs + lingerMs大于30s了，说明超时了
             */
        } else if (!this.inRetry() && requestTimeoutMs < (now - (this.createdMs + lingerMs))) {
            expire = true;
            errorMessage = (now - (this.createdMs + lingerMs)) + " ms has passed since batch creation plus linger time";
            /**
             * 针对重试
             * lastAttemptMs：上一次重试的时间
             * retryBackoffMs：重试的时间间隔
             * now - (this.lastAttemptMs + retryBackoffMs）大于30s，说明超时了
             */
        } else if (this.inRetry() && requestTimeoutMs < (now - (this.lastAttemptMs + retryBackoffMs))) {
            expire = true;
            errorMessage = (now - (this.lastAttemptMs + retryBackoffMs)) + " ms has passed since last attempt plus backoff time";
        }

        if (expire) {
            this.records.close();
            //如果超时了，调用done方法
          //方法里面传入了一个Timeout异常（说明超时了）
            this.done(-1L, Record.NO_TIMESTAMP,
                      new TimeoutException("Expiring " + recordCount + " record(s) for " + topicPartition + " due to " + errorMessage));
        }

        return expire;
    }

最后看到了熟悉的done方法，这次会走他的另一个分支

else {//这次要走这个分支，因为异常不为null了
                    //如果有异常就会把异常传给回调函数
                    //由用户自己去捕获这个异常
                    //然后对这个异常进行处理
                    //根据自己的实际情况进行处理
                    
                    //如果走这里，用户的代码可以捕获到TimeoutException这个异常
                    //如果用户捕获到了，自己做对应处理就可以了
                    thunk.callback.onCompletion(null, exception);
                }

再来看一下 abortExpiredBatches方法，会发现如果消息有异常了也会对内存做相应的处理

   //从数据结构里面移除
    batchIterator.remove();
   //释放资源
    deallocate(batch);

这里最终的思想还是交给用户自己去处理