这里肯定又要想到sender线程了,因为当时做处理过程概览的时候有一个步骤六,我们一直没有去分析,这个步骤当时说是放弃超时的batch,准确来说要在这里看一下是如何处理的
/**
* 超时批次是如何处理的?
*
*/
List<RecordBatch> expiredBatches = this.accumulator.abortExpiredBatches(this.requestTimeout, now);
我们跟进去看一下是如何处理的
public List<RecordBatch> abortExpiredBatches(int requestTimeout, long now) {
List<RecordBatch> expiredBatches = new ArrayList<>();
int count = 0;
//遍历batches
for (Map.Entry<TopicPartition, Deque<RecordBatch>> entry : this.batches.entrySet()) {
//获取到每个分区的队列 -》 队列里面对应的批次
Deque<RecordBatch> dq = entry.getValue();
TopicPartition tp = entry.getKey();
// We only check if the batch should be expired if the partition does not have a batch in flight.
// This is to prevent later batches from being expired while an earlier batch is still in progress.
// Note that `muted` is only ever populated if `max.in.flight.request.per.connection=1` so this protection
// is only active in this case. Otherwise the expiration order is not guaranteed.
if (!muted.contains(tp)) {
synchronized (dq) {
// iterate over the batches and expire them if they have been in the accumulator for more than requestTimeOut
RecordBatch lastBatch = dq.peekLast();
//迭代每个分区里面的每个批次
Iterator<RecordBatch> batchIterator = dq.iterator();
while (batchIterator.hasNext()) {
RecordBatch batch = batchIterator.next();
boolean isFull = batch != lastBatch || batch.records.isFull();
// check if the batch is expired
//判断是否超时
if (batch.maybeExpire(requestTimeout, retryBackoffMs, now, this.lingerMs, isFull)) {
expiredBatches.add(batch);
count++;
batchIterator.remove();
deallocate(batch);
} else {
// Stop at the first batch that has not expired.
break;
}
}
}
}
}
这里要去看一下maybeExpire判断超时的一个标准
public boolean maybeExpire(int requestTimeoutMs, long retryBackoffMs, long now, long lingerMs, boolean isFull) {
boolean expire = false;
String errorMessage = null;
/**
* requestTimeoutMs:默认30s ,代表请求发送的超时时间,可以自己配置
* now:当前时间
* lastAppendTime:批次创建时间(上一次重试的时间)
* 如果now - this.lastAppendTime大于30s,说明超时了,还没有发送出去
*/
if (!this.inRetry() && isFull && requestTimeoutMs < (now - this.lastAppendTime)) {
expire = true;
//记录异常信息
errorMessage = (now - this.lastAppendTime) + " ms has passed since last append";
/**
* lingerMs:100ms ,无论如何都要把消息发送出去的时间
* createdMs:批次创建的时间
*
* now - (this.createdMs + lingerMs大于30s了,说明超时了
*/
} else if (!this.inRetry() && requestTimeoutMs < (now - (this.createdMs + lingerMs))) {
expire = true;
errorMessage = (now - (this.createdMs + lingerMs)) + " ms has passed since batch creation plus linger time";
/**
* 针对重试
* lastAttemptMs:上一次重试的时间
* retryBackoffMs:重试的时间间隔
* now - (this.lastAttemptMs + retryBackoffMs)大于30s,说明超时了
*/
} else if (this.inRetry() && requestTimeoutMs < (now - (this.lastAttemptMs + retryBackoffMs))) {
expire = true;
errorMessage = (now - (this.lastAttemptMs + retryBackoffMs)) + " ms has passed since last attempt plus backoff time";
}
if (expire) {
this.records.close();
//如果超时了,调用done方法
//方法里面传入了一个Timeout异常(说明超时了)
this.done(-1L, Record.NO_TIMESTAMP,
new TimeoutException("Expiring " + recordCount + " record(s) for " + topicPartition + " due to " + errorMessage));
}
return expire;
}
最后看到了熟悉的done方法,这次会走他的另一个分支
else {//这次要走这个分支,因为异常不为null了
//如果有异常就会把异常传给回调函数
//由用户自己去捕获这个异常
//然后对这个异常进行处理
//根据自己的实际情况进行处理
//如果走这里,用户的代码可以捕获到TimeoutException这个异常
//如果用户捕获到了,自己做对应处理就可以了
thunk.callback.onCompletion(null, exception);
}
再来看一下 abortExpiredBatches方法,会发现如果消息有异常了也会对内存做相应的处理
//从数据结构里面移除
batchIterator.remove();
//释放资源
deallocate(batch);
这里最终的思想还是交给用户自己去处理