第1章 简介
经过上一篇文章的源码阅读,大致了解了Producer发送消息的流程,本篇文章我们详细的看看Producer时如何获取元数据的。
第2章 详细步骤
2.1 sender线程拉取元数据
sender线程启动以后,会执行run()=>runOnce()=>client.poll()执行kafka client的网络请求开始执行如下代码。
@Override
public List<ClientResponse> poll(long timeout, long now) {
ensureActive();
if (!abortedSends.isEmpty()) {
// If there are aborted sends because of unsupported version exceptions or disconnects,
// handle them immediately without waiting for Selector#poll.
List<ClientResponse> responses = new ArrayList<>();
handleAbortedSends(responses);
completeResponses(responses);
return responses;
}
//TODO 封装获取元数据的请求(metadataRequest),返回元数据的超时时间
long metadataTimeout = metadataUpdater.maybeUpdate(now);
try {
//TODO 发送请求
this.selector.poll(Utils.min(timeout, metadataTimeout, defaultRequestTimeoutMs));
} catch (IOException e) {
log.error("Unexpected error during I/O", e);
}
// process completed actions
long updatedNow = this.time.milliseconds();
List<ClientResponse> responses = new ArrayList<>();
handleCompletedSends(responses, updatedNow);
//TODO 处理获取到的元数据
handleCompletedReceives(responses, updatedNow);
handleDisconnections(responses, updatedNow);
handleConnections();
handleInitiateApiVersionRequests(updatedNow);
handleTimedOutConnections(responses, updatedNow);
handleTimedOutRequests(responses, updatedNow);
completeResponses(responses);
return responses;
}
处理响应handleCompletedReceives
private void handleCompletedReceives(List<ClientResponse> responses, long now) {
//TODO 遍历所有完成的请求
for (NetworkReceive receive : this.selector.completedReceives()) {
String source = receive.source();
InFlightRequest req = inFlightRequests.completeNext(source);
Struct responseStruct = parseStructMaybeUpdateThrottleTimeMetrics(receive.payload(), req.header,
throttleTimeSensor, now);
AbstractResponse response = AbstractResponse.
parseResponse(req.header.apiKey(), responseStruct, req.header.apiVersion());
if (log.isDebugEnabled()) {
log.debug("Received {} response from node {} for request with header {}: {}",
req.header.apiKey(), req.destination, req.header, response);
}
// If the received response includes a throttle delay, throttle the connection.
maybeThrottle(response, req.header.apiVersion(), req.destination, now);
if (req.isInternalRequest && response instanceof MetadataResponse)
//TODO 元数据响应信息处理
metadataUpdater.handleSuccessfulResponse(req.header, now, (MetadataResponse) response);
else if (req.isInternalRequest && response instanceof ApiVersionsResponse)
handleApiVersionsResponse(responses, req, now, (ApiVersionsResponse) response);
else
responses.add(req.completed(response, now));
}
}
元数据的处理在metadataUpdater.handleSuccessfulResponse方法
MetadataUpdater是一个接口,这里的实现类是org.apache.kafka.clients.NetworkClient.DefaultMetadataUpdater,最终执行的是下面这个方法
@Override
public void handleSuccessfulResponse(RequestHeader requestHeader, long now, MetadataResponse response) {
// If any partition has leader with missing listeners, log up to ten of these partitions
// for diagnosing broker configuration issues.
// This could be a transient issue if listeners were added dynamically to brokers.
List<TopicPartition> missingListenerPartitions = response.topicMetadata().stream().flatMap(topicMetadata ->
topicMetadata.partitionMetadata().stream()
.filter(partitionMetadata -> partitionMetadata.error == Errors.LISTENER_NOT_FOUND)
.map(partitionMetadata -> new TopicPartition(topicMetadata.topic(), partitionMetadata.partition())))
.collect(Collectors.toList());
if (!missingListenerPartitions.isEmpty()) {
int count = missingListenerPartitions.size();
log.warn("{} partitions have leader brokers without a matching listener, including {}",
count, missingListenerPartitions.subList(0, Math.min(10, count)));
}
// Check if any topic's metadata failed to get updated
// 如果返回由错误信息
Map<String, Errors> errors = response.errors();
if (!errors.isEmpty())
log.warn("Error while fetching metadata with correlation id {} : {}", requestHeader.correlationId(), errors);
// Don't update the cluster if there are no valid nodes...the topic we want may still be in the process of being
// created which means we will get errors and no nodes until it exists
// broker为空
if (response.brokers().isEmpty()) {
log.trace("Ignoring empty metadata response with correlation id {}.", requestHeader.correlationId());
// 元数据更新失败
this.metadata.failedUpdate(now);
} else {
//TODO 更新元数据
this.metadata.update(inProgress.requestVersion, response, inProgress.isPartialUpdate, now);
}
inProgress = null;
}
更新元数据时执行org.apache.kafka.clients.producer.internals.ProducerMetadata#update方法,这里会先执行父类Metadata的update方法,先处理响应,最后在唤醒线程,唤醒的就是发送数据时等待的线程,即唤醒2.2中的wait();
父类
public synchronized void update(int requestVersion, MetadataResponse response, boolean isPartialUpdate, long nowMs) {
Objects.requireNonNull(response, "Metadata response cannot be null");
if (isClosed())
throw new IllegalStateException("Update requested after metadata close");
this.needPartialUpdate = requestVersion < this.requestVersion;
this.lastRefreshMs = nowMs;
//TODO 版本号+1
this.updateVersion += 1;
if (!isPartialUpdate) {
this.needFullUpdate = false;
this.lastSuccessfulRefreshMs = nowMs;
}
String previousClusterId = cache.clusterResource().clusterId();
//TODO 处理元数据响应信息
this.cache = handleMetadataResponse(response, isPartialUpdate, nowMs);
Cluster cluster = cache.cluster();
maybeSetMetadataError(cluster);
this.lastSeenLeaderEpochs.keySet().removeIf(tp -> !retainTopic(tp.topic(), false, nowMs));
String newClusterId = cache.clusterResource().clusterId();
if (!Objects.equals(previousClusterId, newClusterId)) {
log.info("Cluster ID: {}", newClusterId);
}
clusterResourceListeners.onUpdate(cache.clusterResource());
log.debug("Updated cluster metadata updateVersion {} to {}", this.updateVersion, this.cache);
}
子类
@Override
public synchronized void update(int requestVersion, MetadataResponse response, boolean isPartialUpdate, long nowMs) {
super.update(requestVersion, response, isPartialUpdate, nowMs);
// Remove all topics in the response that are in the new topic set. Note that if an error was encountered for a
// new topic's metadata, then any work to resolve the error will include the topic in a full metadata update.
if (!newTopics.isEmpty()) {
for (MetadataResponse.TopicMetadata metadata : response.topicMetadata()) {
newTopics.remove(metadata.topic());
}
}
//TODO 唤醒等待的线程
notifyAll();
}
到这里producer的元数据就在new KafkaProducer时获取到了,并在后面的doSend真正发送数据的时候使用。
2.2 doSend发送消息前等待元数据信息
在org.apache.kafka.clients.producer.KafkaProducer#doSend方法中,执行waitOnMetadata方法,真正的开始拉取元数据,我们从这个方法开始阅读。
private ClusterAndWaitTime waitOnMetadata(String topic, Integer partition, long nowMs, long maxWaitMs) throws InterruptedException {
// add topic to metadata topic list if it is not there already and reset expiry
// 获取cache中集群元数据信息,第一次执行没有元数据信息
Cluster cluster = metadata.fetch();
if (cluster.invalidTopics().contains(topic))
throw new InvalidTopicException(topic);
// 把当前topic加入到元数据信息中
metadata.add(topic, nowMs);
// 从元数据信息中,获取topic的partition数量,第一次执行获取不到
Integer partitionsCount = cluster.partitionCountForTopic(topic);
// Return cached metadata if we have it, and if the record's partition is either undefined
// or within the known partition range
// 非第一次执行,获取到元数据信息后返回
if (partitionsCount != null && (partition == null || partition < partitionsCount))
return new ClusterAndWaitTime(cluster, 0);
// 元数据的剩余等待时间,默认=最大等待时间
long remainingWaitMs = maxWaitMs;
// 花费的时间
long elapsed = 0;
// Issue metadata requests until we have metadata for the topic and the requested partition,
// or until maxWaitTimeMs is exceeded. This is necessary in case the metadata
// is stale and the number of partitions for this topic has increased in the meantime.
do {
if (partition != null) {
log.trace("Requesting metadata update for partition {} of topic {}.", partition, topic);
} else {
log.trace("Requesting metadata update for topic {}.", topic);
}
metadata.add(topic, nowMs + elapsed);
// 获取当前元数据版本,Producer每次更新元数据信息后,版本号+1
int version = metadata.requestUpdateForTopic(topic);
//TODO 唤醒sender线程,真正拉取元数据是由sender线程执行
sender.wakeup();
try {
//TODO 等待元数据更新,即等待sender线程获取到元数据
metadata.awaitUpdate(version, remainingWaitMs);
} catch (TimeoutException ex) {
// Rethrow with original maxWaitMs to prevent logging exception with remainingWaitMs
throw new TimeoutException(
String.format("Topic %s not present in metadata after %d ms.",
topic, maxWaitMs));
}
// 获取元数据信息
cluster = metadata.fetch();
// 计算当前花费的时间
elapsed = time.milliseconds() - nowMs;
// 如果大于最大等待时间,则超时
if (elapsed >= maxWaitMs) {
throw new TimeoutException(partitionsCount == null ?
String.format("Topic %s not present in metadata after %d ms.",
topic, maxWaitMs) :
String.format("Partition %d of topic %s with partition count %d is not present in metadata after %d ms.",
partition, topic, partitionsCount, maxWaitMs));
}
// 校验topic是否正确(权限等)
metadata.maybeThrowExceptionForTopic(topic);
// 计算剩余等待时间=最大等待时间-已经花费的时间
remainingWaitMs = maxWaitMs - elapsed;
// 获取partition数量
partitionsCount = cluster.partitionCountForTopic(topic);
// 如果获取不到元数据信息则一直循环;如果获取到了,则退出循环
} while (partitionsCount == null || (partition != null && partition >= partitionsCount));
//TODO 返回元数据信息(元数据信息,花费的时间)
return new ClusterAndWaitTime(cluster, elapsed);
}
往里面看看awaitUpdate方法的实现
public synchronized void awaitUpdate(final int lastVersion, final long timeoutMs) throws InterruptedException {
long currentTimeMs = time.milliseconds();
long deadlineMs = currentTimeMs + timeoutMs < 0 ? Long.MAX_VALUE : currentTimeMs + timeoutMs;
//TODO 等待
time.waitObject(this, () -> {
// Throw fatal exceptions, if there are any. Recoverable topic errors will be handled by the caller.
maybeThrowFatalException();
// 根据版本version判断是否进行等待
return updateVersion() > lastVersion || isClosed();
}, deadlineMs);
if (isClosed())
throw new KafkaException("Requested metadata update after close");
}
在org.apache.kafka.common.utils.SystemTime#waitObject方法里实现的wait()
@Override
public void waitObject(Object obj, Supplier<Boolean> condition, long deadlineMs) throws InterruptedException {
synchronized (obj) {
while (true) {
if (condition.get())
return;
long currentTimeMs = milliseconds();
if (currentTimeMs >= deadlineMs)
throw new TimeoutException("Condition not satisfied before deadline");
obj.wait(deadlineMs - currentTimeMs);
}
}
}
这里就是发送数据之前等待元数据的代码,这里被sender线程的唤醒后,就开始执行后续的发送逻辑!
精彩文章持续发布,欢迎关注;文章如有不妥之处,欢迎留言指正,一起学习共同进步!