5.1producer加载元数据

最新推荐文章于 2023-04-24 19:23:25 发布

一只王多鱼的分享

最新推荐文章于 2023-04-24 19:23:25 发布

阅读量106

点赞数

分类专栏： Kafka之producer源码解读文章标签： java 数据库大数据

本文链接：https://blog.csdn.net/WANGCHUNHE55/article/details/129429404

版权

Kafka之producer源码解读专栏收录该内容

18 篇文章 1 订阅

订阅专栏

接下来要一步一步来看之前划分的流程

首先来看步骤一如何拉取元数据

对应的就是这段代码

  // first make sure the metadata for the topic is available
            /**
             * 步骤一：同步等待拉取元数据
             * maxBlockTimeMs 最多能等待多久
             */
            ClusterAndWaitTime clusterAndWaitTime = waitOnMetadata(record.topic(), record.partition(), maxBlockTimeMs);
            //clusterAndWaitTime.waitedOnMetadataMs 代表拉取元数据用了多少时间
            //maxBlockTimeMs - 用了多少时间 = 还剩余多少时间可以使用。
            long remainingWaitMs = Math.max(0, maxBlockTimeMs - clusterAndWaitTime.waitedOnMetadataMs);
            //更新集群的元数据
            Cluster cluster = clusterAndWaitTime.cluster;

5.1.1等待拉取元数据

我们来看这个方法waitOnMetadata

private ClusterAndWaitTime waitOnMetadata(String topic, Integer partition, long maxWaitMs) throws InterruptedException {
    // add topic to metadata topic list if it is not there already and reset expiry
    //把当前的topic存入到元数据里面
    metadata.add(topic);
    //这个地方我们使用的是场景驱动的方式，然后目前我们代码执行到producer端初始化完成
    //这个cluster里面其实没有元数据，只是我们写代码的时候设置的address
    Cluster cluster = metadata.fetch();
    //根据当前的topic从这个集群的cluster元数据信息里面查看分区的信息。
    //因为我们目前是第一次执行这段代码，所以这肯定是没有对应的分区的信息的。
    Integer partitionsCount = cluster.partitionCountForTopic(topic);
    // Return cached metadata if we have it, and if the record's partition is either undefined
    // or within the known partition range
    //如果在元数据里面获取到了分区的信息
    //第一次代码进来这，代码是不会运行这的
    if (partitionsCount != null && (partition == null || partition < partitionsCount))
        //直接返回cluster元数据信息，拉取元数据花的时间
        return new ClusterAndWaitTime(cluster, 0);
    //如果代码执行到这，说明，真的需要去服务端拉取元数据
    //记录当前时间
    long begin = time.milliseconds();
    //剩余多少时间，默认值给的是 最多可以等待的时间
    long remainingWaitMs = maxWaitMs;
    //已经花了多少时间
    long elapsed;
    // Issue metadata requests until we have metadata for the topic or maxWaitTimeMs is exceeded.
    // In case we already have cached metadata for the topic, but the requested partition is greater
    // than expected, issue an update request only once. This is necessary in case the metadata
    // is stale and the number of partitions for this topic has increased in the meantime.
    do {
        log.trace("Requesting metadata update for topic {}.", topic);
        //获取当前元数据的版本号
        //在producer管理元数据的时候，对于他来说元数据是有版本号的
        //每次更改更新元数据，都会递增这个版本号
        //2.把needUpdate 标识赋值为true
        int version = metadata.requestUpdate();
        /**
         * TODO 这个步骤重要
         * 这去唤醒sender线程
         * 因为拉取元数据这个操作是由sender线程去完成的。
         */
        sender.wakeup();
        try {
            //TODO 等待元数据更新
            metadata.awaitUpdate(version, remainingWaitMs);
        } catch (TimeoutException ex) {
            // Rethrow with original maxWaitMs to prevent logging exception with remainingWaitMs
            throw new TimeoutException("Failed to update metadata after " + maxWaitMs + " ms.");
        }
        //尝试获取一下集群的元数据信息
        cluster = metadata.fetch();
        //计算一下拉取元数据已经花了多少时间
        elapsed = time.milliseconds() - begin;
        //如果花的时间大于最大等待时间，那么就报超时
        if (elapsed >= maxWaitMs)
            throw new TimeoutException("Failed to update metadata after " + maxWaitMs + " ms.");
        //如果已经获取到了元数据，但是发现topic没有授权
        if (cluster.unauthorizedTopics().contains(topic))
            throw new TopicAuthorizationException(topic);
        //计算出来还可以用的时间
        remainingWaitMs = maxWaitMs - elapsed;
        //尝试获取一下，我们要发送消息的topic对应分区的信息
        //如果这个值不为null，说明前面sender线程已经获取到元数据了
        partitionsCount = cluster.partitionCountForTopic(topic);
        //如果获取到了元数据以后，这的代码就会退出
    } while (partitionsCount == null);

    if (partition != null && partition >= partitionsCount) {
        throw new KafkaException(
                String.format("Invalid partition given with record: %d is not in the range [0...%d).", partition, partitionsCount));
    }
    //代码就执行到这，返回一个对象
    //有两个参数
    //cluster：集群的元数据
    //elapsed：代表拉取元数据花了多少时间
    return new ClusterAndWaitTime(cluster, elapsed);
}

到这里我们就已经分析完了waitOnMetadata方法，但是我们还没有看到元数据到底是怎么加载的

之前我们看到了这样的一段代码

 //TODO 等待元数据更新
                //同步的等待
                //等待sender线程获取到元数据
                metadata.awaitUpdate(version, remainingWaitMs);

跟进来看一下

 public synchronized void awaitUpdate(final int lastVersion, final long maxWaitMs) throws InterruptedException {
        if (maxWaitMs < 0) {
            throw new IllegalArgumentException("Max time to wait for metadata updates should not be < 0 milli seconds");
        }
        //获取当前时间
        long begin = System.currentTimeMillis();
        //看剩余可以使用的时间，一开始是最大等待的时间
        long remainingWaitMs = maxWaitMs;
        //如果当前的这个version小于等于上一次的version
        //说明元数据还没有更新
        //因为如果sender线程那更新元数据成功了，sender线程肯定会去累加这个version
        while (this.version <= lastVersion) {
            //如果还有剩余时间
            if (remainingWaitMs != 0)
                //让当前线程阻塞等待
                //这里我们猜测，sender线程如果更新元数据成功了，会唤醒这个线程。
                wait(remainingWaitMs);
            //如果执行到这，说明要么就是被唤醒了，要么就是到等待时间了
            //计算一下花了多少时间
            long elapsed = System.currentTimeMillis() - begin;
            //已经超时了
            if (elapsed >= maxWaitMs)
                //报一个超时的异常
                throw new TimeoutException("Failed to update metadata after " + maxWaitMs + " ms.");
            //再次计算可以使用的时间
            remainingWaitMs = maxWaitMs - elapsed;
        }
    }

在这里我们可以看到等待这个方法都做了哪些事情，这里面也使用了底层异常往上抛，由核心流程去捕获的方式，我们可以学习一下。但是到这里还是没有真正的去拉取元数据。

5.1.2拉取元数据

正在拉取元数据的操作是由sender线程来进行的，那我们可以先回忆一下sender线程是在哪启动的，sender线程是在producer初始化的时候就已经启动了，如果忘记了可以去回顾一下producer的初始化流程。

 //这就是一个线程
            this.sender = new Sender(client,
                    this.metadata,
                    this.accumulator,
                    config.getInt(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION) == 1,
                    config.getInt(ProducerConfig.MAX_REQUEST_SIZE_CONFIG),
                    (short) parseAcks(config.getString(ProducerConfig.ACKS_CONFIG)),
                    config.getInt(ProducerConfig.RETRIES_CONFIG),
                    this.metrics,
                    new SystemTime(),
                    clientId,
                    this.requestTimeoutMs);

sender就是在这里，点进来

因为是线程所以直接去找run方法

 public void run() {
        log.debug("Starting Kafka producer I/O thread.");

        // main loop, runs until close is called
        //其实代码就是一个死循环，一直在运行
        //所以sender线程在启动起来以后一直在运行
        while (running) {
            try {
                //TODO
                run(time.milliseconds());
            } catch (Exception e) {
                log.error("Uncaught error in kafka producer I/O thread: ", e);
            }
        }

我们来看一下核心的run方法，跟进去

 void run(long now) {
        //获取元数据
        //因为是场景驱动方式，目前是第一次代码进来，还没有获取到元数据
        //所以这个cluster里面没有元数据的
        //如果这没有元数据的话，这个方法里面接下来的代码就不用看了
        //是因为接下来的代码依赖这个元数据
        //TODO 直接看这个代码的最后一行代码
        //就是这行代码去拉取的元数据。
        Cluster cluster = metadata.fetch();
        // get the list of partitions with data ready to send
        RecordAccumulator.ReadyCheckResult result = this.accumulator.ready(cluster, now);

        // if there are any partitions whose leaders are not known yet, force metadata update
        if (!result.unknownLeaderTopics.isEmpty()) {
            // The set of topics with unknown leader contains topics with leader election pending as well as
            // topics which may have expired. Add the topic again to metadata to ensure it is included
            // and request metadata update, since there are messages to send to the topic.
            for (String topic : result.unknownLeaderTopics)
                this.metadata.add(topic);
            this.metadata.requestUpdate();
        }

        // remove any nodes we aren't ready to send to
        Iterator<Node> iter = result.readyNodes.iterator();
        long notReadyTimeout = Long.MAX_VALUE;
        while (iter.hasNext()) {
            Node node = iter.next();
            if (!this.client.ready(node, now)) {
                iter.remove();
                notReadyTimeout = Math.min(notReadyTimeout, this.client.connectionDelay(node, now));
            }
        }

        // create produce requests
        Map<Integer, List<RecordBatch>> batches = this.accumulator.drain(cluster,
                                                                         result.readyNodes,
                                                                         this.maxRequestSize,
                                                                         now);
        if (guaranteeMessageOrder) {
            // Mute all the partitions drained
            for (List<RecordBatch> batchList : batches.values()) {
                for (RecordBatch batch : batchList)
                    this.accumulator.mutePartition(batch.topicPartition);
            }
        }

        List<RecordBatch> expiredBatches = this.accumulator.abortExpiredBatches(this.requestTimeout, now);
        // update sensors
        for (RecordBatch expiredBatch : expiredBatches)
            this.sensors.recordErrors(expiredBatch.topicPartition.topic(), expiredBatch.recordCount);

        sensors.updateProduceRequestMetrics(batches);
        List<ClientRequest> requests = createProduceRequests(batches, now);
        // If we have any nodes that are ready to send + have sendable data, poll with 0 timeout so this can immediately
        // loop and try sending more data. Otherwise, the timeout is determined by nodes that have partitions with data
        // that isn't yet sendable (e.g. lingering, backing off). Note that this specifically does not include nodes
        // with sendable data that aren't ready to send since they would cause busy looping.
        long pollTimeout = Math.min(result.nextReadyCheckDelayMs, notReadyTimeout);
        if (result.readyNodes.size() > 0) {
            log.trace("Nodes with data ready to send: {}", result.readyNodes);
            log.trace("Created {} produce requests: {}", requests.size(), requests);
            pollTimeout = 0;
        }
        for (ClientRequest request : requests)
            client.send(request, now);

        // if some partitions are already ready to be sent, the select time would be 0;
        // otherwise if some partition already has some data accumulated but not ready yet,
        // the select time will be the time difference between now and its linger expiry time;
        // otherwise the select time will be the time difference between now and the metadata expiry time;
        //TODO 重点就是去看这个方法
        //就是用这个方法去拉取的元数据。
        this.client.poll(pollTimeout, now);
    }

我们来到poll这个方法看一下

public List<ClientResponse> poll(long timeout, long now);

进来我们可以看到其实就是一个接口

我们再看看client

 private final KafkaClient client;

可以看到其实就是kafkaclient

public interface KafkaClient extends Closeable

还是一个接口
在这里插入图片描述
我们可以看到它的实现

来看一下NetworkClient这个实现，进来直接找poll方法

 */
    @Override
    public List<ClientResponse> poll(long timeout, long now) {
        //步骤一：封装了一个拉取元数据的请求
        long metadataTimeout = metadataUpdater.maybeUpdate(now);
        try {
            /**
             * 在这个方法里面涉及到kafka的网络的方法，目前没有看到网络这个模块所以先不太用去关心，
             * 大概知道是如何获取到元数据即可，后面看到网络模块再来看这里的网络处理。
             * 
             */
            
            //步骤二：发送请求，进行复杂的网络操作
            //目前不用太过关心，先知道这里会发送网络请求就可以了
            this.selector.poll(Utils.min(timeout, metadataTimeout, requestTimeoutMs));
        } catch (IOException e) {
            log.error("Unexpected error during I/O", e);
        }

        // process completed actions
        
        long updatedNow = this.time.milliseconds();
        List<ClientResponse> responses = new ArrayList<>();
        handleCompletedSends(responses, updatedNow);
        //步骤三：处理响应，响应里面就会有我们需要的元数据
        handleCompletedReceives(responses, updatedNow);
        handleDisconnections(responses, updatedNow);
        handleConnections();
        handleTimedOutRequests(responses, updatedNow);

        // invoke callbacks
        for (ClientResponse response : responses) {
            if (response.request().hasCallback()) {
                try {
                    response.request().callback().onComplete(response);
                } catch (Exception e) {
                    log.error("Uncaught error in request completion:", e);
                }
            }
        }

        return responses;
    }

这里总共是三个步骤

我们先来看步骤一：

nterface MetadataUpdater
long maybeUpdate(long now);

点进来可以看到是一个接口

我们来看实现

在这里插入图片描述
我们要看的是第一个实现

 @Override
        public long maybeUpdate(long now) {
            // should we update our metadata?
            long timeToNextMetadataUpdate = metadata.timeToNextUpdate(now);
            long timeToNextReconnectAttempt = Math.max(this.lastNoNodeAvailableMs + metadata.refreshBackoff() - now, 0);
            long waitForMetadataFetch = this.metadataFetchInProgress ? Integer.MAX_VALUE : 0;
            // if there is no node available to connect, back off refreshing metadata
            long metadataTimeout = Math.max(Math.max(timeToNextMetadataUpdate, timeToNextReconnectAttempt),
                    waitForMetadataFetch);

            if (metadataTimeout == 0) {
                // Beware that the behavior of this method and the computation of timeouts for poll() are
                // highly dependent on the behavior of leastLoadedNode.
                Node node = leastLoadedNode(now);
                //TODO 这个里面会封装请求。
                maybeUpdate(now, node);
            }

            return metadataTimeout;
        }

这里我们要看的是这个方法，直接来看关键代码 maybeUpdate(now, node);

private void maybeUpdate(long now, Node node) {
            if (node == null) {
                log.debug("Give up sending metadata request since no node is available");
                // mark the timestamp for no node available to connect
                this.lastNoNodeAvailableMs = now;
                return;
            }
            String nodeConnectionId = node.idString();
            //判断网络连接是否建立好
            //因为还没有分析网络模块
            //所以这里先默认网络已经建立好了
            if (canSendRequest(nodeConnectionId)) {
                this.metadataFetchInProgress = true;
                MetadataRequest metadataRequest;
                if (metadata.needMetadataForAllTopics())
                    //封装请求，获取所有topic的元数据信息的请求
                    //但是我们一般获取元数据的时候，只获取自己要发送消息的
                    //对应topic的元数据信息
                    metadataRequest = MetadataRequest.allTopics();
                else
                    //所以我们走的是这的方法
                //就是拉取我们发送信息对应的topic的方法
                    metadataRequest = new MetadataRequest(new ArrayList<>(metadata.topics()));
                //这就创建了一个请求（拉取元数据的请求）
                ClientRequest clientRequest = request(now, nodeConnectionId, metadataRequest);
                log.debug("Sending metadata request {} to node {}", metadataRequest, node.id());
                //发送请求
                //至于具体里面是怎么发送的，暂时不关心，在网络模块去分析
                //这这会将要发送的请求存储起来
                doSend(clientRequest, now);
            } else if (connectionStates.canConnect(nodeConnectionId, now)) {
                // we don't have a connection to this node right now, make one
                log.debug("Initialize connection to node {} for sending metadata request", node.id());
                initiateConnect(node, now);
                // If initiateConnect failed immediately, this node will be put into blackout and we
                // should allow immediately retrying in case there is another candidate node. If it
                // is still connecting, the worst case is that we end up setting a longer timeout
                // on the next round and then wait for the response.
            } else { // connected, but can't send more OR connecting
                // In either case, we just need to wait for a network event to let us know the selected
                // connection might be usable again.
                this.lastNoNodeAvailableMs = now;
            }
        }

以上就是步骤一会做的事情

步骤二先不做过多介绍，主要涉及到网络，具体发送请求，后面再分析

再看步骤三

来看handleCompletedReceives(responses, updatedNow);这个方法

        for (NetworkReceive receive : this.selector.completedReceives()) {
            String source = receive.source();
            ClientRequest req = inFlightRequests.completeNext(source);
            Struct body = parseResponse(receive.payload(), req.request().header());
            //TODO 如果是关于元数据信息的响应
            if (!metadataUpdater.maybeHandleCompletedReceive(req, now, body))
                responses.add(new ClientResponse(req, now, false, body));
        }
    }

点进来看一下

interface MetadataUpdater
 boolean maybeHandleCompletedReceive(ClientRequest request, long now, Struct body);

还是这个接口

所以还是看第一个实现

public boolean maybeHandleCompletedReceive(ClientRequest req, long now, Struct body) {
            short apiKey = req.request().header().apiKey();
            if (apiKey == ApiKeys.METADATA.id && req.isInitiatedByNetworkClient()) {
                //TODO 处理响应
                handleResponse(req.request().header(), body, now);
                return true;
            }
            return false;
        }

主要看怎么处理响应


        private void handleResponse(RequestHeader header, Struct body, long now) {
            this.metadataFetchInProgress = false;
            //因为服务端发送回来的是一个二进制的数据结构
            //所以生产者这要对这个数据结构进行解析
            //解析完了以后就封装成一个MetadataResponse对象。
            MetadataResponse response = new MetadataResponse(body);
            //响应里面就会带回来元数据的信息
            //获取到了从服务端拉取的集群的元数据信息
            Cluster cluster = response.cluster();
            // check if any topics metadata failed to get updated
            Map<String, Errors> errors = response.errors();
            if (!errors.isEmpty())
                log.warn("Error while fetching metadata with correlation id {} : {}", header.correlationId(), errors);

            // don't update the cluster if there are no valid nodes...the topic we want may still be in the process of being
            // created which means we will get errors and no nodes until it exists
            //如果正常获取到了元数据信息
            if (cluster.nodes().size() > 0) {
                //更新元数据信息
                this.metadata.update(cluster, now);
            } else {
                log.trace("Ignoring empty metadata response with correlation id {}.", header.correlationId());
                this.metadata.failedUpdate(now);
            }
        }

我们来看一下更新元数据信息

public synchronized void update(Cluster cluster, long now) {
        Objects.requireNonNull(cluster, "cluster should not be null");

        this.needUpdate = false;
        this.lastRefreshMs = now;
        this.lastSuccessfulRefreshMs = now;
        this.version += 1;
        //这个默认值是true，所以这段代码默认执行
        if (topicExpiryEnabled) {
            // Handle expiry of topics from the metadata refresh set.
            //但是我们目前topics是空的
            //所以下面的代码是不会被运行的
            for (Iterator<Map.Entry<String, Long>> it = topics.entrySet().iterator(); it.hasNext(); ) {
                Map.Entry<String, Long> entry = it.next();
                long expireMs = entry.getValue();
                if (expireMs == TOPIC_EXPIRY_NEEDS_UPDATE)
                    entry.setValue(now + TOPIC_EXPIRY_MS);
                else if (expireMs <= now) {
                    it.remove();
                    log.debug("Removing unused topic {} from the metadata list, expiryMs {} now {}", entry.getKey(), expireMs, now);
                }
            }
        }

点进来之后发现，这个update方法就是我们最开始看到的那个update方法

当代码第二次进入这个方法会做以下的一些事情

   public synchronized void update(Cluster cluster, long now) {
        Objects.requireNonNull(cluster, "cluster should not be null");

        this.needUpdate = false;
        this.lastRefreshMs = now;
        this.lastSuccessfulRefreshMs = now;
        /**
         * 这里可以发现，更新了元数据的版本
         */
        this.version += 1;
        //这个默认值是true，所以这段代码默认执行
        if (topicExpiryEnabled) {
            // Handle expiry of topics from the metadata refresh set.
            //但是我们目前topics是空的
            //所以下面的代码是不会被运行的

            //这次我们的代码就是第二次进来了
            //如果是第二次进来，此时此刻，producer.send方法
            //要去拉取元数据 -》sender -》 代码走到这
            //第二次进来的时候，topics就不是空了，已经给它赋值了
            //那么就会运行下面的代码
            for (Iterator<Map.Entry<String, Long>> it = topics.entrySet().iterator(); it.hasNext(); ) {
                Map.Entry<String, Long> entry = it.next();
                long expireMs = entry.getValue();
                if (expireMs == TOPIC_EXPIRY_NEEDS_UPDATE)
                    entry.setValue(now + TOPIC_EXPIRY_MS);
                else if (expireMs <= now) {
                    it.remove();
                    log.debug("Removing unused topic {} from the metadata list, expiryMs {} now {}", entry.getKey(), expireMs, now);
                }
            }
        }

        for (Listener listener: listeners)
            listener.onMetadataUpdate(cluster);

        String previousClusterId = cluster.clusterResource().clusterId();
        //默认值是false，所以这个分支的代码不会去运行
        if (this.needMetadataForAllTopics) {
            // the listener may change the interested topics, which could cause another metadata refresh.
            // If we have already fetched all topics, however, another fetch should be unnecessary.
            this.needUpdate = false;
            this.cluster = getClusterForCurrentTopics(cluster);
        } else {
            //所以代码执行的是这
            //直接把刚刚传进来的对象赋值给了cluster
            //cluster代表的是kafka集群的元数据。
            this.cluster = cluster;
        }

        // The bootstrap cluster is guaranteed not to have any useful information
        if (!cluster.isBootstrapConfigured()) {
            String clusterId = cluster.clusterResource().clusterId();
            if (clusterId == null ? previousClusterId != null : !clusterId.equals(previousClusterId))
                log.info("Cluster ID: {}", cluster.clusterResource().clusterId());
            clusterResourceListeners.onUpdate(cluster.clusterResource());
        }
        //这里最重要的一个作用就是唤醒上一节看到的wait线程
        //就是awaitUpdate这个方法中的wait
        notifyAll();
        log.debug("Updated cluster metadata version {} to {}", this.version, this.cluster);
    }