（4.4）kafka生产者源码——Sender线程类

猿来如此dj

已于 2022-07-15 15:27:22 修改

阅读量381

点赞数

分类专栏： kafka消息系统源码解析文章标签： kafka java 分布式

于 2022-07-12 17:12:17 首次发布

本文链接：https://blog.csdn.net/weixin_43930865/article/details/125746312

版权

kafka消息系统源码解析专栏收录该内容

15 篇文章 9 订阅

订阅专栏

1：Sender

独立于KafkaProducer的线程，符合条件时被唤醒启动用于发送数据。
处理向Kafka集群发送生产请求的后台线程。此线程发出元数据请求以更新集群信息，然后将生产请求发送到相应的节点。

1.1：属性

   /* the state of each nodes connection 连接broker的网络客户端*/
    private final KafkaClient client;

    /* the record accumulator that batches records 存储批记录的消息累加器*/
    private final RecordAccumulator accumulator;

    /* the metadata for the client 元数据信息*/
    private final Metadata metadata;

    /* the flag indicating whether the producer should guarantee the message order on the broker or not.
    指示生产者是否应保证代理上的消息顺序的标志。
     */
    private final boolean guaranteeMessageOrder;

    /* the maximum request size to attempt to send to the server 发送到服务器的最大请求大小*/
    private final int maxRequestSize;

    /*要从服务器请求的确认数*/
    private final short acks;

    /*发送失败时重试次数*/
    private final int retries;

    /* the clock instance used for getting the time */
    private final Time time;

    /* 当发送者线程仍在运行时为true*/
    private volatile boolean running;

    /* true when the caller wants to ignore all unsent/inflight messages and force close.  */
    private volatile boolean forceClose;

    /* metrics */
    private final SenderMetrics sensors;

    /*等待服务端响应最大时间*/
    private final int requestTimeout;

    /* 失败重试次数*/
    private final long retryBackoffMs;

    /* current request API versions supported by the known brokers */
    private final ApiVersions apiVersions;

    /* all the state related to transactions, in particular the producer id, producer epoch, and sequence numbers
     * 与事务相关的所有状态，特别是生产者ID、生产者时代和序列号
     * */
    private final TransactionManager transactionManager;

1.2：方法

线程启动方法，构建请求到Selector中，并开始处理请求

 void run(long now) {
        if (transactionManager != null) {/*对事务管理器的一些判断，默认不开启，为null*/
            try {
                if (transactionManager.shouldResetProducerStateAfterResolvingSequences())
                    // Check if the previous run expired batches which requires a reset of the producer state.
                    transactionManager.resetProducerId();

                if (!transactionManager.isTransactional()) {
                    // this is an idempotent producer, so make sure we have a producer id
                    maybeWaitForProducerId();
                } else if (transactionManager.hasUnresolvedSequences() && !transactionManager.hasFatalError()) {
                    transactionManager.transitionToFatalError(new KafkaException("The client hasn't received acknowledgment for " +
                            "some previously sent messages and can no longer retry them. It isn't safe to continue."));
                } else if (transactionManager.hasInFlightTransactionalRequest() || maybeSendTransactionalRequest(now)) {
                    // as long as there are outstanding transactional requests, we simply wait for them to return
                    client.poll(retryBackoffMs, now);
                    return;
                }

                // do not continue sending if the transaction manager is in a failed state or if there
                // is no producer id (for the idempotent case).
                if (transactionManager.hasFatalError() || !transactionManager.hasProducerId()) {
                    RuntimeException lastError = transactionManager.lastError();
                    if (lastError != null)
                        maybeAbortBatches(lastError);
                    client.poll(retryBackoffMs, now);
                    return;
                } else if (transactionManager.hasAbortableError()) {
                    accumulator.abortUndrainedBatches(transactionManager.lastError());
                }
            } catch (AuthenticationException e) {
                // This is already logged as error, but propagated here to perform any clean ups.
                log.trace("Authentication exception while processing transactional request: {}", e);
                transactionManager.authenticationFailed(e);
            }
        }
        /*1：重点在sendProducerData中,此时会汇总batch,寻找需要发送的partition leader，构建clientRequest到Selector中*/
        long pollTimeout = sendProducerData(now);
        /*2：请求构建完成，这里会调用java.nio.Selector.poll()方法取处理我们构建的clientRequest生产请求
        client.poll(pollTimeout, now);
    }

1.1：主要使用的两个方法


    /*发送请求和构建请求的实现*/
    private long sendProducerData(long now) {
        /*更新元数据，topic->partitions->partition leader->isr*/
        Cluster cluster = metadata.fetch();

        // get the list of partitions with data ready to send
        /*遍历所有的batch，判断可以发送的batch和获取可以发送batch对应的partition leader*/
        RecordAccumulator.ReadyCheckResult result = this.accumulator.ready(cluster, now);

        // if there are any partitions whose leaders are not known yet, force metadata update
        if (!result.unknownLeaderTopics.isEmpty()) {
            // The set of topics with unknown leader contains topics with leader election pending as well as
            // topics which may have expired. Add the topic again to metadata to ensure it is included
            // and request metadata update, since there are messages to send to the topic.
            for (String topic : result.unknownLeaderTopics)
                this.metadata.add(topic);
            this.metadata.requestUpdate();
        }

        // remove any nodes we are not ready to send to移除没有准备好发送数据的节点
        Iterator<Node> iter = result.readyNodes.iterator();
        long notReadyTimeout = Long.MAX_VALUE;
        while (iter.hasNext()) {
            Node node = iter.next();

            /*检查客户端broker是否准备好，能否连接到node节点*/
            if (!this.client.ready(node, now)) {
                iter.remove();
                notReadyTimeout = Math.min(notReadyTimeout, this.client.connectionDelay(node, now));
            }
        }

        // batch是否需要被发送算法实现
        // create produce requests创建发送请求，把发往同一个broker的所有batch都放在一起，得到batches，减小网络io和请求次数
        Map<Integer, List<ProducerBatch>> batches = this.accumulator.drain(cluster, result.readyNodes,
                this.maxRequestSize, now);
        /*消息有序性，默认true，也就是所有分区都会保证*/
        if (guaranteeMessageOrder) {
            // Mute all the partitions drained将所有的分区加到标识中，标识该分区有正在处理的批次。
            for (List<ProducerBatch> batchList : batches.values()) {
                for (ProducerBatch batch : batchList)
                    this.accumulator.mutePartition(batch.topicPartition);
            }
        }
        /*过期的批次*/
        List<ProducerBatch> expiredBatches = this.accumulator.expiredBatches(this.requestTimeout, now);
        // Reset the producer id if an expired batch has previously been sent to the broker. Also update the metrics
        // for expired batches. see the documentation of @TransactionState.resetProducerId to understand why
        // we need to reset the producer id here.
        if (!expiredBatches.isEmpty())
            log.trace("Expired {} batches in accumulator", expiredBatches.size());
        for (ProducerBatch expiredBatch : expiredBatches) {
            failBatch(expiredBatch, -1, NO_TIMESTAMP, expiredBatch.timeoutException(), false);
            if (transactionManager != null && expiredBatch.inRetry()) {
                // This ensures that no new batches are drained until the current in flight batches are fully resolved.
                transactionManager.markSequenceUnresolved(expiredBatch.topicPartition);
            }
        }

        sensors.updateProduceRequestMetrics(batches);

        // If we have any nodes that are ready to send + have sendable data, poll with 0 timeout so this can immediately
        // loop and try sending more data. Otherwise, the timeout is determined by nodes that have partitions with data
        // that isn't yet sendable (e.g. lingering, backing off). Note that this specifically does not include nodes
        // with sendable data that aren't ready to send since they would cause busy looping.
        long pollTimeout = Math.min(result.nextReadyCheckDelayMs, notReadyTimeout);
        if (!result.readyNodes.isEmpty()) {
            log.trace("Nodes with data ready to send: {}", result.readyNodes);
            // if some partitions are already ready to be sent, the select time would be 0;
            // otherwise if some partition already has some data accumulated but not ready yet,
            // the select time will be the time difference between now and its linger expiry time;
            // otherwise the select time will be the time difference between now and the metadata expiry time;
            pollTimeout = 0;
        }
        /*2：开始发送请求*/
        sendProduceRequests(batches, now);

        return pollTimeout;
    }

2.1 创建ClientRequest客户端请求，用于networkClient向Selector发送请求

    private void sendProduceRequest(long now, int destination, short acks, int timeout, List<ProducerBatch> batches) {
        if (batches.isEmpty())
            return;

        Map<TopicPartition, MemoryRecords> produceRecordsByPartition = new HashMap<>(batches.size());
        final Map<TopicPartition, ProducerBatch> recordsByPartition = new HashMap<>(batches.size());

        // find the minimum magic version used when creating the record sets查找创建记录集时使用的最小版本
        byte minUsedMagic = apiVersions.maxUsableProduceMagic();
        for (ProducerBatch batch : batches) {
            if (batch.magic() < minUsedMagic)
                minUsedMagic = batch.magic();
        }

        for (ProducerBatch batch : batches) {
            TopicPartition tp = batch.topicPartition;
            MemoryRecords records = batch.records();

            // down convert if necessary to the minimum magic used. In general, there can be a delay between the time
            // that the producer starts building the batch and the time that we send the request, and we may have
            // chosen the message format based on out-dated metadata. In the worst case, we optimistically chose to use
            // the new message format, but found that the broker didn't support it, so we need to down-convert on the
            // client before sending. This is intended to handle edge cases around cluster upgrades where brokers may
            // not all support the same message format version. For example, if a partition migrates from a broker
            // which is supporting the new magic version to one which doesn't, then we will need to convert.
            if (!records.hasMatchingMagic(minUsedMagic))
                records = batch.records().downConvert(minUsedMagic, 0, time).records();
            produceRecordsByPartition.put(tp, records);
            recordsByPartition.put(tp, batch);
        }

        String transactionalId = null;
        if (transactionManager != null && transactionManager.isTransactional()) {
            transactionalId = transactionManager.transactionalId();
        }
        //构建ProduceRequest请求
        ProduceRequest.Builder requestBuilder = ProduceRequest.Builder.forMagic(minUsedMagic, acks, timeout,
                produceRecordsByPartition, transactionalId);
        RequestCompletionHandler callback = new RequestCompletionHandler() {
            public void onComplete(ClientResponse response) {
                handleProduceResponse(response, recordsByPartition, time.milliseconds());
            }
        };

        String nodeId = Integer.toString(destination);
        /*创建ClientRequest客户端请求*/
        ClientRequest clientRequest = client.newClientRequest(nodeId, requestBuilder, now, acks != 0, callback);
        //2：发送请求
        client.send(clientRequest, now);
        log.trace("Sent produce request to {}: {}", nodeId, requestBuilder);
    }

client.send(clientRequest, now);此时会调用NetworkClient.send()发送请求
client源码——kafka网络通信客户端-NetworkClient类

猿来如此dj

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
（4.4）kafka生产者源码——Sender线程类

独立于KafkaProducer的线程，符合条件时被唤醒启动用于发送数据。处理向Kafka集群发送生产请求的后台线程。此线程发出元数据请求以更新集群信息，然后将生产请求发送到相应的节点。1.2：方法主要使用的两个方法创建ClientRequest客户端请求，用于networkClient向Selector发送请求.........
复制链接

扫一扫

专栏目录