Kafka核心源码剖析（一）：Producer发送消息流程

最新推荐文章于 2024-05-18 00:15:00 发布

数据与智能

最新推荐文章于 2024-05-18 00:15:00 发布

阅读量790

点赞数 1

文章标签：分布式大数据 hadoop kafka java

本文链接：https://blog.csdn.net/qq_43045873/article/details/113903850

版权

作者 | 吴邪大数据4年从业经验，目前就职于广州一家互联网公司，负责大数据基础平台自研、离线计算&实时计算研究

编辑 | auroral-L

全文共6000字，预计阅读30分钟。

Apache Kafka是一个高性能的开源分布式消息中间件，上一篇文章「浅谈Kafka」对kafka做了简单的介绍，让我们对kafka的架构、工作原理及优势有个大概的了解。从这篇文章开始，将深入剖析kafka核心功能的源码实现，让我们对kafka底层的原理有更深的认知。

通过上一篇文章，我们知道了Kafka消息队列主要有三部分组成：生产者（Producer）、消费者和Broker组成，所以我们从生产者入手开始我们的源码剖析之旅。我们公司目前使用的是kafka V2.11——2.1.1这个版本，源码剖析的时候可以根据个人喜好选择0.10之后的版本进行剖析，因为目前2.x版本的架构跟0.10相差不大，0.10之前的版本在架构和功能上差别非常大。这篇文章以2.x版本的Kafka作为剖析的对象。

生产者发送消息流程

Kafka的源码最核心的是由client模块和core模块构成，在开始剖析源码之前，先用一幅图大致介绍一下生产者发送消息的流程。

图1：生产者发送消息流程

将消息封装成ProducerRecord对象
Serializer对消息的key和value做序列化
根据Partitioner将消息分发到不同的分区，需要先获取集群的元数据
RecordAccumulator封装很多分区的消息队列，每个队列代表一个分区，每个分区里面有很多的批次，每个批次里面由多条消息组成
Sender会从RecordAccumulator拉取消息，封装成批次，发送请求
通过网络将请求发送到kafka集群

如果你不知道从何入手，那就打开Kafka源码example目录下的Producer这个类，层层递进。你会发现Producer继承了Thread，这就有点意思了，所以run（）方法是我们一定要看的。

public class Producer extends Thread {    //定义了KafkaProducer对象    private final KafkaProducer<Integer, String> producer;    //消息的主题    private final String topic;    //发送消息的方式：同步发送或者异步发送    private final Boolean isAsync;    ......

通过Producer的代码结构可以看到，Producer主要包括了构造函数和内部类DemoCallBack。

/**
 * 初始化生产者对象
 * @param topic
 * @param isAsync
 */
public Producer(String topic, Boolean isAsync) {
    Properties props = new Properties();
    // 指定kafka服务端的主机名和端口号，从kafka集群获取元数据信息
    props.put("bootstrap.servers", "localhost:9092");
    //客户端ID
    props.put("client.id", "DemoProducer");
    //IntegerSerializer将消息key序列化成字节数组
    props.put("key.serializer", "org.apache.kafka.common.serialization.IntegerSerializer");
    //StringSerializer将String对象序列化成字节数组
    props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
    //初始化kafkaProducer，生产者的核心类
    producer = new KafkaProducer<>(props);
    this.topic = topic;
    this.isAsync = isAsync;
}

初始KafkaProducer

/**
 * A producer is instantiated by providing a set of key-value pairs as configuration. Valid configuration strings
 * are documented <a href="http://kafka.apache.org/documentation.html#producerconfigs">here</a>.
 * @param properties   The producer configs
 */
 //通过producer的配置文件进行KafkaProducer初始化
public KafkaProducer(Properties properties) {
    this(new ProducerConfig(properties), null, null);
}

KafkaProducer核心功能详解

Producer初始化了KafkaProducer这个核心类，消息就是调用了这个类的send（xxx）方法进行发送的。感兴趣的小伙伴可以去看一下KafkaProducer这个类的注释，因为内容有点长，这里就简单做一下介绍，主要说明了KafkaProducer是发送消息的客户端，并且是线程安全的，支持高并发，kafkaProducer还包含一个用于缓冲待提交消息的缓冲空间，发送消息是异步的，支持ACK消息确认机制，以及包含producer的配置，详情如下，截取了重点部分，可以参考一下。

/**
 * A Kafka client that publishes records to the Kafka cluster.
 * <P>
 * The producer is <i>thread safe</i> and sharing a single producer instance across threads will generally be faster than
 * having multiple instances.
 * <p>
 * The producer consists of a pool of buffer space that holds records that haven't yet been transmitted to the server
 * as well as a background I/O thread that is responsible for turning these records into requests and transmitting them
 * to the cluster. Failure to close the producer after use will leak these resources.
 * <p>
 * The {@link #send(ProducerRecord) send()} method is asynchronous. When called it adds the record to a buffer of pending record sends
 * and immediately returns. This allows the producer to batch together individual records for efficiency.
 * <p>
 * The <code>acks</code> config controls the criteria under which requests are considered complete. The "all" setting
 * we have specified will result in blocking on the full commit of the record, the slowest but most durable setting.
 */
private static final Logger log = LoggerFactory.getLogger(KafkaProducer.class);
//用于生产clientID
private static final AtomicInteger PRODUCER_CLIENT_ID_SEQUENCE = new AtomicInteger(1);
private static final String JMX_PREFIX = "kafka.producer";
private String clientId;
//分区器
private final Partitioner partitioner;
//消息的最大长度
private final int maxRequestSize;
//缓冲区大小
private final long totalMemorySize;
//管理的元数据的对象
private final Metadata metadata;
//用于收集并缓存消息，等待sender线程调用
private final RecordAccumulator accumulator;
//消息发送任务，是一个线程，实现了Runnable接口
private final Sender sender;
private final Metrics metrics;
//执行sender任务，发送消息的线程
private final Thread ioThread;
//压缩算法，有none、gzip、snappy和lz4，针对RecordAccumulator的消息使用
private final CompressionType compressionType;
private final Sensor errors;
private final Time time;
//key的序列化器
private final Serializer<K> keySerializer;
//value的序列化器
private final Serializer<V> valueSerializer;
//配置对象
private final ProducerConfig producerConfig;
//等待更新kafka集群元数据的最大时长
private final long maxBlockTimeMs;
//消息超时时间
private final int requestTimeoutMs;
//消息拦截器，发送消息之前可以先初步过滤
private final ProducerInterceptors<K, V> interceptors;

了解了KafkaProducer主要的功能概述之后，继续回到KafkaProducer的初始化流程，找到KafkaProducer（xxx）构造方法。

@SuppressWarnings({"unchecked", "deprecation"})
private KafkaProducer(ProducerConfig config, Serializer<K> keySerializer, Serializer<V> valueSerializer) {
    try {
        log.trace("Starting the Kafka producer");
        // 配置用户自定义的参数
        Map<String, Object> userProvidedConfigs = config.originals();
        this.producerConfig = config;
        this.time = new SystemTime();
        //生成clientId
        clientId = config.getString(ProducerConfig.CLIENT_ID_CONFIG);
        if (clientId.length() <= 0)
            clientId = "producer-" + PRODUCER_CLIENT_ID_SEQUENCE.getAndIncrement();
        Map<String, String> metricTags = new LinkedHashMap<String, String>();
        metricTags.put("client-id", clientId);
        MetricConfig metricConfig = new MetricConfig().samples(config.getInt(ProducerConfig.METRICS_NUM_SAMPLES_CONFIG))              .timeWindow(config.getLong(ProducerConfig.METRICS_SAMPLE_WINDOW_MS_CONFIG), TimeUnit.MILLISECONDS).tags(metricTags);
        List<MetricsReporter> reporters = config.getConfiguredInstances(ProducerConfig.METRIC_REPORTER_CLASSES_CONFIG,
                MetricsReporter.class);
        reporters.add(new JmxReporter(JMX_PREFIX));
        this.metrics = new Metrics(metricConfig, reporters, time);


        //设置分区器，可以引用自定义的分区器
        this.partitioner = config.getConfiguredInstance(ProducerConfig.PARTITIONER_CLASS_CONFIG, Partitioner.class);
       
        //重试时间 
        //RETRY_BACKOFF_MS_CONFIG=retry.backoff.ms 默认100ms
        long retryBackoffMs = config.getLong(ProducerConfig.RETRY_BACKOFF_MS_CONFIG);
        //设置key序列化器
        if (keySerializer == null) {
            this.keySerializer = config.getConfiguredInstance(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
                    Serializer.class);
            this.keySerializer.configure(config.originals(), true);
        } else {
            config.ignore(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG);
            this.keySerializer = keySerializer;
        }
        //设置value序列化器
        if (valueSerializer == null) {
            this.valueSerializer = config.getConfiguredInstance(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
                    Serializer.class);
            this.valueSerializer.configure(config.originals(), false);
        } else {
            config.ignore(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG);
            this.valueSerializer = valueSerializer;
        }


        // load interceptors and make sure they get clientId      
        userProvidedConfigs.put(ProducerConfig.CLIENT_ID_CONFIG, clientId);
        //设置拦截器，一般用不到
        List<ProducerInterceptor<K, V>> interceptorList = (List) (new ProducerConfig(userProvidedConfigs)).getConfiguredInstances(ProducerConfig.INTERCEPTOR_CLASSES_CONFIG,
                ProducerInterceptor.class);
        this.interceptors = interceptorList.isEmpty() ? null : new ProducerInterceptors<>(interceptorList);
        ClusterResourceListeners clusterResourceListeners = configureClusterResourceListeners(keySerializer, valueSerializer, interceptorList, reporters);
        //METADATA_MAX_AGE_CONFIG=metadata.max.age.ms,默认5分钟
        //生产者每隔一段时间都要去更新一下集群的元数据。
        this.metadata = new Metadata(retryBackoffMs, config.getLong(ProducerConfig.METADATA_MAX_AGE_CONFIG), true, clusterResourceListeners);
        //MAX_REQUEST_SIZE_CONFIG=max.request.size 默认是1M
        //代表单个消息请求的最大值，可以根据公司的实际情况设置
        this.maxRequestSize = config.getInt(ProducerConfig.MAX_REQUEST_SIZE_CONFIG);
        //指的是用于存放待发送消息的缓存区大小，默认是32M        
        this.totalMemorySize = config.getLong(ProducerConfig.BUFFER_MEMORY_CONFIG);
        //kafka可以通过设置压缩格式提高吞吐量，相应的会消耗更多的cpu资源
        this.compressionType = CompressionType.forName(config.getString(ProducerConfig.COMPRESSION_TYPE_CONFIG));
        /* check for user defined settings.
         * If the BLOCK_ON_BUFFER_FULL is set to true,we do not honor METADATA_FETCH_TIMEOUT_CONFIG.
         * This should be removed with release 0.9 when the deprecated configs are removed.
         */
        if (userProvidedConfigs.containsKey(ProducerConfig.BLOCK_ON_BUFFER_FULL_CONFIG)) {
            log.warn(ProducerConfig.BLOCK_ON_BUFFER_FULL_CONFIG + " config is deprecated and will be removed soon. " +
                    "Please use " + ProducerConfig.MAX_BLOCK_MS_CONFIG);
            boolean blockOnBufferFull = config.getBoolean(ProducerConfig.BLOCK_ON_BUFFER_FULL_CONFIG);
            if (blockOnBufferFull) {
                this.maxBlockTimeMs = Long.MAX_VALUE;
            } else if (userProvidedConfigs.containsKey(ProducerConfig.METADATA_FETCH_TIMEOUT_CONFIG)) {
                log.warn(ProducerConfig.METADATA_FETCH_TIMEOUT_CONFIG + " config is deprecated and will be removed soon. " +
                        "Please use " + ProducerConfig.MAX_BLOCK_MS_CONFIG);
                this.maxBlockTimeMs = config.getLong(ProducerConfig.METADATA_FETCH_TIMEOUT_CONFIG);
            } else {
                this.maxBlockTimeMs = config.getLong(ProducerConfig.MAX_BLOCK_MS_CONFIG);
            }
        } else if (userProvidedConfigs.containsKey(ProducerConfig.METADATA_FETCH_TIMEOUT_CONFIG)) {
            log.warn(ProducerConfig.METADATA_FETCH_TIMEOUT_CONFIG + " config is deprecated and will be removed soon. " +
                    "Please use " + ProducerConfig.MAX_BLOCK_MS_CONFIG);
            this.maxBlockTimeMs = config.getLong(ProducerConfig.METADATA_FETCH_TIMEOUT_CONFIG);
        } else {
            this.maxBlockTimeMs = config.getLong(ProducerConfig.MAX_BLOCK_MS_CONFIG);
        }


        /* check for user defined settings.
         * If the TIME_OUT config is set use that for request timeout.
         * This should be removed with release 0.9
         */
        if (userProvidedConfigs.containsKey(ProducerConfig.TIMEOUT_CONFIG)) {
            log.warn(ProducerConfig.TIMEOUT_CONFIG + " config is deprecated and will be removed soon. Please use " +
                    ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG);
            this.requestTimeoutMs = config.getInt(ProducerConfig.TIMEOUT_CONFIG);
        } else {
            this.requestTimeoutMs = config.getInt(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG);
        }
        //创建了RecordAccumulator这个重要的对象
        this.accumulator = new RecordAccumulator(config.getInt(ProducerConfig.BATCH_SIZE_CONFIG),
                this.totalMemorySize,
                this.compressionType,
                config.getLong(ProducerConfig.LINGER_MS_CONFIG),
                retryBackoffMs,
                metrics,
                time);


        List<InetSocketAddress> addresses = ClientUtils.parseAndValidateAddresses(config.getList(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG));
        //去更新元数据
        this.metadata.update(Cluster.bootstrap(addresses), time.milliseconds());
        ChannelBuilder channelBuilder = ClientUtils.createChannelBuilder(config.values());
        //初始化了网络组件，用于接收ClientRequest请求，通过Selector将请求发送给kafka集群
        //CONNECTIONS_MAX_IDLE_MS_CONFIG=connections.max.idle.ms:默认值是9分钟,表明网络连接处于空闲状态的最长时间，超过设置的时间就关闭网络
             //MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION=max.in.flight.requests.per.connection,默认值是5，限制客户端在单个连接上能够发送的未响应请求的个数
        //RECONNECT_BACKOFF_MS_CONFIG=reconnect.backoff.ms,尝试重新连接到给定主机之前等待的时间
        //SEND_BUFFER_CONFIG=send.buffer.bytes,表示socket发送数据的缓冲区的大小，默认值是128K
        //RECEIVE_BUFFER_CONFIG=receive.buffer.bytess，表示socket接受数据的缓冲区的大小，默认值是32K。
        NetworkClient client = new NetworkClient(
                new Selector(config.getLong(ProducerConfig.CONNECTIONS_MAX_IDLE_MS_CONFIG), this.metrics, time, "producer", channelBuilder),
                this.metadata,
                clientId,                config.getInt(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION),            config.getLong(ProducerConfig.RECONNECT_BACKOFF_MS_CONFIG),
                config.getInt(ProducerConfig.SEND_BUFFER_CONFIG),
                config.getInt(ProducerConfig.RECEIVE_BUFFER_CONFIG),
                this.requestTimeoutMs, time);
 
        //这个就是一个线程
        this.sender = new Sender(client,
                this.metadata,
                this.accumulator,                config.getInt(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION) == 1, config.getInt(ProducerConfig.MAX_REQUEST_SIZE_CONFIG),               (short)parseAcks(config.getString(ProducerConfig.ACKS_CONFIG)),
                config.getInt(ProducerConfig.RETRIES_CONFIG),
                this.metrics,
                new SystemTime(),
                clientId,
                this.requestTimeoutMs);
        String ioThreadName = "kafka-producer-network-thread" + (clientId.length() > 0 ? " | " + clientId : "");


        //创建KafkaThread线程，将sender对象传入，点进去会发现KafkaThread只是接收了线程的名字，并把线程设置为后台线程，没有其他逻辑操作，这样做的好处是可以实现线程控制和业务逻辑解耦。
        this.ioThread = new KafkaThread(ioThreadName, this.sender, true);
        //启动线程。
        this.ioThread.start();


        this.errors = this.metrics.sensor("errors");
        config.logUnused();
        AppInfoParser.registerAppInfo(JMX_PREFIX, clientId);
        log.debug("Kafka producer started");
    } catch (Throwable t) {
        // call close methods if internal objects are already constructed
        // this is to prevent resource leak. see KAFKA-2121
        close(0, TimeUnit.MILLISECONDS, true);
        // now propagate the exception
        throw new KafkaException("Failed to construct kafka producer", t);
    }
}

到这里就完成了KafkaProducer的初始化流程分析。接下来就要看一下发送消息的流程，我们重点看一下run（）方法。

public void run() {
    //消息的key  
    int messageNo = 1;
    
    while (true) {
        //消息的value
        String messageStr = "Message_" + messageNo;
        long startTime = System.currentTimeMillis();    
        //isAsync: kafka发送数据的方式，true的时候是异步发送，false就是同步发送
        if (isAsync) { // Send asynchronously
            //ProducerRecord封装了消息的topic、消息的kye和value，并且调用CallBack回调对象，里面包含回调方法，当生产者收到kafka发送过来的ACK确认信号时，就会调用CallBack对象里面的onCompletion（）方法
            //异步发送，一直发送消息，消息响应结果交给回调函数处理，性能好，生产者绝大部分场景都是使用这种机制
            producer.send(new ProducerRecord<>(topic,
                messageNo,
                messageStr), new DemoCallBack(startTime, messageNo, messageStr));
        } else { // Send synchronously
            try {
                //同步发送，发送一条消息，需要等到这条消息确认已经发送出去了，才能继续发送下一条，性能很差，一般不使用
                producer.send(new ProducerRecord<>(topic,
                    messageNo,
                    messageStr)).get();
                System.out.println("Sent message: (" + messageNo + ", " + messageStr + ")");
            } catch (InterruptedException | ExecutionException e) {
                e.printStackTrace();
            }
        }
        //对消息的key进行递增
        ++messageNo;
    }
}

我们接着先看CallBack这个回调对象，然后再回头深究producer的send（xxx）方法。

class DemoCallBack implements Callback {


    private final long startTime;
    private final int key;
    private final String message;


    public DemoCallBack(long startTime, int key, String message) {
        this.startTime = startTime;
        this.key = key;
        this.message = message;
    }


    /**
     * A callback method the user can implement to provide asynchronous handling of request completion. This method will
     * be called when the record sent to the server has been acknowledged. Exactly one of the arguments will be
     * non-null.
     *
     * @param metadata  The metadata for the record that was sent (i.e. the partition and offset). Null if an error
     *                  occurred.
     * @param exception The exception thrown during processing of this record. Null if no error occurred.
     */
    public void onCompletion(RecordMetadata metadata, Exception exception) {
        long elapsedTime = System.currentTimeMillis() - startTime;
      //RecordMetadata 包含了分区信息、offset信息等等
        if (metadata != null) {
            System.out.println(
                "message(" + key + ", " + message + ") sent to partition(" + metadata.partition() +
                    "), " +
                    "offset(" + metadata.offset() + ") in " + elapsedTime + " ms");
        } else {
            exception.printStackTrace();
        }
    }

把关注点先放到send（xxx）这个方法上面来，这里附上send（）方法的时序图帮助读者阅读。

@Override
public Future<RecordMetadata> send(ProducerRecord<K, V> record, Callback callback) {
    // intercept the record, which can be potentially modified; this method does not throw exceptions
    ProducerRecord<K, V> interceptedRecord = this.interceptors == null ? record : this.interceptors.onSend(record);
    //一把抓住重点
    return doSend(interceptedRecord, callback);
}

/**
 * 将消息异步发送给topic
 * Implementation of asynchronously send a record to a topic.
 */
private Future<RecordMetadata> doSend(ProducerRecord<K, V> record, Callback callback) {
    TopicPartition tp = null;
    try {
        
        ClusterAndWaitTime clusterAndWaitTime = waitOnMetadata(record.topic(), record.partition(), maxBlockTimeMs);
        //clusterAndWaitTime.waitedOnMetadataMs 代表的是拉取元数据耗费的时间。
        long remainingWaitMs = Math.max(0, maxBlockTimeMs - clusterAndWaitTime.waitedOnMetadataMs);
        //更新集群的元数据
        Cluster cluster = clusterAndWaitTime.cluster;      
         //对消息的key进行序列化。         
        byte[] serializedKey;
        try {
            serializedKey = keySerializer.serialize(record.topic(), record.key());
        } catch (ClassCastException cce) {
            throw new SerializationException("Can't convert key of class " + record.key().getClass().getName() +
                    " to class " + producerConfig.getClass(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG).getName() +
                    " specified in key.serializer");
        }
        //对消息的value进行序列化。
        byte[] serializedValue;
        try {
            serializedValue = valueSerializer.serialize(record.topic(), record.value());
        } catch (ClassCastException cce) {
            throw new SerializationException("Can't convert value of class " + record.value().getClass().getName() +
                    " to class " + producerConfig.getClass(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG).getName() +
                    " specified in value.serializer");
        }
        
         //根据分区器和集群元数据信息将消息分发到对应的分区。
        int partition = partition(record, serializedKey, serializedValue, cluster);


        int serializedSize = Records.LOG_OVERHEAD + Record.recordSize(serializedKey, serializedValue);      
         //根据分区器选择消息应该发送的分区,默认是1M
        ensureValidRecordSize(serializedSize);
        //根据元数据信息，封装分区对象
        tp = new TopicPartition(record.topic(), partition);
        long timestamp = record.timestamp() == null ? time.milliseconds() : record.timestamp();
        log.trace("Sending record {} with callback {} to topic {} partition {}", record, callback, record.topic(), partition);
        // producer callback will make sure to call both 'callback' and interceptor callback
       //消息绑定回调函数
        Callback interceptCallback = this.interceptors == null ? callback : new InterceptorCallback<>(callback, this.interceptors, tp);       
         //把消息放入accumulator,缓冲区大小默认是32M，
        RecordAccumulator.RecordAppendResult result = accumulator.append(tp, timestamp, serializedKey, serializedValue, interceptCallback, remainingWaitMs);
        //如果批次满了或者新创建出来一个批次
        if (result.batchIsFull || result.newBatchCreated) {
            log.trace("Waking up the sender since topic {} partition {} is either full or getting a new batch", record.topic(), partition);
             //唤醒sender线程，sender才是真正发送数据的线程。
            this.sender.wakeup();
        }
        return result.future;
        // handling exceptions and record the errors;
        // for API exceptions return them in the future,
        // for other exceptions throw directly
    } catch (ApiException e) {
        log.debug("Exception occurred during message send:", e);
        if (callback != null)
            callback.onCompletion(null, e);
        this.errors.record();
        if (this.interceptors != null)
            this.interceptors.onSendError(record, tp, e);
        return new FutureFailure(e);
    } catch (InterruptedException e) {
        this.errors.record();
        if (this.interceptors != null)
            this.interceptors.onSendError(record, tp, e);
        throw new InterruptException(e);
    } catch (BufferExhaustedException e) {
        this.errors.record();
        this.metrics.sensor("buffer-exhausted-records").record();
        if (this.interceptors != null)
            this.interceptors.onSendError(record, tp, e);
        throw e;
    } catch (KafkaException e) {
        this.errors.record();
        if (this.interceptors != null)
            this.interceptors.onSendError(record, tp, e);
        throw e;
    } catch (Exception e) {
        // we notify interceptor about all exceptions, since onSend is called before anything else in this method
        if (this.interceptors != null)
            this.interceptors.onSendError(record, tp, e);
        throw e;
    }

sender线程被唤醒之后就开始往kafka集群发送消息，以上就是对生产者发送消息流程的整体分析，本篇文章分析的粒度比较粗，主要包括了消息的封装、序列化、分区，批次封装、sender线程通过NetworkClient发送请求到Kafka集群这几个重要流程，还介绍了一些常见配置参数，具体的细节没有进行展开。

总结

相信看到这里有很多小伙伴都会有疑问，难道生产者发送消息这么重要的过程就这么简单地介绍完啦，答案肯定不是的，因为生产者发送消息的源码以及实现过程其实还是比较复杂的，为了方便读者阅读，并且更好地理解生产者发送消息的整个过程，我选择将生产者发送消息的整体流程和相关的细节剖析分为两篇写。细节剖析篇涉及的内容比较多，主要包括了集群元数据的管理，元数据的结构，元数据的获取和更新，分区器解析，RecordAccumulator消息核心类封装解析，batch的封装和发送，发送消息的机制，以及如何实现读写安全和高性能，内存池的设计和使用，还有网络通信组件NetworkClient应用等等这些重要的内容，实际展开的时候可能会引出更多的内容。以便于大家更好地理解和串联生产者发送消息的流程，所以有兴趣的童鞋可以持续关注kafka源码剖析系列，内容的关联性很强，尽在#公众号：数据与智能大数据源码系列，欢迎大家留言，一起交流讨论，指正不足的地方。

数据与智能

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
打赏
0
评论
Kafka核心源码剖析（一）：Producer发送消息流程

作者 | 吴邪大数据4年从业经验，目前就职于广州一家互联网公司，负责大数据基础平台自研、离线计算&实时计算研究编辑 | auroral-L全文共6000字，预计阅读30分钟。...
复制链接

扫一扫