图解Kafka的RecordBatch结构

最新推荐文章于 2023-11-16 17:21:00 发布

JavaShark

最新推荐文章于 2023-11-16 17:21:00 发布

阅读量691

点赞数

分类专栏：程序员 JAVA 计算机文章标签： kafka java 开发语言

本文链接：https://blog.csdn.net/JavaShark/article/details/125872591

版权

本文详细解析Kafka的RecordBatch结构，包括初始化、消息写入、Record结构、RecordBatchHeader和整体结构。RecordBatch是ProducerBatch内部用于存储消息的对象，文章通过源码分析了RecordBatch的各个组成部分，如位移偏移量、时间戳偏移量以及消息头，同时还介绍了压缩和关闭ProducerBatch的过程。

摘要由CSDN通过智能技术生成

RecordBatch

我们之前有讲过生产者的ProducerBatch, 这个RecordBatch跟ProducerBatch的区别是什么呢？

RecordBatch是在ProducerBatch里面的一个专门存放消息的对象, 除此之外ProducerBatch还有其他相关属性,例如还有重试、回调等等相关属性。

RecordBatch初始化

在创建一个需要创建一个新的ProducerBatch的时候,同时需要构建一个 MemoryRecordsBuilder , 这个对象我们可以理解为消息构造器,所有的消息相关都存放到这个里面。

public MemoryRecordsBuilder(ByteBufferOutputStream bufferStream,
                                byte magic,
                                CompressionType compressionType,
                                TimestampType timestampType,
                                long baseOffset,
                                long logAppendTime,
                                long producerId,
                                short producerEpoch,
                                int baseSequence,
                                boolean isTransactional,
                                boolean isControlBatch,
                                int partitionLeaderEpoch,
                                int writeLimit) {
   		// 省略部分....
        this.magic = magic;
        this.timestampType = timestampType;
        this.compressionType = compressionType;
        this.baseOffset = baseOffset;
        this.logAppendTime = logAppendTime;
        this.numRecords = 0;
        this.uncompressedRecordsSizeInBytes = 0;
        this.actualCompressionRatio = 1;
        this.maxTimestamp = RecordBatch.NO_TIMESTAMP;
        this.producerId = producerId;
        this.producerEpoch = producerEpoch;
        this.baseSequence = baseSequence;
        this.isTransactional = isTransactional;
        this.isControlBatch = isControlBatch;
        this.partitionLeaderEpoch = partitionLeaderEpoch;
        this.writeLimit = writeLimit;
        this.initialPosition = bufferStream.position();
        this.batchHeaderSizeInBytes = AbstractRecords.recordBatchHeaderSizeInBytes(magic, compressionType);
		// Buffer一开始就需要预留61B的位置用于 存放消息投 RecordHeader
        bufferStream.position(initialPosition + batchHeaderSizeInBytes);
        this.bufferStream = bufferStream;
        //选择合适的压缩器实现类
        this.appendStream = new DataOutputStream(compressionType.wrapForOutput(this.bufferStream, magic));
    }

复制代码

上面的源码可知重点：

bufferStream 一开始的时候就需要预留 61B 的位置给消息头使用,也就是RecordHeader。batchHeaderSizeInBytes = 61
根据配置的压缩类型 compression.type ,选择对应的压缩输出流。例如假设使用 lz4 压缩类型,返回的输出流实体对象为 KafkaLZ4BlockOutputStream , 这里面有写入消息的方法和压缩方法。