存储机制
kafka的数据存储:
每个topic的分区对应一个文件夹,用来保存一个分区的数据。
这个文件夹下分为多个segment,将分区数据分为多个片段。每个segment有index和log两个文件。
[hadoop@hadoop000 topic_test-0]$ ls
000000000000000000202.index 000000000000000000202.log
index文件内容:
[hadoop@hadoop000 kafka_2.11-2.2.1]$ bin/kafka-run-class.sh kafka.tools.DumpLogSegments --files /home/hadoop/app/tmp/kafka-log/topic_test-0/00000000000000000202.index --print-data-log
Dumping /home/hadoop/app/tmp/kafka-log/topic_test-0/00000000000000000202.index
offset: 387 position: 6094
offset: 518 position: 10734
offset: 622 position: 16367
offset: 1138 position: 27121
offset: 1246 position: 39566
offset: 1359 position: 44315
offset: 1731 position: 49284
offset: 2103 position: 65649
offset: 2475 position: 82014
offset: 2847 position: 98379 //下方一批数据:baseOffset: 2476——lastOffset: 2847
offset: 2909 position: 114744
index文件由两列组成,offset和position。
position是相对于文件起始的地址,position为10,则在磁盘中的物理地址为这个文件的起始物理地址+position。
offset应为每一个批次最后的offset。
log文件内容:
[hadoop@hadoop000 kafka_2.11-2.2.1]$ bin/kafka-run-class.sh kafka.tools.DumpLogSegments --files /home/hadoop/app/tmp/kafka-log/topic_test-0/00000000000000000202.log --print-data-log
baseOffset: 2476 lastOffset: 2847 count: 372 baseSequence: -1 lastSequence: -1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 1 isTransactional: false isControl: false position: 98379 CreateTime: 1647327649751 size: 16365 magic: 2 compresscodec: NONE crc: 1572428979 isvalid: true
| offset: 2476 CreateTime: 1647327649746 keysize: 15 valuesize: 21 sequence: -1 headerKeys: [] key: key-key-key4088 payload: value-value-value4088
| offset: 2477 CreateTime: 1647327649746 keysize: 15 valuesize: 21 sequence: -1 headerKeys: [] key: key-key-key4089 payload: value-value-value4089
| offset: 2478 CreateTime: 1647327649746 keysize: 15 valuesize: 21 sequence: -1 headerKeys: [] key: key-key-key4090 payload: value-value-value4090
| offset: 2479 CreateTime: 1647327649746 keysize: 15 valuesize: 21 sequence: -1 headerKeys: [] key: key-key-key4092 payload: value-value-value4092
| offset: 2480 CreateTime: 1647327649746 keysize: 15 valuesize: 21 sequence: -1 headerKeys: [] key: key-key-key4094 payload: value-value-value4094
properties.put(ProducerConfig.BATCH_SIZE_CONFIG,“16384”);
所以上图中 size: 16365,下一条消息加上来就超过16384,所以372条消息共用一个header批量发送。
查找offset为2476的消息的步骤:
- 在index文件中通过二分查找法来查找比它小并最接近的offset,得到offset: 2475 的这行,得到position: 82014。2476比2475大,进入下一批进行查找。
- 定位到log文件中position为82014的数据,这个数据offset为2475,然后再向后遍历一条,就能找到offset为2476的数据。
Spark Structured Streaming中读取kafka的DataFrame:
消息格式
baseOffset: 99 lastOffset: 102 count: 4 baseSequence: -1 lastSequence: -1 producerId: -1 producerEpoch: -1 partitionLeaderEpoch: 0
isTransactional: false isControl: false position: 0 CreateTime: 1611670759851 size: 137 magic: 2 compresscodec: NONE crc: 820456027 isvalid: true
| offset: 99 CreateTime: 1611670759849 keysize: 5 valuesize: 7 sequence: -1 headerKeys: [] key: key-1 payload: value-1
| offset: 100 CreateTime: 1611670759849 keysize: 5 valuesize: 7 sequence: -1 headerKeys: [] key: key-2 payload: value-2
| offset: 101 CreateTime: 1611670759851 keysize: 5 valuesize: 7 sequence: -1 headerKeys: [] key: key-5 payload: value-5
| offset: 102 CreateTime: 1611670759851 keysize: 5 valuesize: 7 sequence: -1 headerKeys: [] key: key-6 payload: value-6
有个疑问:position是怎么来的呢?当然要知道每个消息所占的空间,才能去计算。
上面log文件的size为137,即为137个字节。所以下一个文件的position就从137开始。
消息的构成:Record Batch,其将多个消息(Record)打包存放到单个RecordBatch中,就是多个Record拥有一个header,而不是每个都拥有header。
137B的构成:
- Record Batch Header部分共61B。
- Record部分的总长度=1B+1B+1B+1B+5B+1B+7B+1B=18B
- 编码为变长整型占1B
- 61+(18+1)*4=137B
如果你看不懂,推荐个博客,写的很好:
一文看懂Kafka消息格式的演变.