源码分析RocketMQ之消费队列、Index索引文件存储结构与存储机制-上篇(1)

2401_84102400

于 2024-05-14 21:04:33 发布

阅读量567

点赞数 16

分类专栏：程序员文章标签： java 面试学习

本文链接：https://blog.csdn.net/2401_84102400/article/details/138870178

版权

程序员专栏收录该内容

238 篇文章 0 订阅

订阅专栏

总结

谈到面试，其实说白了就是刷题刷题刷题，天天作死的刷。。。。。

为了准备这个“金三银四”的春招，狂刷一个月的题，狂补超多的漏洞知识，像这次美团面试问的算法、数据库、Redis、设计模式等这些题目都是我刷到过的

并且我也将自己刷的题全部整理成了PDF或者Word文档（含详细答案解析）

我的美团offer凉凉了？开发工程师（Java岗）三面结束等通知...

66个Java面试知识点

架构专题（MySQL，Java，Redis，线程，并发，设计模式，Nginx，Linux，框架，微服务等）+大厂面试题详解（百度，阿里，腾讯，华为，迅雷，网易，中兴，北京中软等）

我的美团offer凉凉了？开发工程师（Java岗）三面结束等通知...

算法刷题（PDF）

我的美团offer凉凉了？开发工程师（Java岗）三面结束等通知...

本文已被CODING开源项目：【一线大厂Java面试题解析+核心总结学习笔记+最新讲解视频+实战项目源码】收录

需要这份系统化的资料的朋友，可以点击这里获取

this.mappedFileQueue.setCommittedWhere(expectLogicOffset);

this.fillPreBlank(mappedFile, expectLogicOffset);

log.info("fill pre blank space " + mappedFile.getFileName() + " " + expectLogicOffset + " "

mappedFile.getWrotePosition());

}

if (cqOffset != 0) {

long currentLogicOffset = mappedFile.getWrotePosition() + mappedFile.getFileFromOffset();

if (expectLogicOffset != currentLogicOffset) {

LOG_ERROR.warn(

“[BUG]logic queue order maybe wrong, expectLogicOffset: {} currentLogicOffset: {} Topic: {} QID: {} Diff: {}”,

expectLogicOffset,

currentLogicOffset,

this.topic,

this.queueId,

expectLogicOffset - currentLogicOffset

);

}

this.maxPhysicOffset = offset;

return mappedFile.appendMessage(this.byteBufferIndex.array()); // @4

}

return false;

}

首先说一下参数：

long offset

commitlog偏移量，8字节。

int size

消息体大小 4字节。

long tagsCode

消息 tags 的 hashcode。

long cqOffset

写入 consumequeue 的偏移量。

代码@1：首先将一条 ConsueQueue 条目总共20个字节，写入到 ByteBuffer 中。

代码@2：计算期望插入 ConsumeQueue 的 consumequeue 文件位置。

代码@3：如果文件是新建的，需要先填充空格。

代码@4：写入到 ConsumeQueue 文件中，整个过程都是基于 MappedFile 来操作的。

我们现在已经知道 ConsumeQueue 每一个条目都是 20个字节（8个字节commitlog偏移量+4字节消息长度+8字节tag的hashcode

那 consumqu e文件的路径，默认大小是多少呢？

默认路径为：rockemt_home/store/consume/ {topic} / {queryId},默认大小为，30W条记录，也就是30W * 20字节。

2.2 CommitLogDispatcherBuildIndex

其核心实现类 IndexService#buildIndex，存放 Index 文件的封装类为：IndexFile。

2.2.1 IndexFile 详解

2.2.1.1 核心属性

private static final Logger log = LoggerFactory.getLogger(LoggerName.STORE_LOGGER_NAME);

// 每个 hash 槽所占的字节数

private static int hashSlotSize = 4;

// 每条indexFile条目占用字节数

private static int indexSize = 20;

// 用来验证是否是一个有效的索引。

private static int invalidIndex = 0;

// index 文件中 hash 槽的总个数

private final int hashSlotNum;

// indexFile中包含的条目数

private final int indexNum;

// 对应的映射文件

private final MappedFile mappedFile;

// 对应的文件通道

private final FileChannel fileChannel;

// 对应 PageCache

private final MappedByteBuffer mappedByteBuffer;

// IndexHeader,每一个indexfile的头部信息

private final IndexHeader indexHeader;

IndexHeader 详解：

index存储路径：/rocket_home/store/index/年月日时分秒。

目前了解到这来，目光继续投向IndexService。

2.2.2 IndexService

2.2.2.1 核心属性与构造方法

private final DefaultMessageStore defaultMessageStore;

private final int hashSlotNum;

private final int indexNum;

private final String storePath;

private final ArrayList indexFileList = new ArrayList();

private final ReadWriteLock readWriteLock = new ReentrantReadWriteLock();

public IndexService(final DefaultMessageStore store) {

this.defaultMessageStore = store;

this.hashSlotNum = store.getMessageStoreConfig().getMaxHashSlotNum();

this.indexNum = store.getMessageStoreConfig().getMaxIndexNum();

this.storePath =

StorePathConfigHelper.getStorePathIndex(store.getMessageStoreConfig().getStorePathRootDir());

}

hashSlotNum

hash槽数量，默认5百万个。

indexNum

index条目个数，默认为 2千万个。

storePath

index存储路径，默认为：/rocket_home/store/index。

2.2.2.2 buildIndex

public void buildIndex(DispatchRequest req) {

IndexFile indexFile = retryGetAndCreateIndexFile(); // @1

if (indexFile != null) {

long endPhyOffset = indexFile.getEndPhyOffset();

DispatchRequest msg = req;

String topic = msg.getTopic();

String keys = msg.getKeys();

if (msg.getCommitLogOffset() < endPhyOffset) { // @2

return;

}

final int tranType = MessageSysFlag.getTransactionValue(msg.getSysFlag());

switch (tranType) {

case MessageSysFlag.TRANSACTION_NOT_TYPE:

case MessageSysFlag.TRANSACTION_PREPARED_TYPE:

case MessageSysFlag.TRANSACTION_COMMIT_TYPE:

break;

case MessageSysFlag.TRANSACTION_ROLLBACK_TYPE:

return;

}

if (req.getUniqKey() != null) { // @3

indexFile = putKey(indexFile, msg, buildKey(topic, req.getUniqKey()));

if (indexFile == null) {

log.error(“putKey error commitlog {} uniqkey {}”, req.getCommitLogOffset(), req.getUniqKey());

return;

}

if (keys != null && keys.length() > 0) { // @4

String[] keyset = keys.split(MessageConst.KEY_SEPARATOR);

for (int i = 0; i < keyset.length; i++) {

String key = keyset[i];

if (key.length() > 0) {

indexFile = putKey(indexFile, msg, buildKey(topic, key));

if (indexFile == null) {

log.error(“putKey error commitlog {} uniqkey {}”, req.getCommitLogOffset(), req.getUniqKey());

return;

}

} else {

log.error(“build index error, stop building index”);

}

代码@1：创建或获取当前写入的IndexFile.

代码@2：如果 indexfile 中的最大偏移量大于该消息的 commitlog offset，忽略本次构建。

代码@3，@4：将消息中的 keys,uniq_keys 写入 index 文件中。重点看一下putKey方法。

这是首先看一下，到底什么是消息的 keys 和 uniq_keys。

由此可以看出，keys,uniqKey存放在消息的propertiesmap中。

keys：用户在发送消息时候，可以指定，多个 key 用英文逗号隔开，对应代码：

uniqKey：消息唯一键，与消息ID不一样，为什么呢？因为消息ID在 commitlog 文件中并不是唯一的，消息消费重试时，发送的消息的消息ID与原先的一样。

uniqKey具体算法：（代码见 MessageClientIDSetter）

接下来重点进入IndexService#putKey方法：

private IndexFile putKey(IndexFile indexFile, DispatchRequest msg, String idxKey) {

for (boolean ok = indexFile.putKey(idxKey, msg.getCommitLogOffset(), msg.getStoreTimestamp()); !ok; ) {

log.warn(“Index file [” + indexFile.getFileName() + “] is full, trying to create another one”);

indexFile = retryGetAndCreateIndexFile();

if (null == indexFile) {

return null;

}

ok = indexFile.putKey(idxKey, msg.getCommitLogOffset(), msg.getStoreTimestamp());

}

return indexFile;

}

IndexFile#putKey

public boolean putKey(final String key, final long phyOffset, final long storeTimestamp) { // @1

if (this.indexHeader.getIndexCount() < this.indexNum) { // @2

int keyHash = indexKeyHashMethod(key);

int slotPos = keyHash % this.hashSlotNum; // @3

int absSlotPos = IndexHeader.INDEX_HEADER_SIZE + slotPos * hashSlotSize; // @4

FileLock fileLock = null;

try {

// fileLock = this.fileChannel.lock(absSlotPos, hashSlotSize,

// false);

int slotValue = this.mappedByteBuffer.getInt(absSlotPos); // @5

if (slotValue <= invalidIndex || slotValue > this.indexHeader.getIndexCount()) {

slotValue = invalidIndex;

}

long timeDiff = storeTimestamp - this.indexHeader.getBeginTimestamp();

timeDiff = timeDiff / 1000;

if (this.indexHeader.getBeginTimestamp() <= 0) {

timeDiff = 0;

} else if (timeDiff > Integer.MAX_VALUE) {

timeDiff = Integer.MAX_VALUE;

} else if (timeDiff < 0) {

timeDiff = 0;

} // @6

int absIndexPos =

IndexHeader.INDEX_HEADER_SIZE + this.hashSlotNum * hashSlotSize

this.indexHeader.getIndexCount() * indexSize; // @7

this.mappedByteBuffer.putInt(absIndexPos, keyHash); // @8

this.mappedByteBuffer.putLong(absIndexPos + 4, phyOffset);

this.mappedByteBuffer.putInt(absIndexPos + 4 + 8, (int) timeDiff);

this.mappedByteBuffer.putInt(absIndexPos + 4 + 8 + 4, slotValue);

this.mappedByteBuffer.putInt(absSlotPos, this.indexHeader.getIndexCount()); // @9

if (this.indexHeader.getIndexCount() <= 1) { // @10

this.indexHeader.setBeginPhyOffset(phyOffset);

this.indexHeader.setBeginTimestamp(storeTimestamp);

}

this.indexHeader.incHashSlotCount();

this.indexHeader.incIndexCount();

this.indexHeader.setEndPhyOffset(phyOffset);

this.indexHeader.setEndTimestamp(storeTimestamp);

return true;

} catch (Exception e) {

log.error("putKey exception, Key: " + key + " KeyHashCode: " + key.hashCode(), e);

} finally {

if (fileLock != null) {

try {

fileLock.release();

} catch (IOException e) {

log.error(“Failed to release the lock”, e);

}

} else {

log.warn("Over index file capacity: index count = " + this.indexHeader.getIndexCount()

"; index max num = " + this.indexNum);

}

return false;

}

从这个方法我们也能得知 IndexFile 的存储协议。

代码@1：参数详解：

phyOffset

消息存储在commitlog的偏移量。

storeTimestamp

消息存入commitlog的时间戳。

代码@2：如果目前 index file 存储的条目数小于允许的条目数，则存入当前文件中，如果超出，则返回 false, 表示存入失败，IndexService 中有重试机制，默认重试3次。

从代码@3开始，主要是根据 IndexFile 的文件格式进行存储。

代码@3：先获取 key 的 hashcode，然后用 hashcode 和 hashSlotNum 取模，得到该 key 所在的 hashslot 下标，hashSlotNum默认500万个。

代码@4：根据 key 所算出来的 hashslot 的下标计算出绝对位置，从这里可以看出端倪：IndexFile的文件布局：文件头(IndexFileHeader 40个字节) + (hashSlotNum * 4)。

代码@5：读取 key 所在 hashslot 下标处的值(4个字节)，如果小于0或超过当前包含的 indexCount，则设置为0。

代码@6：计算消息的存储时间与当前 IndexFile 存放的最小时间差额(单位为秒）。

代码@7：计算该 key 存放的条目的起始位置，等于=文件头(IndexFileHeader 40个字节) + (hashSlotNum * 4) + IndexSize(一个条目20个字节) * 当前存放的条目数量。

代码@8：填充 IndexFile 条目，4字节（hashcode） + 8字节（commitlog offset） + 4字节（commitlog存储时间与indexfile第一个条目的时间差，单位秒） + 4字节（同hashcode的上一个的位置，0表示没有上一个）。

代码@9：将当前先添加的条目的位置，存入到 key hashcode 对应的 hash槽，也就是该字段里面存放的是该 hashcode 最新的条目（如果产生hash冲突，不同的key，hashcode相同。

代码@10：更新IndexFile头部相关字段，比如最小时间，当前最大时间等。

这个方法，可以得出IndexFile的存储格式：

HashSolt 每个槽4个字节，存放的是对应 hashcode 最新的index条目的位置。

indexFIleItem:index条目，每个20个字节，4字节（hashcode） + 8字节（commitlog offset） + 4字节（commitlog存储时间与indexfile第一个条目的时间差，单位秒） + 4字节（同hashcode的上一个的位置，0表示没有上一个）。

上述设计，可以支持 hashcode 冲突，，多个不同的key,相同的 hashcode,index 条目其实是一个逻辑链表的概念，因为每个index 条目的最后4个字节存放的就是上一个的位置。知道存了储结构，要检索 index文件就变的简单起来来，其实就根据 key 得到 hashcode,然后从最新的条目开始找，匹配时间戳是否有效，得到消息的物理地址（存放在commitlog文件中），然后就可以根据 commitlog 偏移量找到具体的消息，从而得到最终的key-value。

我们在顺便看一下IndexFile#selectPhyOffset。

public void selectPhyOffset(final List phyOffsets, final String key, final int maxNum,

final long begin, final long end, boolean lock) { // @1

if (this.mappedFile.hold()) {

int keyHash = indexKeyHashMethod(key);

int slotPos = keyHash % this.hashSlotNum;

int absSlotPos = IndexHeader.INDEX_HEADER_SIZE + slotPos * hashSlotSize; // @2

FileLock fileLock = null;

try {

if (lock) {

// fileLock = this.fileChannel.lock(absSlotPos,

// hashSlotSize, true);

}

int slotValue = this.mappedByteBuffer.getInt(absSlotPos);

// if (fileLock != null) {

// fileLock.release();

// fileLock = null;

// }

if (slotValue <= invalidIndex || slotValue > this.indexHeader.getIndexCount()

|| this.indexHeader.getIndexCount() <= 1) { // @3

} else {

for (int nextIndexToRead = slotValue; ; ) { // @4 开始循环找（相同hashcode的index条目是不连续的单向链表，最新的指向上一个。

if (phyOffsets.size() >= maxNum) {

break;

}

int absIndexPos =

IndexHeader.INDEX_HEADER_SIZE + this.hashSlotNum * hashSlotSize

nextIndexToRead * indexSize;

int keyHashRead = this.mappedByteBuffer.getInt(absIndexPos);

long phyOffsetRead = this.mappedByteBuffer.getLong(absIndexPos + 4);

long timeDiff = (long) this.mappedByteBuffer.getInt(absIndexPos + 4 + 8); // 找到对应的条目

int prevIndexRead = this.mappedByteBuffer.getInt(absIndexPos + 4 + 8 + 4);

if (timeDiff < 0) { // 如果时间非法，则表示无效

break;

}

timeDiff *= 1000L;

long timeRead = this.indexHeader.getBeginTimestamp() + timeDiff;

boolean timeMatched = (timeRead >= begin) && (timeRead <= end);

if (keyHash == keyHashRead && timeMatched) { // @5

phyOffsets.add(phyOffsetRead);

}

if (prevIndexRead <= invalidIndex

|| prevIndexRead > this.indexHeader.getIndexCount()

|| prevIndexRead == nextIndexToRead || timeRead < begin) {

break;

}

nextIndexToRead = prevIndexRead;

}

} catch (Exception e) {

log.error("selectPhyOffset exception ", e);

} finally {

if (fileLock != null) {

try {

fileLock.release();

} catch (IOException e) {

log.error(“Failed to release the lock”, e);

}

this.mappedFile.release();

}

最后

终极手撕架构师的学习笔记：分布式+微服务+开源框架+性能优化

本文已被CODING开源项目：【一线大厂Java面试题解析+核心总结学习笔记+最新讲解视频+实战项目源码】收录

需要这份系统化的资料的朋友，可以点击这里获取

Offsets.add(phyOffsetRead);

}

if (prevIndexRead <= invalidIndex

|| prevIndexRead > this.indexHeader.getIndexCount()

|| prevIndexRead == nextIndexToRead || timeRead < begin) {

break;

}

nextIndexToRead = prevIndexRead;

}

} catch (Exception e) {

log.error("selectPhyOffset exception ", e);

} finally {

if (fileLock != null) {

try {

fileLock.release();

} catch (IOException e) {

log.error(“Failed to release the lock”, e);

}

this.mappedFile.release();

}

最后

终极手撕架构师的学习笔记：分布式+微服务+开源框架+性能优化

[外链图片转存中…(img-7zxvwH9Y-1715691850302)]

本文已被CODING开源项目：【一线大厂Java面试题解析+核心总结学习笔记+最新讲解视频+实战项目源码】收录

需要这份系统化的资料的朋友，可以点击这里获取

2401_84102400

关注

16
点赞
踩
21

收藏

觉得还不错? 一键收藏
0
评论
源码分析RocketMQ之消费队列、Index索引文件存储结构与存储机制-上篇(1)

终极手撕架构师的学习笔记：分布式+微服务+开源框架+性能优化本文已被CODING开源项目：【一线大厂Java面试题解析+核心总结学习笔记+最新讲解视频+实战项目源码】收录需要这份系统化的资料的朋友，可以点击这里获取break;= null) {try {终极手撕架构师的学习笔记：分布式+微服务+开源框架+性能优化[外链图片转存中…(img-7zxvwH9Y-1715691850302)]本文已被。
复制链接

扫一扫