RocketMQ源码解析——存储部分(5)IndexFile消息索引日志文件相关的`IndexService`类

本文链接：https://blog.csdn.net/szhlcy/article/details/115907118

文章目录

IndexFile文件讲解

之前说了RocketMQ的物理日志文件CommitLog和逻辑日志文件ConsumeQueue。现在说的是对应的消息索引文件IndexFile。

概述

IndexFile（索引文件）提供了一种可以通过key或时间区间来查询消息的方法。Index文件的存储位置是：HOME\store\index{fileName}，文件名fileName是以创建时的时间戳命名的，固定的单个IndexFile文件大小约为400M，一个IndexFile可以保存 2000W个索引，IndexFile的底层存储设计为在文件系统中实现HashMap结构，故rocketmq的索引文件其底层实现为hash索引。

文件结构

文件的结构这里参考网上的一张图
在这里插入图片描述
这个图是整个IndexFile的文件结构，主要分为3部分。第一部分，文件头信息（大小为40 byte）；第二部分，hash槽位（单个槽位4byte，一共500w个）；第三部分，索引信息链表部分（单个index为20 byte，一共2000w个）。分三部分进行说明：

头信息Index Head部分：主要记录整个文件的相关信息
- beginTimestamp：第一个索引消息落在Broker的时间戳；
- endTimestamp：最后一个索引消息落在Broker的时间戳；
- beginPhyOffset：第一个索引消息在commitlog的偏移量；
- endPhyOffset：最后一个索引消息在commitlog的偏移量；
- hashSlotCount：构建索引占用的槽位数；
- indexCount：构建的索引个数；
Hash槽 Slot Table 部分：保存的是消息key在Index部分的位置，槽位的确定方式是消息的topic和key中间用#拼接起来（topic#key）然后对总槽树取模，计算槽位。
Index链表部分：Index中存储的是消息相关的详细信息，和hash冲突时的处理方式
- keyHash:topic#key结构的Hash值(key是消息的key)
- phyOffset:commitLog真实的物理位移
- timeOffset：时间位移，消息的存储时间与Index Header中beginTimestamp的时间差
- slotValue（解决hash槽冲突的值）：当topic-key(key是消息的key)的Hash值取500W的余之后得到的Slot Table的slot位置中已经有值了（即Hash值取余后在Slot Table中有hash冲突时），则会用最新的Index值覆盖，并且将上一个值写入最新Index的slotValue中，从而形成了一个链表的结构。

IndexFile文件相关的类

IndexFile头文件相关的`IndexHead`类

IndexHead类关联的其实就是IndexFile文件的头文件相关的信息，没有复杂的方法，都是一些字段的get和set方法

	//IndexFile的头大小
    public static final int INDEX_HEADER_SIZE = 40;
    //beginTimestamp：第一个索引消息落在Broker的时间戳
    private static int beginTimestampIndex = 0;
    //endTimestamp：最后一个索引消息落在Broker的时间戳
    private static int endTimestampIndex = 8;
    //beginPhyOffset：第一个索引消息在commitlog的偏移量；
    private static int beginPhyoffsetIndex = 16;
    //endPhyOffset：最后一个索引消息在commitlog的偏移量；
    private static int endPhyoffsetIndex = 24;
    //hashSlotCount：构建索引占用的槽位数
    private static int hashSlotcountIndex = 32;
    //indexCount：构建的索引个数
    private static int indexCountIndex = 36;
	
	//记录对应信息用的原子类
	private AtomicLong beginTimestamp = new AtomicLong(0);
    private AtomicLong endTimestamp = new AtomicLong(0);
    private AtomicLong beginPhyOffset = new AtomicLong(0);
    private AtomicLong endPhyOffset = new AtomicLong(0);
    private AtomicInteger hashSlotCount = new AtomicInteger(0);
    private AtomicInteger indexCount = new AtomicInteger(1);

可以看到这里通过6个字段来表示对应的6个字段的偏移量，其中6个字段的值都是用原子类来记录表示的。

IndexFile读写相关的`IndexFile`类

IndexFile文件相关操作对应的就是IndexFile类，提供对IndexFile文件的插入信息和对应的查询操作。

字段属性

  	//hash曹的大小
    private static int hashSlotSize = 4;
    //一个index结构的大小
    private static int indexSize = 20;
    //无效的index
    private static int invalidIndex = 0;
    //hash槽总数
    private final int hashSlotNum;
    //index的数量
    private final int indexNum;
    //IndexFile文件的映射文件对象
    private final MappedFile mappedFile;
    private final FileChannel fileChannel;
    private final MappedByteBuffer mappedByteBuffer;
    //IndexFile的头信息
    private final IndexHeader indexHeader;

记录的主要是整个文件中相应的单元的单个大小，和对应hash槽和index链表的大小。和对应文件的映射对象等信息

内部方法分析

构造方法

构造方法主要就是对应的主要参数的设置，根据入参计算整个文件的大小（IndexFile的头大小 + hash槽的大小 x hash槽的数量 + index结构的大小 x index结构的数量），然后创建文件，设置文件头对象IndexHead

public IndexFile(final String fileName, final int hashSlotNum, final int indexNum,
        final long endPhyOffset, final long endTimestamp) throws IOException {
        //计算文件的大小 = IndexFile的头大小 + hash槽的大小*hash槽的数量 + index结构的大小*index结构的数量
        int fileTotalSize =
            IndexHeader.INDEX_HEADER_SIZE + (hashSlotNum * hashSlotSize) + (indexNum * indexSize);
        //获取映射文件对象
        this.mappedFile = new MappedFile(fileName, fileTotalSize);
        //获取对应的channel
        this.fileChannel = this.mappedFile.getFileChannel();
        //获取对应文件的缓存
        this.mappedByteBuffer = this.mappedFile.getMappedByteBuffer();
        //设置hash槽数量
        this.hashSlotNum = hashSlotNum;
        //设置index结构的数量
        this.indexNum = indexNum;

        ByteBuffer byteBuffer = this.mappedByteBuffer.slice();
        //创建文件对应的IndexHead对象
        this.indexHeader = new IndexHeader(byteBuffer);
        //初始化头文件的beginPhyOffset 和 endPhyOffset
        if (endPhyOffset > 0) {
            this.indexHeader.setBeginPhyOffset(endPhyOffset);
            this.indexHeader.setEndPhyOffset(endPhyOffset);
        }
        //初始化头文件的beginTimestamp 和 endTimestamp
        if (endTimestamp > 0) {
            this.indexHeader.setBeginTimestamp(endTimestamp);
            this.indexHeader.setEndTimestamp(endTimestamp);
        }
    }

保存key对应index的`putKey`方法

这个方法的调用，是在消息存入CommitLog之后，进行消息转存的时候会调用。这里简单贴一些调用链。

ReputMessageService#run
    ReputMessageService#doReput
    	DefaultMessageStore#doDispatch
    		CommitLogDispatcherBuildIndex#dispatch
    			DefaultMessageStore#putMessagePositionInfo
    				IndexService#buildIndex
    					IndexService#putKey
    						IndexFile#putKey

这个方法根据传入的消息的key，消息在CommitLog的物理偏移量，消息的存储时间三个参数来进行构建索引。主要逻辑过程为：

检查IndexHead类中记录的indexCount值和IndexFile类中记录的indexNum进行比较，检查文件是否已经满了，如果满了直接返回
计算传入key对应的hash槽的位置，并检查要插入的槽位是否已经存在值了，如果已经存在值了，检查是不是无效值，如果不是则需要记录。在插入index信息的时候保存
吧当前key的索引值，插入对应的hash槽中
计算对应的index链表的位置，然后插入index信息，如果之前hash槽分配存在hash冲突，则在把冲突的上一个key的index的值，保存在slotValue中

源码如下

 public boolean putKey(final String key, final long phyOffset, final long storeTimestamp) {
        //如果已经构建的索引index数量 < 最大的index数量，则进行插入，否则直接返回 false
        if (this.indexHeader.getIndexCount() < this.indexNum) {
            //计算key 的 hash值，使用的是String自带的hashcode方法计算
            int keyHash = indexKeyHashMethod(key);
            // 计算key对应的hash槽的位置
            int slotPos = keyHash % this.hashSlotNum;
            //计算对应槽为的偏移量   IndexFile的头长度+hash槽的位置*hash槽大小  40+位置*4
            int absSlotPos = IndexHeader.INDEX_HEADER_SIZE + slotPos * hashSlotSize;
            FileLock fileLock = null;
            try {

                // fileLock = this.fileChannel.lock(absSlotPos, hashSlotSize,
                // false);
                //从对应的槽位的位置开始 获取4个字节的长度 得到对应topic的key对应索引的位置
                int slotValue = this.mappedByteBuffer.getInt(absSlotPos);
                //检查对应槽位的值 是不是无效的索引，如果不是说明这次插入的key跟之前的key冲突，则要取出之前的keu
                if (slotValue <= invalidIndex || slotValue > this.indexHeader.getIndexCount()) {
                    slotValue = invalidIndex;
                }
                // 存储时间 - 头文件记录的开始时间得到 时间差
                long timeDiff = storeTimestamp - this.indexHeader.getBeginTimestamp();
                //转换时间
                timeDiff = timeDiff / 1000;
                //如果头文件记录的开始时间小于0，则时间差记为0 ， 如果大于int最大值，则为最大值，如果时间差小于0，也记录为0
                if (this.indexHeader.getBeginTimestamp() <= 0) {
                    timeDiff = 0;
                } else if (timeDiff > Integer.MAX_VALUE) {
                    timeDiff = Integer.MAX_VALUE;
                } else if (timeDiff < 0) {
                    timeDiff = 0;
                }
                /**
                 * 计算 需要设置值的index偏移量  IndexFile头大小+hash槽数量*hash槽大小+IndexFile的indexCount*index大小
                 * 也就是 40+500w*4+20*indexCount
                 */
                int absIndexPos =
                    IndexHeader.INDEX_HEADER_SIZE + this.hashSlotNum * hashSlotSize
                        + this.indexHeader.getIndexCount() * indexSize;
                //设置  index中的 keyHash
                this.mappedByteBuffer.putInt(absIndexPos, keyHash);
                //设置 index中的 phyOffset
                this.mappedByteBuffer.putLong(absIndexPos + 4, phyOffset);
                //设置 index中的 timeDiff
                this.mappedByteBuffer.putInt(absIndexPos + 4 + 8, (int) timeDiff);
                //设置 index中的 slotValue
                this.mappedByteBuffer.putInt(absIndexPos + 4 + 8 + 4, slotValue);
                //设置 在hash槽中的 index
                this.mappedByteBuffer.putInt(absSlotPos, this.indexHeader.getIndexCount());
                //如果indexCount 小于1，则表示是第一个存入的消息信息 则设置对应的初始信息
                if (this.indexHeader.getIndexCount() <= 1) {
                    this.indexHeader.setBeginPhyOffset(phyOffset);
                    this.indexHeader.setBeginTimestamp(storeTimestamp);
                }
                //如果对应的 key的索引是无效索引
                if (invalidIndex == slotValue) {
                    this.indexHeader.incHashSlotCount();
                }
                //增加indexCount值
                this.indexHeader.incIndexCount();
                //设置对应的最后一个消息的偏移量
                this.indexHeader.setEndPhyOffset(phyOffset);
                //设置对应的最后一个消息的存储时间
                this.indexHeader.setEndTimestamp(storeTimestamp);

                return true;
            } catch (Exception e) {
                log.error("putKey exception, Key: " + key + " KeyHashCode: " + key.hashCode(), e);
            } finally {
                if (fileLock != null) {
                    try {
                        //释放文件锁
                        fileLock.release();
                    } catch (IOException e) {
                        log.error("Failed to release the lock", e);
                    }
                }
            }
        } else {
            log.warn("Over index file capacity: index count = " + this.indexHeader.getIndexCount()
                + "; index max num = " + this.indexNum);
        }

        return false;
    }

根据时间区间查询和key来进行查询消息的`selectPhyOffset`方法

根据消息和落盘时间段来寻找消息在CommitLog上的偏移量。主要逻辑如下：

根据传入的key，计算hash槽的位置
获取hash槽记录的index链表的位置的值
获取index链表中的slotValue值是否大于0，大于0表示存在hash冲突，也就是存在key相同的消息，需要进入步骤4进一步寻找，否则直接返回
根据slotValue记录的值，寻找对应的index链表的index信息。同时校验，index记录的timeOffset和IndexHead记录的beginTimestamp的和是否在传入的时间区间内。在则继续获取slotValue重复步骤4，直到找到不符合的消息。

整个方法就是根据key计算消息的偏移量。源码如下

 /**
     * 根据偏移量和落盘时间段获取消息的物理偏移量集合
     * @param phyOffsets  封装逻辑偏移量值的集合
     * @param key 开始寻找的key  结构为消息的topic#key
     * @param maxNum 寻找的数量
     * @param begin 落盘时间段开始时间
     * @param end 落盘时间段结束时间
     * @param lock 是否加文件锁，现阶段是不加锁
     */
    public void selectPhyOffset(final List<Long> phyOffsets, final String key, final int maxNum,
        final long begin, final long end, boolean lock) {
        if (this.mappedFile.hold()) {
            //计算key的hash槽的位置
            int keyHash = indexKeyHashMethod(key);
            int slotPos = keyHash % this.hashSlotNum;
            //计算槽位的偏移量信息
            int absSlotPos = IndexHeader.INDEX_HEADER_SIZE + slotPos * hashSlotSize;

            FileLock fileLock = null;
            try {
                if (lock) {
                    // fileLock = this.fileChannel.lock(absSlotPos,
                    // hashSlotSize, true);
                }
                //获取key对应的索引信息
                int slotValue = this.mappedByteBuffer.getInt(absSlotPos);
                // if (fileLock != null) {
                // fileLock.release();
                // fileLock = null;
                // }
                //如果是无效索引则不处理，意思就是没有hash冲突的情况下则不进一步处理，否则需要获取之前的冲突的key
                if (slotValue <= invalidIndex || slotValue > this.indexHeader.getIndexCount()
                    || this.indexHeader.getIndexCount() <= 1) {
                } else {
                    //迭代获取冲突的消息直到没有冲突
                    for (int nextIndexToRead = slotValue; ; ) {
                        //获取完毕，就结束
                        if (phyOffsets.size() >= maxNum) {
                            break;
                        }
                        //计算index结构的偏移量
                        int absIndexPos =
                            IndexHeader.INDEX_HEADER_SIZE + this.hashSlotNum * hashSlotSize
                                + nextIndexToRead * indexSize;
                        //获取key的hash值
                        int keyHashRead = this.mappedByteBuffer.getInt(absIndexPos);
                        //获取消息的物理偏移量
                        long phyOffsetRead = this.mappedByteBuffer.getLong(absIndexPos + 4);
                        //获取时间位移
                        long timeDiff = (long) this.mappedByteBuffer.getInt(absIndexPos + 4 + 8);
                        //获取槽位冲突的上一个key的index信息
                        int prevIndexRead = this.mappedByteBuffer.getInt(absIndexPos + 4 + 8 + 4);
                        //如果时间偏移小于0，则进行处理
                        if (timeDiff < 0) {
                            break;
                        }

                        timeDiff *= 1000L;
                        //计算消息的存储时间
                        long timeRead = this.indexHeader.getBeginTimestamp() + timeDiff;
                        //检查消息是否符合
                        boolean timeMatched = (timeRead >= begin) && (timeRead <= end);
                        //符合条件的消息的 物理偏移量 添加到结果集中
                        if (keyHash == keyHashRead && timeMatched) {
                            phyOffsets.add(phyOffsetRead);
                        }
                        //如果槽位冲突的上一个key的index信息不合法，则直接跳过，否则处理冲突的key
                        if (prevIndexRead <= invalidIndex
                            || prevIndexRead > this.indexHeader.getIndexCount()
                            || prevIndexRead == nextIndexToRead || timeRead < begin) {
                            break;
                        }
                        nextIndexToRead = prevIndexRead;
                    }
                }
            } catch (Exception e) {
                log.error("selectPhyOffset exception ", e);
            } finally {
                if (fileLock != null) {
                    try {
                        fileLock.release();
                    } catch (IOException e) {
                        log.error("Failed to release the lock", e);
                    }
                }

                this.mappedFile.release();
            }
        }
    }

操作IndexFile文件集合的`IndexService`

IndexService是对多个IndexFile类的一种封装。也是IndexFile文件最外层的操作类。这个类的很多方法和CommitLog文件相关的CommitLog类以及ConsumeQueue文件相关的ConsumeQueue类相似。可以看看前面的两篇文章，分别分析这两个类：

这里对于文件的加载，创建和删除逻辑就不进行分析，主要看创建消息的索引的方法，和根据消息以及时间范围查询消息集合的方法

字段属性

IndexService中的字段，主要是设置单个IndexFile文件中的hash槽和index链表长度相关的

 //尝试创建IndexFile的最大次数
    private static final int MAX_TRY_IDX_CREATE = 3;
    //消息存储的操作类
    private final DefaultMessageStore defaultMessageStore;
    //hash槽合数
    private final int hashSlotNum;
    //index索引链表个数
    private final int indexNum;
    //存储的路径
    private final String storePath;
    //IndexFile的集合
    private final ArrayList<IndexFile> indexFileList = new ArrayList<IndexFile>();
    //读写锁
    private final ReadWriteLock readWriteLock = new ReentrantReadWriteLock();

内部方法

构造方法

    public IndexService(final DefaultMessageStore store) {
        this.defaultMessageStore = store;
        //获取默认构建的索引个数  默认是的 500w个
        this.hashSlotNum = store.getMessageStoreConfig().getMaxHashSlotNum();
        //设置索引的个数 默认是 5000000 * 4 也就是2000w个
        this.indexNum = store.getMessageStoreConfig().getMaxIndexNum();
        //存储的路径
        this.storePath =
            StorePathConfigHelper.getStorePathIndex(store.getMessageStoreConfig().getStorePathRootDir());
    }

这里两个参数都是可以通过配置来设置

参数	含义
maxHashSlotNum	IndexFile的hash槽数量，默认500w
maxIndexNum	IndexFile的index链长度，默认2000w

创建消息索引和保存的`buildIndex`

buildIndex方法的逻辑比较简单。就是根据请求的中的消息的key和topic来构建存储的key结构。然后调用IndexFile类中的方法。其中对于事务消息的回滚类型的消息不进行记录。

public void buildIndex(DispatchRequest req) {
        //尝试获取和创建 IndexFile 最大尝试次数为3 次
        IndexFile indexFile = retryGetAndCreateIndexFile();
        if (indexFile != null) {
            long endPhyOffset = indexFile.getEndPhyOffset();
            DispatchRequest msg = req;
            //获取消息转存请求中消息的 topic 和 key
            String topic = msg.getTopic();
            String keys = msg.getKeys();
            //如果消息的CommitLog的物理偏移量 < IndexFile记录的最后一个消息物理结束偏移量，则表示消息已经记录了
            if (msg.getCommitLogOffset() < endPhyOffset) {
                return;
            }
            //获取消息的类型，如果是事务消息的回滚类型的消息，则直接返回，不进行记录
            final int tranType = MessageSysFlag.getTransactionValue(msg.getSysFlag());
            switch (tranType) {
                case MessageSysFlag.TRANSACTION_NOT_TYPE:
                case MessageSysFlag.TRANSACTION_PREPARED_TYPE:
                case MessageSysFlag.TRANSACTION_COMMIT_TYPE:
                    break;
                case MessageSysFlag.TRANSACTION_ROLLBACK_TYPE:
                    return;
            }

            if (req.getUniqKey() != null) {
                //保存对应的key的 ， key的格式为 topic + "#" + key
                indexFile = putKey(indexFile, msg, buildKey(topic, req.getUniqKey()));
                if (indexFile == null) {
                    log.error("putKey error commitlog {} uniqkey {}", req.getCommitLogOffset(), req.getUniqKey());
                    return;
                }
            }

            if (keys != null && keys.length() > 0) {
                String[] keyset = keys.split(MessageConst.KEY_SEPARATOR);
                for (int i = 0; i < keyset.length; i++) {
                    String key = keyset[i];
                    if (key.length() > 0) {
                        indexFile = putKey(indexFile, msg, buildKey(topic, key));
                        if (indexFile == null) {
                            log.error("putKey error commitlog {} uniqkey {}", req.getCommitLogOffset(), req.getUniqKey());
                            return;
                        }
                    }
                }
            }
        } else {
            log.error("build index error, stop building index");
        }
    }

根据消息以及时间范围查询消息集合的`queryOffset`

queryOffset方法也比较简单，先根据传入的落盘时间区间段，获取合适的IndexFile文件，然后调用IndexFile类从文件中根据消息的key和topic获取消息的物理偏移量集合

public QueryOffsetResult queryOffset(String topic, String key, int maxNum, long begin, long end) {
        List<Long> phyOffsets = new ArrayList<Long>(maxNum);

        long indexLastUpdateTimestamp = 0;
        long indexLastUpdatePhyoffset = 0;
        //比较此次要获取的 最大数量 和 配置的 maxMsgsNumBatch 参数。 取最大值
        maxNum = Math.min(maxNum, this.defaultMessageStore.getMessageStoreConfig().getMaxMsgsNumBatch());
        try {
            this.readWriteLock.readLock().lock();
            //indexFile 不为空 则迭代indexFile 集合
            if (!this.indexFileList.isEmpty()) {
                for (int i = this.indexFileList.size(); i > 0; i--) {
                    // 获取IndexFile
                    IndexFile f = this.indexFileList.get(i - 1);
                    boolean lastFile = i == this.indexFileList.size();
                    //如果是最后一个IndexFile，则记录对应的 最后记录时间 和 最大偏移量
                    if (lastFile) {
                        indexLastUpdateTimestamp = f.getEndTimestamp();
                        indexLastUpdatePhyoffset = f.getEndPhyOffset();
                    }
                    /**
                     * 检查时间是不是符合 ，
                     * 1. 开始时间和结束 时间在 IndexFile 头文件记录的beginTimestamp 和endTimestamp 中
                     * 2. 开始时间 在 beginTimestamp 和endTimestamp 中
                     * 3. 结束时间 在 beginTimestamp 和endTimestamp 中
                     */
                    if (f.isTimeMatched(begin, end)) {
                        //获取符合条件的key的物理偏移量
                        f.selectPhyOffset(phyOffsets, buildKey(topic, key), maxNum, begin, end, lastFile);
                    }

                    if (f.getBeginTimestamp() < begin) {
                        break;
                    }

                    if (phyOffsets.size() >= maxNum) {
                        break;
                    }
                }
            }
        } catch (Exception e) {
            log.error("queryMsg exception", e);
        } finally {
            this.readWriteLock.readLock().unlock();
        }

        return new QueryOffsetResult(phyOffsets, indexLastUpdateTimestamp, indexLastUpdatePhyoffset);
    }