ROCKETMQ key索引的插入查询原理

最新推荐文章于 2024-04-27 16:08:55 发布

weixin_34294649

最新推荐文章于 2024-04-27 16:08:55 发布

阅读量1.9k

点赞数

文章标签：数据结构与算法 python

原文链接：https://my.oschina.net/liangxiao/blog/3054702

版权

2019独角兽企业重金招聘Python工程师标准>>>

摘要

插入逻辑

1、计算key的hash值
2、根据hash值找到对应的slot，slot的内容是这个hash上一个（想对当前key来说的）index的逻辑位置（第几个）
3、计算当前key应该存放到index 的物理位置（header部分的固定长度，加上slot部分的固定数目*slot的单个固定长度，加上 index的固定长度 * 当前index的数据，写入key对应的index
4、更新slot的内容为当前index的个数

读取逻辑

1、计算hash值，找到slot，取出slot的value，这个value就是这个hash的最近的一个index（key对应的）
2、逐个向前迭代，key的hash值一致、且时间符合，则取出来

文件结构

index数据结构

index的数据结构总共包括header，slot，index总计3个部分组成，各个部分的格式如下图所示，其中slot可以保存500万slot，index部分可以保存2000万index。

header 部分好理解，就是一些基本信息。 slot 部分存储的是 key的hash值%500w（假设这个值叫 keyHashM值），这种值最后一个key在index部分的逻辑位置（第几个），因为index里面每一个单元信息包含同keyHashM值的上一个key的逻辑位置，这样，查找的时候倒叙查找就可以了。 index 部分存储的就是消息的物理位置，还有一个关键信息，是keyHashM值的上一个在index里面的逻辑位置

header数据结构

1. beginTimestamp : 该索引文件的第一个消息(Message)的存储时间(落盘时间) 物理位置(pos: 0-7) 8bytes
1. endTimestamp : 该索引文件的最后一个消息(Message)的存储时间(落盘时间) 物理位置(pos: 8-15) 8bytes
1. beginPhyoffset : 该索引文件第一个消息(Message)的在CommitLog(消息存储文件)的物理位置偏移量(可以通过该物理偏移直接获取到该消息) 物理位置(pos: 16-23) 8bytes
1. beginPhyoffset : 该索引文件最后一个消息(Message)的在CommitLog(消息存储文件)的物理位置偏移量 (pos: 24-31) 8bytes
1. hashSlotCount : 该索引文件目前的hash slot的个数 (pos: 32-35) 4bytes
1. indexCount : 该索引文件目前的索引个数 (pos: 36-39) 4bytes

slot的单元结构

slot的每一个单元信息，存储的是这种值（key的hash值 % 500W），最近的一个（也可以说是最后一个）key在index部分的逻辑位置（第几个），这样插入的时候方便直接知道要插入的key上一个的位置，查询的时候知道这种值的key的最后一个的位置。

index的单元结构

1. key hash value: message key的hash值
1. phyOffset: message在CommitLog的物理文件地址, 可以直接查询到该消息(索引的核心机制)
1. timeDiff: message的落盘时间与header里的beginTimestamp的差值(为了节省存储空间，如果直接存message的落盘时间就得8bytes)
1. prevIndex: hash冲突处理的关键之处, 相同hash值上一个消息索引的index(如果当前消息索引是该hash值的第一个索引，则prevIndex=0, 也是消息索引查找时的停止条件)，每个slot位置的第一个消息的prevIndex就是0的。

源码

org.apache.rocketmq.store.index.IndexFile

插入源码

/**
     * 存储key
     * 步骤：1、计算key的hash值
     *     2、根据hash值找到对应的slot，slot的内容是这个hash上一个（想对当前key来说的）index的逻辑位置（第几个）
     *     3、计算当前key应该存放到index 的物理位置（header部分的固定长度，加上slot部分的固定数目*slot的单个固定长度，加上 index的固定长度 * 当前index的数据，写入key对应的index
     *     4、更新slot的内容为当前index的个数
     * @param key
     * @param phyOffset
     * @param storeTimestamp
     * @return
     */
    public boolean putKey(final String key, final long phyOffset, final long storeTimestamp) {
    	//1. 判断该索引文件的索引数小于最大的索引数，如果>=最大索引数，IndexService就会尝试新建一个索引文件
        if (this.indexHeader.getIndexCount() < this.indexNum) {
        	//计算当前消息key的hash值
            int keyHash = indexKeyHashMethod(key);
            // 计算出 slot的位置
            int slotPos = keyHash % this.hashSlotNum;
            int absSlotPos = IndexHeader.INDEX_HEADER_SIZE + slotPos * hashSlotSize;

            FileLock fileLock = null;

            try {

                // fileLock = this.fileChannel.lock(absSlotPos, hashSlotSize,
                // false);
            	//这里存的是逻辑位置，第几个索引
                int slotValue = this.mappedByteBuffer.getInt(absSlotPos);// slot固定的4字节长度
                if (slotValue <= invalidIndex || slotValue > this.indexHeader.getIndexCount()) {
                    slotValue = invalidIndex;
                }

                long timeDiff = storeTimestamp - this.indexHeader.getBeginTimestamp();

                timeDiff = timeDiff / 1000;

                if (this.indexHeader.getBeginTimestamp() <= 0) {
                    timeDiff = 0;
                } else if (timeDiff > Integer.MAX_VALUE) {
                    timeDiff = Integer.MAX_VALUE;
                } else if (timeDiff < 0) {
                    timeDiff = 0;
                }
                //计算当前索引应该放的位置
                int absIndexPos =
                    IndexHeader.INDEX_HEADER_SIZE + this.hashSlotNum * hashSlotSize
                        + this.indexHeader.getIndexCount() * indexSize;

                /**
                 * 4字节  存的 key的hash值
                 * 8字节  存的是消息的物理地址  在commitlog上的物理地址
                 * 4字节  相对于当前文件的时间偏差
                 * 4字节   同hash值上一个索引的逻辑偏移值
                 */
                this.mappedByteBuffer.putInt(absIndexPos, keyHash);// 
                this.mappedByteBuffer.putLong(absIndexPos + 4, phyOffset);
                this.mappedByteBuffer.putInt(absIndexPos + 4 + 8, (int) timeDiff);
                this.mappedByteBuffer.putInt(absIndexPos + 4 + 8 + 4, slotValue);
                
                //更新slotValue ， 将当前是第几个索引，更新到这个slotValue的内容中
                this.mappedByteBuffer.putInt(absSlotPos, this.indexHeader.getIndexCount());

                if (this.indexHeader.getIndexCount() <= 1) {
                    this.indexHeader.setBeginPhyOffset(phyOffset);
                    this.indexHeader.setBeginTimestamp(storeTimestamp);
                }

                this.indexHeader.incHashSlotCount();
                this.indexHeader.incIndexCount();
                this.indexHeader.setEndPhyOffset(phyOffset);
                this.indexHeader.setEndTimestamp(storeTimestamp);

                return true;
            } catch (Exception e) {
                log.error("putKey exception, Key: " + key + " KeyHashCode: " + key.hashCode(), e);
            } finally {
                if (fileLock != null) {
                    try {
                        fileLock.release();
                    } catch (IOException e) {
                        log.error("Failed to release the lock", e);
                    }
                }
            }
        } else {
            log.warn("Over index file capacity: index count = " + this.indexHeader.getIndexCount()
                + "; index max num = " + this.indexNum);
        }

        return false;
    }

查询源码

/**
     * 根据key查询消息
     * 1、计算hash值，找到slot，取出slot的value，这个value就是这个hash的最近的一个index（key对应的）
     * 2、逐个向前迭代，key的hash值一致、且时间符合，则取出来
     * @param phyOffsets
     * @param key
     * @param maxNum
     * @param begin
     * @param end
     * @param lock
     */
    public void selectPhyOffset(final List<Long> phyOffsets, final String key, final int maxNum,
        final long begin, final long end, boolean lock) {
        if (this.mappedFile.hold()) {
            int keyHash = indexKeyHashMethod(key);
            int slotPos = keyHash % this.hashSlotNum;
            int absSlotPos = IndexHeader.INDEX_HEADER_SIZE + slotPos * hashSlotSize;

            FileLock fileLock = null;
            try {
                if (lock) {
                    // fileLock = this.fileChannel.lock(absSlotPos,
                    // hashSlotSize, true);
                }

                int slotValue = this.mappedByteBuffer.getInt(absSlotPos);
                // if (fileLock != null) {
                // fileLock.release();
                // fileLock = null;
                // }

                if (slotValue <= invalidIndex || slotValue > this.indexHeader.getIndexCount()
                    || this.indexHeader.getIndexCount() <= 1) {
                } else {
                    for (int nextIndexToRead = slotValue; ; ) {
                        if (phyOffsets.size() >= maxNum) {
                            break;
                        }
                        // 计算当前nextIndexToRead的物理位置
                        int absIndexPos =
                            IndexHeader.INDEX_HEADER_SIZE + this.hashSlotNum * hashSlotSize
                                + nextIndexToRead * indexSize;
                        //
                        int keyHashRead = this.mappedByteBuffer.getInt(absIndexPos);
                        long phyOffsetRead = this.mappedByteBuffer.getLong(absIndexPos + 4);

                        long timeDiff = (long) this.mappedByteBuffer.getInt(absIndexPos + 4 + 8);
                        int prevIndexRead = this.mappedByteBuffer.getInt(absIndexPos + 4 + 8 + 4);

                        if (timeDiff < 0) {
                            break;
                        }

                        timeDiff *= 1000L;

                        long timeRead = this.indexHeader.getBeginTimestamp() + timeDiff;
                        boolean timeMatched = (timeRead >= begin) && (timeRead <= end);

                        // index里面的keyHashRead 跟传进来的 keyHash值相等，且时间对的上，则认为是相同的
                        if (keyHash == keyHashRead && timeMatched) {
                            phyOffsets.add(phyOffsetRead);
                        }

                        if (prevIndexRead <= invalidIndex
                            || prevIndexRead > this.indexHeader.getIndexCount()
                            || prevIndexRead == nextIndexToRead || timeRead < begin) {
                            break;
                        }

                        nextIndexToRead = prevIndexRead;
                    }
                }
            } catch (Exception e) {
                log.error("selectPhyOffset exception ", e);
            } finally {
                if (fileLock != null) {
                    try {
                        fileLock.release();
                    } catch (IOException e) {
                        log.error("Failed to release the lock", e);
                    }
                }

                this.mappedFile.release();
            }
        }
    }