MDS ResultItem 的保存和读取

最新推荐文章于 2023-04-22 23:33:15 发布

binling

最新推荐文章于 2023-04-22 23:33:15 发布

阅读量577

点赞数

分类专栏：系统分析设计

本文链接：https://blog.csdn.net/binling/article/details/39958731

版权

系统分析设计专栏收录该内容

50 篇文章 0 订阅

订阅专栏

保存部分：

每一个resultItem有原始主键，在建立索引的时候分配一个自增的lableIndex，作为记录在索引系统里的标识。还有有一个记录的label 到offset的映射表，存在master文件的结尾。

public bool WriteResultItem(InternalResultItem item)
        {
            if (item == null)
                throw new ArgumentNullException("item");
            
            // Don't add the object if it already exists
            if (pointersTable.ContainsKey(item.LabelIndex))
                return false;

            pointersTable.Add(item.LabelIndex, stream.Position);

            BinaryWriter writer = new BinaryWriter(stream);
            WriteResultItem(writer, item);

            // If we've exceed the allowed number of bytes per file, then close the current search master,
            // and open a new one.
            if (!maintainBackwardCompatibility && stream.Length >= bytesPerFile)
            {
                OpenNewFile();
            }

            return true;
        }

具体写保存一个记录

private void WriteResultItem(BinaryWriter writer, InternalResultItem item)
        {
            // Write index
            writer.Write(item.LabelIndex);

            // Write Type
            writer.Write(item.Type);

            // Write number of attributes
            writer.Write(item.Attributes.Count);

            // Write attributes
            foreach (AttributeItem attrib in item.Attributes)
            {
                writer.Write(attrib.Key);
                writer.Write(attrib.Normalize);
                writer.Write(attrib.AllowEmptyValue);
                writer.Write(attrib.Culture);
                writer.Write(attrib.Value);
            }
        }

读取部分：

每一个master文件都有一个label到offset的映射表，读的时候拿lableIndex依次去每个master去查

public InternalResultItem GetResultItem(int index)
        {
            // Go through all our master files, and find the the requested index.
            long offset;
            foreach (MasterData masterData in masterDataList)
            {
                // This can happen; with deltas, its not garanteed that an asset will be there.
                if (masterData.PointersTable.TryGetValue(index, out offset))
                {
                    if (masterData.Stream.Length < offset)
                        throw new EndOfStreamException(string.Format("The offset ({0}) exceed the stream length.", offset));

                    masterData.Stream.Position = offset;

                    BinaryReader reader = new BinaryReader(masterData.Stream);
                    return ReadResultItem(reader);
                }
            }

            return null;
        }

具体读取一条记录

private InternalResultItem ReadResultItem(BinaryReader reader)
        {
            InternalResultItem item = new InternalResultItem();

            // Read index
            item.LabelIndex = reader.ReadInt32();

            // Reader Type
            item.Type = reader.ReadString();

            // Read number of attributes
            int nrAttributes = reader.ReadInt32();

            if (nrAttributes > 0)
            {
                for (int i = 0; i < nrAttributes; i++)
                {
                    string key = reader.ReadString();
                    bool normalize = reader.ReadBoolean();
                    bool allowEmptyValue = reader.ReadBoolean();
                    string culture = reader.ReadString();
                    string value = reader.ReadString();
                    item.Attributes.Add(new AttributeItem(key, normalize, allowEmptyValue, value, culture));
                }
            }

            return item;
        }

总结评价：这部分解决原始记录/doc的标识、存储，和标识的寻址问题。labelIndex就相当于数据库的row locator，或者搜索里的DocId。索引查询最后得到的就是这个docId，也就是索引里只保存docId，根据docId得到doc是存储系统的事。

resultItem物理上并没有排序，只是按照传进来的顺序依次写入master, 也就是mds的数据是非clustered的，只支持bookmark查询，不支持高效的range查询。