lucene源码分析---10

最新推荐文章于 2023-04-12 08:23:22 发布

二侠

最新推荐文章于 2023-04-12 08:23:22 发布

阅读量3.1k

点赞数 2

分类专栏： lucene-6.1.0源码分析

本文链接：https://blog.csdn.net/conansonic/article/details/52091301

版权

lucene-6.1.0源码分析专栏收录该内容

15 篇文章 12 订阅

订阅专栏

lucene源码分析—倒排索引的读过程

上一章中分析了lucene倒排索引的写过程，本章开始分析其读过程，重点分析SegmentTermsEnum的seekExact函数。
首先看几个构造函数，先看SegmentCoreReaders的构造函数，在Lucene50PostingFormat的fieldsProducer函数中创建。

BlockTreeTermsReader::BlockTreeTermsReader

  public BlockTreeTermsReader(PostingsReaderBase postingsReader, SegmentReadState state) throws IOException {
    boolean success = false;
    IndexInput indexIn = null;

    this.postingsReader = postingsReader;
    this.segment = state.segmentInfo.name;

    String termsName = IndexFileNames.segmentFileName(segment, state.segmentSuffix, TERMS_EXTENSION);
    try {
      termsIn = state.directory.openInput(termsName, state.context);
      version = CodecUtil.checkIndexHeader(termsIn, TERMS_CODEC_NAME, VERSION_START, VERSION_CURRENT, state.segmentInfo.getId(), state.segmentSuffix);
      ...
      String indexName = IndexFileNames.segmentFileName(segment, state.segmentSuffix, TERMS_INDEX_EXTENSION);
      indexIn = state.directory.openInput(indexName, state.context);
      CodecUtil.checkIndexHeader(indexIn, TERMS_INDEX_CODEC_NAME, version, version, state.segmentInfo.getId(), state.segmentSuffix);
      CodecUtil.checksumEntireFile(indexIn);

      postingsReader.init(termsIn, state);
      CodecUtil.retrieveChecksum(termsIn);

      seekDir(termsIn, dirOffset);
      seekDir(indexIn, indexDirOffset);

      final int numFields = termsIn.readVInt();

      for (int i = 0; i < numFields; ++i) {
        final int field = termsIn.readVInt();
        final long numTerms = termsIn.readVLong();
        final int numBytes = termsIn.readVInt();
        final BytesRef rootCode = new BytesRef(new byte[numBytes]);
        termsIn.readBytes(rootCode.bytes, 0, numBytes);
        rootCode.length = numBytes;
        final FieldInfo fieldInfo = state.fieldInfos.fieldInfo(field);
        final long sumTotalTermFreq = fieldInfo.getIndexOptions() == IndexOptions.DOCS ? -1 : termsIn.readVLong();
        final long sumDocFreq = termsIn.readVLong();
        final int docCount = termsIn.readVInt();
        final int longsSize = termsIn.readVInt();
        BytesRef minTerm = readBytesRef(termsIn);
        BytesRef maxTerm = readBytesRef(termsIn);
        final long indexStartFP = indexIn.readVLong();
        FieldReader previous = fields.put(fieldInfo.name, new FieldReader(this, fieldInfo, numTerms, rootCode, sumTotalTermFreq, sumDocFreq, docCount, indexStartFP, longsSize, indexIn, minTerm, maxTerm));
      }

      indexIn.close();
      success = true;
    } finally {

    }
  }

BlockTreeTermsReader的核心功能是打开.tim和.tip文件并创建输出流，然后创建FiledReader用于读取数据。
函数中的segment为段名，例如”_0”，state.segmentSuffix假设返回Lucene50_0，TERMS_EXTENSION默认为tim，因此segmentFileName构造文件名_0_Lucene50_0.tim。
directory对于cfs文件，返回Lucene50CompoundReader。
openInput函数返回SingleBufferImpl或者MultiBufferImpl，下面假设为SingleBufferImpl，termsIn封装了_0_Lucene50_0.tim文件的输出流。
checkIndexHeader检查头信息，和写过程的writeIndexHeader函数对应。
和.tim文件的打开过程类似，BlockTreeTermsReader的构造函数接下来打开_0_Lucene50_0.tip文件，检查头信息，同样调用openInput返回的indexIn封装了_0_Lucene50_0.tip文件的输出流。
seekDir最终调用SingleBufferImpl的父类ByteBufferIndexInput的seek函数，改变DirectByteBufferR的position指针的位置，用于略过一些头信息。然后从tim文件中读取并设置域的相应信息。最后创建FieldReader并返回。

BlockTreeTermsReader::BlockTreeTermsReader->FieldReader::FieldReader

  FieldReader(BlockTreeTermsReader parent, FieldInfo fieldInfo, long numTerms, BytesRef rootCode, long sumTotalTermFreq, long sumDocFreq, int docCount, long indexStartFP, int longsSize, IndexInput indexIn, BytesRef minTerm, BytesRef maxTerm) throws IOException {
    this.fieldInfo = fieldInfo;
    this.parent = parent;
    this.numTerms = numTerms;
    this.sumTotalTermFreq = sumTotalTermFreq;
    this.sumDocFreq = sumDocFreq;
    this.docCount = docCount;
    this.indexStartFP = indexStartFP;
    this.rootCode = rootCode;
    this.longsSize = longsSize;
    this.minTerm = minTerm;
    this.maxTerm = maxTerm;

    rootBlockFP = (new ByteArrayDataInput(rootCode.bytes, rootCode.offset, rootCode.length)).readVLong() >>> BlockTreeTermsReader.OUTPUT_FLAGS_NUM_BITS;

    if (indexIn != null) {
      final IndexInput clone = indexIn.clone();
      clone.seek(indexStartFP);
      index = new FST<>(clone, ByteSequenceOutputs.getSingleton());

    } else {
      index = null;
    }
  }

FieldReader函数的核心部分是创建一个FST，FST，全称Finite State Transducer，用有限状态机实现对词典中单词前缀和后缀的重复利用，压缩存储空间，在上一章已经介绍了如何将FST中的信息写入.tip文件，这一章后面介绍的过程相反，要将.tip文件中的数据读取出来。
rootBlockFP被创建为ByteArrayDataInput，ByteArrayDataInput对应的每个存储结构的最高位bit用来表示是否后面的位置信息有用。例如10000001（高位1表示后面的数据和前面的数据组成一个数据）+00000001最终其实为10000001。
seek函数调整ByteBufferIndexInput中当前ByteBuffer中的position位置为indexStartFP。
最后创建FST赋值给成员变量index。

BlockTreeTermsReader::BlockTreeTermsReader->FieldReader::FieldReader->FST::FST

  public FST(DataInput in, Outputs<T> outputs) throws IOException {
    this(in, outputs, DEFAULT_MAX_BLOCK_BITS);
  }

  public FST(DataInput in, Outputs<T> outputs, int maxBlockBits) throws IOException {
    this.outputs = outputs;

    version = CodecUtil.checkHeader(in, FILE_FORMAT_NAME, VERSION_PACKED, VERSION_NO_NODE_ARC_COUNTS);
    packed = in.readByte() == 1;
    if (in.readByte() == 1) {
      BytesStore emptyBytes = new BytesStore(10);
      int numBytes = in.readVInt();
      emptyBytes.copyBytes(in, numBytes);

      BytesReader reader;
      if (packed) {
        reader = emptyBytes.getForwardReader();
      } else {
        reader = emptyBytes.getReverseReader();
        if (numBytes > 0) {
          reader.setPosition(numBytes-1);
        }
      }
      emptyOutput = outputs.readFinalOutput(reader);
    } else {
      emptyOutput = null;
    }
    final byte t = in.readByte();
    switch(t) {
      case 0:
        inputType = INPUT_TYPE.BYTE1;
        break;
      case 1:
        inputType = INPUT_TYPE.BYTE2;
        break;
      case 2:
        inputType = INPUT_TYPE.BYTE4;
        break;
    default:
      throw new IllegalStateException("invalid input type " + t);
    }
    if (packed) {
      nodeRefToAddress = PackedInts.getReader(in);
    } else {
      nodeRefToAddress = null;
    }
    startNode = in.readVLong();
    if (version < VERSION_NO_NODE_ARC_COUNTS) {
      in.readVLong();
      in.readVLong();
      in.readVLong();
    }

    long numBytes = in.readVLong();
    if (numBytes > 1 << maxBlockBits) {
      bytes = new BytesStore(in, numBytes, 1<<maxBlockBits);
      bytesArray = null;
    } else {
      bytes = null;
      bytesArray = new byte[(int) numBytes];
      in.readBytes(bytesArray, 0, bytesArray.length);
    }

    cacheRootArcs();
  }

FST的构造函数简而言之就是从.tip文件中读取写入的各个索引，并进行初始化。

传入的参数DEFAULT_MAX_BLOCK_BITS表示读取文件时每个块的大小，默认为30个bit。
checkHeader检查.tip文件的合法性。getForwardReader和getReverseReader返回FST.BytesReader。getForwardReader返回的BytesReader从缓存中向前读取数据，getReverseReader向后读取数据。
读取数据类型至inputType，即一个Term中的每个元素占多少字节。
最后读取了.tip文件最核心的内容并存储至bytesArray中，即倒排索引写过程中写入树的每个节点的信息。
cacheRootArcs函数对bytesArray中的数据进行解析并缓存根节点。

BlockTreeTermsReader::BlockTreeTermsReader->FieldReader::FieldReader->FST::FST->cacheRootArcs

  private void cacheRootArcs() throws IOException {
    final Arc<T> arc = new Arc<>();
    getFirstArc(arc);
    if (targetHasArcs(arc)) {
      final BytesReader in = getBytesReader();
      Arc<T>[] arcs = (Arc<T>[]) new Arc[0x80];
      readFirstRealTargetArc(arc.target, arc, in);
      int count = 0;
      while(true) {
        if (arc.label < arcs.length) {
          arcs[arc.label] = new Arc<T>().copyFrom(arc);
        } else {
          break;
        }
        if (arc.isLast()) {
          break;
        }
        readNextRealArc(arc, in);
        count++;
      }

      int cacheRAM = (int) ramBytesUsed(arcs);

      if (count >= FIXED_ARRAY_NUM_ARCS_SHALLOW && cacheRAM < ramBytesUsed()/5) {
        cachedRootArcs = arcs;
        cachedArcsBytesUsed = cacheRAM;
      }
    }
  }

cacheRootArcs函数首先创建Arc，并调用getFirstArc对第一个节点进行初始化。targetHasArcs函数判断是否有可读信息，即在.tip文件中，一个节点是否有下一个节点。接着调用readFirstRealTargetArc读取第一个节点也即根节点的信息，这里就不往下看了，其中最重要的是读取该节点的内容label和下一个节点在bytesArray缓存中的位置。
再往下看cacheRootArcs函数，接下来通过一个while循环读取其他的根节点，如果读取的内容label大于128或者已经读取到最后的一个叶子节点，就退出循环，否则将读取到的节点信息存入arcs中，最后根据条件缓存到cachedRootArcs和cachedArcsBytesUsed成员变量里。

BlockTreeTermsReader::BlockTreeTermsReader->FieldReader::FieldReader->FST::FST->cacheRootArcs->getFirstArc

  public Arc<T> getFirstArc(Arc<T> arc) {
    T NO_OUTPUT = outputs.getNoOutput();

    if (emptyOutput != null) {
      arc.flags = BIT_FINAL_ARC | BIT_LAST_ARC;
      arc.nextFinalOutput = emptyOutput;
      if (emptyOutput != NO_OUTPUT) {
        arc.flags |= BIT_ARC_HAS_FINAL_OUTPUT;
      }
    } else {
      arc.flags = BIT_LAST_ARC;
      arc.nextFinalOutput = NO_OUTPUT;
    }
    arc.output = NO_OUTPUT;

    arc.target = startNode;
    return arc;
  }

getFirstArc函数用来初始化第一个节点，最重要的是设置了最后的arc.target，标识了一会从.tip核心内容的缓存bytesArray的哪个位置开始读。

下面开始分析SegmentTermsEnum的seekExact函数，先看一下SegmentTermsEnum的构造函数。

SegmentTermsEnum::SegmentTermsEnum

  public SegmentTermsEnum(FieldReader fr) throws IOException {
    this.fr = fr;

    stack = new SegmentTermsEnumFrame[0];
    staticFrame = new SegmentTermsEnumFrame(this, -1);

    if (fr.index == null) {
      fstReader = null;
    } else {
      fstReader = fr.index.getBytesReader();
    }

    for(int arcIdx=0;arcIdx<arcs.length;arcIdx++) {
      arcs[arcIdx] = new FST.Arc<>();
    }

    currentFrame = staticFrame;
    validIndexPrefix = 0;
  }

根据前面的分析，FieldReader的成员变量index是前面构造的FST，其构造函数读取了.tip文件，缓存了其核心内容到bytesArray中，并标记了起始位置为startNode。如果该index不为null，接下来的getBytesReader返回的就是bytesArray。

SegmentTermsEnum::seekExact
第一部分

  public boolean seekExact(BytesRef target) throws IOException {

    term.grow(1 + target.length);

    FST.Arc<BytesRef> arc;
    int targetUpto;
    BytesRef output;

    targetBeforeCurrentLength = currentFrame.ord;

    if (currentFrame != staticFrame) {

      ...

    } else {

      targetBeforeCurrentLength = -1;
      arc = fr.index.getFirstArc(arcs[0]);
      output = arc.output;
      currentFrame = staticFrame;

      targetUpto = 0;
      currentFrame = pushFrame(arc, BlockTreeTermsReader.FST_OUTPUTS.add(output, arc.nextFinalOutput), 0);
    }

    ...

  }

这里的fr是前面创建的FieldReader，index是FST，内部分装了从.tip文件读取的信息，FST_OUTPUTS是ByteSequenceOutputs。ByteSequenceOutputs的add函数合并arc.output和arc.nextFinalOutput两个BytesRef。
currentFrame和staticFrame不相等的情况不是第一次调用seekExact，if里省略的代码会利用之前的查找结果，本章不分析这种情况。
如果currentFrame和staticFrame相等，就调用getFirstArc初始化第一个Arc，最后pushFrame获得对应位置上（这里是第一个）的SegmentTermsEnumFrame并进行相应的设置。一个SegmentTermsEnumFrame代表的是一层节点，并不是一个节点，一层节点表示树中大于1个以上叶子节点到下一个该种节点间的所有节点。

SegmentTermsEnum::seekExact->SegmentTermsEnumFrame::pushFrame

  SegmentTermsEnumFrame pushFrame(FST.Arc<BytesRef> arc, BytesRef frameData, int length) throws IOException {
    scratchReader.reset(frameData.bytes, frameData.offset, frameData.length);
    final long code = scratchReader.readVLong();
    final long fpSeek = code >>> BlockTreeTermsReader.OUTPUT_FLAGS_NUM_BITS;
    final SegmentTermsEnumFrame f = getFrame(1+currentFrame.ord);
    f.hasTerms = (code & BlockTreeTermsReader.OUTPUT_FLAG_HAS_TERMS) != 0;
    f.hasTermsOrig = f.hasTerms;
    f.isFloor = (code & BlockTreeTermsReader.OUTPUT_FLAG_IS_FLOOR) != 0;
    if (f.isFloor) {
      f.setFloorData(scratchReader, frameData);
    }
    pushFrame(arc, fpSeek, length);

    return f;
  }

frameData保存了从.tip文件中读取的该节点对应的下一层节点的所有信息，即Arc结构中的nextFinalOutput。getFrame函数从SegmentTermsEnumFrame数组stack中获取对应位置上的SegmentTermsEnumFrame结构，然后调用pushFrame对其设置记录信息。

继续看seekExact函数的后一部分。

SegmentTermsEnum::seekExact
第二部分

  public boolean seekExact(BytesRef target) throws IOException {

    ...

    while (targetUpto < target.length) {

      final int targetLabel = target.bytes[target.offset + targetUpto] & 0xFF;

      final FST.Arc<BytesRef> nextArc = fr.index.findTargetArc(targetLabel, arc, getArc(1+targetUpto), fstReader);

      if (nextArc == null) {

        validIndexPrefix = currentFrame.prefix;

        currentFrame.scanToFloorFrame(target);

        if (!currentFrame.hasTerms) {
          termExists = false;
          term.setByteAt(targetUpto, (byte) targetLabel);
          term.setLength(1+targetUpto);
          return false;
        }

        currentFrame.loadBlock();

        final SeekStatus result = currentFrame.scanToTerm(target, true);            
        if (result == SeekStatus.FOUND) {
          return true;
        } else {
          return false;
        }
      } else {
        arc = nextArc;
        term.setByteAt(targetUpto, (byte) targetLabel);
        if (arc.output != BlockTreeTermsReader.NO_OUTPUT) {
          output = BlockTreeTermsReader.FST_OUTPUTS.add(output, arc.output);
        }

        targetUpto++;

        if (arc.isFinal()) {
          currentFrame = pushFrame(arc, BlockTreeTermsReader.FST_OUTPUTS.add(output, arc.nextFinalOutput), targetUpto);
        }
      }
    }

    validIndexPrefix = currentFrame.prefix;
    currentFrame.scanToFloorFrame(target);
    if (!currentFrame.hasTerms) {
      termExists = false;
      term.setLength(targetUpto);
      return false;
    }

    currentFrame.loadBlock();

    final SeekStatus result = currentFrame.scanToTerm(target, true);            
    if (result == SeekStatus.FOUND) {
      return true;
    } else {
      return false;
    }
  }

getArc函数每次在SegmentTermsEnum的成员变量Arc数组arcs中分配一个Arc结构，用于存放下一个节点信息，例如查询“abc”，如果当前查找“a”，有可能下一个节点即为“b”。
findTargetArc查找byte对应节点。
if部分表示找到了最后一层节点，或者没找到节点，scanToFloorFrame函数首先从.tip文件读取的结果中获取.tim文件中的指针。如果currentFrame.hasTerms为false，则表示没有找到Term，此时就直接返回了。如果找到了，则首先通过loadBlock函数从.tim文件中读取余下的信息，再调用scanToTerm进行比较，返回最终的结果。
这个举个例子，假设lucene索引中存储了“aab”、“aac”两个Term，在调用loadBlock前，已经找到了“aa”在.tip文件中信息，loadBlock函数就是根据“aa”在.tip文件中提供的指针位置，在.tim文件中获取到了b、c。
else部分表示找到了节点，则将查找到的label缓存到term中，如果到达了该层的最后一个节点，就调用pushFrame函数创建一个SegmentTermsEnumFrame记录下一层节点的信息。

SegmentTermsEnum::seekExact->getArc->findTargetArc

  public Arc<T> findTargetArc(int labelToMatch, Arc<T> follow, Arc<T> arc, BytesReader in) throws IOException {
    return findTargetArc(labelToMatch, follow, arc, in, true);
  }

  private Arc<T> findTargetArc(int labelToMatch, Arc<T> follow, Arc<T> arc, BytesReader in, boolean useRootArcCache) throws IOException {

    ...

    in.setPosition(getNodeAddress(follow.target));
    arc.node = follow.target;

    ...

    readFirstRealTargetArc(follow.target, arc, in);

    while(true) {
      if (arc.label == labelToMatch) {
        return arc;
      } else if (arc.label > labelToMatch) {
        return null;
      } else if (arc.isLast()) {
        return null;
      } else {
        readNextRealArc(arc, in);
      }
    }
  }

省略的部分代码处理两种情况，一种情况是要查询的byte是个结束字符-1，另一种是直接从缓存cachedRootArcs查找。
第二部分省略的代码是当节点数量相同时采用二分法查找。
剩下的代码就是线性搜索了，传入的参数in就是对应.tip文件核心内容的缓存，即前面读取到的bytesArray，follow的target变量存储了第一个节点在.tip文件缓存中的指针位置，调用setPosition调整指针位置。
如果是线性搜索，则首先调用readFirstRealTargetArc读取根节点信息到arc，读取的信息最重要的一是根节点的label，二是根节点的下一个节点。如果匹配到要查找的labelToMatch就直接返回该节点，否则继续读取下一个节点直到匹配到或返回。

进入seekExact函数的if部分，scanToFloorFrame根据.tip文件中的信息获取最后的叶子节点在.tim文件中的指针，loadBlock则从.tim文件中读取最后的信息。

SegmentTermsEnum::seekExact->SegmentTermsEnumFrame::loadBlock

  void loadBlock() throws IOException {

    ste.initIndexInput();

    ste.in.seek(fp);
    int code = ste.in.readVInt();
    entCount = code >>> 1;
    isLastInFloor = (code & 1) != 0;

    code = ste.in.readVInt();
    isLeafBlock = (code & 1) != 0;
    int numBytes = code >>> 1;
    if (suffixBytes.length < numBytes) {
      suffixBytes = new byte[ArrayUtil.oversize(numBytes, 1)];
    }
    ste.in.readBytes(suffixBytes, 0, numBytes);
    suffixesReader.reset(suffixBytes, 0, numBytes);

    numBytes = ste.in.readVInt();
    if (statBytes.length < numBytes) {
      statBytes = new byte[ArrayUtil.oversize(numBytes, 1)];
    }
    ste.in.readBytes(statBytes, 0, numBytes);
    statsReader.reset(statBytes, 0, numBytes);
    metaDataUpto = 0;

    state.termBlockOrd = 0;
    nextEnt = 0;
    lastSubFP = -1;

    numBytes = ste.in.readVInt();
    if (bytes.length < numBytes) {
      bytes = new byte[ArrayUtil.oversize(numBytes, 1)];
    }
    ste.in.readBytes(bytes, 0, numBytes);
    bytesReader.reset(bytes, 0, numBytes);

    fpEnd = ste.in.getFilePointer();

  }

ste为SegmentTermsEnum。initIndexInput函数设置SegmentTermsEnum的成员变量in为BlockTreeTermsReader中的termsIn，对应.tim文件的输出流。fp为文件中的指针位置，在.tip文件中读取出来的。seek调整termsIn的读取位置。然后从tim文件读取数据到suffixBytes中，在封装到suffixBytes中。读取数据到statBytes中，封装到statsReader中。读取数据到bytes中，封装到bytesReader中。其中suffixBytes中封装的是待比较的数据。

SegmentTermsEnum::seekExact->SegmentTermsEnumFrame::scanToTerm

  public SeekStatus scanToTerm(BytesRef target, boolean exactOnly) throws IOException {
    return isLeafBlock ? scanToTermLeaf(target, exactOnly) : scanToTermNonLeaf(target, exactOnly);
  }

  public SeekStatus scanToTermLeaf(BytesRef target, boolean exactOnly) throws IOException {

    ste.termExists = true;
    subCode = 0;

    if (nextEnt == entCount) {
      if (exactOnly) {
        fillTerm();
      }
      return SeekStatus.END;
    }


    nextTerm: while (true) {
      nextEnt++;

      suffix = suffixesReader.readVInt();

      final int termLen = prefix + suffix;
      startBytePos = suffixesReader.getPosition();
      suffixesReader.skipBytes(suffix);

      final int targetLimit = target.offset + (target.length < termLen ? target.length : termLen);
      int targetPos = target.offset + prefix;

      int bytePos = startBytePos;
      while(true) {
        final int cmp;
        final boolean stop;
        if (targetPos < targetLimit) {
          cmp = (suffixBytes[bytePos++]&0xFF) - (target.bytes[targetPos++]&0xFF);
          stop = false;
        } else {
          cmp = termLen - target.length;
          stop = true;
        }

        if (cmp < 0) {
          if (nextEnt == entCount) {
            break nextTerm;
          } else {
            continue nextTerm;
          }
        } else if (cmp > 0) {
          fillTerm();
          return SeekStatus.NOT_FOUND;
        } else if (stop) {
          fillTerm();
          return SeekStatus.FOUND;
        }
      }
    }

    if (exactOnly) {
      fillTerm();
    }

    return SeekStatus.END;
  }

suffixesReader中有多个可以匹配的term，外层的while循环依次取出每个term，其中prefix是已经匹配的长度，不需要再匹配的，因为该长度已经对应到一个block中了（block下面包含多个suffix）。suffix保存了term的长度，startBytePos保存了该term在suffixesReader也即在suffixBytes中的偏移。内层的while循环依次比对每个字节，直到每个字节都相等，targetPos会等于targetLimit，stop被设为true。其他情况下，例如遍历了所有suffix都没找到，或者cmp大于0（suffix中的字节按顺序排序），意味着该block中找不到匹配的term，则也返回。
最后如果找到了，就返回SeekStatus.FOUND。

总结

下面通过一个例子总结一下lucene倒排索引的读过程。
假设索引文件中存储了“abc”“abd”两个字符串，待查找的字符串为“abc”，首先从.tip文件中按层次按节点查找“a”节点、再查找“b”节点（findTargetArc函数），获得“b”节点后继续从.tip文件中读取剩下的部分在.tim文件中的指针（scanToFloorFrame函数），然后从.tim文件中读取了“c”和“d”（loadBlock函数），最后比较获得最终结果（scanToTerm函数）。

二侠

关注

2
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
lucene源码分析---10

lucene源码分析—倒排索引的读过程SegmentTermsEnumFieldReaderseekExactfindTargetArcscanToFloorFrameloadBlockscanToTerm
复制链接

扫一扫

专栏目录