防止忘记的最好的方法就是记下来。
这是一段最简单的搜索代码:
public
void
Search()
{
var
dir=FSDirectory.Open(
new
DirectoryInfo(
"xxx"
));
var
searcher =
new
IndexSearcher(dir,
true
);
var
query =
new
TermQuery(
new
Term(
"Title"
,
"jinzhao"
));
var
tops=searcher.Search(query,100);
foreach
(
var
top
in
tops)
{
var
doc=searcher.Doc(top);
Output(doc);
}
}
|
红色的一句话就返回了一个完整document,是search内部的IndexReader(Lucene.Net.Index.IndexReader)返回的document,方法如下:
public
abstract
Document Document(
int
n, FieldSelector fieldSelector);
|
下面是这个类的实现:
他们的关系如下:
MultiReader和ParallelReader维护了IndexReader的一个集合(这些IndexReader可能由下面几重实现,但是不包含SegmentReader),封装了访问多个reader的方式,原理就是lucene里最常见的偏移的方式;
DirectoryReader等除SegmentReader外模拟的是一个目录,就像索引文件夹一样,它维护了一组SegmentReader的实现,原理如上;
SegmentReader是读取文档的最小单位它不再维护任何子的IndexReader,接收到ID后就会读取通过public sealed class FieldsReader 读取这个文档的字段(Lucene的核心就是文档,一个文档由若干字段组成),这里加载方式有立即加载、立即加载指定字段、懒加载等其它几种,方法如下:
public
/*internal*/
Document Doc(
int
n, FieldSelector fieldSelector)
{
SeekIndex(n);
long
position = indexStream.ReadLong();
fieldsStream.Seek(position);
Document doc =
new
Document();
int
numFields = fieldsStream.ReadVInt();
for
(
int
i = 0; i < numFields; i++)
{
int
fieldNumber = fieldsStream.ReadVInt();
FieldInfo fi = fieldInfos.FieldInfo(fieldNumber);
FieldSelectorResult acceptField = fieldSelector ==
null
?FieldSelectorResult.LOAD:fieldSelector.Accept(fi.name);
byte
bits = fieldsStream.ReadByte();
System.Diagnostics.Debug.Assert(bits <= FieldsWriter.FIELD_IS_COMPRESSED + FieldsWriter.FIELD_IS_TOKENIZED + FieldsWriter.FIELD_IS_BINARY);
bool
compressed = (bits & FieldsWriter.FIELD_IS_COMPRESSED) != 0;
bool
tokenize = (bits & FieldsWriter.FIELD_IS_TOKENIZED) != 0;
bool
binary = (bits & FieldsWriter.FIELD_IS_BINARY) != 0;
//TODO: Find an alternative approach here if this list continues to grow beyond the
//list of 5 or 6 currently here. See Lucene 762 for discussion
if
(acceptField.Equals(FieldSelectorResult.LOAD))
{
AddField(doc, fi, binary, compressed, tokenize);
}
else
if
(acceptField.Equals(FieldSelectorResult.LOAD_FOR_MERGE))
{
AddFieldForMerge(doc, fi, binary, compressed, tokenize);
}
else
if
(acceptField.Equals(FieldSelectorResult.LOAD_AND_BREAK))
{
AddField(doc, fi, binary, compressed, tokenize);
break
;
//Get out of this loop
}
else
if
(acceptField.Equals(FieldSelectorResult.LAZY_LOAD))
{
AddFieldLazy(doc, fi, binary, compressed, tokenize);
}
else
if
(acceptField.Equals(FieldSelectorResult.SIZE))
{
SkipField(binary, compressed, AddFieldSize(doc, fi, binary, compressed));
}
else
if
(acceptField.Equals(FieldSelectorResult.SIZE_AND_BREAK))
{
AddFieldSize(doc, fi, binary, compressed);
break
;
}
else
{
SkipField(binary, compressed);
}
}
return
doc;
}
|
标红的是一个IndexInput的实现,它是具体读取的方法,实现一般在存储类中以嵌套公开的方式实现,比如此处例子的实现如下:
public
/*protected internal*/
class
SimpleFSIndexInput : BufferedIndexInput, System.ICloneable
{
protected
internal
class
Descriptor : System.IO.BinaryReader
{
// remember if the file is open, so that we don't try to close it
// more than once
protected
internal
volatile
bool
isOpen;
internal
long
position;
internal
long
length;
public
Descriptor(
/*FSIndexInput enclosingInstance,*/
System.IO.FileInfo file, System.IO.FileAccess mode)
:
base
(
new
System.IO.FileStream(file.FullName, System.IO.FileMode.Open, mode, System.IO.FileShare.ReadWrite))
{
isOpen =
true
;
length = file.Length;
}
public
override
void
Close()
{
if
(isOpen)
{
isOpen =
false
;
base
.Close();
}
}
~Descriptor()
{
try
{
Close();
}
finally
{
}
}
}
|
可以看到最后字段由System.IO.BinaryReader到文件中读取。
完。
本文转自today4king博客园博客,原文链接:http://www.cnblogs.com/jinzhao/archive/2012/06/05/2537068.html,如需转载请自行联系原作者