http://www.cnblogs.com/huangfox/archive/2010/10/18/1854403.html
1.IndexSearcher中和排序相关的方法及sort类、SortField类(api级别);
用IndexSearcher直接排序一般使用方法
search(Weight weight, Filter filter, int n, Sort sort)
Expert: Low-level search implementation with arbitrary sorting.
该方法只需传入一个sort实例。
Constructor Summary | |
---|---|
Sort() Sorts by computed relevance. | |
Sort(SortField... fields) Sorts in succession by the criteria in each SortField. | |
Sort(SortField field) Sorts by the criteria in the given SortField. |
在sort实例中,决定对哪个字段进行排序,按照什么数据类型排序,是升序还是降序,由SortField说的算。
两个最基础的构造方法如下:
SortField(String field, int type) Creates a sort by terms in the given field with the type of term values explicitly given. |
SortField(String field, int type, boolean reverse) Creates a sort, possibly in reverse, by terms in the given field with the type of term values explicitly given. |
通过这些类我们能很方便的完成检索结果的排序。
简单示例:
package ceshi0305;
import java.io.File;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.MatchAllDocsQuery;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.Sort;
import org.apache.lucene.search.SortField;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
public class SortDemo {
public static void main(String[] args) {
new SortDemo().search();
}
public void search() {
Directory dir = null;
IndexReader reader = null;
IndexSearcher searcher = null;
try {
dir = FSDirectory.open(new File("d:\\20140303index"));
reader = DirectoryReader.open(dir);
searcher = new IndexSearcher(reader);
SortField sortF = new SortField("f1", SortField.TYPE.STRING);
Sort sort = new Sort(sortF);
TopDocs res = searcher.search(new MatchAllDocsQuery(), null, 10, sort);
for (ScoreDoc doc : res.scoreDocs) {
System.out.println(searcher.doc(doc.doc));
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
经测试,在构造SortField,第三个参数传true时,为方向排序(没有即为正向)。另外,如果Filed “f1”是多值域,貌似只针对该域的第一个值进行排序(目前这么认为)。
2.按文档得分进行排序;
IndexSearcher默认的搜索就是按照文档得分进行排序的。
在SortField中将类型设置为SCORE即可。
SCORE Sort by document score (relevancy). |
3.按文档内部id进行排序;
每个文档进入索引的时候都会分配一个id号,有时可能会需要按照这个id号进行排序,
那么将SortField中类型设置为DOC即可。
DOC Sort by document number (index order). |
注意点:
对日期、价格等数据排序都要选择合适的排序类型,不单单是满足业务的需要,而且用INT、FLOAT等数值型的排序
比STRING效率要高。
5.多Field排序;
...实例代码:
SortField sortF
=
new
SortField(
"
f
"
, SortField.INT);
SortField sortF2 = new SortField( " f1 " , SortField.INT);
Sort sort = new Sort( new SortField[]{sortF , sortF2});
TopFieldDocs docs = searcher.search(query, null , 10 , sort);
SortField sortF2 = new SortField( " f1 " , SortField.INT);
Sort sort = new Sort( new SortField[]{sortF , sortF2});
TopFieldDocs docs = searcher.search(query, null , 10 , sort);
结果:
Document
<
stored,indexed
<
f:
-
2
>
stored,indexed
<
f1:
20000128
>
stored,indexed
<
a:fox
>>
Document < stored,indexed < f: 0 > stored,indexed < f1: 20050719 > stored,indexed < a:fox >>
Document < stored,indexed < f: 5 > stored,indexed < f1: 20101019 > stored,indexed < a:fox >>
Document <stored,indexed<f:10> stored,indexed<f1:20090512> stored,indexed<a:fox>>
Document<stored,indexed<f:10> stored,indexed<f1:20100215> stored,indexed<a:fox>>
Document < stored,indexed < f: 0 > stored,indexed < f1: 20050719 > stored,indexed < a:fox >>
Document < stored,indexed < f: 5 > stored,indexed < f1: 20101019 > stored,indexed < a:fox >>
Document <stored,indexed<f:10> stored,indexed<f1:20090512> stored,indexed<a:fox>>
Document<stored,indexed<f:10> stored,indexed<f1:20100215> stored,indexed<a:fox>>
注意点:
先按照 f字段 进行排序,如果 f字段 值相等,再按照 f1字段 进行排序。
这个顺序由 SortField数组中 SortField实例的顺序 一致。
6.通过改变boost值来改变文档的得分。
默认排序(相关度排序),原始排序情况:
Document
<
stored,indexed
<
f:
10
>
stored,indexed
<
f1:
20100215
>
stored,indexed
<
a:fox
>>
Document < stored,indexed < f: 10 > stored,indexed < f1: 20090512 > stored,indexed < a:fox >>
Document < stored,indexed < f: 5 > stored,indexed < f1: 20101019 > stored,indexed < a:fox >>
Document < stored,indexed < f: - 2 > stored,indexed < f1: 20000128 > stored,indexed < a:fox >>
Document <stored,indexed<f:0> stored,indexed<f1:20050719> stored,indexed<a:fox>>
在Lucene4.6版本中Document已经没有setBoost这个方法,如果一定要给文档整体打分,那么可以增加一个字段Boost,他的值为所需要的分数。再对Boost字段排序即可。
Document < stored,indexed < f: 10 > stored,indexed < f1: 20090512 > stored,indexed < a:fox >>
Document < stored,indexed < f: 5 > stored,indexed < f1: 20101019 > stored,indexed < a:fox >>
Document < stored,indexed < f: - 2 > stored,indexed < f1: 20000128 > stored,indexed < a:fox >>
Document <stored,indexed<f:0> stored,indexed<f1:20050719> stored,indexed<a:fox>>
for (int i = 0; i < 500000; i++) {
Document doc = new Document();
f1.setStringValue("f1 hello doc" + i);
doc.add(f1);
f2.setStringValue("f2 world doc" + i);
doc.add(f2);
doc.add(new IntField("Boost", i, Store.YES));
writer.addDocument(doc);
}