lucene (2)查询

最新推荐文章于 2024-05-30 11:59:32 发布

hy飞无

最新推荐文章于 2024-05-30 11:59:32 发布

阅读量118

点赞数

分类专栏： java

本文链接：https://blog.csdn.net/hyhanyu/article/details/79400845

版权

java 专栏收录该内容

47 篇文章 0 订阅

订阅专栏

Field的类型介绍

* <li>{@link StringField}: {@link String} indexed verbatim as a single token

可以被索引但不会分词可以选择是否存储

doc.add(new StringField("StringField", "StringField的类型Field", Store.YES));

* <li>{@link IntPoint}: {@code int} indexed for exact/range queries

doc.add(new IntPoint("IntPoint1", 3 ));

doc.add(new IntPoint("IntPoint1", 1,2 ));可以传多个值

需要注意的是如果想存储值需要加上 doc.add(new StoredField("IntPoint2",IntPoint2));

如果需要对该字段进行排序 doc.add(new NumericDocValuesField("IntPoint2",IntPoint2));

* <li>{@link LongPoint}: {@code long} indexed for exact/range queries.

同IntPoint

* <li>{@link FloatPoint}: {@code float} indexed for exact/range queries.

同IntPoint

* <li>{@link DoublePoint}: {@code double} indexed for exact/range queries

同IntPoint

* <li>{@link StoredField}: Stored-only value for retrieving in summary results

* <li>{@link SortedDocValuesField}: {@code byte[]} indexed column-wise for sorting/faceting

This value can be at most 32766 bytes long.

用来排序

* <li>{@link SortedSetDocValuesField}: {@code SortedSet<byte[]>} indexed column-wise for sorting/faceting

Each value can be at most 32766 bytes long

用来排序

* <li>{@link NumericDocValuesField}: {@code long} indexed column-wise for sorting/faceting

用来排序

* <li>{@link SortedNumericDocValuesField}: {@code SortedSet<long>} indexed column-wise for sorting/faceting

用来排序

设置字段权重

TextField filename = new TextField("filename", file.getName(), Store.YES);

filename.setBoost(i);

从网上找来的：
可以给 Document 和 Field 增加权重(Boost)，使其在搜索结果排名更加靠前。缺省情况下，搜索结果以 Document.Score 作为排序依据，该数值越大排名越靠前。Boost 缺省值为 1。

Score = Score * Boost

通过上面的公式，我们就可以设置不同的权重来影响排名。

如下面的例子中根据 VIP 级别设定不同的权重。

Document document = new Document();
switch (vip)
{
case VIP.Gold: document.SetBoost(2F); break;
case VIP.Argentine: document.SetBoost(1.5F); break;
}

查询方法

1.根据QueryParser 查询一般给用户输入

QueryParser parser = new QueryParser("filename", analyzer);

Query query2 = parser.parse("contents:学生班级");//注意的是如果term中没有该值就查询不到，如果查询的字段是IntPoint类型，无法查询到 TermQuery t1 = new TermQuery(new Term("IntPoint2","2")); 可以查询

2.数字类型查询

Query query = IntPoint.newRangeQuery("IntPoint2", 4, 8);

3.组合查询

QueryParser parser = new QueryParser("filename", analyzer);
Query query2 = parser.parse("contents:学生班级");
BooleanClause bc1 = new BooleanClause(query2, Occur.MUST);

BooleanClause bc2 = new BooleanClause(query, Occur.MUST);
BooleanQuery.Builder builder=new BooleanQuery.Builder();

builder.add(bc1).add(bc2);

Occur.MUST// 相当于 sql AND

Occur.MUST_NOT//相当于 sql NOT IN

Occur.SHOULD//相当于 sql OR

4.排序查询

SortField sortField = new SortField("IntPoint2",SortField.Type.INT,true);

Sort sort = new Sort(sortField)//需要注意的是你对那个字段进行排序必须添加相应的docvalue值不然会报错

5.更新

lucene 的更新是删除原来的信息，重新添加。

6.高亮查询

Formatter formatter = new SimpleHTMLFormatter("<font color='red'>","</font>");
Scorer scorer = new QueryScorer(query2);
// 3. 高亮对象

Highlighter highlighter = new Highlighter(formatter, scorer);

TopDocs docs = is.search(builder.build(), 5);
for(ScoreDoc s : docs.scoreDocs){
Document doc=is.doc(s.doc);
String string = doc.get("filename");
System.out.println(string+"1");//如果有多个相同的field值那个在前面去那个

String contents = doc.get("contents");
if (contents!=null) {
TokenStream tokenStream = analyzer.tokenStream("contents",
new StringReader(contents));// TokenStream将查询出来的搞成片段，得到的是整个内容
System.out.println(highlighter.getBestFragment(tokenStream,
contents));// 将权重高的摘要显示出来，得到的是关键字内容
}

}