Lucene教程（三）- 理解搜索过程的核心类

最新推荐文章于 2022-01-24 13:52:13 发布

橘猫吃不胖胖

最新推荐文章于 2022-01-24 13:52:13 发布

阅读量1.9k

点赞数

分类专栏： Lucene 文章标签： Lucene 搜索核心类

本文链接：https://blog.csdn.net/yuguiyang1990/article/details/13614321

版权

Lucene 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

上一篇博客，我们学习了索引过程的核心类，并且重构了IndexFiles，

在这一篇博客，我们学习一下搜索过程的核心类，并重构一下SearchFiles类。

    // Now search the index:
    DirectoryReader ireader = DirectoryReader.open(directory);
    IndexSearcher isearcher = new IndexSearcher(ireader);
    // Parse a simple query that searches for "text":
    QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, "fieldname", analyzer);
    Query query = parser.parse("text");
    ScoreDoc[] hits = isearcher.search(query, null, 1000).scoreDocs;
    assertEquals(1, hits.length);
    // Iterate through the results:
    for (int i = 0; i < hits.length; i++) {
      Document hitDoc = isearcher.doc(hits[i].doc);
      assertEquals("This is the text to be indexed.", hitDoc.get("fieldname"));
    }
    ireader.close();
    directory.close();

1.IndexSearcher

IndexSearcher用于搜索IndexWriter类所创建的索引。可以将IndexSearcher类看作是一个以只读方式打开索引的类，并提供了一些search（）方法。

2. Term

项（Term）是用于搜索的一个基本单元。如同域对象一样，它包括了一对字符串元素：与域中的域名和域值相对应。

3. Query

Lucene有很多的具体的查询（Query）子类，后面会详细讲解。

下面，我们来重构一下SearchFiles类，删除了分页操作，和一些参数什么的。

package org.ygy.lucene;

import java.io.File;
import java.io.IOException;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.util.Version;

public class SearchFiles {
	
	public static void main(String[] args) throws Exception {
		
		String index = "F:\\Lucene_index";
		String field = "contents";
		String queryString = "aa";
		
		int hitsPerPage = 10;
		
		//读取索引
		IndexReader reader = DirectoryReader.open(FSDirectory.open(new File(index)));
		
		//查询索引
		IndexSearcher searcher = new IndexSearcher(reader);
		
		//分析器
		Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_45);

		//解析器
		QueryParser parser = new QueryParser(Version.LUCENE_45, field, analyzer);
		
		Query query = parser.parse(queryString);
		
		System.out.println("Searching for: " + query.toString(field));

		doSearch(searcher, query, hitsPerPage);
		
		reader.close();
	}

	public static void doSearch(IndexSearcher searcher, Query query, int hitsPerPage) throws IOException {
		TopDocs results = searcher.search(query, 5 * hitsPerPage);
		ScoreDoc[] hits = results.scoreDocs;

		int numTotalHits = results.totalHits;
		System.out.println(numTotalHits + " total matching documents");

		int start = 0;
		int end = Math.min(numTotalHits, hitsPerPage);
		
		//遍历查询结果
		for (int i = start; i < end; i++) {
			Document doc = searcher.doc(hits[i].doc);
			
			String path = doc.get("path");
			if (path != null) {
				System.out.println(i + 1 + ". " + path);
			} else {
				System.out.println(i + 1 + ". " + "No path for this document");
			}
		}
	}
}