lucene3.5学习笔记02--创建索引和建立搜索-CSDN博客

本文链接：https://blog.csdn.net/honor571/article/details/7318280

先大致了解一下lucene的组成结构

lucene的组成结构：对于外部应用来说索引模块(index)和检索模块(search)是主要的外部应用入口

org.apache.Lucene.search/	搜索入口
org.apache.Lucene.index/	索引入口
org.apache.Lucene.analysis/	语言分析器
org.apache.Lucene.queryParser/	查询分析器
org.apache.Lucene.document/	存储结构
org.apache.Lucene.store/	底层IO/存储结构
org.apache.Lucene.util/	一些公用的数据结构

接下来，我们构建一个最简单的文件搜索样例

先在我的电脑里面创建两个空文件夹
E:\lucene\data 用来存放数据，代表要搜索的文件
E:\lucene\index 原来存放lucene为数据创建的索引文件
构造一点假数据
E:\lucene\data\1.txt        内容为 a1a2a3
E:\lucene\data\2.txt        内容为 b1b2b3
E:\lucene\data\3.txt        内容为 c1c2c3 honor

建立索引

public static void createIndex(String filePath, String indexPath) throws IOException {
		Version version = Version.LUCENE_35;
		
		File indexFile = new File(indexPath);
		FSDirectory directory = FSDirectory.open(indexFile);

		IndexWriterConfig conf = new IndexWriterConfig(version, new SimpleAnalyzer(version));
		IndexWriter writer = new IndexWriter(directory, conf);

		List<File> files = FileList.getFiles(filePath);// 获取该路径下所有文件
		for(File file:files){
			System.out.println("Indexing file " + file);

			// 构造Document对象
			Document doc = new Document();
			
			doc.add(new Field("filename", file.getName(), Field.Store.YES, Field.Index.ANALYZED));
			
			doc.add(new Field("uri", file.getPath(), Field.Store.YES, Field.Index.NO));
			
			String text = FileText.getText(file);// 获取该文件内容
			doc.add(new Field("text", text, Field.Store.YES, Field.Index.ANALYZED));//将文件内容索引在text
			// 将文档写入索引
			writer.addDocument(doc);
		}
		
		// 关闭写索引器
		writer.close();
	}

public static void main(String[] args) {
		String filePath = "E:/lucene/data";
		String indexPath = "E:/lucene/index";

		//
		try{
			createIndex(filePath, indexPath);
		}catch(IOException e){
			e.printStackTrace();
		}
}

这时E:\lucene\index\ 目录下生成的索引文件如下

建立搜索

public static void search(String keyword, String indexPath) throws CorruptIndexException, IOException, ParseException {
		Version version = Version.LUCENE_35;
		
		// 指向索引目录的搜索器
		File indexFile = new File(indexPath);
		FSDirectory directory = FSDirectory.open(indexFile);
		IndexReader reader = IndexReader.open(directory);
		IndexSearcher searcher = new IndexSearcher(reader);

		// 查询解析器：使用和索引同样的语言分析器    查询text字段
		QueryParser parser = new QueryParser(version, "text", new SimpleAnalyzer(version));// text 字段
		Query query = parser.parse(keyword);

		// 搜索结果使用Hits存储
		TopDocs hits = searcher.search(query, null, 10);

		// 通过hits可以访问到相应字段的数据和查询的匹配度
		System.out.println(hits.totalHits + " total results");
		
		
		System.out.println("-----匹配结果如下------");
		ScoreDoc[] scoredocs = hits.scoreDocs;
		for(int i = 0; i < scoredocs.length; i++){
			ScoreDoc scoreDoc = scoredocs[i];
			
			Document d = searcher.doc(scoreDoc.doc);
			String path = d.get("uri");
			System.out.println(i + "--得分:" +scoreDoc.score +" 文件路径:"+path);
		}

		searcher.close();
	}

public static void main(String[] args) {
		String indexPath = "E:/lucene/index";

		try{
			// 搜索 honor 这个关键字
			search("honor",indexPath);
		}catch(CorruptIndexException e){
			e.printStackTrace();
		}catch(IOException e){
			e.printStackTrace();
		}catch(ParseException e){
			e.printStackTrace();
		}
	}

控制台输出如下

1 total results
-----匹配结果如下------
0--得分:0.70273256 文件路径:E:\lucene\data\3.txt

怎么样,利用lucene实现检索很简单吧
由于没有涉及到中文,使用lucene自带的分析器就可以了
要是中文还得使用中文分词器，这个接下来再学习