Lucene实战（一）Lucene介绍及HelloWorld运行（附Eclipse工程）

最新推荐文章于 2024-10-17 09:00:00 发布

明天花会开

最新推荐文章于 2024-10-17 09:00:00 发布

阅读量2k

点赞数

分类专栏： Lucene 文章标签： Lucene 3.5 实战教程

本文链接：https://blog.csdn.net/mthhk008/article/details/24771587

版权

Lucene 专栏收录该内容

4 篇文章

订阅专栏

前言

给你一张过去的CD，听听我们有过的思绪~~~~~

Lucene简介

Lucene是一个开源的、成熟的java检索库。它为许多文档（Document）维护了一个倒排索引表，并且向外表现出了简单易用的API。更多有关Lucene的介绍可以参看Lucene的百科。

下图展现了Lucene的索引处理和检索处理的流程（点击图片放大）：

下面的表格描述了Lucene中各包的作用。

^包^名	^功^能
^{org.apache.lucene.analysis}	^{语言分析器，主要用于切词，中文切词可以扩展此类}
^{org.apache.lucene.document}	^{索引存储时的文档结构管理，类似于关系型数据库的表结构}
^{org.apache.lucene.index}	^{索引管理，包括索引建立、删除等}
^{org.apache.lucene.queryParser}	^{查询分析器，实现查询关键词的运算，如与、或、非等}
^{org.apache.lucene.search}	^{检索管理，根据查询条件，检索得到结果}
^{org.apache.lucene.store}	^{数据存储管理，主要包括一些底层的}^I/O^操作
^{org.apache.lucene.util}	^{一些公用类}

Lucene入门应用

上图中，红色部分是我们需要利用Lucene的API来进行干涉的，不过这些都非常容易。下面是利用Lucene实现全文检索功能的一般步骤（未整合任何框架）：

创建索引

package org.xiaom.lucene;

import java.io.BufferedReader;

public class MyIndexCreater {
	private static IndexWriter indexWriter;
	private static Version version = Version.LUCENE_35;
	/**
	 * 为该目录<strong>及其子目录</strong>下所有的文本文件（.java;.xml;.txt)创建索引
	 * @param docPath 文档存放路径
	 * @param indexPath　索引存放路径
	 */
	public static void createContainChild(String docPath, String indexPath)
			throws IOException {
		File docDir = new File(docPath);
		File indexDir = new File(indexPath);
		//1,打开索引的存放目录
		Directory directory = FSDirectory.open(indexDir);
		//2,创建IndexWriterConfig
		IndexWriterConfig conf = new IndexWriterConfig(version,new StandardAnalyzer(version));
		//每次都覆盖之前的索引文件
		conf.setOpenMode(OpenMode.CREATE);
		//根据IndexWriterConfig实例创建IndexWriter
		indexWriter = new IndexWriter(directory, conf);
		
		indexDir(docDir);
		//7,提交，关闭indexWrtier(必须)
		indexWriter.commit();
		indexWriter.close();
	}
	// 该目录及其子目录创建索引，返回索引文件总数
	private static int indexDir(File dir) {
		int c = 0;
		File[] files = dir.listFiles();
		for (File f : files) {
			if (f.isDirectory()) {
				indexDir(f);
			} else if (f.getName().endsWith(".java")
					|| f.getName().endsWith(".txt")
					|| f.getName().endsWith(".xml")) {
				c += indexFile(f);
			}
		}
		return c;
	}
	//为某个文件创建索引,索引成功返回1,失败0
	private static int indexFile(File f) {
		boolean rs = true;
		BufferedReader br = null;
		String titleStr = null;
		StringBuffer contentStr = new StringBuffer();
		try {
			br = new BufferedReader(new FileReader(f));
			titleStr = br.readLine();
			String s;
			while((s=br.readLine())!=null){
				contentStr.append(s);
				contentStr.append("\n");
			}
			//3,创建Document对象
			Document doc = new Document();
			//4,创建Field对象
			Field name = new Field("name", f.getName(), Store.YES, Index.ANALYZED);
			Field title = new Field("title", titleStr, Store.YES, Index.ANALYZED);
			Field content = new Field("content", contentStr.toString(), Store.YES,Index.ANALYZED);
			//5,将Field对象加入到Document
			doc.add(name);
			doc.add(title);
			doc.add(content);
			//6,将Document加入到indexWriter
			indexWriter.addDocument(doc);
		} catch (Exception e) {
			e.printStackTrace();
			rs = false;
		}
		return rs ? 1 : 0;
	}
}

搜索

package org.xiaom.lucene;

import java.io.File;

public class MyIndexSearcher {
	private static Version version=Version.LUCENE_35;
	/**
	 * @param indexPath 索引存放路径
	 * @param key 搜索关键字
	 * @param value 关键字的值
	 */
	public static void search(String indexPath, String key, String value) {
		IndexReader ireader = null;
		try {
			//1,创建IndexReader
			ireader = IndexReader.open(FSDirectory.open(new File(indexPath)));
			//2,根据indexReader实例创建IndexSearcher
			IndexSearcher indexSearcher = new IndexSearcher(ireader);
			//3,创建QueryParser
			QueryParser queryParser =new QueryParser(version,key,new StandardAnalyzer(version));
			//4,通过queryParser解析出Query
			Query query=queryParser.parse(value);
			//5,使用TopDocs接收indexSearcher.searche的返回值
			TopDocs topDocs=indexSearcher.search(query,100);
			ScoreDoc[] scoreDocs=topDocs.scoreDocs;
			//6,获取Document输出
			System.err.println("total hit:"+topDocs.totalHits);
			System.out.println("total document:"+scoreDocs.length);
			System.err.println("==================================================");
			for(int i=0;i<scoreDocs.length;i++){
				Document doc=indexSearcher.doc(scoreDocs[i].doc);
				System.out.println("name:"+doc.get("name"));
				System.err.println("title:"+doc.get("title"));
				System.out.println("score:"+scoreDocs[i].score);
				System.err.println("content:"+doc.get("content").substring(0, 80));
			}
		} catch (CorruptIndexException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		} catch (ParseException e) {
			e.printStackTrace();
		}
	}
}

测试检索

package org.xiaom.lucene;

import java.io.IOException;

public class LuceneTest {
public static void main(String[] args) throws IOException {
	String docPath="D:/test1/docs";
	String indexPath="D:/test1/index";
	MyIndexCreater.createContainChild(docPath, indexPath);
	MyIndexSearcher.search(indexPath, "content", "adfddd");
}
}

这里是一个Lucene3.5入门实例下载

维护索引

维护索引一般有如下几种操作

增加索引(见上文)
删除索引

//删除某些满足条件的索引及Document
	public boolean delete(Term term){
		boolean rs=true;
		try {
			indexWriter.deleteDocuments(term);
		} catch (CorruptIndexException e) {
			e.printStackTrace();
			rs=false;
		} catch (IOException e) {
			rs=false;
			e.printStackTrace();
		}
		return rs;
	}

更新（删除索引后新增）索引

public boolean update(Document doc){
		boolean rs=true;
		try {
			indexWriter.addDocument(doc);
		} catch (CorruptIndexException e) {
			rs=false;
			e.printStackTrace();
		} catch (IOException e) {
			rs=false;
			e.printStackTrace();
		}
		return rs;
	}