1.Lucene的下载
Lucene的官网地址是http://lucene.apache.org/,访问该地址,可看到Lucene的最新更新日志如下:点击上述红框中的链接,下载Lucene5.5.5的压缩包文件lucene-5.5.5.zip。
2.创建简单的索引
新建Java项目,并创建Lucene的测试类LuceneTest:package com.xxpsw.demo.lucene;
public class LuceneTest {
}
解压lucene-5.5.5.zip,分别将\lucene-5.5.5\core\下的lucene-core-5.5.5.jar和\lucene-5.5.5\analysis\common\下的lucene-analyzers-common-5.5.5.jar引入项目中。同时引入JUnit4以方便测试。创建索引,方法testCreateIndex如下:
@Test
public void testCreateIndex() throws Exception {
// 索引存放的位置
Directory directory = FSDirectory.open(new File("D://indexDir/").toPath());
// 分词器
Analyzer analyzer = new StandardAnalyzer();
// 索引写入的配置
IndexWriterConfig writerConfig = new IndexWriterConfig(analyzer);
// 构建用于操作索引的类
IndexWriter indexWriter = new IndexWriter(directory, writerConfig);
Document doc = new Document();
// 创建索引文件(字段名称,字段的值,是否存储)
IndexableField field = new IntField("id", 1, Store.YES);
IndexableField title = new StringField("title", "xxpsw的博客", Store.YES);
IndexableField content = new TextField("content", "http://blog.csdn.net/xxpsw"
+ "/article/details/78751630", Store.YES);
doc.add(field);
doc.add(title);
doc.add(content);
indexWriter.addDocument(doc);
indexWriter.close();
}
执行单元测试,查看索引存放路径D://indexDir/,结果如下:
3.Lucene的检索
测试Lucene的检索,方法testSearcher内容如下: @Test
public void testSearcher() throws Exception {
// 索引存放的位置
Directory directory = FSDirectory.open(new File("D://indexDir/").toPath());
// 根据指定的字段及字段值检索
Query query = new TermQuery(new Term("title", "xxpsw的博客"));
IndexReader indexReader = DirectoryReader.open(directory);
IndexSearcher indexSearcher = new IndexSearcher(indexReader);
// 找到符合query条件的前N条记录
TopDocs topDocs = indexSearcher.search(query, 100);
System.out.println("返回的总记录数 ==> " + topDocs.totalHits);
ScoreDoc scoreDocs[] = topDocs.scoreDocs;
for (ScoreDoc scoreDoc : scoreDocs) {
int docID = scoreDoc.doc;
Document document = indexSearcher.doc(docID);
System.out.println("id ==> " + document.get("id"));
System.out.println("title ==> " + document.get("title"));
System.out.println("content ==> " + document.get("content"));
}
}
运行单元测试,控制台打印结果如下: