如何用lucene实现创建索引和搜索索引

最新推荐文章于 2022-05-28 16:27:58 发布

chunhun2282

最新推荐文章于 2022-05-28 16:27:58 发布

阅读量534

点赞数

文章标签：数据库 java

原文链接：https://my.oschina.net/u/4117393/blog/3040573

版权

1：了解Luence

首先Lucene 是 apache 下的一个开源的全文检索引擎工具包。

全文检索（ Full-text Search ）

那全文检索就是先分词创建索引，再执行搜索的过程。

分词：就是将一段文字分成一个个单词全文检索就将一段文字分成一个个单词去查询数据！！！

2：Lucene 实现全文检索的流程

索引并非一个过程，而分为两步

全文检索的流程分为两大部分：索引流程、搜索流程。

索引流程 ：采集数据 ---> 构建文档对象 ---> 创建索引 ( 将文档写入索引库 ) 。

搜索流程 ：创建查询 ---> 执行搜索 ---> 渲染搜索结果。

2.1创建项目，导入包

mysql5.1 驱动包： mysql-connector-java-5.1.7-bin.jar

核心包： lucene-core-4.10.3.jar

分析器通用包： lucene-analyzers-common-4.10.3.jar

查询解析器包： lucene-queryparser-4.10.3.jar

2.2创建索引

步骤说明：

（ 1 ）采集数据

首先，先创建相关对象，创建一个数据库查询出所有数据的方法

如（查询出所有book的数据）

（ 2 ）将数据转换成 Lucene 文档

Lucene 是使用 文档类型 来封装数据的，所有需要先将采集的数据转换成文档类型。

public List getDocuments(List books ){

// Document 对象集合

List docList = new ArrayList();

// Document 对象

Document doc = null ;

for (Book book : books ) {

// 创建 Document 对象，同时要创建 field 对象

doc = new Document();

// 根据需求创建不同的 Field

Field id = new TextField( "id" , book .getBookId().toString(),

Store. YES );

Field name = new TextField( "name" , book .getName(), Store. YES );

Field price = new TextField( "price" ,

book .getPrice().toString(),Store. YES );

Field pic = new TextField( "pic" , book .getPic(), Store. YES );

Field desc = new TextField( "description" ,

book .getDescription(), Store. YES );

// 把域（ Field ）添加到文档（ Document ）中

doc .add( id );

doc .add( name );

doc .add( price );

doc .add( pic );

doc .add( desc );

docList .add( doc );

}

return docList ;

}

（ 3 ）将文档写入索引库，创建索引

Lucene 是在将文档 写入索引库的过程中，自动完成分词、创建索引 的。因此创建索引库，从形式上看，就是将文档写入索引库

public void createIndex(){

try {

BookDao dao = new BookDao();

// 分析文档，对文档中的 field 域进行分词

Analyzer analyzer = new StandardAnalyzer();

// 创建索引

// 1) 创建索引库目录

Directory directory = FSDirectory. open ( new

File( "F:\\lucene\\0719" ));

// 2) 创建 IndexWriterConfig 对象

IndexWriterConfig cfg = new IndexWriterConfig(Version. LATEST ,

analyzer );

// 3) 创建 IndexWriter 对象

IndexWriter writer = new IndexWriter( directory , cfg );

// 4) 通过 IndexWriter 对象添加文档对象（ document ）

writer .addDocuments( dao .getDocuments( dao .getAll()));

// 5) 关闭 IndexWriter

writer .close();

System. out .println( " 创建索引库成功 " );

} catch (Exception e ) {

e .printStackTrace();

}

3搜索索引

说明搜索的时候，需要指定搜索哪一个域（也就是字段），并且，还要对搜索的关键词做分词处理。

3.1 执行搜索

public void search Document ByIndex(){

try {

// 1 、创建查询（ Query 对象）

// 创建分析器

Analyzer analyzer = new StandardAnalyzer();

QueryParser queryParser = new QueryParser( "name" , analyzer );

Query query = queryParser .parse( "name:java 教程 " );

// 2 、执行搜索

// a) 指定索引库目录

Directory directory = FSDirectory. open ( new

File( "F:\\lucene\\0719" ));

// b) 创建 IndexReader 对象

IndexReader reader = DirectoryReader. open ( directory );

// c) 创建 IndexSearcher 对象

IndexSearcher searcher = new IndexSearcher( reader );

// d) 通过 IndexSearcher 对象执行查询索引库，返回 TopDocs 对象

// 第一个参数：查询对象

// 第二个参数：最大的 n 条记录

TopDocs topDocs = searcher .search( query , 10);

// e) 提取 TopDocs 对象中前 n 条记录

ScoreDoc[] scoreDocs = topDocs . scoreDocs ;

System. out .println( " 查询出文档个数为： " + topDocs . totalHits );

for (ScoreDoc scoreDoc : scoreDocs ) {

// 文档对象 ID

int docId = scoreDoc . doc ;

Document doc = searcher .doc( docId );

// f) 输出文档内容

System. out .println( "===============================" );

System. out .println( " 文档 id:" + docId );

System. out .println( " 图书 id:" + doc .get( "id" ));

System. out .println( " 图书 name:" + doc .get( "name" ));

System. out .println( " 图书 price:" + doc .get( "price" ));

System. out .println( " 图书 pic:" + doc .get( "pic" ));

System. out .println( " 图书 description:" +

doc .get( "description" ));

}

// g) 关闭 IndexReader

reader .close();

} catch (Exception e ) {

// TODO Auto-generated catch block

e .printStackTrace();

}

转载于:https://my.oschina.net/u/4117393/blog/3040573

chunhun2282

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
如何用lucene实现创建索引和搜索索引

1：了解Luence 首先Lucene 是 apache 下的一个开源的全文检索引擎工具包。全文检索（ Full-text Search ）那全文检索就是先分词创建索引，再执行搜索的过程。 ...
复制链接

扫一扫