Lucene笔记_汤阳光_01

站内搜索

 

搜索功能很常见。

搜索是从一大堆资源中快速准确的找出想要的数据。

 

全文检索的定义:

1,搜索的都是文本

2,不处理语义,方式是找出含有指定词的文本。

 

特点

a,不区分英文的大小写。

b,

         1,搜索效果的问题,要准确。

         2,相关度排序(按相关度得分排序,相关度是指结某对搜索条件的匹配程度)

         3,速度。

 

 

搭建Lucene的开发环境只需要加入Lucene的Jar包,要加入的jar包至少要有:

     lucene-core-3.0.1.jar(核心包)

     contrib\analyzers\common\lucene-analyzers-3.0.1.jar(分词器)

     contrib\highlighter\lucene-highlighter-3.0.1.jar(高亮)

     contrib\memory\lucene-memory-3.0.1.jar(高亮)

 

 

Lucene的helloworld例子程序

 

Article.java

 

package cn.itcast._domain;

 

/**

 * 文章实体

 *

 *@author tyg

 *

 */

public class Article {

         privateInteger id;

         privateString title;

         privateString content;

 

         publicInteger getId() {

                   returnid;

         }

 

         publicvoid setId(Integer id) {

                   this.id= id;

         }

 

         publicString getTitle() {

                   returntitle;

         }

 

         publicvoid setTitle(String title) {

                   this.title= title;

         }

 

         publicString getContent() {

                   returncontent;

         }

 

         publicvoid setContent(String content) {

                   this.content= content;

         }

}

 

 

HelloWorld.java(一个建立索引方法,一个查询方法)

public class HelloWorld {

 

         //建立索引(模拟在贴吧中发表了一个文章,会保存到数据库中,并且应该建立索引,以便能搜索到)

         @Test

         publicvoid createIndex() throws Exception {

                   //模拟一条刚保存到数据库中的数据

                   Articlearticle = new Article();

                   article.setId(1);

                   article.setTitle("Lucene是全文检索的框架");

                   article.setContent("如果信息检索系统在用户发出了检索请求后再去互联网上找答案,根本无法在有限的时间内返回结果。");

 

                   //建立索引 ?

                   //>> 1,把Article转成Document

                   Documentdoc = new Document();

                   doc.add(newField("id", article.getId().toString(), Store.YES,

                                     Index.ANALYZED));

                   doc.add(newField("title", article.getTitle(), Store.YES,

                                     Index.ANALYZED));

                   doc.add(newField("content", article.getContent(), Store.YES,

                                     Index.ANALYZED));

 

                   //>> 2,建立索引

                   Directorydirectory = FSDirectory.open(new File("./indexDir/")); // 索引库文件所在的目录

                   Analyzeranalyzer = new StandardAnalyzer(Version.LUCENE_30);

 

                   IndexWriterindexWriter = new IndexWriter(directory, analyzer,

                                     MaxFieldLength.LIMITED);

                   indexWriter.addDocument(doc);

                   indexWriter.close();

         }

 

         //搜索

         @Test

         publicvoid search() throws Exception {

                   //搜索条件

                   //String queryString = "lucene";

                   StringqueryString = "compass";

 

                   //进行搜索,得到结果?

                   //====================================================================

                   Directorydirectory = FSDirectory.open(new File("./indexDir/")); // 索引库文件所在的目录

                   Analyzeranalyzer = new StandardAnalyzer(Version.LUCENE_30);

 

                   //1,把查询字符串转为Query对象

                   QueryParserqueryParser = new QueryParser(Version.LUCENE_30, "title",

                                     analyzer);// 只在title中查询

                   Queryquery = queryParser.parse(queryString);

 

                   //2,查询,得到中间结果

                   IndexSearcherindexSearcher = new IndexSearcher(directory);

                   TopDocstopDocs = indexSearcher.search(query, 100); // 按指定条件条询,只返回前n条结束

                   intcount = topDocs.totalHits; // 总结果数

                   ScoreDoc[]scoreDocs = topDocs.scoreDocs; // 前n条结果的信息

 

                   //3,处理结果

                  List<Article> list = newArrayList<Article>();

                   for(int i = 0; i < scoreDocs.length; i++) {

                            ScoreDocscoreDoc = scoreDocs[i];

                            floatscore = scoreDoc.score; // 相关度得分

                            intdocId = scoreDoc.doc; // Document数据库的内部编号(是唯一的,由Lucene自动生成的)

 

                            //根据编号取出真正的Document数据

                            Documentdoc = indexSearcher.doc(docId);

 

                            //把Document转成Article

                            Articlearticle = new Article();

                            article.setId(Integer.parseInt(doc.get("id")));// 需要转Integer型

                            article.setTitle(doc.get("title"));// doc.getField("title").stringValue()

                            article.setContent(doc.get("content"));

                            list.add(article);

                   }

 

                   indexSearcher.close();

                   //====================================================================

 

                   //显示结果

                   System.out.println("总结果数量为:"+ list.size());

                   for(Article article : list) {

                            System.out.println("-------->id = " + article.getId());

                            System.out.println("title  = " + article.getTitle());

                            System.out.println("content=" + article.getContent());

                   }

         }

}

 

Lucene_1_面向互联网的搜索的应用程序的结构


Lucene_2_对索引的操作


Lucene数据结构


Lucene_3_索引库的结构

 

Lucene_4_索引库的结构-建立索引的过程


Lucene_5_索引库的结构-搜索的过程

 

 

检测java虚拟机退出,并执行一段代码

// 在程序退出前关闭

 

Runtime.getRuntime().addShutdownHook(newThread() {

         @Override

         publicvoid run() { // 在JVM退出前会执行这个run()方法

                   try{

                            indexWriter.close();

                            System.out.println("--IndexWriter已关闭 --");

                   }catch (Exception e) {

                            thrownew RuntimeException(e);

                   }

         }

});

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值