lucene3.0中各检索方法的使用介绍

最新推荐文章于 2022-05-02 16:17:43 发布

单眼皮的心情

最新推荐文章于 2022-05-02 16:17:43 发布

阅读量2.1k

点赞数

分类专栏： Lucene 文章标签： lucene query search string path null

Lucene 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

QueryParser是一个非常通用的帮助类，他的作用是把用户输入的文本转换为内置的Query对象（大多数web搜索引擎都提供一个查询输入框来让用户输入查询条件）。QueryParser内置提供了很多语法使可以使用输入的各种高级条件的 Query。为了保证查询的正确性，最好用创建索引文件时同样的分析器。QueryParser解析字符串时，可以指定查询域，实际可以在字符串中指定一个或多个域。

QueryParser调用静态方法Parse后会返回Query的实例，原子查询。例如：“title:电视台 source:亲亲宝宝”会返回BooleanQuery，“title:电视台”或“电视台”会返回PhraseQuery，“台”会返回TermQuery。

“title:电视台 site:亲亲宝宝” 查询标题为电视台或者来源是亲亲宝宝

“+title:电视台 site:亲亲宝宝” 查询标题必须包含电视台,来源是亲亲宝宝

“-title:电视台 site:亲亲宝宝” 查询标题不能包含电视台,来源是亲亲宝宝

根据这个,可以做类似google,百度的在一个输入框中实现多字段查询,具体使用方法如下:

String field = "contents";

QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, field,analyzer);

Query query = parser.parse(str);

Term是搜索的基本单元。与Field对象类似，它由一对字符串元素组成：字段的名称和字段的值。注意Term对象也和索引过程有关。但是它们是由Lucene内部生成，所以在索引时你一般不必考虑它们。在搜索时，你可能创建Term对象并与TermQuery同时使用

Query q = new TermQuery(new Term(“contents”, “lucene”));

Lucene内建Query对象

TermQuery：词条查询。通过对某个词条的指定，实现检索索引中存在该词条的所有文档。

BooleanQuery：布尔查询。Lucene中包含逻辑关系：“与”，“或”，“非”的复杂查询，最终都会表示成BooleanQuery。布尔查询就是一个由多个子句和子句之间组成的布尔逻辑所组成的查询。

RangeQuery：范围查询。这种范围可以是日期，时间，数字，大小等等。

PrefixQuery：前缀查询。

PhraseQuery：短语查询。默认为完全匹配，但可以指定坡度（Slop，默认为0）改变范围。比如Slop=1，检索短语为“电台”，那么在“电台”中间有一个字的也可以被查找出来，比如“电视台”。

MultiPhraseQuery：多短语查询。

FuzzyQuery：模糊查询。模糊查询使用的匹配算法是levensh-itein算法。此算法在比较两个字符串时，将动作分为3种：加一个字母（Insert），删一个字母（Delete），改变一个字母（Substitute）。

WildcardQuery：通配符查询。“*”号表示0到多个字符，“？”表示单个字符。

SpanQuery：跨度查询。此类为抽象类。

SpanTermQuery：检索效果完全同TermQuery，但内部会记录一些位置信息，供SpanQuery的其它API使用，是其它属于SpanQuery的Query的基础。

SpanFirstQuery：查找方式为从Field的内容起始位置开始，在一个固定的宽度内查找所指定的词条。

SpanNearQuery：功能类似PharaseQuery。SpanNearQuery查找所匹配的不一定是短语，还有可能是另一个SpanQuery的查询结果作为整体考虑，进行嵌套查询。

SpanOrQuery：把所有SpanQuery查询结果综合起来，作为检索结果。

SpanNotQuery：从第一个SpanQuery查询结果中，去掉第二个SpanQuery查询结果，作为检索结果。

代码参考如下：

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.Date;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.BooleanClause;
import org.apache.lucene.search.BooleanQuery;
import org.apache.lucene.search.FuzzyQuery;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.PhraseQuery;
import org.apache.lucene.search.PrefixQuery;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.TermRangeQuery;
import org.apache.lucene.search.WildcardQuery;
import org.apache.lucene.search.spans.SpanFirstQuery;
import org.apache.lucene.search.spans.SpanNearQuery;
import org.apache.lucene.search.spans.SpanNotQuery;
import org.apache.lucene.search.spans.SpanOrQuery;
import org.apache.lucene.search.spans.SpanQuery;
import org.apache.lucene.search.spans.SpanTermQuery;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.store.RAMDirectory;

public class QueryTest {

    static String sIndex_Path="E:/index";
    static String sText_path="E:/textbook";
    static protected String[] keywords = {"001","002","003","004","005"};
    static protected String[] textdetail = {"记录一","记录二","记录三","一 2345 记录","记录新一"};
    static File fIndex_Path=new File(sIndex_Path);

/**===========================================================
* 名称：IndexBuilder
* 功能：构造磁盘索引，添加内容到指定目录，为后继检索查询做好准备
=============================================================**/
public static void IndexBuilder(){
   try{
    Date start = new Date();
    File f=new File(sText_path);
    File[] list=f.listFiles();
    File file2 = new File(sIndex_Path);
    //创建磁盘索引目录
    Directory dir = FSDirectory.open(file2);
    Directory ramdir = new RAMDirectory();
    Analyzer TextAnalyzer = new SimpleAnalyzer();
    //创建磁盘索引
    IndexWriter TextIndex = new IndexWriter(dir, TextAnalyzer, true, IndexWriter.MaxFieldLength.LIMITED);
    //创建内存索引
    IndexWriter RAMTextIndex = new IndexWriter(ramdir,TextAnalyzer,true, IndexWriter.MaxFieldLength.LIMITED);
    for(int i=0;i<list.length;i++){
     Document document = new Document();
     Field field_name = new Field("name", list[1].getName(),
       Field.Store.YES, Field.Index.NOT_ANALYZED);
     document.add(field_name);
     FileInputStream inputfile = new FileInputStream(list[i]);
     int len = inputfile.available();
     byte[] buffer = new byte[len];
     inputfile.read(buffer);
     inputfile.close();

     String contenttext = new String(buffer);
     Field field_content = new Field("content", contenttext,
       Field.Store.YES, Field.Index.ANALYZED);
     document.add(field_content);

     Field field_size = new Field("size",String.valueOf(len),Field.Store.YES,Field.Index.NOT_ANALYZED);
     document.add(field_size);
     TextIndex.addDocument(document);
     TextIndex.optimize();
    }
      //关闭磁盘索引
      TextIndex.close();
      Date end = new Date();
      long tm_index = end.getTime()-start.getTime();
      System.out.print("Total Time:(ms)");
      System.out.println(tm_index);
   }catch(IOException e){
    e.printStackTrace();
   }
   System.out.println("index Sccess");
}

/**===================================================================
*名称：LuceneTermQuery
*功能：构造检索查询器，对指定的目录进行查询，找到指定的值，并输出相应结果
===================================================================**/
public static void LuceneTermQuery(String word){
   try{
    Directory Index_Dir=FSDirectory.open(fIndex_Path);
    IndexSearcher searcher = new IndexSearcher(Index_Dir);
    Term t = new Term("id", "002");
    TermQuery query = new TermQuery(t);
    System.out.print(query.toString());
    ScoreDoc[] hits = searcher.search(query, null, 1000).scoreDocs;
    System.out.println("Search result:");
    for (int i = 0; i < hits.length; i++) {
     Document hitDoc = searcher.doc(hits[i].doc);
        System.out.println(hitDoc.get("fieldname"));
    }
   }catch(IOException e){
    e.printStackTrace();
   }
   System.out.println("Search Success");
}

/**===================================================================
*名称：LuceneRangeQuery
*功能：构造范围检索查询器，对指定的索引进行查询，找到指定的文档，并输
===================================================================**/
public static void LuceneRangeQuery(String lowerTerm, String upperTerm){
   try{
    Directory Index_Dir=FSDirectory.open(fIndex_Path);
    IndexSearcher searcher = new IndexSearcher(Index_Dir);
    TermRangeQuery query = new TermRangeQuery("numval",lowerTerm,upperTerm,true,true);
    System.out.print(query.toString());
    ScoreDoc[] hits = searcher.search(query, null, 1000).scoreDocs;
    System.out.println("Search result:");
    for (int i = 0; i < hits.length; i++) {
     Document hitDoc = searcher.doc(hits[i].doc);
        System.out.println(hitDoc.get("fieldname"));
    }
   }catch(IOException e){
    e.printStackTrace();
   }
   System.out.println("Search Success");
}

/**=========================================================================
*名称：LuceneBooleanQuery
*功能：构造布尔检索查询器，对指定的索引进行查询，找到指定的值，并输出相应的结果
=========================================================================**/
public static void LuceneBooleanQuery(){
   try {
    Directory Index_Dir = FSDirectory.open(fIndex_Path);
    IndexSearcher searcher = new IndexSearcher(Index_Dir);
    Term term1 = new Term("content","记录");
    Term term2 = new Term("content","二");
    TermQuery query1 = new TermQuery(term1);
    TermQuery query2 = new TermQuery(term2);
    BooleanQuery query = new BooleanQuery();
    query.add(query1,BooleanClause.Occur.MUST);
    query.add(query2,BooleanClause.Occur.MUST);
    System.out.println(query.toString());
    ScoreDoc[] hits = searcher.search(query, null, 1000).scoreDocs;
    System.out.println("Search result:");
    for (int i = 0; i < hits.length; i++) {
     Document hitDoc = searcher.doc(hits[i].doc);
        System.out.println(hitDoc.get("fieldname"));
    }
   } catch (IOException e) {
    e.printStackTrace();
   }
   System.out.println("Search Success");
}

/**=========================================================================
* 名称：LucenePrefixQuery
* 功能：构造前缀检索查询器，对指定的目录进行查询，找到指定的值，并输出相应结果
==========================================================================*/
public static void LucenePrefixQuery(String word){
   try {
    Directory Index_Dir = FSDirectory.open(fIndex_Path);
    IndexSearcher searcher = new IndexSearcher(Index_Dir);
    Term term = new Term("content",word);
    PrefixQuery query = new PrefixQuery(term);
    System.out.println(query.toString());
    ScoreDoc[] hits = searcher.search(query, null, 1000).scoreDocs;
    System.out.println("Search result:");
    for (int i = 0; i < hits.length; i++) {
     Document hitDoc = searcher.doc(hits[i].doc);
        System.out.println(hitDoc.get("fieldname"));
    }
   } catch (IOException e) {
    e.printStackTrace();
   }
   System.out.println("Search Success");
}

/**=========================================================================
* 名称：LucenePhraseQuery
* 功能：构造短语检索查询器，对指定的目录进行查询，找到指定的值，并输出相应结果
==========================================================================*/
public static void LucenePhraseQuery(String word1, String word2){
   try {
    Directory Index_Dir = FSDirectory.open(fIndex_Path);
    IndexSearcher searcher = new IndexSearcher(Index_Dir);
    Term term1 = new Term("content",word1);
    Term term2 = new Term("content",word2);
    PhraseQuery query = new PhraseQuery();
    query.add(term1);
    query.add(term2);
    System.out.println(query.toString());
    ScoreDoc[] hits = searcher.search(query, null, 1000).scoreDocs;
    System.out.println("Search result:");
    for (int i = 0; i < hits.length; i++) {
     Document hitDoc = searcher.doc(hits[i].doc);
        System.out.println(hitDoc.get("fieldname"));
    }
   } catch (IOException e) {
    e.printStackTrace();
   }
   System.out.println("Search Success");
}

/**=========================================================================
* 名称：LuceneFuzzyQuery
* 功能：构造模糊检索查询器，对指定的目录进行查询，找到指定的值，并输出相应结果
==========================================================================*/
public static void LuceneFuzzyQuery(String word){
   try {
    Directory Index_Dir = FSDirectory.open(fIndex_Path);
    IndexSearcher searcher = new IndexSearcher(Index_Dir);
    Term term = new Term("content",word);
    FuzzyQuery query = new FuzzyQuery(term);
    System.out.println(query.toString());
    ScoreDoc[] hits = searcher.search(query, null, 1000).scoreDocs;
    System.out.println("Search result:");
    for (int i = 0; i < hits.length; i++) {
     Document hitDoc = searcher.doc(hits[i].doc);
        System.out.println(hitDoc.get("fieldname"));
    }
   } catch (IOException e) {
    e.printStackTrace();
   }
   System.out.println("Search Success");
}

/**=========================================================================
* 名称：LuceneWildcardQuery
* 功能：构造通配符检索查询器，对指定的目录进行查询，找到指定的值，并输出相应结果
==========================================================================*/
public static void LuceneWildcardQuery(String word){
   try {
    Directory Index_Dir = FSDirectory.open(fIndex_Path);
    IndexSearcher searcher = new IndexSearcher(Index_Dir);
    Term term = new Term("content",word);
    WildcardQuery query = new WildcardQuery(term);
    System.out.println(query.toString());
    ScoreDoc[] hits = searcher.search(query, null, 1000).scoreDocs;
    System.out.println("Search result:");
    for (int i = 0; i < hits.length; i++) {
     Document hitDoc = searcher.doc(hits[i].doc);
        System.out.println(hitDoc.get("fieldname"));
    }
   } catch (IOException e) {
    e.printStackTrace();
   }
   System.out.println("Search Success");
}

/**=========================================================================
* 名称：LuceneSpanFirstQuery
* 功能：构造SpanQuery检索查询器，对指定的目录进行查询，找到指定的值，并输出相应结果
==========================================================================*/
public static void LuceneSpanFirstQuery(String word){
   try {
    Directory Index_Dir = FSDirectory.open(fIndex_Path);
    IndexSearcher searcher = new IndexSearcher(Index_Dir);
    Term term = new Term("content",word);
    SpanTermQuery query = new SpanTermQuery(term);
    SpanFirstQuery firstquery = new SpanFirstQuery(query,2);
    System.out.println(firstquery.toString());
    ScoreDoc[] hits = searcher.search(query, null, 1000).scoreDocs;
    System.out.println("Search result:");
    for (int i = 0; i < hits.length; i++) {
     Document hitDoc = searcher.doc(hits[i].doc);
        System.out.println(hitDoc.get("fieldname"));
    }
   } catch (IOException e) {
    e.printStackTrace();
   }
   System.out.println("Search Success");
}

/**=========================================================================
* 名称：LuceneSpanNearQuery
* 功能：构造SpanQuery检索查询器，对指定的目录进行查询，找到指定的值，并输出相应结果
==========================================================================*/
public static void LuceneSpanNearQuery(String word1,String word2,String word3){
   try {
    Directory Index_Dir = FSDirectory.open(fIndex_Path);
    IndexSearcher searcher = new IndexSearcher(Index_Dir);
    Term term1 = new Term("content",word1);
    Term term2 = new Term("content",word2);
    Term term3 = new Term("content",word3);
    SpanTermQuery query1 = new SpanTermQuery(term1);
    SpanTermQuery query2 = new SpanTermQuery(term2);
    SpanTermQuery query3 = new SpanTermQuery(term3);
    SpanQuery[] queryarray = new SpanQuery[]{query1,query2,query3};
    SpanNearQuery nearquery = new SpanNearQuery(queryarray,1,true);
    System.out.println(nearquery.toString());
    ScoreDoc[] hits = searcher.search(nearquery, null, 1000).scoreDocs;
    System.out.println("Search result:");
    for (int i = 0; i < hits.length; i++) {
     Document hitDoc = searcher.doc(hits[i].doc);
        System.out.println(hitDoc.get("fieldname"));
    }
   } catch (IOException e) {
    e.printStackTrace();
   }
   System.out.println("Search Success");
}

/**=========================================================================
* 名称：LuceneSpanNotQuery
* 功能：构造SpanQuery检索查询器，对指定的目录进行查询，找到指定的值，并输出相应结果
==========================================================================*/
public static void LuceneSpanNotQuery(String word1,String word2,String word3){
   try {
    Directory Index_Dir = FSDirectory.open(fIndex_Path);
    IndexSearcher searcher = new IndexSearcher(Index_Dir);
    Term term1 = new Term("content",word1);
    Term term2 = new Term("content",word2);
    Term term3 = new Term("content",word3);
    SpanTermQuery query1 = new SpanTermQuery(term1);
    SpanTermQuery query2 = new SpanTermQuery(term2);
    SpanTermQuery query3 = new SpanTermQuery(term3);
    SpanQuery[] queryarray = new SpanQuery[]{query1,query2};
    SpanNearQuery nearquery = new SpanNearQuery(queryarray,1,true);
    SpanNotQuery notquery = new SpanNotQuery(nearquery,query3);
    System.out.println(notquery.toString());
    ScoreDoc[] hits = searcher.search(notquery, null, 1000).scoreDocs;
    System.out.println("Search result:");
    for (int i = 0; i < hits.length; i++) {
     Document hitDoc = searcher.doc(hits[i].doc);
        System.out.println(hitDoc.get("fieldname"));
    }
   } catch (IOException e) {
    e.printStackTrace();
   }
   System.out.println("Search Success");
}

/**=========================================================================
* 名称：LuceneSpanOrQuery
* 功能：构造SpanQuery检索查询器，对指定的目录进行查询，找到指定的值，并输出相应结果
==========================================================================*/
public static void LuceneSpanOrQuery(String word1,String word2,String word3){
   try {
    Directory Index_Dir = FSDirectory.open(fIndex_Path);
    IndexSearcher searcher = new IndexSearcher(Index_Dir);
    Term term1 = new Term("content",word1);
    Term term2 = new Term("content",word2);
    Term term3 = new Term("content",word3);
    SpanTermQuery query1 = new SpanTermQuery(term1);
    SpanTermQuery query2 = new SpanTermQuery(term2);
    SpanTermQuery query3 = new SpanTermQuery(term3);
    SpanQuery[] queryarray1 = new SpanQuery[]{query1,query2};
    SpanQuery[] queryarray2 = new SpanQuery[]{query2,query3};
    SpanNearQuery nearquery1 = new SpanNearQuery(queryarray1,1,true);
    SpanNearQuery nearquery2 = new SpanNearQuery(queryarray2,1,true);
    SpanOrQuery orquery = new SpanOrQuery(new SpanNearQuery[]{nearquery1,nearquery2});
    System.out.println(orquery.toString());
    ScoreDoc[] hits = searcher.search(orquery, null, 1000).scoreDocs;
    System.out.println("Search result:");
    for (int i = 0; i < hits.length; i++) {
     Document hitDoc = searcher.doc(hits[i].doc);
        System.out.println(hitDoc.get("fieldname"));
    }
   } catch (IOException e) {
    e.printStackTrace();
   }
   System.out.println("Search Success");
}

}

单眼皮的心情

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
lucene3.0中各检索方法的使用介绍

<br /> QueryParser是一个非常通用的帮助类，他的作用是把用户输入的文本转换为内置的Query对象（大多数web搜索引擎都提供一个查询输入框来让用户输入查询条件）。QueryParser内置提供了很多语法使可以使用输入的各种高级条件的 Query。为了保证查询的正确性，最好用创建索引文件时同样的分析器。QueryParser解析字符串时，可以指定查询域，实际可以在字符串中指定一个或多个域。 QueryParser调用静态方法Parse后会返回Query的实例，原子查询
复制链接

扫一扫