Lucene是一个搜索引擎的开发工具包,它本身只关注文本的索引和搜索,Lucene可以为你的应用程序添加索引和搜索能力。
Lucene使用:
1. 准备环境,添加Jar包(核心包,分词器和高亮器)
2. 建立索引
1)构建分词器:Analyzer analyzer = new StandardAnalyzer();
2)创建IndexWriter对象:IndexWriter indexWriter = new IndexWriter("C:/luceneDemo",
analyzer,MaxFieldLength.LIMITED);
3)创建Document对象:Document doc = new Document();
4)向Document对象中加入Field字段:doc.add(new Field("title",title, Store.YES, Index.ANALYZED));和doc.add(new Field("content",content,Store.YES,Index.ANALYZED));
5)将Document对象添加到IndexWriter对象中去:indexWriter.addDocument(doc);
6)关闭IndexWriter对象:indexWriter.close();
3. 搜索
1)构建分词器:Analyzer analyzer = new StandardAnalyzer();
2)创建IndexSearcher对象:IndexSearcher indexSearcher = new IndexSearcher("C:/luceneDemo ");
3)构建query对象转换器:QueryParser queryParser = new QueryParser("content",analyzer);
4)将String类型查询条件转换为query对象:Query query = queryParser.parse(queryString);
5)使用IndexSearcher对象查询返回满足条件的TopDocs:TopDocs topDocs = indexSearcher.search(query, null, 1000);
6)使用for循环打印出每条记录对应的标题和内容:
for(ScoreDoc scoreDoc:topDocs.scoreDocs){
Document doc = indexSearcher.doc(scoreDoc.doc);
System.out.println("title:"+doc.getField("title").stringValue());
System.out.println("content:"+doc.getField("content").stringValue());
}
7)关闭IndexSearcher对象:indexSearcher.close();
4.QueryParser的子类MultiFieldQueryParser可以传入一个数组参数,指定搜索文章多个部分
5.Lucene不区分大小写,在创建Term对象时,需要传入指定文章区域内的小写单词,例:
Term term = new Term("content", "may"); 文章content部分"may"为大写"May"
6.查询:
1)关键字查询:
Term term = new Term("content","文");
Query termQuery = new TermQuery(term);
2)范围查询:
Term term = new Term("size","0100");
Term term = new Term("size","0300");
RangeQuery query = new RangeQuery(lowerTerm,upperTerm,true);
3)前缀查询
Term term = new Term("title","luce");
Query query = new PrefixQuery(term);
4)通配符查询
Term term = new Term("content","luce?e");
Query query = new WildcardQuery(term);
注意:?号代表一个字符,*号代表零个或多个字符
5)布尔查询:
BooleanQuery booleanQuery = new BooleanQuery();
booleanQuery.add(termQuery,Occur.Must);
booleanQuery.add(rangeQuery,Occur.Must);