FuzzyQuery:
创建索引:
IndexWriter writer
=
new
IndexWriter(path,
new
StandardAnalyzer(),
false );
writer.setUseCompoundFile( false );
Document doc1 = new Document();
Document doc2 = new Document();
Document doc3 = new Document();
Document doc4 = new Document();
Document doc5 = new Document();
Document doc6 = new Document();
Field f1 = new Field( " content " , " word " , Field.Store.YES,
Field.Index.TOKENIZED);
Field f2 = new Field( " content " , " work " , Field.Store.YES,
Field.Index.TOKENIZED);
Field f3 = new Field( " content " , " seed " , Field.Store.YES,
Field.Index.TOKENIZED);
Field f4 = new Field( " content " , " sword " , Field.Store.YES,
Field.Index.TOKENIZED);
Field f5 = new Field( " content " , " world " , Field.Store.YES,
Field.Index.TOKENIZED);
Field f6 = new Field( " content " , " ford " , Field.Store.YES,
Field.Index.TOKENIZED);
doc1.add(f1);
doc2.add(f2);
doc3.add(f3);
doc4.add(f4);
doc5.add(f5);
doc6.add(f6);
writer.addDocument(doc1);
writer.addDocument(doc2);
writer.addDocument(doc3);
writer.addDocument(doc4);
writer.addDocument(doc5);
writer.addDocument(doc6);
writer.close();
false );
writer.setUseCompoundFile( false );
Document doc1 = new Document();
Document doc2 = new Document();
Document doc3 = new Document();
Document doc4 = new Document();
Document doc5 = new Document();
Document doc6 = new Document();
Field f1 = new Field( " content " , " word " , Field.Store.YES,
Field.Index.TOKENIZED);
Field f2 = new Field( " content " , " work " , Field.Store.YES,
Field.Index.TOKENIZED);
Field f3 = new Field( " content " , " seed " , Field.Store.YES,
Field.Index.TOKENIZED);
Field f4 = new Field( " content " , " sword " , Field.Store.YES,
Field.Index.TOKENIZED);
Field f5 = new Field( " content " , " world " , Field.Store.YES,
Field.Index.TOKENIZED);
Field f6 = new Field( " content " , " ford " , Field.Store.YES,
Field.Index.TOKENIZED);
doc1.add(f1);
doc2.add(f2);
doc3.add(f3);
doc4.add(f4);
doc5.add(f5);
doc6.add(f6);
writer.addDocument(doc1);
writer.addDocument(doc2);
writer.addDocument(doc3);
writer.addDocument(doc4);
writer.addDocument(doc5);
writer.addDocument(doc6);
writer.close();
注:IndexWriter中的create的变量值一般设为true
搜索:
IndexSearcher searcher
=
new
IndexSearcher(path);
// 构建一个Term,然后对其进行模糊查找
Term t = new Term( " content " , " work " );
FuzzyQuery query = new FuzzyQuery(t);
// FuzzyQuery还有两个构造函数,来限制模糊匹配的程度
// 在FuzzyQuery中,默认的匹配度是0.5,当这个值越小时,通过模糊查找出的文档的匹配程度就
// 越低,查出的文档量就越多,反之亦然
FuzzyQuery query1 = new FuzzyQuery(t, 0.1f );
FuzzyQuery query2 = new FuzzyQuery(t, 0.1f , 1 );
Hits hits = searcher.search(query2);
for ( int i = 0 ; i < hits.length(); i ++ )
{
System.out.println(hits.doc(i));
}
searcher.close();
// 构建一个Term,然后对其进行模糊查找
Term t = new Term( " content " , " work " );
FuzzyQuery query = new FuzzyQuery(t);
// FuzzyQuery还有两个构造函数,来限制模糊匹配的程度
// 在FuzzyQuery中,默认的匹配度是0.5,当这个值越小时,通过模糊查找出的文档的匹配程度就
// 越低,查出的文档量就越多,反之亦然
FuzzyQuery query1 = new FuzzyQuery(t, 0.1f );
FuzzyQuery query2 = new FuzzyQuery(t, 0.1f , 1 );
Hits hits = searcher.search(query2);
for ( int i = 0 ; i < hits.length(); i ++ )
{
System.out.println(hits.doc(i));
}
searcher.close();
模糊搜索的三种构造函数,具体讲一下参数的用法(以第三个为例);
第一个参数当然是词条对象,第二个参数指的是levenshtein算法的最小相似度,第三个参数指的是要有多少个前缀字母完全匹配:
通配符就更简单了,只要知道“*”表示0到多个字符,而使用“?”表示一个字符就行了:
IndexSearcher searcher
=
new
IndexSearcher(path);
Term t1 = new Term( " content " , " ?o* " );
WildcardQuery query = new WildcardQuery(t1);
Hits hits = searcher.search(query);
for ( int i = 0 ;i < hits.length();i ++ )
{
System.out.println(hits.doc(i));
}
Term t1 = new Term( " content " , " ?o* " );
WildcardQuery query = new WildcardQuery(t1);
Hits hits = searcher.search(query);
for ( int i = 0 ;i < hits.length();i ++ )
{
System.out.println(hits.doc(i));
}
That“s all!