最近为了实现查询优化,驱除数据库中的LIKE语句,通过查阅,发现通过全文检索就OK,而在这个领域,Luence是最好的选择了,所以就开始揣摩这个东西,把它搞懂就OK。
Luence API也有很多,Luence.net是针对C#.net的一个版本,其实思路是一模一样的。
下载DLL,很简单哦,网络上到处都是,需要引用:Lucene.Net.dll,但是在程序中using的时候要全部命名空间。
Code
using Lucene.Net.Analysis;
using Lucene.Net.Analysis.Standard;
using Lucene.Net.Analysis.Cn; //需要单独下载Lucene.Net.Analysis.Cn.dll,并引用才可以
using Lucene.Net.Documents;
using Lucene.Net.Index;
using Lucene.Net.QueryParsers;
using Lucene.Net.Search;
using Lucene.Net.Store;
using Lucene.Net.Util;
using Lucene.Net.Analysis;
using Lucene.Net.Analysis.Standard;
using Lucene.Net.Analysis.Cn; //需要单独下载Lucene.Net.Analysis.Cn.dll,并引用才可以
using Lucene.Net.Documents;
using Lucene.Net.Index;
using Lucene.Net.QueryParsers;
using Lucene.Net.Search;
using Lucene.Net.Store;
using Lucene.Net.Util;
思路:
1、先通过创建索引writer,把所有索引内容全部通过Lucene.Documents的document来ADD进来。然后,索引writer再ADD这个document对象:
建立索引
1 private void Index()
2 {
3 DBMaster dbm=new DBMaster();
4 DataTable dt=dbm.GetDt("select * from Sheet");
5 IndexWriter write = new IndexWriter(directory, analyzer, true);
6 foreach (DataRow dr in dt.Rows)
7 {
8 Document doc = new Document();
9 doc.Add(new Field(FieldName, dr[0].ToString(), Field.Store.YES, Field.Index.TOKENIZED)); //TOKENIZED:分词,建索引 UN_TOKENIZED:不分词,建索引(中文的时候会失败) Text(String,String) <==> Store.YES,Index.TOKENIZED.Keyword <==> Store.YES,Index.UN_TOKENIZED; UnIndexed <==> Store.YES,Index.NO; UnStored <==> Store.NO,Index.TOKENIZED; Text(String, Reader) <==> Store.NO,Index.TOKENIZED;
10 write.AddDocument(doc);
11 }
12 write.Optimize();
13 write.Close();
14 }
1 private void Index()
2 {
3 DBMaster dbm=new DBMaster();
4 DataTable dt=dbm.GetDt("select * from Sheet");
5 IndexWriter write = new IndexWriter(directory, analyzer, true);
6 foreach (DataRow dr in dt.Rows)
7 {
8 Document doc = new Document();
9 doc.Add(new Field(FieldName, dr[0].ToString(), Field.Store.YES, Field.Index.TOKENIZED)); //TOKENIZED:分词,建索引 UN_TOKENIZED:不分词,建索引(中文的时候会失败) Text(String,String) <==> Store.YES,Index.TOKENIZED.Keyword <==> Store.YES,Index.UN_TOKENIZED; UnIndexed <==> Store.YES,Index.NO; UnStored <==> Store.NO,Index.TOKENIZED; Text(String, Reader) <==> Store.NO,Index.TOKENIZED;
10 write.AddDocument(doc);
11 }
12 write.Optimize();
13 write.Close();
14 }
2、建立索引Search,通过索引(查询)的方法,Search(Query),首先要实例化一个Query=QueryParser()---new QueryParser(FieldName, analyzer) 就OK,接着,将这个结果赋给Hits对象,用Hits对象=Search(Query),这个对象就是出现搜索的结果
索引Searcher
public string Search()
{
string restr = "";
IndexSearcher searcher = new IndexSearcher(directory); //建立所以Searcher
QueryParser queryparser = new QueryParser(FieldName, analyzer); //FieldName:被索引数据的字段名,自己定义,和索引writer一致就OK,
Query query = queryparser.Parse("李娟"); //用QueryParser.Parse方法实例化一个查询
Hits hits = searcher.Search(query); //搜索结果 hits----{weight(AH:"李 娟")} FieldName:被索引内容
restr += String.Format("符合条件记录:{0}; 索引库记录总数:{1}", hits.Length(), searcher.Reader.NumDocs()+"<br/>"); //输出自己想要的
for (int i = 0; i < hits.Length(); i++)
{
int docId = hits.Id(i);
string name = hits.Doc(i).Get(FieldName); //"李 娟" Get("AH")
float score = hits.Score(i);
restr += String.Format("{0}: DocId:{1}; AH:{2}; Score:{3}", i + 1, docId, name, score);
}
searcher.Close();
return restr;
}
public string Search()
{
string restr = "";
IndexSearcher searcher = new IndexSearcher(directory); //建立所以Searcher
QueryParser queryparser = new QueryParser(FieldName, analyzer); //FieldName:被索引数据的字段名,自己定义,和索引writer一致就OK,
Query query = queryparser.Parse("李娟"); //用QueryParser.Parse方法实例化一个查询
Hits hits = searcher.Search(query); //搜索结果 hits----{weight(AH:"李 娟")} FieldName:被索引内容
restr += String.Format("符合条件记录:{0}; 索引库记录总数:{1}", hits.Length(), searcher.Reader.NumDocs()+"<br/>"); //输出自己想要的
for (int i = 0; i < hits.Length(); i++)
{
int docId = hits.Id(i);
string name = hits.Doc(i).Get(FieldName); //"李 娟" Get("AH")
float score = hits.Score(i);
restr += String.Format("{0}: DocId:{1}; AH:{2}; Score:{3}", i + 1, docId, name, score);
}
searcher.Close();
return restr;
}
3、整合思路:先创建索引writer,再创建索引searcher的结果就是查询结果了:
整合
private const string FieldName = "AH";
private Directory directory = new RAMDirectory();
private Analyzer analyzer = new StandardAnalyzer();
//private ChineseAnalyzer analyzer = new ChineseAnalyzer(); //搜索中文,要单独引入Lucene.Net.Analysis.Cn.dll 但是不引入的时候,修改了Field的参数为Field.Index.TOKENIZED就OK,但是引入了也不能搜索中文,奇怪!
public Searcher()
{
Index();
string outstr=Search();
}
private const string FieldName = "AH";
private Directory directory = new RAMDirectory();
private Analyzer analyzer = new StandardAnalyzer();
//private ChineseAnalyzer analyzer = new ChineseAnalyzer(); //搜索中文,要单独引入Lucene.Net.Analysis.Cn.dll 但是不引入的时候,修改了Field的参数为Field.Index.TOKENIZED就OK,但是引入了也不能搜索中文,奇怪!
public Searcher()
{
Index();
string outstr=Search();
}
总结:思路很简单,但是在自己揣摩的过程中从不懂到掌握,在之后,我还会深入这个Luence,真的很好用!!待续......
http://www.cnblogs.com/winvay/articles/1587349.html