Lucene是一个功能强大的全文检索工具,许多搜索都是基于它来做的,但是他的每个版本之间,都有差异,不同版本建的索引库,都必须要用建索引时用的那个版本的API才能读;兼容性做的不是很好,也可能是为了性能,再优化之后就把以前的API废除掉了。
我最近在做近实时搜索时,就遇到了这问题,Lucene4.4之前,一般都是用NRTManager,后台线程定期刷新索引的Reader 和Writer,用户不容关心,只要通过NRTManager就可以添加,更新和删除文档,并且通过NRTManager获取IndexSearch,马上就可以查到新增的文档,实现近实时搜索。具体的代码:
初始化NRTManager ,并开启一个后台线程:
Directory d = FSDirectory.open(new File("c:/index"));
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_35);
IndexWriterConfig iwc= new IndexWriterConfig(Version.LUCENE_35, analyzer);
indexWriter = new IndexWriter(d, iwc);
NRTManager nrtMgr= new NRTManager(indexWriter , new SearcherWarmer(){
@Override
public void warm(IndexSearcher s) throws IOException{
}
});
NRTManagerReopenThread nrtManagerReopenThread = new NRTManagerReopenThread(nrtMgr, 5.0, 0.025);
nrtManagerReopenThread.setName("nrt reopen thread");
nrtManagerReopenThread .setDaemon(true);
nrtManagerReopenThread .start();
SearcherManager mgr= nrtMgr.getSearcherManager(true);
添加文档:
nrtMgr.add(...);
获取索引:
indexSearcher = mgr.acquire();
......
searcherManager.release(indexSearcher );
Lucene4.4之后,NRTManager 及NRTManagerReopenThread 已经都没有了,如果做近实时搜索的话,就要这么做,
初始化:
Directory directory = new RAMDirectory();
IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_48, new StandardAnalyzer(ver));
IndexWriter indexWriter = new IndexWriter(directory, iwc);
TrackingIndexWriter trackWriter = new TrackingIndexWriter(indexWriter);
searcherManager = new SearcherManager(indexWriter, true, new SearcherFactory());
ControlledRealTimeReopenThread<IndexSearcher> CRTReopenThread =
new ControlledRealTimeReopenThread<IndexSearcher>(trackWriter, searcherManager, 5.0, 0.025) ;
CRTReopenThread.setDaemon(true);
CRTReopenThread.setName("后台刷新服务");
CRTReopenThread.start();
添加文档:
trackWriter.addDocument(doc);
进行搜索:
IndexSearcher searcher = searcherManager.acquire();
......
searcherManager.release(searcher);
我在找了源代码里找了很久才发现ControlledRealTimeReopenThread这个类,所以做Lucene开发,查看源码还是很重要的。