刚结束了一个项目,回头想了下整个开发流程,决定再总结下缓存问题及lucene(全文检索)的运用。
首先来谈下lucene
项目的service端运用spring+hibernate开发。其间用到lucene做全文检索。版本为2.2,分词用的是JE-Analysis1.5.1.MMAnalyzer.建立索引用到队列。
我们先在blogservice里初始化索引路径,其实现是在spring配置文件里设置:
<bean id="blogService" class="cn.shell.service.BlogService"
parent="baseService">
<property name="indexPathRoot"
value="${webapp.root}WEB-INF/index/blog/">
</property>
</bean>
项目中主要是要求提高性能,所以采用队列来创建索引。List waitToIndexList=new LinkedList();
创建索引部分:
IndexWriter indexWriter;
IndexSearcher indexSearcher;
创建新线程:Thread indexThread=new Thread(this);
实例分词: MMAnalyzer analyzer=new MMAnalyzer();
初始化部分:
public void init(){
File rp = new File(indexPathRoot);
if (!rp.exists()) {
rp.mkdirs();
}
File segments = new File(indexPathRoot + File.separator
+ "segments.gen");
boolean bCreate = true;
if (segments.exists()) {
bCreate = false;
}
try {
indexWriter = new IndexWriter(indexPathRoot, analyzer, bCreate);
indexSearcher = new IndexSearcher(indexPathRoot);
} catch (Exception e) {
logger.error("init indexWriter fail", e);
}
indexThread.start(); //启动线程
}
public void run(){
while (!indexThread.isInterrupted()) {
if (!waitToIndexList.isEmpty()) {
Blog blog = (Blog) waitToIndexList.remove(0);
Document doc = new Document();
doc.add(new Field("blogID", blog.getBlogID(), Field.Store.YES,
Field.Index.UN_TOKENIZED));
doc.add(new Field("title", blog.getTitle(), Field.Store.YES,
Field.Index.TOKENIZED));
doc.add(new Field("content", blog.getContent(),
Field.Store.YES, Field.Index.TOKENIZED));
doc.add(new Field("author", blog.getClientUser().getNickName(),
Field.Store.YES, Field.Index.TOKENIZED));
try {
indexWriter.addDocument(doc);
indexWriter.flush();
indexWriter.optimize();
} catch (Exception e) {
logger.error("create index error", e);
}
}
try {
Thread.sleep(50);
} catch (Exception e) {
logger.error(e);
}
}}
每创建一个新BLOG对象,我们将该对象塞到队列waitToIndexList中。
public void addBlog(Blog blog) {
AddBlogTask saveBlogTask = new AddBlogTask(blog, blogDAO);
asyncService.doTask(saveBlogTask);
// 更新缓存
String k = "KEY_BLOG" + blog.getBlogID();
cacheService.put(k, blog);
this.waitToIndexList.add(blog);
}
search部分:
public Hits searchBlogByLucene(String keyword){
//首先从缓存中取 看是否能取到。
Hits hits = (Hits) cacheService.get("BLOG_SEARCH_" + keyword);
if (hits == null) {
try {
MultiFieldQueryParser queryParser = new MultiFieldQueryParser(
new String[] { "title", "content", "author" }, analyzer);
Query query = queryParser.parse(keyword);
hits = indexSearcher.search(query);
} catch (Exception e) {
logger.error("search " + keyword, e);
}
//缓存中没有的情况下 再将搜到的结果塞到缓存中。
cacheService.put("BLOG_SEARCH_" + keyword, hits);
}
return hits;
}//享元(flyweight)模式
搜索结果部分:
public List searchBlogs(String keyword, int off, int max) {
Hits hits = searchBlogByLucene(keyword);
String[] ids = new String[hits.length()];
for (int i = 0; i < hits.length(); i++) {
Document docTemp;
try {
docTemp = hits.doc(i);
String blogID = docTemp.get("blogID");
ids[i] = blogID;
} catch (Exception e) {
e.printStackTrace();
}
}
List hitsList = blogDAO.getBlogsByBlogIDS(ids, off, max);
return hitsList;
}
搜索结果集大小:
public int getSearchBlogsCount(String keyword) {
Hits hits = searchBlogByLucene(keyword);
if (hits != null) {
return hits.length();
} else {
return 0;
}
}