Lucene是一个基于Java的全文索引工具包。
参考地址 http://lucene.apache.org/java/docs/index.html。
这里讨论lucene2.0版本的功能。
lucene api包结构为:
包名 包功能简述 常用类成员
org.apache.lucene.analysis 转换文本为tokens;abstract class Analyzer ;final class Token ;
org.apache.lucene.analysis.br
org.apache.lucene.analysis.cjk
org.apache.lucene.analysis.cn 中文词法分析;ChineseAnalyzer,ChineseFilter,ChineseTokenizer
org.apache.lucene.analysis.cz
org.apache.lucene.analysis.de
org.apache.lucene.analysis.el
org.apache.lucene.analysis.fr
org.apache.lucene.analysis.nl
org.apache.lucene.analysis.ru
org.apache.lucene.analysis.snowball
org.apache.lucene.analysis.standard
org.apache.lucene.ant
org.apache.lucene.document 提取文本;final class Document、final class Field
org.apache.lucene.index 维护访问索引;IndexWriter,abstract class IndexReader,final class Term
org.apache.lucene.index.memory
org.apache.lucene.misc
org.apache.lucene.queryParser
org.apache.lucene.queryParser.analyzing
org.apache.lucene.queryParser.precedence
org.apache.lucene.search 查询索引;abstract class Query,abstract class Searcher,final class Hits,
| IndexSearcher
org.apache.lucene.search.highlight
org.apache.lucene.search.regex
org.apache.lucene.search.similar
org.apache.lucene.search.spans
org.apache.lucene.search.spell
org.apache.lucene.store Binary i/o API, used for all index data;abstract class Directory,FSDirectory
org.apache.lucene.swing.models
org.apache.lucene.util
org.apache.lucene.wordnet
org.apache.regexp
一个建立索引的例子Indexer.java(为指定目录下*.txt文档建立索引。目录可以包含多层子目录。同时索引document的field分别为 文件名、文件内容、创建日期)
import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.index.IndexWriter; /** public static void main(String[] args)throws Exception { |