lucene(-).....包组织结构

 Lucene是一个基于Java的全文索引工具包。
参考地址 http://lucene.apache.org/java/docs/index.html
这里讨论lucene2.0版本的功能。
lucene api包结构为:
包名                                         包功能简述          常用类成员
org.apache.lucene.analysis 转换文本为tokens;abstract class Analyzer ;final class Token ;
org.apache.lucene.analysis.br
org.apache.lucene.analysis.cjk
org.apache.lucene.analysis.cn 中文词法分析;ChineseAnalyzer,ChineseFilter,ChineseTokenizer
org.apache.lucene.analysis.cz
org.apache.lucene.analysis.de
org.apache.lucene.analysis.el
org.apache.lucene.analysis.fr
org.apache.lucene.analysis.nl
org.apache.lucene.analysis.ru

org.apache.lucene.analysis.snowball
org.apache.lucene.analysis.standard
org.apache.lucene.ant 
org.apache.lucene.document   提取文本;final class Document、final class Field
org.apache.lucene.index            维护访问索引;IndexWriter,abstract class IndexReader,final class Term
org.apache.lucene.index.memory
org.apache.lucene.misc
org.apache.lucene.queryParser
org.apache.lucene.queryParser.analyzing
org.apache.lucene.queryParser.precedence
org.apache.lucene.search         查询索引;abstract class Query,abstract class Searcher,final class Hits,
|                                                          IndexSearcher

org.apache.lucene.search.highlight
org.apache.lucene.search.regex
org.apache.lucene.search.similar
org.apache.lucene.search.spans
org.apache.lucene.search.spell
org.apache.lucene.store Binary i/o API, used for all index data;abstract class Directory,FSDirectory
org.apache.lucene.swing.models
org.apache.lucene.util
org.apache.lucene.wordnet
org.apache.regexp

一个建立索引的例子Indexer.java(为指定目录下*.txt文档建立索引。目录可以包含多层子目录。同时索引document的field分别为 文件名、文件内容、创建日期)

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;

import org.apache.lucene.index.IndexWriter;

/**
 * @author jacky
 *
 * XXX To change the template for this generated type comment go to
 * Window - Preferences - Java - Code Style - Code Templates
 */
public class Indexer {

    public static void main(String[] args)throws Exception {
        String indexDir = "d://luceneTest//index";
        String dataDir = "d://luceneTest//datas";
        File indexDirF = new File(indexDir);
        File dataDirF = new File(dataDir);
        long startTime = System.currentTimeMillis();
        index(indexDirF,dataDirF);
        long endTime  =System.currentTimeMillis();
        System.out.println("Create indexs need Time: "+ (endTime -startTime)+" milliSecond");
   
    }
    public static int index(File indexDirF,File dataDirF) throws IOException
    {
        if(!dataDirF.exists() || !dataDirF.isDirectory())
        {
            throw new IOException("dataDirF is not found! please check");
        }
        IndexWriter writer = new IndexWriter(indexDirF,new StandardAnalyzer(),true);
        writer.setInfoStream(System.out);
        writer.setUseCompoundFile(false);
        indexDirectory(writer,dataDirF);
        int indexNum = writer.docCount();
        writer.optimize();
        writer.close();
        return indexNum;
    }
   
    public static void indexDirectory(IndexWriter writer,File file) throws IOException
    {
        File[] files = file.listFiles();
        for (int i = 0; i < files.length; i++) {
            if(files[i].isDirectory())
            {
                indexDirectory(writer,file);
            }
            else if(files[i].getName().endsWith(".txt"))
            {
                System.out.println("files name: "+files[i].getName());
                indexFile(writer,files[i]);
            }
           
        }
    }
   
    public static void indexFile(IndexWriter writer,File file) throws IOException
    {
        if(file.isHidden()||!file.exists()||!file.canRead())
        {
            return;
        }
        Document doc = new Document();
   
        doc.add(new Field("indexDate",new Date().toString(),Field.Store.YES,Field.Index.NO));
        doc.add(new Field("context",new FileReader(file)));
        doc.add(new Field("fileName",file.getCanonicalPath(),Field.Store.YES,Field.Index.NO));
      //  doc.add(new Field("fileName",file.getCanonicalPath(),null,null));
        writer.addDocument(doc);
       
    }
}


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值