第一个“搜索引擎”【索引】

本文介绍了一个使用Java和Lucene创建文本文件索引的例子。通过MMAnalyzer分词工具,程序遍历指定目录下的所有.txt文件,并将文件名及内容进行索引存储。文章还分享了作者在开发过程中遇到的问题及解决办法。
摘要由CSDN通过智能技术生成

在写索引前,先了解一下replace吧

replace

public String replace(char oldChar,char newChar)
Returns a new string resulting from replacing all occurrences of oldChar in this string with newChar. If the character oldChar does not occur in the character sequence represented by this String object, then a reference to this String object is returned. Otherwise, a new String object is createdthat represents a character sequence identical to the character sequence represented by this String object, except that every occurrence of oldChar is replaced by an occurrence of newChar.

Examples:

 "mesquite in your cellar".replace('e', 'o')
       returns "mosquito in your collar"
       "the war of baronets".replace('r', 'y')
       returns "the way of bayonets"
      "sparring with a purple porpoise".replace('p', 't')
       returns "starring with a turtle tortoise"
      "JonL".replace('q', 'x') returns "JonL" (no change)
 

 

Parameters:
oldChar - the old character. 
newChar - the new character.
Returns:
a string derived from this string by replacing every occurrence of oldChar with newChar.

前提代码

 

package  ch2.lucenedemo.process;

import  java.io.BufferedReader;
import  java.io.File;
import  java.io.FileReader;
import  java.io.IOException;

import  jeasy.analysis.MMAnalyzer;

import  org.apache.lucene.document.Document;
import  org.apache.lucene.document.Field;
import  org.apache.lucene.index.IndexWriter;

 

这个程序迫切需要分词包和API包

 

public   class  IndexProcesser 
{
    
// 成员变量,存储创建的文件存放位置
     private   String INDEX_STORE_PATH = " d:/index " ;
    
    
public   void  createIndex(String inputDir)
    {
        
try
        {
            
// 以MMAnalyzer作为分词工具创建一个IndexWriter
            IndexWriter writer = new  IndexWriter(INDEX_STORE_PATH,
                    
new  MMAnalyzer(), true );
            File filesDir
= new  File(inputDir);
            
// 取得所有需要建立索引的文件组
            File[] files = filesDir.listFiles();
            
            
int  length = files.length;
            
for ( int  i = 0 ;i < length;i ++ )
            {
                String fileName
= files[i].getName();
                
// 判断文件名是否为.txt文件
                 if (fileName.substring(fileName.lastIndexOf( " . " )).equals( " .txt " ))
                {
                    Document doc
= new  Document();
                    
// Field以后在学,Field.Index.TOKENIZED表示先分词后索引
                    Field field = new  Field( " filename " ,files[i].getName(),Field.Store.YES,Field.Index.TOKENIZED);
                    doc.add(field);
                    field
= new  Field( " content " ,loadFileToString(files[i]),Field.Store.NO,Field.Index.TOKENIZED);
                    doc.add(field);
                    
                    writer.addDocument(doc);
                }
            }
            
            writer.close();
        }
        
catch (Exception e)
        {
            e.printStackTrace();
        }
    }
    
    
public  String loadFileToString(File file)
    {
        
try {
            BufferedReader br
= new  BufferedReader( new  FileReader(file));
            StringBuffer sb
= new  StringBuffer();
            String line
= br.readLine();
            
while (line != null )
            {
                sb.append(line);
                line
= br.readLine();
            }
            br.close();
            
return  sb.toString();
        }
        
catch (IOException e){
            e.printStackTrace();
            
return   null ;
        }
    }
    
    
public   static   void  main(String[] args) 
    {
        IndexProcesser processor
= new  IndexProcesser();
        processor.createIndex(
" d:/textfolder " );
    }

 

 写这个程序特另我郁闷,一直出现空指针错误,file.length一直没有值,查了很多资料都没结果,最后发现路径写错了,d://textfolder而不是d://testfolder,差点急死我啊,幸好找到了,^_^!

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值