lucene分析（1）

最新推荐文章于 2024-10-10 10:34:11 发布

hhhhhigh

最新推荐文章于 2024-10-10 10:34:11 发布

阅读量61

点赞数

分类专栏： lucene 文章标签： lucene

本文链接：https://blog.csdn.net/hhhhhigh/article/details/120691391

版权

lucene 专栏收录该内容

14 篇文章 0 订阅

订阅专栏

2021SC@SDUSC

lucene的使用结构

索引过程

1) 有一系列被索引文件
2) 被索引文件经过语法分析和语言处理形成一系列词(Term)。
3) 经过索引创建形成词典和反向索引表。
4) 通过索引存储将索引写入硬盘。

搜索过程

1) 用户输入查询语句。
2) 对查询语句经过语法分析和语言分析得到一系列词(Term)。
3) 通过语法分析得到一个查询树。
4) 通过索引存储将索引读入到内存。
5) 利用查询树搜索索引，从而得到每个词(Term)的文档链表，对文档链表进行交，差，并得
到结果文档。
6) 将搜索到的结果文档对查询的相关性进行排序。
7) 返回查询结果给用户。

可见理解其索引方式对于我所研究的search部分有重大意义。

使用例子：

创建索引

public class IndexTest {
public static void main(String[] args)
{
try {
    File fileDir =new File("F:\\document");
    IndexWriterConfig config=new IndexWriterConfig(Version.LUCENE_43, new StandardAnalyzer(Version.LUCENE_43));
    config.setOpenMode(OpenMode.CREATE);
    IndexWriter writer=new IndexWriter(FSDirectory.open(new File("F:\\index")),config);
    for(File file:fileDir.listFiles())
    {
        Document document=new Document();
        document.add(new TextField("content", new FileReader(file)));
        document.add(new StringField("title", file.getName(), Store.YES));
        writer.addDocument(document);
    }
    writer.close();
} catch (Exception e) {
    e.printStackTrace();
}
}
}

搜索过程

public class SearchTest {
public static void main(String[] args)
{
try {
    IndexReader reader=DirectoryReader.open(FSDirectory.open(new File("F:\\index")));
    IndexSearcher searcher=new IndexSearcher(reader);
    Analyzer analyzer=new StandardAnalyzer(Version.LUCENE_43);

    QueryParser queryParser=new QueryParser(Version.LUCENE_43, "content", analyzer);

    Query query=queryParser.parse("lucene");
    TopDocs topDocs=searcher.search(query, 10);

    ScoreDoc[] hits=topDocs.scoreDocs;

    for(int i=0;i    {
        System.out.println("score:"+hits[i].score);
        System.out.println("title:"+searcher.doc((hits[i].doc)).get("title"));
    }

} catch (Exception e) {
    e.printStackTrace();
}
}
}