Lucene-构建索引的2种方式及加权

最新推荐文章于 2021-08-26 11:30:31 发布

青岛欢迎您

最新推荐文章于 2021-08-26 11:30:31 发布

阅读量478

点赞数

分类专栏：搜索引擎文章标签： lucene

本文链接：https://blog.csdn.net/liberty12345678/article/details/82464344

版权

1、根据文件来生成索引，如后缀为.txt等的文件

步骤：

第一步：FSDirectory.open(Paths.get(url));根据路径获取存储索引的目录。

FSDirectory：表示对文件系统目录的操作。RAMDirectory ：内存中的目录操作。

Paths为NIO(new io)的一个类；Path 类是 java.io.File 类的升级版，File file=newFile("index.html")而Path path=Paths.get("index.html");由于 Path 类基于字符串创建，因此它引用的资源也有可能不存在。

关于nio:传统的io流都是通过字节的移动来处理的，也就是说输入/输出流一次只能处理一个字节，因此面向流的输入/输出系统通常效率不高；因此引进了新IO(new IO),NIO采用内存映射文件的方式来处理输入/输出，NIO将文件或文件的一段区域映射到内存中，这样就可以向访问内存一样来访问文件了(这种方式模拟了操作系统上的虚拟内存的概念)，所以NIO的效率很快。

第二步：new IndexWriter(Directory,IndexWriterConfig)创建索引

第三步：索引指定目录的文件

第四步：将文件写入lucene中的文档(Document)

package com.wp.util;

import java.io.File;
import java.io.FileReader;
import java.nio.file.Paths;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

public class Indexer {

private IndexWriter writer; // 写索引实例

/**
* 构造方法实例化IndexWriter
*
* @param indexDir
* @throws Exception
*/
public Indexer(String indexDir) throws Exception {
Directory dir = FSDirectory.open(Paths.get(indexDir));// 根据路径获取存储索引的目录
Analyzer analyzer = new StandardAnalyzer(); // 这里用了多态，StandardAnalyzer是标准分词器，Analyzer是一个分词器
IndexWriterConfig iwc = new IndexWriterConfig(analyzer);
writer = new IndexWriter(dir, iwc);
}
/**
* 关闭写索引 * *
* @throws Exception
*/
public void close() throws Exception {
writer.close();
}
/**
* 索引指定目录的

最低0.47元/天解锁文章

青岛欢迎您

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Lucene-构建索引的2种方式及加权

1、根据文件来生成索引，如后缀为.txt等的文件步骤：第一步：FSDirectory.open(Paths.get(url));根据路径获取存储索引的目录。FSDirectory：表示对文件系统目录的操作。RAMDirectory ：内存中的目录操作。Paths为NIO(new io)的一个类；Path 类是 java.io.File 类的升级版，File file=newFile...
复制链接

扫一扫