Lucene——全文检索案例

奎葵

于 2020-05-22 16:38:51 发布

阅读量271

点赞数

文章标签： java lucene 索引

本文链接：https://blog.csdn.net/qq_44083614/article/details/106283032

版权

非结构化数据检索

顺序扫描法：全盘扫描，效率低
全文检索法：先建立索引再进行搜索

案例文件搜索关键字

在这里插入图片描述
(1) 创建一个Directory对象，指定索引库保存的位置
(2) 基于Directory对象创建一个IndexWriter对象
(3) 读取磁盘上的文件，对应每个文件创建一个文档对象
(4) 向文档对象中添加域
(5) 把文档对象写入索引库
(6) 关闭IndexWriter对象

public class IndexWriter_Demo {
    public static void main(String[] args) throws Exception {
        //创建directory对象，将索引库保存在磁盘
        Directory directory = FSDirectory.open(new File("c:\\io\\index").toPath());
        //基于directory创建一个IndexWriter对象
        IndexWriter indexWriter = new IndexWriter(directory,new IndexWriterConfig());
        //读取磁盘的文件，对应每一个文件创建一个文档对象
        File file = new File("c:\\io\\searchsource");
        file.createNewFile();
        File[] files = file.listFiles();
        for (File f : files) {
            String name = f.getName();
            String path = f.getPath();
            String fileContext = FileUtils.readFileToString(f, "utf-8");
            long fileSize = FileUtils.sizeOf(f);

            //创建域(Field)
            // 参数1  域的名称   参数2 域的内容   参数3 是否存储,是否将域的内容保存在磁盘上，如果不保存，就取不出来。
            Field fieldName = new TextField("name",name, Field.Store.YES);
            Field fieldpath = new TextField("path",path,Field.Store.YES);
            Field fieldContext = new TextField("content",fileContext,Field.Store.YES);
            Field fieldSize = new TextField("size",fileSize+"",Field.Store.YES);

            //向文档对象中添加域
            Document document = new Document();
            document.add(fieldName);
            document.add(fieldpath);
            document.add(fieldContext);
            document.add(fieldSize);
            //把文档对象写入索引库
            indexWriter.addDocument(document);

        }

        //关闭indexWriter对象
        indexWriter.close();
    }
}

查询索引库实现步骤：
第一步：创建一个Directory对象，也就是索引库存放的位置。
第二步：创建一个indexReader对象，需要指定Directory对象。
第三步：创建一个indexsearcher对象，需要指定IndexReader对象
第四步：创建一个TermQuery对象，指定查询的域和查询的关键词。
第五步：执行查询。
第六步：返回查询结果。遍历查询结果并输出。
第七步：关闭IndexReader对象

public class IndexReader_Demo {
    public static void main(String[] args) throws Exception {
        //指定索引库存放的路径
        FSDirectory directory = FSDirectory.open(new File("c:\\io\\index").toPath());
        //创建索引阅读器
        IndexReader indexReader = DirectoryReader.open(directory);
        //创建索引搜索器
        IndexSearcher indexSearcher = new IndexSearcher(indexReader);
        //创建查询对象   参数1：域名  参数2：域值
        Query query = new TermQuery(new Term("size", "88"));
        //执行查询  参数1：查询对象 参数2：查询最大值
        TopDocs topDocs = indexSearcher.search(query, 5);
        System.out.println("查询记录总条数" + topDocs.totalHits);
        //topDocs.scoreDocs存储document对象的id
        for (ScoreDoc scoreDoc : topDocs.scoreDocs) {
            //scoreDoc.doc是对应document对象的id
            System.out.println("id" + scoreDoc.doc);
            //根据document的id找到document对象
            Document document = indexSearcher.doc(scoreDoc.doc);
            System.out.println(document);
            /**
             * <stored,indexed,tokenized<name:a.txt>
             *  stored,indexed,tokenized<path:c:\io\searchsource\a.txt>
             *  stored,indexed,tokenized<content:山东黄金克拉克五千万人体育课教学自行车zxcnsdfghganxzxcvbbzxcvbxcvcxcb>
             *  stored,indexed,tokenized<size:88>>
             */
            System.out.println("文件名:"+document.get("name"));
            System.out.println("文件内容:"+document.get("content"));
            System.out.println("文件路径:"+document.get("path"));
            System.out.println("文件大小:"+document.get("size"));
            System.out.println("-------------------------");

        }
        indexReader.close();
    }
}

奎葵

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Lucene——全文检索案例

非结构化数据检索顺序扫描法：全盘扫描，效率低全文检索法：先建立索引再进行搜索案例文件搜索关键字(1) 创建一个Directory对象，指定索引库保存的位置(2) 基于Directory对象创建一个IndexWriter对象(3) 读取磁盘上的文件，对应每个文件创建一个文档对象(4) 向文档对象中添加域(5) 把文档对象写入索引库(6) 关闭IndexWriter对象public class IndexWriter_Demo { public static void mai
复制链接

扫一扫