Lucene入门学习

最新推荐文章于 2022-05-07 23:10:14 发布

Mr_张三阿

最新推荐文章于 2022-05-07 23:10:14 发布

阅读量134

点赞数

文章标签： lucene solr java elasticsearch

本文链接：https://blog.csdn.net/hnh_blog/article/details/114161992

版权

一、索引流程

1、采集数据
2、创建document文档对象
3、创建分词器
4、创建indexWiterConfig配置信息类
5、创建directory对象声明索引库存储位置
6、创建indexWiter写入对象
7、把document写入到索引库
8、释放资源

入门案例

pom.xml 文件

   <dependency>
            <groupId>commons-io</groupId>
            <artifactId>commons-io</artifactId>
            <version>2.6</version>
        </dependency>
        <dependency>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-core</artifactId>
            <version>7.7.2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-analyzers-common</artifactId>
            <version>7.7.2</version>
        </dependency>
        <dependency>
            <groupId>org.apache.lucene</groupId>
            <artifactId>lucene-queryparser</artifactId>
            <version>7.7.2</version>
        </dependency>
        <!-- 测试 -->
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.12</version>
            <scope>test</scope>
        </dependency>
        <!-- mysql数据库驱动 -->
        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>5.1.48</version>
        </dependency>

//索引流程
	@Test
	public void createIndexTest() throws IOException{
		//1、采集数据
		SkuDaoImpl skuDao = new SkuDaoImpl();
		List<Sku> skuList = skuDao.querySkuList();
		
		//2、创建document文档对象
		
		List<Document> documents = new ArrayList<Document>();
		for (Sku sku : skuList) {
			Document document = new Document();
			//document文档添加field域
			//store.Yes:表示存储到文档域中
			// 商品Id, 不分词,索引,存储
			document.add(new StringField("id", sku.getId(), Field.Store.YES));
			// 商品名称, 分词, 索引, 存储
			document.add(new TextField("name", sku.getName(),Field.Store.YES));
			// 商品价格, 分词,索引,不存储, 不排序
			//document.add(new TextField("price", sku.getPrice().toString(),Field.Store.YES));
			document.add(new FloatPoint("price", sku.getPrice()));
			
			// 品牌名称, 不分词, 索引, 存储
			document.add(new StringField("brandName", sku.getBrandName(),
			Field.Store.YES));
			
			// 分类名称, 不分词, 索引, 存储
			document.add(new StringField("categoryName", sku.getCategoryName(),
			Field.Store.YES));
			// 图片地址, 不分词,不索引,存储
			document.add(new TextField("image", sku.getImage(),
			Field.Store.YES));
			
			//把document放到list中
			documents.add(document);
		}
		//3、创建analyzer分词器，分析文档，对文档进行分词
		Analyzer analyzer = new StandardAnalyzer();
		//4、创建directory对象，声明索引库的位置
		Directory directory = FSDirectory.open(Paths.get("E:\\dir"));
		//5、创建indexWriteConfig对象 写入索引需要的配置
		IndexWriterConfig config = new IndexWriterConfig(analyzer);
		
		//6、创建indexWriter写入对象
		IndexWriter indexWriter = new IndexWriter(directory,config);
		//7、写入到索引库 通过indexWriter添加文档对象document
		for (Document doc : documents) {
			indexWriter.addDocument(doc);
		}
		//释放资源
		indexWriter.close();
		
	}

二、 Filed常用类型

	类型	是否分词	是否索引	是否存储	说明
StringField(FieldName,FieldValue,Store.YES))	字符串	N	Y	Y或N	这个Field用来构建一个字符串Field，但是不会进行分词，会将整个串存储在索引中，比如(订单号,身份证号等)是否存储在文档中Store.YES或Store.NO决定
FloatPoint(FieldName,FieldValue)	Float型	Y	Y	N	这个Field用来构建一个Float数字型Field，进行分词和索引，不存储, 比如(价格) 存储在文档中
DoublePoint(FieldName,FieldValue)	Double型	Y	Y	N	这个Field用来构建一个Double数字型Field，进行分词和索引，不存储
LongPoint(FieldName,FieldValue)	Long型	Y	Y	N	这个Field用来构建一个Long数字型Field，进行分词和索引，不存储
IntPoint(FieldName, FieldValue)	Integer型	Y	Y	N	这个Field用来构建一个Integer数字型Field，进行分词和索引，不存储
StoredField(FieldName,FieldValue)	重载方法，支持多种类型	N	N	Y	这个Field用来构建不同类型Field不分析，不索引，但要Field存储在文档中
TextField(FieldName,FieldValue,Store.NO) 或 TextField(FieldName,reader)	字符串或流	Y	Y	Y或N	如果是一个Reader, lucene猜测内容比较多,会采用Unstored的策略.
NumericDocValuesField(FieldName,FieldValue)	数值	-	-	-	配合其他域使用

三、搜索流程

1、创建Query搜索对象
2、创建directory流对象，声明索引库的位置
3、创建索引读取对象indexReader
4、创建索引搜索对象indexSearcher
5、使用索引搜索对象执行搜索，返回结果集topDocs
6、解析结果集
7、释放资源

	//1、创建Query搜索对象
		//创建分析器
		Analyzer analyzer = new StandardAnalyzer();
		//创建搜索解析器，第一个参数 field域 第二个参数：分词器
		QueryParser queryParser = new QueryParser("brandName", analyzer);
		
		//创建搜索对象
		Query query = queryParser.parse("name:手机 AND 华为");
		
		//2.创建directory流对象 声明索引库的位置
		FSDirectory directory = FSDirectory.open(Paths.get("E:\\dir"));
		//3、创建索引读取对象indexReader
		IndexReader reader = DirectoryReader.open(directory);
		
		//4、创建索引搜索对象
		org.apache.lucene.search.IndexSearcher searcher = new org.apache.lucene.search.IndexSearcher(reader);
		
		// 5. 使用索引搜索对象，执行搜索，返回结果集TopDocs
		// 第一个参数：搜索对象，第二个参数：返回的数据条数，指定查询结果最顶部的n条数据返回
		TopDocs topDocs = searcher.search(query, 10);
		System.out.println("查询到的数据总条数是：" + topDocs.totalHits);
		// 获取查询结果集
		ScoreDoc[] docs = topDocs.scoreDocs;
		// 6. 解析结果集
		for (ScoreDoc scoreDoc : docs) {
			// 获取文档
			int docID = scoreDoc.doc;
			Document doc = searcher.doc(docID);
			System.out.println("=============================");
			System.out.println("docID:" + docID);
			System.out.println("id:" + doc.get("id"));
			System.out.println("name:" + doc.get("name"));
			System.out.println("price:" + doc.get("price"));
			System.out.println("brandName:" + doc.get("brandName"));
			System.out.println("image:" + doc.get("image"));
			
		}
		reader.close();

程序执行结果如下：
在这里插入图片描述

Mr_张三阿

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Lucene入门学习

一、索引流程1、采集数据2、创建document文档对象3、创建分词器4、创建indexWiterConfig配置信息类5、创建directory对象声明索引库存储位置6、创建indexWiter写入对象7、把document写入到索引库8、释放资源入门案例pom.xml 文件 <dependency> <groupId>commons-io</groupId> <arti
复制链接

扫一扫