ES-1-概念&简单操作

最新推荐文章于 2023-12-21 15:30:27 发布

雫#1999

最新推荐文章于 2023-12-21 15:30:27 发布

阅读量1.7k

点赞数

分类专栏： # Elasticsearch 文章标签： elasticsearch 搜索引擎 java

本文链接：https://blog.csdn.net/weixin_43541094/article/details/122361029

版权

Elasticsearch 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

文章目录

1 ES基础概念
2 Java操作ES

1 ES基础概念

1.1 ES简介

在购物网站搜索某件商品时
电商网站的商品种类繁多，需要返给用户大量的商品以供用户选择
假设这些数据存储在MySQL中，用户每搜一次商品以鼠标为例，都需要执行一次SQL:

SELECT * FROM item WHERE NAME LIKE %鼠标%

采用这样的查询方式，由于通配符需要放在前面用like查询时MySQL的索引会失效
查询速度必然是不尽人意的，且MySQL也无法处理用户输入偏差的问题
比如"鼠标"输入成为"数标"，为此需要使用高效的ES搜索引擎

ES介绍：

ES是一个使用Java语言并基于Lucene编写的搜索引擎框架，提供了分布式的全文检索功能
提供了一个统一的基于RESTful风格的web接口，广泛应用于各个领域

Lucene: 搜索引擎底层，Apache顶级项目

分布式: ES采用分布式主要为了实现横向扩展能力，搭建集群

全文检索: 将一段词语做分词 并将分出来的单个词语统一放到一个分词库中
在搜索时，根据关键字去分词库中检索 找到匹配的内容 效率远高于模糊查询

RESTful风格的web接口: 操作ES只需要发送http请求，根据请求方式的不同，携带参数的不同 可以执行对应功能

应用广泛: 各种电商网站，Github...

使用ES的几个原因：

1，海量数据执行搜索功能时，使用MySQL效率太低

2，如果输入关键字不准确，一样可以搜索到想要的数据

3，可以将搜索关键字以特殊字体展示
...

1.2 倒排索引

以这样一个例子演示ES的工作过程，客户端输入了"谁中国"，想要查询相关的数据
请添加图片描述

查询的过程如下:

1，query: 将输入的关键词 谁中国 进行分词 假设分出来了 谁+中国
在分词库中检索内容 谁对应的是1，4 中国对应的是2，3
query得到的结果id是1，2，3，4

2，fetch: 根据query查询到的id，直接去存放数据的区域拉取对应数据并返回
这里的id是1，2，3，4 所以我是谁，中国，你是中国人，你是谁都能被拉取到

倒排索引：将存放的数据以一定的方式进行分词，并将分词的内容存放到单独的分词库中
每次查询数据时，会先将用户查询的关键字分词，再去分词库中匹配内容得到数据的id标识
根据id标识去存放数据的区域拉取数据返回给用户

分词的逻辑由使用的分词器决定

1.3 ES的结构

类比关系数据库：

MySQL         	 ES
database   ->   index 
数据库			 索引 

table    ->      type
表                类型
	  
row     ->   document  
行             文档

column  ->   field 
列			   属性

schema   ->   mapping
表结构定义	   type结构定义
			
SQL    ->   DSL

index:
ES服务中可以创建多个索引，每个索引默认被分成5片存储
每个分片都会存在至少一个备份分片 备份分片需要放在集群不同服务器中
备份分片默认不会帮助检索数据，当ES检索压力大时，备份分片会帮助检索数据

document:
一个Type下可以由多个文档
文档就类似MySQL中表中的行数据

field:
一个Document中可以包含多个属性
类似MySQL中每行数据都有多个列

1.4 Field可以指定的类型

ES中可以存储多个document，document可以看做row
同时每个document有若干个field，field可以看做column，同样可以指定类型

字符串
	text: 将当前field进行分词，一般被用于全文检索

	keyword: 当前field不会进行分词

数值
	long integer short byte double float 
	half_float scaled_float

时间
	date 可针对时间类型指定具体格式

布尔类型
	boolean

二进制类型
	binary 暂时支持Base64 encode string

范围类型
	long_range 赋值时不需要指定具体值，存储一个范围即可
	同类还有 integer_range double_range...

经纬度类型
	geo_point 存储经纬度

ip类型
	ip 可存储ipv4/ipv6
...

1.5 操作ES的RESTful语法

GET请求
	查询索引信息: http://ip:port/index
	查询指定文档: http://ip:port/index/type/doc_id

POST请求
	查询文档: http://ip:port/index/type/_search
	在请求体中添加gson串表示查询条件
	
	修改文档: http://ip:port/index/type/doc_id/_update
	在请求体中指定gson串代表修改具体信息

PUT请求
	创建索引: http://ip:port/index
	需要在请求体中指定索引信息(分片，备份...)
	
	指定索引下type的field信息: http://ip:port/index/type/_mappings
	
DELETE请求
	删除索引: http://ip:port/index
	删除指定文档: http://ip:port/index/type/doc_id

索引的简单操作

1，创建一个简单索引person，5分片1备份
PUT /person
	{
		"settings" : {
			"number_of_shards" : 5,
			"number_of_replicas" : 1
		}
	}


2，查询索引信息
GET /person

3，删除索引
DELETE /person

4，创建索引并指定数据类型
PUT /book 
{
	"settings" : { // 索引的分片备份等设置
		"number_of_shards" : 5,
		"number_of_replicas" : 1
	},
	
	"mappings" : { // 索引下type的field配置
		"novel" : {  // 名为novel的type 目前es一个index下一个type
			"properties" : {
				"name" : { // 名为name的field
					"type" : "text", // name的类型为text
					"analyzer" : "ik_max_word", // 为text类型字符串指定分词器
					"index" : true, // 当前field可以被作为查询的条件
					"store" : false // 当前field是否需要额外存储
				},
				"author" : {
					"type" : "keyword"
				},
				"count" : {
					"type" : "long"
				},
				"onSale" : {
					"type" : "yyyy-MM-dd HH:mm:ss"
				}		
			}
		}
	}
	
}

1.6 ES的文档操作

文档在ES服务中的唯一标识，_index+_type+_id
由这三个值确定唯一的文档

新建文档
	1，自动生成id 不推荐
	POST /book/novel 
	{
		"name" : "xxx",
		"author" : "yyy"
		...
	}
	
	2，指定id新建文档
	PUT /book/novel/1 // 指定id为1
	{ 
		"name" : "xxx",
		"author" : "yyy"
		...
	}

修改文档
	1，覆盖式修改 不推荐
	PUT /book/novel/1 // 覆盖id为1的文档
	{ 
		"name" : "xxx",
		"author" : "yyy"
		...
	}
	
	2，doc修改方式
	POST /book/novel/1/_update {
		"doc" : {
			"author" : "kkk"
		}
	}


删除文档
	根据id删除文档
	DELETE /book/novel/1

2 Java操作ES

首先导入必要依赖:

	<dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>6.5.4</version>
        </dependency>

        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>6.5.4</version>
        </dependency>

准备好Client连接到es服务器：

	public static RestHighLevelClient getClient() {
	
        // 待连接的es服务器
        HttpHost host = new HttpHost("127.0.0.1", 9200);

        // 创建RestClientBuilder
        RestClientBuilder builder = RestClient.builder(host);
        
        // 创建client 连接到es服务器
        RestHighLevelClient client = new RestHighLevelClient(builder.build());
    
        return client;
    }

2.1 创建&删除索引

创建一个索引并指明settings和mappings:

public class Demo1 {

    String index = "person";
    String type = "man";

    RestHighLevelClient client = ESClient.getClient();

    @Test
    public void createIndex() throws IOException {
        // 准备关于索引的settings
        Settings.Builder settings = Settings.builder()
                                            .put("number_of_shards", 3) // 设置索引的分片数
                                            .put("number_of_replicas", 1); // 设置索引的备份数

        // 准备关于索引的结构mappings
        XContentBuilder mappings = JsonXContent.contentBuilder()
                        .startObject()
                            .startObject("properties")
                                    .startObject("name")
                                        .field("type", "text")
                                    .endObject()

                                    .startObject("age")
                                    .field("type", "integer")
                                    .endObject()

                                    .startObject("birthday")
                                    .field("type", "date")
                                    .field("format", "yyyy-MM-dd")
                                    .endObject()
                        .endObject()
                            .endObject();

        // 将settings和mappings封装到request对象中
        CreateIndexRequest request = new CreateIndexRequest(index)
                                        .settings(settings)
                                        .mapping(type, mappings);

        // 通过client对象连接es并执行创建索引的请求
        CreateIndexResponse response = client.indices().create(request, RequestOptions.DEFAULT);
        System.out.println(response);
    }


}


对应的创建了索引名person 类型type的索引:
{
	"settings" : { 
		"number_of_shards" : 3,
		"number_of_replicas" : 1
	},
	
	"mappings" : { 
		"type" : {  
			"properties" : {
				"name" : { 
					"type" : "text", 
				},
				"age" : {
					"type" : "integer"
				},
				"birthday" : {
					"type" : "yyyy-MM-dd"
				}		
			}
		}
	}
	
}

检查索引存在&删除索引:

检查索引是否存在:	
	@Test
    public void exits() throws IOException {
        // 准备request对象
        GetIndexRequest request = new GetIndexRequest();
        request.indices(index);
        
        // 通过client连接es并检查索引是否存在
        boolean exits = client.indices().exists(request, RequestOptions.DEFAULT);
        System.out.println(exits);
    }


删除一个索引:
	@Test
    public void delete() throws IOException {
        // 准备request对象
        DeleteIndexRequest request = new DeleteIndexRequest();
        request.indices(index);
        
        // 通过client连接es删除指定索引
        AcknowledgedResponse delete = client.indices().delete(request, RequestOptions.DEFAULT);
        System.out.println(delete.isAcknowledged());
    }

2.2 增删改文档

	com.fasterxml.jackson.databind.ObjectMapper mapper = new ObjectMapper();
    String index = "person";
    String type = "man";

    RestHighLevelClient client = ESClient.getClient();

添加一个文档:
    @Test
    public void createDoc() throws IOException {
        // 准备一个gson数据
        Person person = new Person(1, "tom", 23, new Date());
        String json = mapper.writeValueAsString(person);

        // 准备request对象 手动指定id
        IndexRequest request = new IndexRequest(index, type, String.valueOf(person.getId()));
        request.source(json, XContentType.JSON);

        // 使用client添加文档
        IndexResponse response = client.index(request, RequestOptions.DEFAULT);
        System.out.println(response.getResult().toString());
    }
    
删除一个文档:
@Test
    public void delete() throws IOException {
        // 创建request对象 封装数据
        DeleteRequest request = new DeleteRequest(index, type, "1");
        
        // 使用client删除文档
        DeleteResponse response = client.delete(request, RequestOptions.DEFAULT);
        System.out.println(response.getResult().toString());
    }



修改某个文档:
@Test
    public void update() throws IOException {
        // 创建一个map 指定需要修改的内容
        Map<String, Object> doc = new HashMap<>();
        doc.put("name", "jack");
        
        // 创建request对象 封装数据
        UpdateRequest request = new UpdateRequest(index, type, "1");
        request.doc(doc);
        
        // 使用client更新文档
        UpdateResponse response = client.update(request, RequestOptions.DEFAULT);
        System.out.println(response.getResult().toString());
    }

2.3 批量操作

	com.fasterxml.jackson.databind.ObjectMapper mapper = new ObjectMapper();
    String index = "person";
    String type = "man";

    RestHighLevelClient client = ESClient.getClient();

批量添加:
 @Test
    public void bulkCreateDoc() throws IOException {
        // 准备多个gson对象
        Person p1 = new Person(1, "A", 21, new Date());
        Person p2 = new Person(2, "B", 22, new Date());
        Person p3 = new Person(3, "C", 23, new Date());

        String json1 = mapper.writeValueAsString(p1);
        String json2 = mapper.writeValueAsString(p1);
        String json3 = mapper.writeValueAsString(p1);

        // 创建request对象 将准备好数据封装
        BulkRequest request = new BulkRequest();
        request.add(new IndexRequest(index, type, p1.getId().toString()).source(json1, XContentType.JSON));
        request.add(new IndexRequest(index, type, p2.getId().toString()).source(json2, XContentType.JSON));
        request.add(new IndexRequest(index, type, p3.getId().toString()).source(json3, XContentType.JSON));
        
        // 使用client批量添加文档
        BulkResponse response = client.bulk(request, RequestOptions.DEFAULT);
        System.out.println(response.toString());
    }


批量删除:
@Test
    public void bulkDelDoc() throws IOException {
        // 封装request对象
        BulkRequest request = new BulkRequest();
        request.add(new DeleteRequest(index, type, "1"));
        request.add(new DeleteRequest(index, type, "2"));
        request.add(new DeleteRequest(index, type, "3"));
        
        // 使用client批量删除文档
        BulkResponse response = client.bulk(request, RequestOptions.DEFAULT);
        System.out.println(response.toString());
    }