概念
ES是一个使用Java语言开发并且基于Lucene编写的搜索引擎框架,它提供了分布式的全文搜索功能,提供了一个统一的基于RESTful风格的WEB接口,官方客户端也对多种语言都提供了相应的API
官网:https://www.elastic.co/cn/products/elasticsearch
es安装与ik分词器的安装:Linux安装ElasticSearch7.X & IK分词器 - 1024。 - 博客园 (cnblogs.com)
2. elasticsearch的核心概念
1 NRT(Near Realtime):近实时
两方面:
写入数据时,过1秒才会被搜索到,因为内部在分词、录入索引。
es搜索时:搜索和分析数据需要秒级出结果。
2 Cluster:集群
包含一个或多个启动着es实例的机器群。通常一台机器起一个es实例。同一网络下,集名一样的多个es实例自动组成集群,自动均衡分片等行为。默认集群名为“elasticsearch”。
3 Node:节点
每个es实例称为一个节点。节点名自动分配,也可以手动配置。
4 Index:索引
包含一堆有相似结构的文档数据。
索引创建规则:
仅限小写字母
不能包含\、/、 *、?、"、<、>、|、#以及空格符等特殊符号
从7.0版本开始不再包含冒号
不能以-、_或+开头
不能超过255个字节(注意它是字节,因此多字节字符将计入255个限制)
5 Document:文档
es中的最小数据单元。一个document就像数据库中的一条记录。通常以json格式显示。多个document存储于一个索引(Index)中。
book document
{
"book_id": "1",
"book_name": "java编程思想",
"book_desc": "从Java的基础语法到最高级特性(深入的[面向对象](https://baike.baidu.com/item/面向对象)概念、多线程、自动项目构建、单元测试和调试等),本书都能逐步指导你轻松掌握。",
"category_id": "2",
"category_name": "java"
}
6 Field:字段
就像数据库中的列(Columns),定义每个document应该有的字段。
7 Type:类型
每个索引里都可以有一个或多个type,type是index中的一个逻辑数据分类,一个type下的document,都有相同的field。
注意:6.0之前的版本有type(类型)概念,type相当于关系数据库的表,ES官方将在ES9.0版本中彻底删除type。本教程typy都为_doc。
8 shard:分片
index数据过大时,将index里面的数据,分为多个shard,分布式的存储在各个服务器上面。可以支持海量数据和高并发,提升性能和吞吐量,充分利用多台机器的cpu。
9 replica:副本
在分布式环境下,任何一台机器都会随时宕机,如果宕机,index的一个分片没有,导致此index不能搜索。所以,为了保证数据的安全,我们会将每个index的分片经行备份,存储在另外的机器上。保证少数机器宕机es集群仍可以搜索。
能正常提供查询和插入的分片我们叫做主分片(primary shard),其余的我们就管他们叫做备份的分片(replica shard)。
es6默认新建索引时,5分片,2副本,也就是一主一备,共10个分片。所以,es集群最小规模为两台。
3.elasticsearch核心概念 vs. 数据库核心概念
关系型数据库(比如Mysql) | 非关系型数据库(Elasticsearch) |
数据库Database | 索引Index |
表Table | 索引Index(原为Type) |
数据行Row | 文档Document |
数据列Column | 字段Field |
约束 Schema | 映射Mapping |
4.创建索引
以student为例子
语法:put /索引名
PUT /index
{
"settings": { ... any settings ... },
"mappings": {
"properties" : {
"field1" : { "type" : "text" }
"field2" : { "type" : "text" }
}
},
"aliases": {
"default_index": {}
}
}
PUT /student/
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1
},
"mappings": {
//设置字段
"properties":{
"stu_id":{
"type":"long"
},
"stu_name":{
"type":"keyword"
},
"address":{
"type":"text",
"analyzer": "ik_max_word"
},
"createTime":{
"type":"date",
"format":"yyyy-MM-dd HH:mm:ss"
}
}
}
}
//结果
{
"acknowledged": true,
"shards_acknowledged": true,
"index": "student"
}
5.对文档document的基本操作
5.1.新增
语法:PUT /index/type/id
//新增一个学生
put /student/_doc/1
{
"stu_id" : 2020611111,
"stu_name": "小明",
"address": "广东省广州市",
"createTime": "2020-10-10 12:24:53"
}
//结果
{
"_index": "student",
"_type": "_doc",
"_id": "1",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
5.2查询文档
后面还有请求体条件查询
GET /index/_doc/id
//获取学生信息
GET /student/_doc/1
//结果
{
"_index": "student",
"_type": "_doc",
"_id": "1",
"_version": 1,
"_seq_no": 2,
"_primary_term": 1,
"found": true,
"_source": {
"stu_id": 2020611111,
"stu_name": "小明",
"address": "广东省广州市",
"createTime": "2020-10-10 12:24:53"
}
}
5.3更新文档
//第一种是直接用新增操作来替换:
Put /student/_doc/1
{
"stu_id" : 2020611100,
"stu_name": "小明",
"address": "广东省广州市",
"createTime": "2020-10-10 12:24:53"
}
//结果:
{
"_index": "student",
"_type": "_doc",
"_id": "1",
"_version": 2,
"result": "updated",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 5,
"_primary_term": 1
}
//第一种是使用_update来更新:
语法:
POST /index/_update/id
{
"doc":{
"你的字段":"你要修改的内容"
}
}
POST /student/_update/1
{
"doc": {
"stu_name": "坤坤"
}
}
//结果:
{
"_index": "student",
"_type": "_doc",
"_id": "1",
"_version": 3,
"result": "updated",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 6,
"_primary_term": 1
}
5.4删除文档
语法: DELETE /index/_doc/id
//删除一个学生
DELETE /student/_doc/1
//结果
{
"_index": "student",
"_type": "_doc",
"_id": "1",
"_version": 2,
"result": "deleted",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 1,
"_primary_term": 1
}
//删除之后,我们查询一下:发现删了
{
"_index": "student",
"_type": "_doc",
"_id": "1",
"found": false
}
6.Java Api RestHighLevelClient对文档的操作
6.1引入es的依赖
<!--首先创建个springboot工程-->
<!--在pom.xml中引入-->
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>7.3.0</version>
</dependency>
#在application.yml中配置
spring:
application:
name: elk-service
stukk:
elasticsearch:
host: 你的服务器ip:9200
username: 你的账号
password: 你的密码
connectTimeout: 5000
socketTimeout: 5000
connectionRequestTimeout: 5000
maxConnectNum: 100
maxConnectPerRoute: 100
//创建es的config配置类
package stukk.config;
import org.apache.http.HttpHost;
import org.apache.http.auth.AuthScope;
import org.apache.http.auth.UsernamePasswordCredentials;
import org.apache.http.client.CredentialsProvider;
import org.apache.http.impl.client.BasicCredentialsProvider;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestClientBuilder;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
public class ElkConfig {
@Value("${stukk.elasticsearch.host}")
private String host; //连接路由,隔开
@Value("${stukk.elasticsearch.username}")
private String userName; //用户名
@Value("${stukk.elasticsearch.password}")
private String password; //密码
/**
* 连接超时时间
*/
@Value("${stukk.elasticsearch.connectTimeout}")
private int connectTimeout;
/**
* Socket 连接超时时间
*/
@Value("${stukk.elasticsearch.socketTimeout}")
private int socketTimeout;
/**
* 获取连接的超时时间
*/
@Value("${stukk.elasticsearch.connectionRequestTimeout}")
private int connectionRequestTimeout;
/**
* 最大连接数
*/
@Value("${stukk.elasticsearch.maxConnectNum}")
private int maxConnectNum;
/**
* 最大路由连接数
*/
@Value("${stukk.elasticsearch.maxConnectPerRoute}")
private int maxConnectPerRoute;
@Bean(destroyMethod = "close")
public RestHighLevelClient restHighLevelClient(){
String[] split = host.split(","); //如果有多个的话,按照,分割开来
HttpHost []httpHosts = new HttpHost[split.length];
for(int i = 0;i< split.length;i++){
//new HttpHost("x.x.x.x",9200,"http");
httpHosts[i] = new HttpHost(split[i].split(":")[0],Integer.parseInt(split[i].split(":")[1]),"http");
}
final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials(userName, password)); //在查询中加入用户名和密码
RestClientBuilder builder = RestClient.builder(httpHosts);
// 异步连接延时配置
builder.setRequestConfigCallback(requestConfigBuilder -> {
requestConfigBuilder.setConnectTimeout(connectTimeout);
requestConfigBuilder.setSocketTimeout(socketTimeout);
requestConfigBuilder.setConnectionRequestTimeout(connectionRequestTimeout);
return requestConfigBuilder;
});
// 异步连接数配置
builder.setHttpClientConfigCallback(httpClientBuilder -> {
httpClientBuilder.setMaxConnTotal(maxConnectNum);
httpClientBuilder.setMaxConnPerRoute(maxConnectPerRoute);
return httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider);
});
return new RestHighLevelClient(builder);
}
}
6.2.Java Api对文档的查询操作
@Test
public void test() throws IOException, InterruptedException {
GetRequest getRequest = new GetRequest("student","1");
//使用FetchSourceContext来控制取出哪些值
// String[] includes = new String[]{"name"};
// String[] excludes = Strings.EMPTY_ARRAY;
// FetchSourceContext fetchSourceContext = new FetchSourceContext(true, includes, excludes);
// getRequest.fetchSourceContext(fetchSourceContext);
//使用FetchSourceContext来控制取出哪些值
//同步执行
GetResponse getResponse = restHighLevelClient.get(getRequest, RequestOptions.DEFAULT);
if(getResponse.isExists()){
System.out.println(getResponse.getSourceAsString());
System.out.println(getResponse.getSourceAsBytes());
System.out.println(getResponse.getSourceAsMap());
}
//同步执行
// //异步执行
// ActionListener<GetResponse> listener = new ActionListener<GetResponse>() {
// @Override
// public void onResponse(GetResponse getResponse) {
// //成功时
// System.out.println(getResponse.getSourceAsString());
//
// }
//
// @Override
// public void onFailure(Exception e) {
失败时
// e.printStackTrace();
// }
// };
// //异步执行
// restHighLevelClient.getAsync(getRequest,RequestOptions.DEFAULT,listener);
// Thread.sleep(1000);
}
//查询到结果
{"stu_id":2020611100,"stu_name":"坤坤","address":"广东省广州市","createTime":"2020-10-10 12:24:53"}
[B@33a3c44a
{stu_id=2020611100, address=广东省广州市, createTime=2020-10-10 12:24:53, stu_name=坤坤}
6.2.Java Api对文档的新增操作
@Test
public void TestAdd() throws IOException {
// 构建请求
IndexRequest indexRequest = new IndexRequest("student");
indexRequest.id("2");
// 构建文档数据
// 第一种方法
String json = "{\n" +
" \"stu_id\" : 2020611119,\n" +
" \"stu_name\": \"ikun\",\n" +
" \"address\": \"广东省深圳市\",\n" +
" \"createTime\": \"2021-10-10 12:24:53\"\n" +
"}";
indexRequest.source(json, XContentType.JSON);
第二种方法
// Map<String,Object> jsonMap = new HashMap<>();
// jsonMap.put("name","tomas");
// jsonMap.put("age",2);
// indexRequest.source(jsonMap);
//
第三种方法
// XContentBuilder xContentBuilder = XContentFactory.jsonBuilder();
// xContentBuilder.startObject();
// {
// xContentBuilder.field("name","fuck");
// xContentBuilder.field("age",3);
// }
// xContentBuilder.endObject();
// indexRequest.source(xContentBuilder);
第四种
// indexRequest.source("name","kkkkkk",
// "age",12
// );
// 设置超时时间
indexRequest.timeout("1s");
// 手动维护版本号
indexRequest.version(2);
indexRequest.versionType(VersionType.EXTERNAL);
// 同步执行
IndexResponse indexResponse = restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);
// 获取hi结果
System.out.println(indexResponse.getIndex());
System.out.println(indexResponse.getId());
System.out.println(indexResponse.getResult());
}
6.3.Java Api对文档的更新操作
@Test
public void updateTest() throws IOException {
UpdateRequest updateRequest = new UpdateRequest("student","2");
Map<String,Object> json =new HashMap<>();
json.put("stu_name","蔡徐坤");
updateRequest.doc(json);
UpdateResponse updateResponse = restHighLevelClient.update(updateRequest, RequestOptions.DEFAULT);
if(updateResponse.getResult() == DocWriteResponse.Result.UPDATED){
System.out.println("已经更新,新的信息:");
System.out.println(updateResponse.getId());
System.out.println(updateResponse.getIndex());
}
}
//结果:
//已经更新,新的信息:
//2
//student
//再查询一下,发现确实改了
"_source": {
"stu_id": 2020611119,
"stu_name": "蔡徐坤",
"address": "广东省深圳市",
"createTime": "2021-10-10 12:24:53"
}
6.4.Java Api对文档的删除操作
@Test
public void DeleteTest() throws Exception{
DeleteRequest deleteRequest = new DeleteRequest("student","2");
DeleteResponse deleteResponse = restHighLevelClient.delete(deleteRequest, RequestOptions.DEFAULT);
System.out.println(deleteResponse.getResult());
}
6.5.Java Api对文档的大规模bulk操作
@Test
public void BulkTest() throws IOException{
BulkRequest bulkRequest = new BulkRequest();
bulkRequest.add(new IndexRequest("student").id("3").source(XContentType.JSON,"stu_id",2020611107,"stu_name","昆昆","address","北京市","createTime","2022-10-10 12:24:53"));
bulkRequest.add(new UpdateRequest("student","3").doc(XContentType.JSON,"stu_name","小强"));
BulkResponse bulkResponse = restHighLevelClient.bulk(bulkRequest,RequestOptions.DEFAULT);
for(BulkItemResponse responses: bulkResponse){
DocWriteResponse response = responses.getResponse();
switch (responses.getOpType()) {
case INDEX:
case CREATE:
IndexResponse indexResponse = (IndexResponse) response;
indexResponse.getId();
System.out.println(indexResponse.getResult());
break;
case UPDATE:
UpdateResponse updateResponse = (UpdateResponse) response;
updateResponse.getIndex();
System.out.println(updateResponse.getResult());
break;
case DELETE:
DeleteResponse deleteResponse = (DeleteResponse) response;
System.out.println(deleteResponse.getResult());
break;
}
}
}
//结果
//CREATED
//UPDATED
6.6.总结
操作 | 构建请求 | 执行 |
查询 | new GetRequest(index,id); | restHighLevelClient.get(getRequest, RequestOptions.DEFAULT); |
新增 | new IndexRequest(index); | restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT); |
更新 | new UpdateRequest(index,id); | restHighLevelClient.update(updateRequest, RequestOptions.DEFAULT); |
删除 | new DeleteRequest(index,id); | restHighLevelClient.delete(deleteRequest, RequestOptions.DEFAULT); |
bulk大规模操作 | .. bulk = new BulkRequest(); bulk.add(....); | restHighLevelClient.bulk(bulkRequest,RequestOptions.DEFAULT); |
7.Java Api 创建es索引
@Test
public void CreateIndexTest() throws IOException { //创建索引测试
//创建一个叫做index-01的索引
CreateIndexRequest createIndexRequest = new CreateIndexRequest("index_01");
//设置参数 settings
createIndexRequest.settings(Settings.builder().put("number_of_shards","1").put("number_of_replicas","1").build());
//第一种方法
// createIndexRequest.mapping("{\n" +
// " \"properties\" : {\n" +
// " \"field1\" : { \"type\" : \"text\" }\n" +
// " \"field2\" : { \"type\" : \"text\" }\n" +
// " }\n" +
// " }", XContentType.JSON);
//第一种方法
//第二种方法
Map<String,Object> field1 = new HashMap<>();
field1.put("type","text");
Map<String,Object> field2 = new HashMap<>();
field2.put("type","text");
Map<String,Object> properties = new HashMap<>();
properties.put("field1",field1);
properties.put("field2",field2);
Map<String,Object> mapping = new HashMap<>();
mapping.put("properties",properties);
createIndexRequest.mapping(mapping);
//第二种方法
CreateIndexResponse createIndexResponse = restHighLevelClient.indices().create(createIndexRequest, RequestOptions.DEFAULT);
System.out.println(createIndexResponse.index());
}