ElasticSearch

ElasticSearch

1. 介绍

  • 本笔记参考狂神说,版本为7.6.X
    • https://www.bilibili.com/video/BV17a4y1x7zq?p=2
  • Lucene是一套信息检索工具包(jar包),不含搜索引擎系统
  • ElasticSearch是基于Lucene做了一些封装和增强

2. 入门操作

  • JDK1.8以上,客户端,界面工具
  • 版本对应。

2.1 下载

官网下载

windows下解压就可以使用

目录:

bin:启动文件
config:配置文件
	log4j2 日志文件
	jvm.options 虚拟机文件
	elasticsearch.yml 配置文件  比如默认9200端口
lib:相关jar包

modules:功能模块
plugins:插件:比如ik插件

启动,然后localhost:9200访问

2.2 安装可视化界面head

  • es head插件,github上面下载

    • https://github.com/mobz/elasticsearch-head
  • npm install
    npm run start #启动插件:localhost:9100
    
  • 解决跨域问题:修改elasticsearch.yml文件

    • #解决跨域问题
      http.cors.enabled: true
      http.cors.allow-origin: "*"
      

2.3 安装kibana

  • ELK:日志分析架构栈
  • 注意:下载版本与es一致;可以在配置文件中汉化
  • 默认端口 localhost:5601

3. ES核心概念

  • es是面向文档的,一切都是JSON

  • 对比

    • 关系型数据库Elasticsearch
      数据库database索引 indices(数据库)
      表tablestypes (以后会被启用)
      行rowsdocuments (文档)
      字段columnsfields
  • 物理设计

    • 在后台把每个索引划分为多个分片,每片可以再集群中的不同服务器间迁移;
  • 逻辑设计

    • 文档:索引和搜索数据的最小单位是文档;
      • 自我包含:key:value
      • 层次型:一个文档中包含文档(json对象)
    • 字段类型:文档的逻辑容器
    • 索引:数据库
  • 倒排索引

    • es使用倒排索引的结构,采用Lucene倒排索引作为底层。用于快速全文检索。

附加:命令

  1. _cat
GET /_cat/nodes; 查看所有节点
GET /_cat/nodes; 查看es健康状况
GET /_cat/master 查看主节点
GET /_cat/Indices 查看所有索引

image-20201028101500686

image-20201028101606205

  1. 索引一个文档
PUT customer/external/1

put不允许发id

get获取索引的数据

image-20201028103238152

image-20201028103344643

更改文档

POST customer/external/1/_update

删除文档&索引

DELETE customer/external/1
DELETE customer

bulk 批量API

image-20201028105552780

image-20201028105628186

image-20201028105756237

加载官方测试数据https://raw.githubusercontent.com/elastic/elasticsearch/master/docs/src/test/resources/accounts.json

进阶检索

url检索:image-20201028112459900

第二种检索:

GET bank/_search
{
  "query": {"match_all": {}},
  "sort": [
    {
      "account_number": "asc"
    }
  ]
}

QueryDSL 领域对象查询语言

匹配查询

GET bank/_search
{
  "query": {
    "match": {
      "account_number": "20"
    }
  }
}
当值为字符串就会进行模糊查询
全文检索按照评分进行排序,会对检索条件进行分词匹配

match_phrase 短语匹配

multi_match 多字段匹配

GET bank/_search
{
  "query": {
    "multi_match": {
      "query": "mill",
      "fields": ["address","state"]
    }
  }
}

bool 符合查询

GET bank/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "gender": "F"
          }
        },
        {
          "match": {
            "address": "mill"
          }
        }
      ]
    }
  }
}

must     必须
must not   必须没有
should   应该

filter结果过滤

term 精确字段推荐使用term

aggregations执行聚合

image-20201028115209838

搜索address中包含mill的所有人的年龄分布以及平均年龄,但是不显示这些人的详情
GET bank/_search
{
  "query": {
    "match": {
      "address": "mill"
    }
  },
  "aggs": {
    "group_by_state": {
      "terms": {
        "field": "age",
        "size": 10
      }
    },
    "avg_aggs":{
      "avg": {
        "field": "age"
      }
    },
    "avg_blance":{
      "avg": {
        "field": "balance"
      }
    }
  }
  
}

按照年龄聚合,并且按照这些年龄段的这些人的平均薪资
嵌套聚合
GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "age_aggs": {
      "terms": {
        "field": "age",
        "size": 100
      },
      "aggs": {
        "avg_bal": {
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  }
}
查出所有年龄分布,并且这些年龄段中M的平均薪资和F的平均薪资以及这个年龄段的总体平均薪资



4. IK分词器插件

什么是IK分词器:

  • 把一句话分词
  • 如果使用中文:推荐IK分词器
  • 两个分词算法:ik_smart(最少切分),ik_max_word(最细粒度划分)

4.1 下载安装

下载地址:https://github.com/medcl/elasticsearch-analysis-ik/releases

然后解压,放到elasticsearch的plugins中,建立“ik”文件夹,然后放入;

重启观察es:发现加载ik插件了

image-20200928103244193

出现了:

image-20200928103412743

也可以使用es中的插件命令,进行list展示。

4.2 使用Kibana测试:

【ik_smart】测试:

输入:

GET _analyze    // 符合restful风格请求,
{
  "analyzer": "ik_smart",    // ik_smart  最小切分算法
  "text":"我是社会主义接班人"
}

输出:

{
  "tokens" : [
    {
      "token" : "我",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "CN_CHAR",
      "position" : 0
    },
    {
      "token" : "是",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
      "token" : "社会主义",
      "start_offset" : 2,
      "end_offset" : 6,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "接班人",
      "start_offset" : 6,
      "end_offset" : 9,
      "type" : "CN_WORD",
      "position" : 3
    }
  ]
}

【ik_max_word】测试:

输入:

GET _analyze
{
  "analyzer": "ik_max_word",   // 最细粒度切分
  "text":"我是社会主义接班人"
}

输入:

{
  "tokens" : [
    {
      "token" : "我",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "CN_CHAR",
      "position" : 0
    },
    {
      "token" : "是",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
      "token" : "社会主义",
      "start_offset" : 2,
      "end_offset" : 6,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "社会",
      "start_offset" : 2,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 3
    },
    {
      "token" : "主义",
      "start_offset" : 4,
      "end_offset" : 6,
      "type" : "CN_WORD",
      "position" : 4
    },
    {
      "token" : "接班人",
      "start_offset" : 6,
      "end_offset" : 9,
      "type" : "CN_WORD",
      "position" : 5
    },
    {
      "token" : "接班",
      "start_offset" : 6,
      "end_offset" : 8,
      "type" : "CN_WORD",
      "position" : 6
    },
    {
      "token" : "人",
      "start_offset" : 8,
      "end_offset" : 9,
      "type" : "CN_CHAR",
      "position" : 7
    }
  ]
}

4.3 用户配置字典

当一些特殊词(比如姓名)不能被识别切分时候,用户可以自定义字典:

img

重启es和kibana测试

5.1 简介

RESTful是一种架构的规范与约束、原则,符合这种规范的架构就是RESTful架构。

操作

methodurl地址描述
PUTlocalhost:9100/索引名称/类型名称/文档id创建文档(指定id)
POSTlocalhost:9100/索引名称/类型名称创建文档(随机id)
POSTlocalhost:9100/索引名称/文档类型/文档id/_update修改文档
DELETElocalhost:9100/索引名称/文档类型/文档id删除文档
GETlocalhost:9100/索引名称/文档类型/文档id查询文档通过文档id
POSTlocalhost:9100/索引名称/文档类型/_search查询所有文档

5.2 测试

  • 1、创建一个索引PUT /索引名/类型名/id
  • 默认是_doc

img

5.3 数据类型

  1. 基本数据类型
  • 字符串 text, keyword
  • 数据类型 long, integer,short,byte,double,float,half_float,scaled_float
  • 日期 date
  • 布尔 boolean
  • 二进制 binary
  1. 制定数据类型

输入:创建规则

PUT /test2
{
  "mappings": {
    
    "properties": {
      "name": {
        "type": "text"
      },
      "age": {
        "type": "long"
      },
      "birthday": {
        "type": "date"
      }
    }
  }  
}
1234567891011121314151617

输出:

{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "test2"
}
12345

如果不指定具体类型,es会默认配置类型

5.4 关于索引的基本操作

  • 查看索引信息:

    GET test2

  • 查看es信息

    get _cat/

  • 修改

    1. 之前的办法:直接put
    2. 现在的办法:

    POST /test3/_doc/1/_update
    {
    “doc”: {
    “name”: “庞世宗”
    }
    }

  • 删除索引

    DELETE test1

5.5 关于文档的基本操作(重点)

5.5.1 基本操作

1、添加数据

PUT /psz/user/1
{
  "name": "psz",
  "age": 22,
  "desc": "偶像派程序员",
  "tags": ["暖","帅"]
}
1234567

2、获取数据

GEt psz/user/1
===============输出===========
{
  "_index" : "psz",
  "_type" : "user",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "name" : "psz",
    "age" : 22,
    "desc" : "偶像派程序员",
    "tags" : [
      "暖",
      "帅"
    ]
  }
}

3、 更新数据PUT

img

4、更新数据,推荐POST _update

  • 不推荐
POST psz/user/1
{
  "doc":{
    "name": "庞庞胖"    #后面信息会没有
  }
}
  • 推荐!
POST psz/user/1/_update
{
  "doc":{
    "name": "庞庞胖"    #后面信息存在
  }
}

5、简单搜索 GET

GET psz/user/1

简答的条件查询:根据默认映射规则产生基本的查询

GET wangyun/user/_search?q=name:王允
5.5.2 复杂查询

1、查询,参数使用JSON体

GET wangyun/user/_search
{
  "query": {
    "match": {
      "name": "王允"   //根据name匹配(精确匹配)
    }  
  },
    "_source": ["name","age"],  //结果的过滤,只显示name和age
    "sort": [
    {
      "age": {
        "order": "desc" //根据年龄降序
    }
    }
  ],
    
  "from": 0, //分页:起始值,从0还是
  "size": 1  //返回多少条数据
}

image-20200928111638286

image-20200928111651532

  • hit:索引以及文档的信息
  • 之后只用java操作es时候,所有的对象和方法就是这里面的key
  • 分页前端 /search/{current}/{pagesize}

2 、布尔值查询

  • must(对应mysql中的and) ,所有条件都要符合
GET psz/user/_search
{
  "query": {
    "bool": {
      "must": [  //相当于and
        {
          "match": {
            "name": "王允"
          }
          
        },
        {
          "match": {
            "age": 22
          }
        }
          
      ]
    }
  }
}
123456789101112131415161718192021
  • shoule(对应mysql中的or)
GET psz/user/_search
{
  "query": {
    "bool": {
      "should": [ //should相当于or
        {
          "match": {
            "name": "王允"
          }
          
        },
        {
          "match": {
            "age": 22
          }
        }
          
      ]
    }
  }
}

  • must_not (对应mysql中的not)
  • 过滤器
GET psz/user/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match": {
            "name": "王允"
          }
          
        }
      ],
      "filter": [
        {
          "range": {
            "age": {
              "gt": 20   //过滤年龄大于20的
            }
          }
        }
      ]
    }
  }
}

多个条件使用空格隔开,只要慢点组其中一个结果就可以被查询这个

3、精确查询

  • trem查询是直接通过倒排索引指定的词条进行精确的查找的。

关于分词:

trem,直接查询精确地

match,会使用分词器解析

关于类型:

text: 分词器会解析

keywords: 不会被拆分

PUT testdb/_doc/1
{
  "name":"i am a smartman",
  "desc":"execute it"
}
PUT testdb/_doc/2
{
  "name":"i am a smartman",
  "desc":"execute it"
}

GET _analyze
{
  "analyzer": "keyword",
  "text": "i am a smartman"
}

4、高亮查询

GET psz/user/_search
{
  "query": {
    "match": {
      "name": "王允"
    }
  },
  "_source": ["name","age"],
  "sort": [
    {
      "age": {
        "order": "desc"
      }
    }
  ],
  "highlight": //高亮
  {
    "pre_tags": "<P>",   //自定义高亮
    "post_tags": "</P>", 
    "fields": {
      "name":{}  //自定义高亮区域
    }  
  }

image-20200928114023075

搜索的相关的结果,自动添加html的标签

6. 集成Springboot

6.1 集成Springboot

官方文档:https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/index.html

1、找到原生的依赖

<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
    <version>7.6.2</version>
</dependency>

    <properties>
        <java.version>1.8</java.version>
        <elasticsearch.version>7.6.1</elasticsearch.version>
    </properties>
12345678910

2、构建对象

Initialization

A RestHighLevelClient instance needs a REST low-level client builder to be built as follows:

RestHighLevelClient client = new RestHighLevelClient(
        RestClient.builder(
                new HttpHost("localhost", 9200, "http"),
                new HttpHost("localhost", 9201, "http")));

The high-level client will internally create the low-level client used to perform requests based on the provided builder. That low-level client maintains a pool of connections and starts some threads so you should close the high-level client when you are well and truly done with it and it will in turn close the internal low-level client to free those resources. This can be done through the close:

client.close();

In the rest of this documentation about the Java High Level Client, the RestHighLevelClient instance will be referenced as client.

3、分析类中的方法

一定要版本一致!默认es是6.8.1,要改成与本地一致的。

	<properties>
		<java.version>1.8</java.version>
		<elasticsearch.version>7.6.1</elasticsearch.version>
	</properties>

Java配置类

@Configuration  //xml
public class EsConfig {

    @Bean
    public RestHighLevelClient restHighLevelClient(){
        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(
                        new HttpHost("localhost", 9200, "http"))); 
        return client;
    }
}

6.2 索引API操作

1、创建索引

@SpringBootTest
class EsApplicationTests {
	
	@Autowired
	@Qualifier("restHighLevelClient")
	private RestHighLevelClient restHighLevelClient;

	//创建索引的创建 Request
	@Test
	void testCreateIndex() throws IOException {
		//1.创建索引请求
		CreateIndexRequest request = new CreateIndexRequest("索引名");
		//2.执行创建请求 indices 请求后获得响应
		CreateIndexResponse createIndexResponse = restHighLevelClient.indices().create(request, RequestOptions.DEFAULT);

		System.out.println(createIndexResponse);
	}

}

2、获取索引

	@Test
	void testExistIndex() throws IOException {
		GetIndexRequest request = new GetIndexRequest("索引名");
		boolean exist =restHighLevelClient.indices().exists(request,RequestOptions.DEFAULT);
		System.out.println(exist);

	}
1234567

3、删除索引

	@Test
	void deleteIndex() throws IOException{
		DeleteIndexRequest requset = new DeleteIndexRequest("索引名");
		AcknowledgedResponse delete = restHighLevelClient.indices().delete(requset, RequestOptions.DEFAULT);
		System.out.println(delete.isAcknowledged());
	}
123456

6.3 文档API操作

1、测试添加文档

	//测试添加文档
	@Test
	void testAddDocument() throws IOException {
		//创建对象
		User user = new User("psz", 22);
		IndexRequest request = new IndexRequest("ppp");
		//规则 PUT /ppp/_doc/1
		request.id("1");
		request.timeout(timeValueSeconds(1));
		//数据放入请求
		IndexRequest source = request.source(JSON.toJSONString(user), XContentType.JSON);

		//客户端发送请求,获取响应结果
		IndexResponse indexResponse = restHighLevelClient.index(request, RequestOptions.DEFAULT);
		System.out.println(indexResponse.toString());
		System.out.println(indexResponse.status());
	}

2、获取文档

	//获取文档,判断是否存在 GET /index/doc/1
	@Test
	void testIsExists() throws IOException {

		GetRequest getRequest = new GetRequest("ppp", "1");
		//过滤,不放回_source上下文
		getRequest.fetchSourceContext(new FetchSourceContext(false));
		getRequest.storedFields("_none_");
		boolean exists = restHighLevelClient.exists(getRequest, RequestOptions.DEFAULT);
		System.out.println(exists);
	}

3、获取文档信息

	//获取文档信息
	@Test
	void getDocument() throws IOException {
		GetRequest getRequest = new GetRequest("ppp", "1");
		GetResponse getResponse = restHighLevelClient.get(getRequest, RequestOptions.DEFAULT);
		System.out.println(getResponse.getSourceAsString());
		System.out.println(getResponse);
	}
==============输出==========================
{"age":22,"name":"psz"}
{"_index":"ppp","_type":"_doc","_id":"1","_version":2,"_seq_no":1,"_primary_term":1,"found":true,"_source":{"age":22,"name":"psz"}}
1234567891011

4、更新文档信息

	//更新文档信息
	@Test
	void updateDocument() throws IOException {

		UpdateRequest updateRequest = new UpdateRequest("ppp","1");
		updateRequest.timeout("1s");

		//json格式传入对象
		User user=new User("新名字",21);
		updateRequest.doc(JSON.toJSONString(user),XContentType.JSON);
		//请求,得到响应
		UpdateResponse updateResponse = restHighLevelClient.update(updateRequest, RequestOptions.DEFAULT);
		System.out.println(updateResponse);
	}

5、删除文档信息

//删除文档信息
@Test
void deleteDocument() throws IOException {

   DeleteRequest deleteRequest = new DeleteRequest("ppp","1");
   deleteRequest.timeout("1s");
   DeleteResponse deleteResponse = restHighLevelClient.delete(deleteRequest, RequestOptions.DEFAULT);
   System.out.println(deleteResponse);
}
123456789

6.4 批量操作Bulk

  • 真实项目中,肯定用到大批量查询
	@Test
	void testBulkRequest() throws IOException{
		BulkRequest bulkRequest = new BulkRequest();
		bulkRequest.timeout("10s");//数据量大的时候,秒数可以增加

		ArrayList<User> userList = new ArrayList<>();
		userList.add(new User("psz",11));
		userList.add(new User("psz2",12));
		userList.add(new User("psz3",13));
		userList.add(new User("psz4",14));
		userList.add(new User("psz5",15));

		for (int i = 0; i < userList.size(); i++) {
			bulkRequest.add(
					new IndexRequest("ppp")
					.id(""+(i+1))
					.source(JSON.toJSONString(userList.get(i)),XContentType.JSON));
		}
		//请求+获得响应
		BulkResponse bulkResponse = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);
		System.out.println(bulkResponse.hasFailures());//返回false:成功
	}
12345678910111213141516171819202122

6.5 搜索

	/*
		查询:
		搜索请求:SearchRequest
		条件构造:SearchSourceBuilder
	 */
	@Test
	void testSearch() throws IOException {
		SearchRequest searchRequest = new SearchRequest("ppp");
		//构建搜索条件
		SearchSourceBuilder searchSourceBuilderBuilder = new SearchSourceBuilder();
		// 查询条件QueryBuilders工具
		// :比如:精确查询
		TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("name", "psz");
		searchSourceBuilderBuilder.query(termQueryBuilder);
		//设置查询时间
		searchSourceBuilderBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS));
		//设置高亮
		//searchSourceBuilderBuilder.highlighter()

		searchRequest.source(searchSourceBuilderBuilder);
		SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
		System.out.println(JSON.toJSONString(searchResponse.getHits()));
	}
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值