Elasticsearch
什么是Elasticsearch?
基本概念
1、index
名词:相当于mysql的inser,
动词:相当于数据库
2、Type
存在inde中,可以定义一个或多个类型,相当于mysql的table。
3、倒排索引
所有的数据都会进行分词,然后保存到分词中,每个分词都会保存在数据的索引。
推荐博客:倒排索引原理和实现
docker安装
elasticsearch
创建实例
docker run -d --name es2 -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms64m -Xmx128m"
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data
-v /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins
b1179d41a7b4
docker rm e53d8de4f9fa
docker stop e53d8de4f9fa
Kibana
一、初步检索
1、_Cat
2、保存数据库与表、数据
发送put: 必须要带id,如果有这个id就修改没有就新增,
customer/external/1
{
"name" : "hello"
}
发送post请求:可以没有id,不带id就一直是新增
customer/external
{
"name" : "hello"
}
customer是数据库,
external是表,
1是唯一标识id,带id,如果有这个id就修改没有就新增,不带id就一直是新增,就像主键
{“name” : “hello” } 是数据,以json保存。
3、Get查询
4、更新
**用put更新:**每次都会改版本
PUT customer/external/1
{
"name" : "hello"
}
用put更新并增加其他属性:
PUT customer/external/1
{
"name" : "hello","price":3000
}
**用post更新:**判断修改的值不相同才会改版本
POST customer/external/1/_update
{
"doc":{
"name" : "hello"
}
}
用post更新并增加其他属性:
POST customer/external/1/_update
{
"doc":{
"name" : "hello","age":20
}
}
5、删除
6、bulk批量API
二、Query DSL
1、查询
2、条件查询
GET bank/_search
{
"query": {
"bool": {
//满足
"must": [
{
"match": {
"gender": "M"
}
},
{
"match": {
"address": "mill"
}
}
],
//不满足
"must_not": [
{
"match": {
"_id": "345"
}
}
]
//有没有都可以
,"should": [
{
"match": {
"age": 28
}
}
]
}
}
}
3、结果过滤
GET /bank/_search
{
"query": {
"bool": {
//只搜索10~20之间
"filter": {
"range": {
"age": {
"gte": 10,
"lte": 20
}
}
}
}
}
}
4、聚合查询
https://www.baidu.com/link?url=TPTS6O37l8Jq08x39Qg5cpfwjDTg98IW0_G2vadrTUbL_z8w-fidAWzyW7GwpHam9tZ0aXBE2lOhuoEoOuciTea520E1vah5HXs7VWdl19q&wd=&eqid=d238e4810009e506000000065ece538e
4-1 子聚合查询
GET /bank/_search
{
"aggs": {
"aggAvg": {
"terms": {
"field": "age",
"size": 100
},
"aggs": {
//性别分组
"genderAgg": {
"terms": {
"field": "gender.keyword",
"size": 10
},
//性别的年龄工资
"aggs": {
"balanceAvg": {
"avg": {
"field": "balance"
}
}
}
},
//所有年龄平均工资
"ageBalanceAvg":{
"avg": {
"field": "balance"
}
}
}
}
}
}
**5、mappings 添加_index **
PUT /my_index
{
"mappings": {
"properties": {
"id":{"type": "integer"},
"email":{"type": "keyword"},
"name":{"type": "text"}
}
}
}
6、_mapping添加新的字段
PUT /my_index/_mapping
{
"properties": {
"employee-id":{
"type":"keyword",
"index":false //这个字段为false不会被检索到,不会参与检索
}
}
}
7、内嵌查询,也就是查询类里的集合属性
GET product/_search
{
"query": {
"bool": {
"filter": {
"term": {
"catalogId": "2"
},
"terms": {
"brandId": [
"1",
"2"
]
},
"nested": {
"path": "attrs",
"query": {
"bool": {
"must": [
{
"term": {
"attrs.attrId": {
"value": "2"
}
}
}
]
}
}
}
}
}
}
}
8、内嵌聚合查询
GET product/_search
{
"aggs": {
"attr_agg": {
//聚合查询的属性
"nested": {
"path": "attrs"
},
"aggs": {
"agg_id_agg": {
"terms": {
"field": "attrs.attrName",
"size": 10
}
}
}
}
}
}
9、高亮
GET product/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"skuTitle": {
"value": "华为"
}
}
}
]
}
},
//高亮设置
"highlight": {
"fields": {"skuTitle":{}},
"pre_tags": "<b style='color:red>'",
"post_tags": "</b>"
},
//分页
"from": 0, //当前页
"size": 1 //每页大小
}
三、数据迁移
【注】已创建的数据库不能修改数据库字段,只能重新用mappings
POST _reindex
//需要迁移的数据库
"source":{
"index":"bank",
"type":"account"
},
//迁移到新的数据库
{
"dest":{
"index":"newbank"
}
}
}
四、安装IK分词器
把对应版本的ik分词器放入映射号的plugin
1、进入容器查看是否安装成功
docker exec -it 容器名字 /bin/bash
2、查看目录plugin
有ik说明成功
3、重启es
docker restart 容器名
五、使用IK分词器
倒排索引
倒排索引有一个倒排索引表,这个表里记载的是文档集中包含了哪些词,并通过文档的id知道这个单词的哪个文档位置以及出现的次数。
ik分词器出现的背景:
分词:即把一段中文或者别的划分成一个个的关键字,我们在搜索时候会把自己的信息进行分词,会把数据库中或者索引库中的数据进行分词,然后进行一个匹配操作,
默认的中文分词是将每个字看成一个词,比如"中国的花"会被分为"中",“国”,“的”,“花”,这显然是不符合要求的,所以我们需要安装中文分词器ik来解决这个问题。
IK提供了两个分词算法
ik_smart 和 ik_max_word
其中 ik_smart 为最少切分,ik_max_word为最细粒度划分
POST _analyze
{
"analyzer": "ik_max_word",
"text": "我是中国人"
}
六、自定义分词器
1、docker安装nginx
docker run -p 80:80 --name nginx \
-v /mydata/nginx/html:/usr/share/nginx/html \
-v /mydata/nginx/logs:/var/log/nginx \
-v /mydata/nginx/conf:/etc/nginx \
-d nginx
2、把nginx自带的配置拷贝后,在nginx的html目录创建fenci.txt
里面写你需要的分的词
3、打开ik的访问远程分词的配置
路径:es->plugins->ik->config->IKAnalyzer.cfg.xml
修改IKAnalyzer.cfg.xml
vi IKAnalyzer.cfg.xml
4、之后重启es
七、Elasticsearch-Rest-Client
导入依赖
1、多种依赖的区别:
2、导入对应ES版本的依赖
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.4.2</version>
</dependency>
如果导入依赖报错就使用下面这个:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
3、springboot有默认Elasticsearch-rest-high-level-client的依赖版本
如果与这个默认的版本与自己ES版本不同,需要改下版本
在pom声明版本:
<properties>
<elasticsearch.version>7.4.2</elasticsearch.version>
</properties>
使用API
官方文档:https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/_search_apis.html
@Configuration
public class GulimallElasticConfig {
public static final RequestOptions COMMON_OPTIONS;
static {
RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
// builder.addHeader("Authorization", "gulimall" );
// builder.setHttpAsyncResponseConsumerFactory(
// new HttpAsyncResponseConsumerFactory
// .HeapBufferedResponseConsumerFactory(30 * 1024 * 1024 * 1024));
COMMON_OPTIONS = builder.build();
}
@Bean
public RestHighLevelClient esRestClient() {
RestClientBuilder builder = null;
builder = RestClient.builder(new HttpHost("www.lzhbk.cn", 9200, "http"));
RestHighLevelClient client = new RestHighLevelClient(builder);
// RestHighLevelClient client = new RestHighLevelClient(
// RestClient.builder(
// new HttpHost("www.lzhbk.cn",9200,"http")
// )
// );
return client;
}
}
1、插入数据
https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-document-index.html
@Test
void test() throws IOException {
IndexRequest indexRequest = new IndexRequest("users");
indexRequest.id("1");
User user = new User("李四", "女",26);
String s = JSON.toJSONString(user);
indexRequest.source(s, XContentType.JSON);
IndexResponse index = client.index(indexRequest, GulimallElasticConfig.COMMON_OPTIONS);
System.out.println(index+"+++"+user.getUserName()+"--="+s);
}
2、条件检索
@Autowired
private RestHighLevelClient client;
@Test
void test2() throws IOException {
//创建检索请求
SearchRequest searchRequest = new SearchRequest();
searchRequest.indices("bank");
//指定DSL 检索条件
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(QueryBuilders.matchQuery("address","Lane"));
//查询源
searchRequest.source(sourceBuilder);
System.out.println(searchRequest.toString());
//执行查询
SearchResponse search = client.search(searchRequest, GulimallElasticConfig.COMMON_OPTIONS);
System.out.println(search);
}
3、获得数据
//执行查询
SearchResponse searchResponse = client.search(searchRequest, GulimallElasticConfig.COMMON_OPTIONS);
System.out.println("响应====>"+searchResponse);
//获得响应数据
SearchHits hits = searchResponse.getHits();
SearchHit[] searchHits = hits.getHits();
for (SearchHit hit : searchHits) {
String source = hit.getSourceAsString();
//转换为对象
JSON.parseObject(source,Accout.class);
}
4、获得分析结果
SearchResponse searchResponse = client.search(searchRequest, GulimallElasticConfig.COMMON_OPTIONS);
//获取这次检索的分析信息
Aggregations aggregations = searchResponse.getAggregations();
Terms ageAgg1 = aggregations.get("ageAgg");
//获得桶
for (Terms.Bucket bucket : ageAgg1.getBuckets()) {
System.out.println("年龄:"+bucket.getKey()+"人数:"+bucket.getDocCount());
}
aggregations.get("ageAgg");