Elasticsearch

Elasticsearch

什么是Elasticsearch?

image-20200527094416483

基本概念

1、index

名词:相当于mysql的inser,

动词:相当于数据库

2、Type

存在inde中,可以定义一个或多个类型,相当于mysql的table。

3、倒排索引
所有的数据都会进行分词,然后保存到分词中,每个分词都会保存在数据的索引。
推荐博客:倒排索引原理和实现

docker安装

elasticsearch

image-20200527102345469

创建实例

image-20200527103701541

image-20200527103724240

docker run -d --name es2 -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -e "ES_JAVA_OPTS=-Xms64m -Xmx128m"
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data
-v /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins
b1179d41a7b4

docker rm  e53d8de4f9fa
docker stop e53d8de4f9fa

Kibana

image-20200527150605600

一、初步检索

1、_Cat

image-20200527153656199

2、保存数据库与表、数据

发送put: 必须要带id,如果有这个id就修改没有就新增,

customer/external/1
{
"name" : "hello"
}

发送post请求:可以没有id,不带id就一直是新增

customer/external
{
"name" : "hello"
}

customer是数据库,

external是表,

1是唯一标识id,带id,如果有这个id就修改没有就新增,不带id就一直是新增,就像主键

{“name” : “hello” } 是数据,以json保存。

3、Get查询

image-20200527165916434

4、更新

**用put更新:**每次都会改版本

PUT customer/external/1
{
"name" : "hello"
}

用put更新并增加其他属性:

PUT customer/external/1
{
"name" : "hello","price":3000
}

**用post更新:**判断修改的值不相同才会改版本

POST customer/external/1/_update
{
  "doc":{
   "name" : "hello"
  }
}

用post更新并增加其他属性:

POST customer/external/1/_update
{
  "doc":{
   "name" : "hello","age":20
  }
}

5、删除

image-20200527172132986

6、bulk批量API

image-20200527172245620

二、Query DSL

image-20200527182631391

1、查询

image-20200527182655401

image-20200527184204767

image-20200527184443317

2、条件查询

GET bank/_search
{
  "query": {
   "bool": {
      //满足
     "must": [
       {
         "match": {
           "gender": "M"
         }
       },
       {
       "match": {
         "address": "mill"
       }
       }
     ],
     //不满足
     "must_not": [
       {
         "match": {
           "_id": "345"
         }
       }
     ]
     //有没有都可以
     ,"should": [
       {
         "match": {
           "age": 28
         }
       }
     ]
   }
    
  }
}

3、结果过滤

GET /bank/_search
{
  "query": {
    "bool": {
    //只搜索10~20之间
      "filter": {
        "range": {
          "age": {
            "gte": 10,
            "lte": 20
          }
        }
      }
    }
  }
}

4、聚合查询

https://www.baidu.com/link?url=TPTS6O37l8Jq08x39Qg5cpfwjDTg98IW0_G2vadrTUbL_z8w-fidAWzyW7GwpHam9tZ0aXBE2lOhuoEoOuciTea520E1vah5HXs7VWdl19q&wd=&eqid=d238e4810009e506000000065ece538e

4-1 子聚合查询

GET /bank/_search
{
  "aggs": {
    "aggAvg": {
      "terms": {
        "field": "age",
        "size": 100
      },
      "aggs": {
          //性别分组
        "genderAgg": {
          "terms": {
            "field": "gender.keyword",
            "size": 10
          },
            //性别的年龄工资
          "aggs": {
            "balanceAvg": {
              "avg": {
                "field": "balance"
              }
            }
          }
        },
          //所有年龄平均工资
      "ageBalanceAvg":{
        "avg": {
          "field": "balance"
        }
      }
      }
    }
  }
}

**5、mappings 添加_index **

PUT /my_index
{
  "mappings": {
    "properties": {
      "id":{"type": "integer"},
      "email":{"type": "keyword"},
      "name":{"type": "text"}
    }
  }
}

6、_mapping添加新的字段

PUT /my_index/_mapping 
{
  "properties": {
      "employee-id":{
        "type":"keyword",
        "index":false //这个字段为false不会被检索到,不会参与检索
      }
    }
}

7、内嵌查询,也就是查询类里的集合属性

GET product/_search
{
  "query": {
    "bool": {
      "filter": {
        "term": {
          "catalogId": "2"
        },
        "terms": {
          "brandId": [
            "1",
            "2"
          ]
        },
        "nested": {
          "path": "attrs",
          "query": {
            "bool": {
              "must": [
                {
                  "term": {
                    "attrs.attrId": {
                      "value": "2"
                    }
                  }
                }
              ]
            }
          }
        }
      }
    }
   }
 }

8、内嵌聚合查询

GET product/_search
{
  "aggs": {
    "attr_agg": {
    //聚合查询的属性
      "nested": {
        "path": "attrs"
      },
      "aggs": {
        "agg_id_agg": {
          "terms": {
            "field": "attrs.attrName",
            "size": 10
          }
        }
      }
    }
  }
}

9、高亮

GET product/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "skuTitle": {
              "value": "华为"
            }
          }
        }
      ]
    }
  },
  //高亮设置
  "highlight": {
    "fields": {"skuTitle":{}}, 
    "pre_tags": "<b style='color:red>'",
    "post_tags": "</b>"
  },
  //分页
  "from": 0, //当前页
  "size": 1  //每页大小
}

三、数据迁移

】已创建的数据库不能修改数据库字段,只能重新用mappings

POST _reindex
    //需要迁移的数据库
"source":{
"index":"bank",
"type":"account"
 },
//迁移到新的数据库
 {
"dest":{
"index":"newbank"
    }
  }
}

四、安装IK分词器

把对应版本的ik分词器放入映射号的plugin

1、进入容器查看是否安装成功

docker exec -it 容器名字 /bin/bash

2、查看目录plugin

有ik说明成功

3、重启es

docker restart 容器名

五、使用IK分词器

倒排索引

倒排索引有一个倒排索引表,这个表里记载的是文档集中包含了哪些词,并通过文档的id知道这个单词的哪个文档位置以及出现的次数。

img

ik分词器出现的背景:
分词:即把一段中文或者别的划分成一个个的关键字,我们在搜索时候会把自己的信息进行分词,会把数据库中或者索引库中的数据进行分词,然后进行一个匹配操作,
默认的中文分词是将每个字看成一个词,比如"中国的花"会被分为"中",“国”,“的”,“花”,这显然是不符合要求的,所以我们需要安装中文分词器ik来解决这个问题。

IK提供了两个分词算法
ik_smart 和 ik_max_word
其中 ik_smart 为最少切分,ik_max_word为最细粒度划分

POST _analyze
{
  "analyzer": "ik_max_word",
  "text": "我是中国人"
}

六、自定义分词器

1、docker安装nginx

image-20200528104611813

docker run -p 80:80 --name nginx \
-v /mydata/nginx/html:/usr/share/nginx/html \
-v /mydata/nginx/logs:/var/log/nginx \
-v /mydata/nginx/conf:/etc/nginx \
-d nginx

2、把nginx自带的配置拷贝后,在nginx的html目录创建fenci.txt

里面写你需要的分的词

3、打开ik的访问远程分词的配置

路径:es->plugins->ik->config->IKAnalyzer.cfg.xml

修改IKAnalyzer.cfg.xml

vi IKAnalyzer.cfg.xml

image-20200528110755269

4、之后重启es

七、Elasticsearch-Rest-Client

导入依赖

1、多种依赖的区别:

image-20200528141404258

image-20200528141903949

2、导入对应ES版本的依赖

<dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>7.4.2</version>
  </dependency>

如果导入依赖报错就使用下面这个:

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
        </dependency>

3、springboot有默认Elasticsearch-rest-high-level-client的依赖版本

如果与这个默认的版本与自己ES版本不同,需要改下版本

image-20200528160845762

image-20200528160817806

在pom声明版本:

  <properties>
        <elasticsearch.version>7.4.2</elasticsearch.version>
   </properties>
使用API

官方文档:https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/_search_apis.html

@Configuration
public class GulimallElasticConfig {


    public static final RequestOptions COMMON_OPTIONS;
    static {
        RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
//        builder.addHeader("Authorization", "gulimall" );
//        builder.setHttpAsyncResponseConsumerFactory(
//                new HttpAsyncResponseConsumerFactory
//                        .HeapBufferedResponseConsumerFactory(30 * 1024 * 1024 * 1024));
        COMMON_OPTIONS = builder.build();
    }

    @Bean
    public RestHighLevelClient esRestClient() {
        RestClientBuilder builder = null;

        builder = RestClient.builder(new HttpHost("www.lzhbk.cn", 9200, "http"));

        RestHighLevelClient client = new RestHighLevelClient(builder);

//        RestHighLevelClient client = new RestHighLevelClient(
//                RestClient.builder(
//                        new HttpHost("www.lzhbk.cn",9200,"http")
//                )
//        );
        return client;
    }
}

1、插入数据

https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-document-index.html

        @Test
    void test() throws IOException {
        IndexRequest indexRequest = new IndexRequest("users");
        indexRequest.id("1");
        User user = new User("李四", "女",26);
        String s = JSON.toJSONString(user);
        indexRequest.source(s, XContentType.JSON);
        IndexResponse index = client.index(indexRequest, GulimallElasticConfig.COMMON_OPTIONS);
        System.out.println(index+"+++"+user.getUserName()+"--="+s);
    }
2、条件检索
 	@Autowired
    private RestHighLevelClient client;
    @Test
    void test2() throws IOException {
        //创建检索请求
        SearchRequest searchRequest = new SearchRequest();

        searchRequest.indices("bank");
        //指定DSL 检索条件
        SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
        sourceBuilder.query(QueryBuilders.matchQuery("address","Lane"));
        //查询源
        searchRequest.source(sourceBuilder);
        System.out.println(searchRequest.toString());
        //执行查询
        SearchResponse search = client.search(searchRequest, GulimallElasticConfig.COMMON_OPTIONS);
        System.out.println(search);
    }
3、获得数据
        //执行查询
        SearchResponse searchResponse = client.search(searchRequest, GulimallElasticConfig.COMMON_OPTIONS);
        System.out.println("响应====>"+searchResponse);
        //获得响应数据
        SearchHits hits = searchResponse.getHits();
        SearchHit[] searchHits = hits.getHits();
        for (SearchHit hit : searchHits) {
            String source = hit.getSourceAsString();
            //转换为对象
            JSON.parseObject(source,Accout.class);
        }
4、获得分析结果
        SearchResponse searchResponse = client.search(searchRequest, GulimallElasticConfig.COMMON_OPTIONS);
 
//获取这次检索的分析信息
        Aggregations aggregations = searchResponse.getAggregations();
        Terms ageAgg1 = aggregations.get("ageAgg");
        //获得桶
        for (Terms.Bucket bucket : ageAgg1.getBuckets()) {
            System.out.println("年龄:"+bucket.getKey()+"人数:"+bucket.getDocCount());
        }
        aggregations.get("ageAgg");
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值