elasticsearch入门

docker run -d \
	--name es \
    -e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \
    -e "discovery.type=single-node" \
    -v es-data:/usr/share/elasticsearch/data \
    -v es-plugins:/usr/share/elasticsearch/plugins \
    --privileged \
    --network es-net \
    -p 9200:9200 \
    -p 9300:9300 \
elasticsearch:7.12.1

命令解释：

-e "cluster.name=es-docker-cluster"：设置集群名称

-e "http.host=0.0.0.0"：监听的地址，可以外网访问

-e "ES_JAVA_OPTS=-Xms512m -Xmx512m"：内存大小

-e "discovery.type=single-node"：非集群模式

-v es-data:/usr/share/elasticsearch/data：挂载逻辑卷，绑定es的数据目录

-v es-logs:/usr/share/elasticsearch/logs：挂载逻辑卷，绑定es的日志目录

-v es-plugins:/usr/share/elasticsearch/plugins：挂载逻辑卷，绑定es的插件目录

--privileged：授予逻辑卷访问权

--network es-net ：加入一个名为es-net的网络中

-p 9200:9200：端口映射配置

2.部署kibana

kibana可以给我们提供一个elasticsearch的可视化界面，便于我们学习。

1.部署
运行docker命令，部署kibana

docker run -d \
--name kibana \
-e ELASTICSEARCH_HOSTS=http://es:9200 \
--network=es-net \
-p 5601:5601  \
kibana:7.12.1

--network es-net ：加入一个名为es-net的网络中，与elasticsearch在同一个网络中

-e ELASTICSEARCH_HOSTS=http://es:9200"：设置elasticsearch的地址，因为kibana已经与elasticsearch在一个网络，因此可以用容器名直接访问elasticsearch

-p 5601:5601：端口映射配置

3.DevTools

kibana中提供了一个DevTools界面：

这个界面中可以编写DSL来操作elasticsearch。并且对DSL语句有自动补全功能。

4.安装IK分词器

1.在线安装

# 进入容器内部
docker exec -it elasticsearch /bin/bash

# 在线下载并安装
./bin/elasticsearch-plugin  install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.12.1/elasticsearch-analysis-ik-7.12.1.zip

#退出
exit
#重启容器
docker restart elasticsearch

2.离线安装

查看数据卷目录
安装插件需要知道elasticsearch的plugins目录位置，而我们用了数据卷挂载，因此需要查看elasticsearch的数据卷目录，通过下面命令查看:

docker volume inspect es-plugins

显示结果

[
    {
        "CreatedAt": "2022-05-06T10:06:34+08:00",
        "Driver": "local",
        "Labels": null,
        "Mountpoint": "/var/lib/docker/volumes/es-plugins/_data",
        "Name": "es-plugins",
        "Options": null,
        "Scope": "local"
    }
]

说明plugins目录被挂载到了：`/var/lib/docker/volumes/es-plugins/_data `这个目录中。

重启容器

docker restart es

测试
IK分词器包含两种模式：
`ik_smart`：最少切分
`ik_max_word`：最细切分

5.IK分词器扩展词条，停用词条

要拓展ik分词器的词库，只需要修改一个ik分词器目录中的config目录中的IkAnalyzer.cfg.xml文件：

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
	<comment>IK Analyzer 扩展配置</comment>
	<!--用户可以在这里配置自己的扩展字典 -->
	<entry key="ext_dict">ext.dic</entry>
	 <!--用户可以在这里配置自己的扩展停止词字典-->
	<entry key="ext_stopwords">stopword.dic</entry>
	<!--用户可以在这里配置远程扩展字典 -->
	<!-- <entry key="remote_ext_dict">words_location</entry> -->
	<!--用户可以在这里配置远程扩展停止词字典-->
	<!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>

然后在名为ext.dic的文件中，添加想要拓展的词语即可
然后在名为stopword.dic的文件中，添加想要拓展的词语即可
utf-8编码保存

6.分词器的作用是什么？

1.创建倒排索引时对文档分词
2.用户搜索时，对输入的内容分词

4.索引库操作

1.mapping属性

mapping是对索引库中文档的约束，常见的mapping属性包括：

type：字段数据类型，常见的简单类型有：
        •字符串：text（可分词的文本）、keyword（精确值，例如：品牌、国家、ip地址）
        •数值：long、integer、short、byte、double、float、
        •布尔：boolean
        •日期：date
        •对象：object
•index：是否创建索引，默认为true
•analyzer：使用哪种分词器
•properties：该字段的子字段

2.创建索引库

ES中通过Restful请求操作索引库、文档。请求内容用DSL语句来表示。创建索引库和mapping的DSL语法如下：

PUT /索引库名称
{
  "mappings": {
    "properties": {
      "字段名":{
        "type": "text",
        "analyzer": "ik_smart"
      },
      "字段名2":{
        "type": "keyword",
        "index": "false"
      },
      "字段名3":{
        "properties": {
          "子字段": {
            "type": "keyword"
          }
        }
      },
      // ...略
    }
  }
}

3.查看删除索引库

GET /索引库名 

DELETE /索引库名

4.修改索引库

索引库和mapping一旦创建无法修改，但是可以添加新的字段，语法如下：

PUT /索引库名/_mapping
{
  "properties": {
    "新字段名":{
      "type": "integer"
    }
  }
}

5.文档操作

1.新增文档的DSL语法如下

POST /索引库名/_doc/文档id
{
    "字段1": "值1",
    "字段2": "值2",
    "字段3": {
        "子属性1": "值3",
        "子属性2": "值4"
    },
    // ...
}

2.查看、删除文档

GET /索引库名/_doc/文档id 

DELETE /索引库名/_doc/文档id

3.修改文档

方式一：全量修改，会删除旧文档，添加新文档

PUT /索引库名/_doc/文档id
{
    "字段1": "值1",
    "字段2": "值2",
    // ... 略
}

方式二：增量修改，修改指定字段值

POST /索引库名/_update/文档id
{
    "doc": {
         "字段名": "新的值",
    }
}

RestClient操作索引库，文档

1.利用JavaRestClient实现创建、删除索引库，判断索引库是否存在

根据课前资料提供的酒店数据创建索引库，索引库名为hotel，mapping属性根据数据库结构定义。

基本步骤如下：

1. 导入课前资料 Demo

2. 分析数据结构，定义 mapping 属性

3. 初始化 JavaRestClient

4. 利用 JavaRestClient 创建索引库

5. 利用 JavaRestClient 删除索引库

6. 利用 JavaRestClient 判断索引库是否存在

1.导入课前资料

2.分析数据结构

mapping要考虑的问题：
字段名、数据类型、是否参与搜索、是否分词、如果分词，分词器是什么？

ES中支持两种地理坐标数据类型：
•geo_point：由纬度（latitude）和经度（longitude）确定的一个点。例如："32.8752345, 120.2981576"
•geo_shape：有多个geo_point组成的复杂几何图形。例如一条直线，"LINESTRING (-77.03653 38.897676, -77.009051 38.889939)"

字段拷贝可以使用copy_to属性将当前字段拷贝到指定字段

#酒店的mapping
PUT /hotel
{
  "mappings": {
    "properties": {
      "id":{
        "type": "keyword"
      },
      "name":{
        "type": "text",
        "analyzer": "ik_max_word",
        "copy_to": "all"
      },
      "address":{
        "type": "keyword",
        "index": false
      },
      "price":{
        "type": "integer"
      },
      "score":{
        "type": "integer"
      },
      "brand":{
        "type": "keyword",
        "copy_to": "all"
      },
      "city":{
        "type": "keyword"
      },
      "starName":{
        "type": "keyword"
      },
      "business":{
        "type": "keyword",
        "copy_to": "all"
      },
      "location":{
        "type": "geo_point"
      },
      "pic":{
        "type": "keyword",
        "index": false
      },
      "all":{
        "type": "text",
        "analyzer": "ik_max_word"
      }
    }
  }
}

3.初始化JavaRestClient

1.引入es的RestHighLevelClient依赖：

        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
        </dependency>

2.因为SpringBoot默认的ES版本是7.6.2，所以我们需要覆盖默认的ES版本：

    <properties>
        <java.version>1.8</java.version>
        <elasticsearch.version>7.12.1</elasticsearch.version>
    </properties>

3.初始化RestHighLevelClient：

    @BeforeEach
    void setUp() {
        this.client=new RestHighLevelClient(RestClient.builder(
                HttpHost.create("http://192.168.64.131:9200")
        ));
    }
    @AfterEach
    void tearDown() throws IOException {
        this.client.close();
    }

4.创建索引库

    @Test
    void createHotelIndex() throws IOException {
        //1.创建Request对象
        CreateIndexRequest request = new CreateIndexRequest("hotel");
        //2.准备请求的参数：DSL语句
        request.source(MAPPING_TEMPLATE, XContentType.JSON);
        //3.发送请求
        client.indices().create(request, RequestOptions.DEFAULT);
    }

package constants;

public class HotelConstants {
    public static final String MAPPING_TEMPLATE="{\n" +
            "  \"mappings\": {\n" +
            "    \"properties\": {\n" +
            "      \"id\":{\n" +
            "        \"type\": \"keyword\"\n" +
            "      },\n" +
            "      \"name\":{\n" +
            "        \"type\": \"text\",\n" +
            "        \"analyzer\": \"ik_max_word\",\n" +
            "        \"copy_to\": \"all\"\n" +
            "      },\n" +
            "      \"address\":{\n" +
            "        \"type\": \"keyword\",\n" +
            "        \"index\": false\n" +
            "      },\n" +
            "      \"price\":{\n" +
            "        \"type\": \"integer\"\n" +
            "      },\n" +
            "      \"score\":{\n" +
            "        \"type\": \"integer\"\n" +
            "      },\n" +
            "      \"brand\":{\n" +
            "        \"type\": \"keyword\",\n" +
            "        \"copy_to\": \"all\"\n" +
            "      },\n" +
            "      \"city\":{\n" +
            "        \"type\": \"keyword\"\n" +
            "      },\n" +
            "      \"starName\":{\n" +
            "        \"type\": \"keyword\"\n" +
            "      },\n" +
            "      \"business\":{\n" +
            "        \"type\": \"keyword\",\n" +
            "        \"copy_to\": \"all\"\n" +
            "      },\n" +
            "      \"location\":{\n" +
            "        \"type\": \"geo_point\"\n" +
            "      },\n" +
            "      \"pic\":{\n" +
            "        \"type\": \"keyword\",\n" +
            "        \"index\": false\n" +
            "      },\n" +
            "      \"all\":{\n" +
            "        \"type\": \"text\",\n" +
            "        \"analyzer\": \"ik_max_word\"\n" +
            "      }\n" +
            "    }\n" +
            "  }\n" +
            "}";
}

GET /hotel 验证创建成功

2.利用JavaRestClient实现文档的CRUD

去数据库查询酒店数据，导入到hotel索引库，实现酒店数据的CRUD。

基本步骤如下：

1. 初始化 JavaRestClient

2. 利用 JavaRestClient 新增酒店数据

3. 利用 JavaRestClient 根据 id 查询酒店数据

4. 利用 JavaRestClient 删除酒店数据

5. 利用 JavaRestClient 修改酒店数据

1.初始化JavaRestClient

新建一个测试类，实现文档相关操作，并且完成JavaRestClient的初始化

public class HotelDocumerTest {

    private RestHighLevelClient client;

    @BeforeEach
    void setUp() {
        this.client=new RestHighLevelClient(RestClient.builder(
                HttpHost.create("http://192.168.64.131:9200")
        ));
    }
    @AfterEach
    void tearDown() throws IOException {
        this.client.close();
    }
}

2.添加酒店数据到索引库

先查询酒店数据，然后给这条数据创建倒排索引，即可完成添加

package cn.itcast.hotel;

import cn.itcast.hotel.pojo.Hotel;
import cn.itcast.hotel.pojo.HotelDoc;
import cn.itcast.hotel.service.IHotelService;
import com.alibaba.fastjson.JSON;
import org.apache.http.HttpHost;
import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.client.indices.CreateIndexRequest;
import org.elasticsearch.client.indices.GetIndexRequest;
import org.elasticsearch.common.xcontent.XContentType;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;

import java.io.IOException;

import static constants.HotelConstants.MAPPING_TEMPLATE;
@SpringBootTest
public class HotelDocumerTest {
    @Autowired
    private IHotelService HotelService;

    private RestHighLevelClient client;

    @Test
    void testAddDocument() throws IOException {
        //根据id查询酒店数据
        Hotel hotel = HotelService.getById(61083L);
        //转换为文档类型
        HotelDoc hotelDoc = new HotelDoc(hotel);
        //1.准备Request对象
        IndexRequest request = new IndexRequest("hotel").id(hotel.getId().toString());
        //2.准备Json文档
        request.source(JSON.toJSONString(hotelDoc),XContentType.JSON);
        //3.发送请求
        client.index(request,RequestOptions.DEFAULT);
    }

    @BeforeEach
    void setUp() {
        this.client=new RestHighLevelClient(RestClient.builder(
                HttpHost.create("http://192.168.64.131:9200")
        ));
    }
    @AfterEach
    void tearDown() throws IOException {
        this.client.close();
    }
}

3.根据id查询酒店数据

根据id查询到的文档数据是json，需要反序列化为java对象：

    @Test
    void testGetDocumentById() throws IOException {
        //1.准备Request
        GetRequest request = new GetRequest("hotel", "61083");
        //2.发送请求，得到响应
        GetResponse response = client.get(request, RequestOptions.DEFAULT);
        //3.解析响应结果
        String json = response.getSourceAsString();
        HotelDoc hotelDoc = JSON.parseObject(json, HotelDoc.class);
        System.out.println(hotelDoc);
    }

4.根据id修改酒店数据

修改文档数据有两种方式：

方式一：全量更新。再次写入id一样的文档，就会删除旧文档，添加新文档
方式二：局部更新。只更新部分字段，我们演示方式二

    @Test
    void testUpdateDocument() throws IOException {
        //1.准备Request
        UpdateRequest request = new UpdateRequest("hotel", "61083");
        //2.准备请求参数
        request.doc(
                "price","888",
                "starName","刘钻"
        );
        //3.发送请求
        client.update(request, RequestOptions.DEFAULT);
    }

5.根据id删除文档数据

    @Test
    void testDeleteDocument() throws IOException {
        //1.准备Request
        DeleteRequest request = new DeleteRequest("hotel", "61083");
        //3.发送请求
        client.delete(request, RequestOptions.DEFAULT);
    }

6.利用JavaRestClient批量导入酒店数据到ES

需求：批量查询酒店数据，然后批量导入索引库中

思路：

1. 利用 mybatis -plus 查询酒店数据

2. 将查询到的酒店数据（ Hotel ）转换为文档类型数据（ HotelDoc ）

3. 利用 JavaRestClient 中的 Bulk 批处理，实现批量新增文档，示例代码如下

    @Test
    void testBulkRequest() throws IOException {
        //批量查询酒店数据
        List<Hotel> hotels = HotelService.list();
        //1.创建Request
        BulkRequest request = new BulkRequest();
        //2.准备参数 添加多个新怎的Request
        for (Hotel hotel : hotels) {
            //转换为文档类型HotelDoc
            HotelDoc hotelDoc = new HotelDoc(hotel);
            //创建新增文档的Request对象
            request.add(new IndexRequest("hotel")
                    .id(hotelDoc.getId().toString())
                    .source(JSON.toJSONString(hotelDoc), XContentType.JSON));
        }
        //3.发送请求
        client.bulk(request, RequestOptions.DEFAULT);
    }

mymk01

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
elasticsearch入门

基本概念1.什么是elasticsearch？一个开源的分布式搜索引擎，可以用来实现搜索、日志统计、分析、系统监控等功能2.什么是elastic stack（ELK）？是以elasticsearch为核心的技术栈，包括beats、Logstash、kibana、elasticsearch3.什么是Lucene？是Apache的开源搜索引擎类库，提供了搜索引擎的核心API1.正向索引和倒排索引1.什么是文档和词条？每一条数据就是一个文档，对文档中的内容分词，得到的词语就是词条elast
复制链接

扫一扫