关于ElasticSearch

最新推荐文章于 2023-05-27 17:15:30 发布

weixin_44314824

最新推荐文章于 2023-05-27 17:15:30 发布

阅读量135

点赞数

分类专栏：学习资料笔记文章标签： elasticsearch java

本文链接：https://blog.csdn.net/weixin_44314824/article/details/110698086

版权

学习资料同时被 2 个专栏收录

10 篇文章 0 订阅

订阅专栏

笔记

6 篇文章 0 订阅

订阅专栏

1.ElasticSearch简介

ElasticSearch是一款文档服务器，适合存储文档类的信息，例如文章、商品详情等（当然，这些数据仍在MySQL数据库中是存在的），主要是为了实现搜索功能！
所以，关于ElasticSearch的使用与Redis是比较相似的，都是先将数据库中的相关数据读出来，存入到ElasticSearch服务器中，后续，当需要搜索时，直接从ElasticSearch服务器中搜索即可！
ElasticSearch在存储文档时，会直接将文档中的文本进行分词（拆分），例如某段信息是 Hello World ，ElasticSearch在存储时会存下 Hello World ，同时还会进行分词，并存下 Hello 和 World ，分别记录 Hello 和 World 在整个 Hello World 中的位置信息，其目的就是为了便于高效的实现“根据关键字进行搜索”！

2. 启动ElasticSearch

下载地址：

Windows：https://mirrors.huaweicloud.com/elasticsearch/7.6.2/elasticsearch-7.6.2-windows-x86_64.zip
MacOS：https://mirrors.huaweicloud.com/elasticsearch/7.6.2/elasticsearch-7.6.2-darwin-x86_64.tar.gz
CentOS：https://mirrors.huaweicloud.com/elasticsearch/7.6.2/elasticsearch-7.6.2-x86_64.rpm

当启动完成后，保存执行启动程序的窗口是打开的，打开浏览器，通过http://localhost:9200即可访问到ElasticSearch的状态信息（是一个JSON数据）

3. 关于ElasticSearch中的相关术语

索引（index）：相当于MySQL中的数据表（Table）；
文档（document）：相当于MySQL中的数据。

4. 通过HTTP Request使用ElasticSearch

### 查看ES状态
GET http://localhost:9200

### 创建索引--index01
PUT http://localhost:9200/index01

### 创建索引--index02
PUT http://localhost:9200/index02

### 删除索引--index01
DELETE http://localhost:9200/index01

### 删除索引--index02
DELETE http://localhost:9200/index02

### 查看索引列表，_cat=category，indices=index复数
GET http://localhost:9200/_cat/indices?v

### 向index01中添加文档--1，_create=创建，以下URL中最后的1表示数据在ES中的ID
PUT http://localhost:9200/index01/_create/1
Content-Type: application/json

{
  "id": 1001,
  "title": "什么是封装？",
  "content": "一直不理解这个概念。"
}

### 向index01中添加文档--2
PUT http://localhost:9200/index01/_create/2
Content-Type: application/json

{
  "id": 1002,
  "title": "Mybatis框架有什么作用？",
  "content": "据说可以简化持久层开发，还有没有别的作用呢？"
}

### 删除index01中id=1的文档，_doc=document
DELETE http://localhost:9200/index01/_doc/1

### 删除index01中id=2的文档
DELETE http://localhost:9200/index01/_doc/2

### 删除index01中id=3的文档
DELETE http://localhost:9200/index01/_doc/3

### 查看index01中id=1的文档
GET http://localhost:9200/index01/_doc/1

### 查看index01中id=2的文档
GET http://localhost:9200/index01/_doc/2

### 查看index01中id=3的文档
GET http://localhost:9200/index01/_doc/3

### 修改index01中id=1的文档
POST http://localhost:9200/index01/_doc/1/_update
Content-Type: application/json

{
  "doc": {
    "title": "怎么理解封装这个概念？"
  }
}

5. 关于ElasticSearch的分词器

### 使用标准分词器（standard）进行分词--英文
POST http://localhost:9200/_analyze
Content-Type: application/json

{
  "analyzer": "standard",
  "text": "Nice to meet you."
}

### 使用标准分词器（standard）进行分词--中文
POST http://localhost:9200/_analyze
Content-Type: application/json

{
  "analyzer": "standard",
  "text": "你一天天的就知道打游戏，能不能好好学习？"
}

当使用“标准”分词器时，可以正确的将一句英文中的各单词拆分出来，但是，如果需要分词的是中文的语句，它只能将每一个汉字拆出来，并不能识别词语！
为了解决中文的分词问题，需要使用其它的分词器，目前主流的中文分词器推荐使用 ik ，从 https://github.com/medcl/elasticsearch-analysisik/releases/download/v7.6.2/elasticsearch-analysis-ik-7.6.2.zip 可下载并解压
在ElasticSearch的文件夹下的 plugins 文件夹是专门用于存放插件的，在 plugins 下自行创建ik 文件夹（文件夹名称是自定义的），并将压缩包中的各文件解压到 ik 文件夹中
在IK插件中，提供了2种分词器，分别是 ik_smart （智能分词）和 ik_max_word （最大词量）：

### 使用ik_smart进行分词--中文
POST http://localhost:9200/_analyze
Content-Type: application/json

{
  "analyzer": "ik_smart",
  "text": "你一天天的就知道打游戏，能不能好好学习？"
}

### 使用ik_max_word进行分词--中文
POST http://localhost:9200/_analyze
Content-Type: application/json

{
  "analyzer": "ik_max_word",
  "text": "你一天天的就知道打游戏，能不能好好学习？"
}
### 为索引index01配置分词器，必须在索引中尚未添加过数据之前执行
POST http://localhost:9200/index01/_mapping
Content-Type: application/json

{
  "properties": {
    "title": {
      "type": "text",
      "analyzer": "ik_smart",
      "search_analyzer": "ik_smart"
    },
    "content": {
      "type": "text",
      "analyzer": "ik_smart",
      "search_analyzer": "ik_smart"
    }
  }
}

6.通过ElasticSearch实现搜索

6.1 配置项目

实现ElasticSearch时，需要在项目中添加 spring-boot-starter-data-elasticsearch 依赖，参考代码为：

<!-- https://mvnrepository.com/artifact/org.springframework.boot/spring- boot-starter-data-elasticsearch --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-elasticsearch</artifactId> <version>2.3.4.RELEASE</version> </dependency>

6.2 将数据从MySQL中取出

创建vo类并配置分词器

@Data
@Document(indexName = "question")
public class QuestionSearchVO implements Serializable {

    @Id
    private Integer id;
    @Field(type = FieldType.Text, analyzer = "ik_smart", searchAnalyzer = "ik_smart")
    private String title;
    @Field(type = FieldType.Text, analyzer = "ik_smart", searchAnalyzer = "ik_smart")
    private String content;
    @Field(type = FieldType.Integer)
    private Integer userId;
    @Field(type = FieldType.Keyword)
    private String userNickName;

}

关于以上使用到的注解：

@Document ：通过 indexName 属性，配置当前类型对应ElasticSearch服务器中的哪个索引，后续的编程中，ElasticSearch的API会自动找到对应的索引来处理数据，甚至当索引不存在时还会自动创建索引；
@Id ：用于将属性对应ElasticSearch的文档ID；
@Field ：用于配置普通属性，应该指定注解参数 type 的值，其中，FieldType.Keyword表示“关键字”，则该属性值是不会被分词的，例如某用户的 userNickName 的值是 “天下第一” ，则不会拆为 “天下” 和 “第一” 这2个词，同时，通过 “天下” 或 “第一” 也就找不到这个用户昵称了！而 FieldType.Text 表示“普通文本”，是可以分词的，需要通过 analyzer指定分词器，通过 searchAnalyzer 指定查询分词器。

6.3 将“问题”写入到ElasticSearch

操作ElasticSearch的数据时，框架提供的做法与Mybatis框架非常类似，只需要在接口中定义抽象方法即可，并不需要自行实现！
先在 repository 包中创建 QuestionRepository 接口，继承自 ElasticsearchRepository 接口：

@Repository
public interface QuestionRepository extends ElasticsearchRepository<QuestionSearchVO, Long> {

    // 目标：通过关键字查询标题Title或正文Content
    // 关于声明ES API中的查询方法
    // -- 方法的名称必须是JPA风格的，应该尽量通过IntelliJ IDEA的提示来生成方法名称，不要自行编写；
    // -- 搜索的关键字只有1个，而涉及的字段有2个甚至多个，在抽象方法中必须声明与涉及的字段的数量匹配的多个参数
    // -- 使用ES查询得到的结果不可以使用PageHelper实现分页，需要使用Spring Data中的分页
    // -- 使用Spring Data分页时，方法的返回结果类型必须是Page<?>类型的，其中的泛型就是查询结果的列表的元素类型
    // -- 使用Spring Data分页时，必须在方法的参数列表的末尾添加Pageable类型的参数，表示分页参数
    Page<QuestionSearchVO> queryByTitleMatchesOrContentMatches(String titleKeyword, String contentKeyword, Pageable pageable);
}

ElasticsearchRepository类似Mybatis Plus框架，实现了许多常用的数据访问功能，一旦
自定义接口实现了该接口，也就可以直接拥有这些方法！
Elasticsearch的编程框架会自动生成以上接口的代理对象。

接下来，可以通过计划任务，定时从MySQL中取出数据，并写入到ES中！则先在启动类类的声明之前添加注解：@EnableScheudling
然后，在 schedule 包下创建 LoadQuestionSchedule 计划任务类：

@Component
public class LoadQuestionSchedule {
    @Autowired
    IQuestionService questionService;
    @Autowired
    QuestionRepository questionRepository;
    @Autowired
    ElasticsearchOperations elasticsearchOperations;

    @Scheduled(fixedRate = 1 * 60 * 1000)
    public void writeQuestions2Elasticsearch() {
        // 先将ES服务器上对应的索引删除，以至于此前ES中存储的文档全部被删除
        IndexOperations indexOps = elasticsearchOperations.indexOps(QuestionSearchVO.class);
        indexOps.delete();
        // 循环读取MySQL中的数据，并写入到Elasticsearch中
        Integer pageNum = 1;
        PageInfo<QuestionSearchVO> pageInfo = null;
        do {
            pageInfo = questionService.getQuestionsFromDatabase(pageNum);
            pageNum++;
            questionRepository.saveAll(pageInfo.getList());
        } while (pageInfo.isHasNextPage());
    }

}

6.4 通过ElasticSearch实现搜索

在 IQuestionService 中定义“搜索”的抽象方法：

Page<QuestionSearchVO> search(String keyword, Integer pageNum);

在 application.properties 中添加自定义配置：

# 自定义配置：从ES中搜索“问题”时每页查询多少条数据
 project.question.query-from-es.page-size=3

然后，在 QuestionServiceImpl 中实现：

@Override
    public Page<QuestionSearchVO> search(String keyword, Integer pageNum) {
        // 确保页码值基本有效
        if (pageNum == null || pageNum < 1) {
            pageNum = 1;
        }
        // 在Spring Data处理分页时，使用0表示第1页，使用1表示第2页，以此类推
        pageNum--;
        // 准备分页参数
        Pageable pageable = PageRequest.of(pageNum, searchPageSize);
        // 执行搜索，并返回结果
        return questionRepository.queryByTitleMatchesOrContentMatches(keyword, keyword, pageable);
    }

最后，通过控制器对外提供数据搜索功能，在 controller 包中创建 QuestionController 类处理请求：

@RestController
@RequestMapping("/es/question")
public class QuestionController {

    @Autowired
    IQuestionService questionService;

    // http://localhost:8200/es/question/search?keyword=haha
    @GetMapping("/search")
    public Page<QuestionSearchVO> search(String keyword, Integer page) {
        if (page == null || page < 1) {
            page = 1;
        }
        return questionService.search(keyword, page);
    }

}