Spring Data ElasticSearch - 分布式搜索和数据分析引擎相关操作实战流程

最新推荐文章于 2023-03-20 21:44:08 发布

胖蝶的程序猿生活

最新推荐文章于 2023-03-20 21:44:08 发布

阅读量320

点赞数

分类专栏： CGB课程学习总结练习文章标签： elasticsearch docker kibana 搜索引擎

本文链接：https://blog.csdn.net/BOTHOTHJX/article/details/120534618

版权

CGB课程学习总结练习专栏收录该内容

13 篇文章 2 订阅

订阅专栏

ElasticSearch

集群 es-cluster，我们用 docker 启动 es，启用三个节点。

准备服务器

克隆 docker-base:es
虚拟机内存设置成 2G 或以上

右键 es 虚拟机 -> 设置，将内存设置成 2G（16G 内存可设为 3G）

修改系统底层参数，直接粘贴进 VMware

echo 'vm.max_map_count=262144' >>/etc/sysctl.conf

重启服务器：
```
shutdown -r now
```
设置 ip
```
./ip-static
ip:192.168.64.181
```

部署 ElasticSearch 集群

上传文件到 /root/
- pditems 文件夹
- elasticsearch-analysis-ik-7.9.3.zip
- es-img.gz

导入镜像:

docker load -i es-img.gz

或者下载 ElasticSearch 镜像

docker pull elasticsearch:7.9.3

修改系统参数 max_map_count
```
cat /etc/sysctl.conf
```
最后一行 vm.max_map_count=262144 即可。在此之前必须重启服务器(第四步)

准备虚拟网络和挂载目录

# 创建虚拟网络
docker network create es-net

# node1 的挂载目录
mkdir -p -m 777 /var/lib/es/node1/plugins
mkdir -p -m 777 /var/lib/es/node1/data

# node2 的挂载目录
mkdir -p -m 777 /var/lib/es/node2/plugins
mkdir -p -m 777 /var/lib/es/node2/data

# node3 的挂载目录
mkdir -p -m 777 /var/lib/es/node3/plugins
mkdir -p -m 777 /var/lib/es/node3/data

启动 Elasticsearch 集群

node1：

docker run -d \
  --name=node1 \
  --restart=always \
  --net es-net \
  -p 9200:9200 \
  -p 9300:9300 \
  -v /var/lib/es/node1/plugins:/usr/share/elasticsearch/plugins \
  -v /var/lib/es/node1/data:/usr/share/elasticsearch/data \
  -e node.name=node1 \
  -e node.master=true \
  -e network.host=node1 \
  -e discovery.seed_hosts=node1,node2,node3 \
  -e cluster.initial_master_nodes=node1 \
  -e cluster.name=es-cluster \
  -e "ES_JAVA_OPTS=-Xms256m -Xmx256m" \
  elasticsearch:7.9.3

变量详解：

9200 端口：客户端和服务端互相通信的端口。

9300 端口：集群之间互相通信的端口。

-v：挂载刚才创建的两个文件夹。

-e：设置环境变量：

node.name：节点在集群中的唯一名称
node.master：是否可被选举为主节点
network.host：当前节点的地址
discovery.seed_hosts：集群中其他节点的地址列表
cluster.initial_master_nodes：候选的主节点地址列表
cluster.name：集群名
ES_JAVA_OPTS：java虚拟机参数

其他参数参考：

https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html

node2：

docker run -d \
  --name=node2 \
  --restart=always \
  --net es-net \
  -p 9201:9200 \
  -p 9301:9300 \
  -v /var/lib/es/node2/plugins:/usr/share/elasticsearch/plugins \
  -v /var/lib/es/node2/data:/usr/share/elasticsearch/data \
  -e node.name=node2 \
  -e node.master=true \
  -e network.host=node2 \
  -e discovery.seed_hosts=node1,node2,node3 \
  -e cluster.initial_master_nodes=node1 \
  -e cluster.name=es-cluster \
  -e "ES_JAVA_OPTS=-Xms256m -Xmx256m" \
  elasticsearch:7.9.3

node3：

docker run -d \
  --name=node3 \
  --restart=always \
  --net es-net \
  -p 9202:9200 \
  -p 9302:9300 \
  -v /var/lib/es/node3/plugins:/usr/share/elasticsearch/plugins \
  -v /var/lib/es/node3/data:/usr/share/elasticsearch/data \
  -e node.name=node3 \
  -e node.master=true \
  -e network.host=node3 \
  -e discovery.seed_hosts=node1,node2,node3 \
  -e cluster.initial_master_nodes=node1 \
  -e cluster.name=es-cluster \
  -e "ES_JAVA_OPTS=-Xms256m -Xmx256m" \
  elasticsearch:7.9.3

启动，访问测试

http://192.168.64.181:9200

http://192.168.64.181:9200/_cat/nodes

chrome浏览器插件：elasticsearch-head

elasticsearch-head 项目提供了一个直观的界面，可以很方便地查看集群、分片、数据等等。elasticsearch-head最简单的安装方式是作为 chrome 浏览器插件进行安装。

安装步骤：

在 elasticsearch-head 项目仓库中下载 chrome 浏览器插件
https://github.com/mobz/elasticsearch-head/raw/master/crx/es-head.crx
在 chrome 浏览器中选择“更多工具”–“扩展程序”
在“扩展程序”中确认开启了“开发者模式”
点击“加载已解压的扩展程序”，选择前面插件的目录
在浏览器中点击 elasticsearch-head 插件打开 head 界面，并连接 http://192.168.64.181:9200/
若安装失败不可用或者根本无法加载目录，请科学上网直接连接 Google 应用商店下载。

（图）

Elasticsearch - IK 中文分词器

安装 IK 分词器

从 ik 分词器项目仓库中下载 ik 分词器安装包，下载的版本需要与 Elasticsearch 版本匹配：
https://github.com/medcl/elasticsearch-analysis-ik

我们用的 Elasticsearch 版本为：7.9.3

或者可以访问 gitee 镜像仓库：
https://gitee.com/mirrors/elasticsearch-analysis-ik

下载 elasticsearch-analysis-ik-7.9.3.zip 复制到 /root/ 目录下

（这个我们无需重新下载，在之前准备服务器步骤里已经导入。）

在三个节点上安装 ik 分词器

cd ~/

# 复制 ik 分词器到三个 es 容器
docker cp elasticsearch-analysis-ik-7.9.3.zip node1:/root/
docker cp elasticsearch-analysis-ik-7.9.3.zip node2:/root/
docker cp elasticsearch-analysis-ik-7.9.3.zip node3:/root/

# 在 node1 中安装 ik 分词器
docker exec -it node1 elasticsearch-plugin install file:///root/elasticsearch-analysis-ik-7.9.3.zip

# 在 node2 中安装 ik 分词器
docker exec -it node2 elasticsearch-plugin install file:///root/elasticsearch-analysis-ik-7.9.3.zip

# 在 node3 中安装 ik 分词器
docker exec -it node3 elasticsearch-plugin install file:///root/elasticsearch-analysis-ik-7.9.3.zip

# 重启三个 es 容器
docker restart node1 node2 node3

注意：安装 IK 分词器的时候不要着急按 Enter 回车，他需要有一个解压的过程，然后会提示你是否要继续（y/N）输入 y，再 Enter 回车继续。

查看安装结果

在浏览器中访问 http://192.168.64.181:9200/_cat/plugins

提示页面有 node1 - node3 的 ik 分词器（版本 7.9.3）即可。

如果插件不可用，可以卸载后重新安装

docker exec -it node1 elasticsearch-plugin remove analysis-ik

docker exec -it node2 elasticsearch-plugin remove analysis-ik

docker exec -it node3 elasticsearch-plugin remove analysis-ik

IK 分词测试

ik分词器提供两种分词器： ik_max_word 和 ik_smart

ik_max_word：会将文本做最细粒度的拆分，比如会将 “ 中华人民共和国国歌 ” 拆分为：“ 中华人民共和国，中华人民，中华，华人，人民共和国，人民，人，民，共和国，共和，和，国国，国歌 ”，会穷尽各种可能的组合，适合 Term Query。

ik_smart：会做最粗粒度的拆分，比如会将 “中华人民共和国国歌” 拆分为 “中华人民共和国，国歌 ”，适合 Phrase 查询。

ik_max_word 分词测试

使用 head 执行下面测试：

点击复合查询[+]

向 http://192.168.64.181:9200/_analyze 路径提交 POST 请求，并在协议体中提交 Json 数据：

{
  "analyzer": "ik_max_word",
  "text": "中华人民共和国国歌"
}

提交请求，右侧会发现拆出来 10 个单词。

（图）

ik_smart 分词测试

使用 head 执行下面测试：

点击复合查询[+]

向 http://192.168.64.181:9200/_analyze 路径提交 POST 请求，并在协议体中提交 Json 数据：

{
  "analyzer":"ik_smart",
  "text":"中华人民共和国国歌"
}

提交请求，右侧会发现只拆出来 2 个单词。

（图）

Elasticsearch - 使用 Kibana 操作 ES

下载 Kibana 镜像

docker pull kibana:7.9.3

若之前已经导入则无需下载。

启动 Kibana 容器

docker run \
-d \
--name kibana \
--net es-net \
-p 5601:5601 \
-e ELASTICSEARCH_HOSTS='["http://node1:9200","http://node2:9200","http://node3:9200"]' \
--restart=always \
kibana:7.9.3

启动后，浏览器访问 Kibana，进入 Dev Tools

http://192.168.64.181:5601/

第一次会提示是否要尝试它的更多组件，选择右边的不使用，要不它会一直下一些资源。

（图）

若有问题，可进入 ES-head 插件界面，在概述一栏点击刷新，将 .kibana 开头的都删掉然后重新启动即可。

索引

在一个索引中存储大量数据会造成性能下降，这时可以对数据进行分片存储。

每个节点上都创建一个索引分片，把数据分散存放到多个节点的索引分片上，减少每个分片的数据量来提高 IO 性能。

每个分片都是一个独立的索引，数据分散存放在多个分片中，也就是说，每个分片中存储的都是不同的数据。搜索时会同时搜索多个分片，并将搜索结果进行汇总。

如果一个节点宕机分片不可用，则会造成部分数据无法搜索。

为了解决这一问题，可以对分片创建多个副本来解决。

索引副本

对分片创建多个副本，那么即使一个节点宕机，其他节点中的副本分片还可以继续工作，不会造成数据不可用。

分片的工作机制：

主分片的数据会复制到副本分片
搜索时，以负载均衡的方式工作，提高处理能力
主分片宕机时，其中一个副本分片会自动提升为主分片

我们可以根据一个分片搭配两个副本的结构来创建 products 索引。

创建索引

新建用 PUT

创建一个名为 products 的索引，用来存储商品数据。

分片和副本参数说明：

number_of_shards：分片数量，默认值是 5
number_of_replicas：副本数量，默认值是 1

我们有三个节点，在每个节点上都创建一个分片。每个分片在另两个节点上各创建一个副本。

创建索引，命名为 products

PUT /products
{
  "settings": {
    "number_of_shards": 3, 
    "number_of_replicas": 2
  }
}

点击右上角的三角提交。此步骤相当于向 http://192.168.64.181:9200/products 发送 PUT 请求。

返回 ES-head 界面，用索引名称过滤，查看 products 索引。

（图）

粗框为主分片，细框为副本分片。

映射（数据结构）

类似于数据库表结构，索引数据也被分为多个数据字段，并且需要设置数据类型和其他属性。

映射，是对索引中字段结构的定义和描述。

字段的常用类型：

数字类型：
- byte、short、integer、long
- float、double
- unsigned_long
字符串类型：
- text ：会进行分词
- keyword ：不会进行分词，适用于email、主机地址、邮编等
日期和时间类型：
- date

类型参考：

https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html

创建映射

在 products 索引中创建映射，映射即为对字段的设定。

分词器设置：

analyzer：在索引中添加文档时，text类型通过指定的分词器分词后，再插入倒排索引
search_analyzer：使用关键词检索时，使用指定的分词器对关键词进行分词

查询时，关键词优先使用 search_analyzer 设置的分词器，如果 search_analyzer 不存在则使用 analyzer 分词器。

相当于向 http://192.168.64.181:9200/products/_mapping 发送信息

# 定义mapping，数据结构
PUT /products/_mapping
{
  "properties": {
    "id": {
      "type": "long"
    },
    "title": {
      "type": "text",
      "analyzer": "ik_max_word",
      "search_analyzer": "ik_smart"
    },
    "category": {
      "type": "text",
      "analyzer": "ik_smart",
      "search_analyzer": "ik_smart"
    },
    "price": {
      "type": "float"
    },
    "city": {
      "type": "text",
      "analyzer": "ik_smart",
      "search_analyzer": "ik_smart"
    },
    "barcode": {
      "type": "keyword"
    }
  }
}

映射参考：

https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html

查看映射：

GET /products/_mapping

添加文档

添加的文档会有一个名为_id的文档id，这个文档id可以自动生成，也可以手动指定，通常可以使用数据的id作为文档id。

# 添加文档
PUT /products/_doc/10033
{
  "id":"10033",
  "title":"SONOS PLAY:5(gen2) 新一代PLAY:5无线智能音响系统 WiFi音箱家庭,潮酷数码会场",
  "category":"潮酷数码会场",
  "price":"3980.01",
  "city":"上海",
  "barcode":"527848718459"
}


PUT /products/_doc/10034
{
  "id":"10034",
  "title":"天猫魔盒 M13网络电视机顶盒 高清电视盒子wifi 64位硬盘播放器",
  "category":"潮酷数码会场",
  "price":"398.00",
  "city":"浙江杭州",
  "barcode":"522994634119"
}



PUT /products/_doc/10035
{
  "id":"10035",
  "title":"BOSE SoundSport耳塞式运动耳机 重低音入耳式防脱降噪音乐耳机",
  "category":"潮酷数码会场",
  "price":"860.00",
  "city":"浙江杭州",
  "barcode":"526558749068"
}



PUT /products/_doc/10036
{
  "id":"10036",
  "title":"【送支架】Beats studio Wireless 2.0无线蓝牙录音师头戴式耳机",
  "category":"潮酷数码会场",
  "price":"2889.00",
  "city":"上海",
  "barcode":"37147009748"
}


PUT /products/_doc/10037
{
  "id":"10037",
  "title":"SONOS PLAY:1无线智能音响系统 美国原创WiFi连接 家庭桌面音箱",
  "category":"潮酷数码会场",
  "price":"1580.01",
  "city":"上海",
  "barcode":"527783392239"
}

也可以自动生成 _id 值

POST /products/_doc
{
  "id":"10027",
  "title":"vivo X9前置双摄全网通4G美颜自拍超薄智能手机大屏vivox9",
  "category":"手机会场",
  "price":"2798.00",
  "city":"广东东莞",
  "barcode":"541396973568"
}

回到 ES-head 插件界面，点击数据浏览模块，查看 products 索引可以看到 5 个文档。

（图）

_type 均为 _doc，这是一种分类，只不过现在被弃用了。

查看指定文档：

GET /products/_doc/10037

（图）

里面的 source 里面存的是文档的内容，但是并没有分词结果。

查看指定文档title字段的分词结果：

GET /products/_doc/10037/_termvectors?fields=title
GET /products/_doc/10037/_termvectors?fields=category
GET /products/_doc/10037/_termvectors?fields=city

修改文档

底层索引数据无法修改，修改数据实际上是先删除再重新添加。

两种修改方式：

PUT：对文档进行完整的替换
POST：可以修改一部分字段

PUT 修改价格字段的值：

# 修改文档 - 替换
PUT /products/_doc/10037
{
  "id":"10037",
  "title":"SONOS PLAY:1无线智能音响系统 美国原创WiFi连接 家庭桌面音箱",
  "category":"潮酷数码会场",
  "price":"9999.99",
  "city":"上海",
  "barcode":"527783392239"
}

查看文档：

GET /products/_doc/10037

POST 修改价格和城市字段的值：

# 修改文档 - 更新部分字段
POST /products/_update/10037
{
  "doc": {
    "price":"8888.88",
    "city":"深圳"
  }
}

查看文档：

GET /products/_doc/10037

删除文档

DELETE /products/_doc/10037

清空

POST /products/_delete_by_query
{
  "query": {
    "match_all": {}
  }
}

删除索引

# 删除 products 索引
DELETE /products

可以尝试用不同的分片和副本值来重新创建 products 索引

Elasticsearch - 搜索

导入测试数据

为了测试搜索功能，我们首先导入测试数据，3160条商品数据，数据样例如下

{ "index": {"_index": "pditems", "_id": "536563"}}
{ "id":"536563","brand":"联想","title":"联想(Lenovo)小新Air13 Pro 13.3英寸14.8mm超轻薄笔记本电脑","sell_point":"清仓！仅北京，武汉仓有货！","price":"6688.0","barcode":"","image":"/images/server/images/portal/air13/little4.jpg","cid":"163","status":"1","created":"2015-03-08 21:33:18","updated":"2015-04-11 20:38:38"}

导出数据库 pd_item 表数据

在课前资料的\elasticsearch\pditems目录下有一个 pditems.json文件，准备好。

创建索引和映射

PUT /pditems
{
  "settings": {
    "number_of_shards": 3, 
    "number_of_replicas": 2
  },
  "mappings": {
    "properties": {
      "id": {
        "type": "long"
      },
      "brand": {
        "type": "text",
        "analyzer": "ik_smart"
      },
      "title": {
        "type": "text",
        "analyzer": "ik_max_word"
      },
      "sell_point": {
        "type": "text",
        "analyzer": "ik_max_word",
        "search_analyzer": "ik_smart"
      },
      "price": {
        "type": "float"
      },
      "image": {
        "type": "keyword"
      },
      "cid": {
        "type": "long"
      },
      "status": {
        "type": "byte"
      },
      "created": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss"
      },
      "updated": {
        "type": "date",
        "format": "yyyy-MM-dd HH:mm:ss"
      }
    } 
  }
}

通过 ES-head 查看索引，看到有 pditems 即可。

（图）

导入数据

在之前导入过 pditems 文件夹里面有封装好的 json 文件。

在服务器上，进入 pditems.json 所在的文件夹，执行批量数据导入：

curl -XPOST 'localhost:9200/pditems/_bulk' \
    -H 'Content-Type:application/json' \
    --data-binary @pditems.json

查看数据

搜索 pditems 索引中全部 3160 条数据：

GET /pditems/_search
{
  "query": {
    "match_all": {}
  },
  "size": 3160
}

搜索所有数据

# 搜索 pditems 索引中全部数据
POST /pditems/_search
{
  "query": {
    "match_all": {}
  }
}

关键词搜索

# 查询 pditems 索引中title中包含"电脑"的商品
POST /pditems/_search
{
  "query": {
    "match": {
      "title": "电脑"
    }
  }
}

搜索结果过滤器

# 价格大于2000，并且title中包含"电脑"的商品
POST /pditems/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "title": "电脑"
          }
        }
      ],

      "filter": [
        {
          "range": {
            "price": {
              "gte": "2000"
            }
          }
        }
      ]
    }
  }
}

搜索结果高亮显示

POST /pditems/_search
{
	"query": {
		"multi_match":{
			"query": "手机",
			"fields": ["title", "sell_point"]
		}
	},
	"highlight" : {
        "pre_tags" : ["<i class=\"highlight\">"],
        "post_tags" : ["</i>"],
        "fields" : {
            "title" : {},
            "sell_point" : {
              "pre_tags": "<em>",
              "post_tags": "</em>"
            }
        }
    }
}

Elasticsearch - Spring Data Elasticsearch - 增删改查API

Spring Data Elasticsearch 是 Elasticsearch 搜索引擎开发的解决方案。它提供：

模板对象，用于存储、搜索、排序文档和构建聚合的高级API。

例如，Repository 使开发者能够通过定义具有自定义方法名称的接口来表达查询。

创建项目

新建 SpringBoot 工程 es，依赖添加 elasticsearch。
修改 pom.xml 文件，Spring Boot 版本号为2.3.2.REALESE，手动添加 Lombok 依赖。

修改 application.yml 文件

spring:
  elasticsearch:
    rest:
      uris:
        - http://192.168.64.181:9200
        - http://192.168.64.181:9201
        - http://192.168.64.181:9202

因为 kibana 不再使用可以将容器删除。

Elasticsearch Repositories：非常方便，实现接口定义抽象方法即可。

在 Elasticsearch 中创建 students 索引

在开始运行测试之前，在 Elasticsearch-head 中先创建 students 索引：

提交类型为 PUT，路径为 students

PUT /students
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 2,
    "index.max_ngram_diff":30,
    "analysis": {
      "analyzer": {
        "ngram_analyzer": {
          "tokenizer": "ngram_tokenizer"
        }
      },
      "tokenizer": {
        "ngram_tokenizer": {
          "type": "ngram",
          "min_gram": 1,
          "max_gram": 30,
          "token_chars": [
            "letter",
            "digit"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "id": {
        "type": "long"
      },
      "name": {
        "type": "text",
        "analyzer": "ngram_analyzer"
      },
      "gender": {
        "type": "keyword"
      },
      "birthDate": {
        "type": "date",
        "format": "yyyy-MM-dd"
      }
    }
  }
}

创建 Student 实体类

在 es 包下创建 entity.Student 实体类。
定义 id（Long）、name（String）、gender（Character）、birthDate（String）。
添加 Lombok 三注解。
增加 Document 配置：

@Document 用来指定索引也可以用来自动创建索引，一般索引是自己手动创建。

添加 @Document 注解，参数为索引名（indexName）、分片数量（shards）、索引数量（replicas）。
```
@Document(indexName = "students",shards = 3,replicas = 2)
```
添加索引 ID 注解：使用学生学号，作为索引 id (_id)，在 id 上面添加 @Id 注解。
添加索引中字段注解：在 birthDate 上添加 @Field(“birthDate”) 注解，设置索引中的字段名,对应实体类中的属性，字段名和属性名相同时可以省略。

创建 StudentRepository 抽象接口

和以前的 Mapper 不一样的是现在我们要写 Repository（原理和Mapper很像）。
在 es 包下创建 StudentRepository 抽象接口并继承 ElasticsearchRepository 接口，后面类型参数写 Student 和 Long。
Spring Data Repository 规范：只需要定义抽象接口,抽象方法，不需要自己实现，Repository底层代码已经实现了所有的数据操作代码。
创建抽象方法：

Spring 接口会自动创建相应方法，所以有规范：命名的属性名必须要大写
- 在 name 中搜索关键词：
```
List<Student> findByName(String nameKey);
```
- 在 name 和 birthDate 中搜索：
```
List<Student> findByNameOrBirthDate(String name,String birthDate);
```

创建测试类

创建 Test：

在 test…es 包下创建 Test1 测试类。
类上添加 @SpringBootTest 注解，注入 StudentRepository。
在 application.yml 上添加日志等级 trace。

增：

创建方法 test1。
在测试类 test1 方法中创建 10 个 Student 对象，其中 id 是 Long 类型的数据最后要加L，birthDate 因为是日期格式要求比较严格所以单位数日期和月份需要补零。然后通过 r 调用 save 添加到索引。
启动执行并查看 ES-head 界面数据浏览标签的 student 索引，有数据即可。

改：

创建方法 test2。
改其实也是另一种增加，创建的过程中会根据 ID 覆盖原有的记录，增加一条数据，ID 与之前创建的某一项相同即可。
通过 r 调用 save 保存。
启动执行并查看 ES-head 界面数据浏览标签的 student 索引，有数据即可。

查：

创建方法 test3。
查询单个对象：

为了防止出现空指针异常，我们利用 Optional 工具创建一个对象 o1，包含一个学生实例，并通过 r 调用 findById 方法查询数据，之后添加分支判断 o1 是否存在（isPresent），若存在则获取 o1 中的 Student 对象然后输出。
查询多个对象：

利用 Iterable 工具创建一个对象 all，并通过 r 调用 findAll 方法查询数据，for 遍历所有数据并打印输出。
启动执行并查看 ES-head 界面数据浏览标签的 student 索引，有数据即可。

删：

创建方法 test4。
通过 r 调用 deleteById 方法，参数为数据的 ID。
启动执行并查看 ES-head 界面数据浏览标签的 student 索引，有数据即可。

根据抽象方法查询：

创建方法 test5。
通过调用 findByName 方法查询名字里含有某字符的数据。遍历输出。
控制台有输出即可。
创建方法 test6。
通过调用 findByNameOrBirthDate 方法查询名字里含有某字符或者出生日期为某日的数据。遍历输出。
控制台有输出即可。

使用 Criteria 构建查询

使用 Criteria 封装搜索条件

Spring Data Elasticsearch 中，可以使用 SearchOperations 工具执行一些更复杂的查询，这些查询操作接收一个 Query 对象封装的查询操作。

Spring Data Elasticsearch 中的 Query 有三种：

CriteriaQuery
StringQuery
NativeSearchQuery

多数情况下，CriteriaQuery 都可以满足我们的查询求。下面来看两个 Criteria 查询示例。

创建 StudentSearcher 类

在 es 包下创建 StudentSearcher 类，添加 @Component 注解自动添加实例。
注入一个用来执行 CriteriaQuery 的工具对象：ElasticsearchOperations
定义一个 private 方法 exec 用来将条件封装到一个查询对象 CriteriaQuery 中，返回值为存有 Student 的 List 集合。
自定义两个搜索方法：
- 定义 findByName 方法返回存有 Student 对象的 List，参数为想要查询的关键词 nameKey，将关键词条件封装进一个 Criteria 对象，参数为字段名（name），调用 exec 方法，返回结果。

创建 Test2 测试类

在 test…es 包下创建一个测试类 Test2
添加 @SpringBootTest 测试类注解，注入 StudentSearcher 类
创建测试方法 test1，添加朱皮特 @Test 注解，调用 findByName 方法返回包含 Student 对象的 List，遍历 List 输出 student。
创建测试方法 test2，内容和 test1 基本一致，将调用方法改为 findByBirthDate 即可，注意要更改参数。
启动测试，观察控制台输出结果。

Spring Data ES 分页

Pageable - 封装向服务器提交的分页参数
Page - 封装从服务器返回的搜索结果的一页数据，和分页属性数据

将 StudentRepository 里面的在 name 中搜索关键词方法修改为返回 Page 类型，并添加参数 Pageable（data.domain包）。
回到 Test1 测试类，修改 test5 方法，在最上面添加 Pageable 对象并限定页数和范围。
下面的方法的返回值改为 Page，参数添加 p。
由于数据量并不多，我们将 test1 中的所有数据都改为同一字符开头并重新执行 test1
执行 test5，观察控制台输出。

实现 pd 商城的搜索条

打开 rabbitmq 项目，将 pd-web 的工作目录调整至 pd-web 模块内。

浏览器直接访问 localhost 即可进入 pd 商城，这里我们的目的是实现搜索条功能。

实现过程：

修改 pom.xml 文件

添加 Spring Data Elasticsearch 和 Lombok 依赖

		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
		</dependency>
		<dependency>
			<groupId>org.projectlombok</groupId>
			<artifactId>lombok</artifactId>
		</dependency>

修改 application.yml 文件

添加连接配置

spring:  
  elasticsearch:
    rest:
      uris: 
        - http://192.168.64.181:9200
        - http://192.168.64.181:9201
        - http://192.168.64.181:9202

创建实体类 Item

实体类用来封装从 es 服务器搜索的结果。

Item 添加参数 id、brand、title、sellPoint、price、image，并添加 Lombok 三注解。
加索引注释 @Document，参数为索引名 pditems。
添加 @Id 注释，添加 @Field 注释（sell_point）

package com.pd.pojo;

import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;
import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.Document;
import org.springframework.data.elasticsearch.annotations.Field;

@Document(indexName = "pditems")
@Data
@NoArgsConstructor
@AllArgsConstructor
public class Item {
    @Id
    private Long id;
    private String brand;
    private String title;
    @Field("sell_point")
    private String sellPoint;
    private String price;
    private String image;
}

创建 ItemRepository 抽象接口

在 pd 包下创建 es.ItemRepository 接口，并继承 ElasticsearchRepository 类，类型参数为 Item 类型和 Id 的类型 Long。
定义搜索方法：findByTitleOrSellPoint，返回值为存有 Item 对象的 Page 分页对象，参数为两个 String 的关键词和 Pageable 对象。注意引包为 Spring Data Domain 包。

package com.pd.es;

import com.pd.pojo.Item;
import org.springframework.data.domain.Page;
import org.springframework.data.domain.Pageable;
import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;

public interface ItemRepository extends ElasticsearchRepository<Item, Long> {

    Page<Item> findByTitleOrSellPoint(String key1, String key2, Pageable pageable);

}

创建 SearchService 接口

在 pd.service 包下创建 SearchService 接口。
定义方法 search，返回值为存有 Item 的 Page 分页对象，参数为关键词 key 和 Pageable。
创建实现类，注意将实现类放在 impl 包下，注入 ItemRepository 对象 r。
重写方法，返回值为通过 r 调用方法查询数据，参数为两个关键词 key 和 Pageable

package com.pd.service;

import com.pd.pojo.Item;
import org.springframework.data.domain.Page;
import org.springframework.data.domain.Pageable;

public interface SearchService {
    Page<Item> search(String key, Pageable pageable);
}

package com.pd.service.impl;

import com.pd.es.ItemRepository;
import com.pd.pojo.Item;
import com.pd.service.SearchService;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.domain.Page;
import org.springframework.data.domain.Pageable;

public class SearchServiceImpl implements SearchService {
    @Autowired
    private ItemRepository r;
    @Override
    public Page<Item> search(String key, Pageable pageable) {
        return r.findByTitleOrSellPoint(key,key,pageable);
    }
}

创建 SearchController 类

在 pd.controller 包下创建 SearchController，添加 @Controller 注解。
注入 SearchService 接口。
定义 search 方法，返回值为 String，参数为 Model 对象（用来向界面传递数据的工具）、关键词 key 和 Pageable。方法请求方式为 Get，路径为 “/search/toSearch.html”，浏览器查询参数为关键词、page 和 size。
完成方法：

通过 searchService 调用 search 方法将 key 和 pageable 封装成一个 Page 对象，然后将 page 对象以 “page” 为名创建添加至 model，最后返回 “/search.jsp”

package com.pd.controller;

import com.pd.pojo.Item;
import com.pd.service.SearchService;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.domain.Page;
import org.springframework.data.domain.Pageable;
import org.springframework.stereotype.Controller;
import org.springframework.ui.Model;
import org.springframework.web.bind.annotation.GetMapping;

@Controller
public class SearchController {
    @Autowired
    private SearchService searchService;
    //    //search/toSearch.html?key=手机&pageNumber=1
    @GetMapping("/search/toSearch.html")
    public String search(Model model,String key, Pageable pageable){
        // Model 是用来向界面传递数据的工具
        Page<Item> page = searchService.search(key, pageable);
        model.addAttribute("page",page);
        return "/search.jsp";
    }
}

修改 search.jsp

修改 search.jsp：Ctrl + 左键点击 “/search.jsp”

修改 23 行，将 list 改为 page.content；

21 行改为搜索结果 > ${param.key}

修改完重新启动服务，浏览器搜索栏搜索手机，有数据展示即可（没有图片正常）。

search.jsp 第 45 行后面添加分页提示：

限定条件，当 page 存在上一页/下一页时，添加上一页/下一页按钮的链接，具体参数为 key、page.number、page.size。

<div>
    <c:if test="${page.hasPrevious()}">
          <a href="?key=${param.key}&page=${page.number-1}&size=${page.size}">上一页</a>
    </c:if>
    <c:if test="${page.hasNext()}">
          <a href="?key=${param.key}&page=${page.number+1}&size=${page.size}">下一页</a>
    </c:if>
</div>

修改 header.jsp

打开 webapp/commons/header.jsp

将 12 行的 q 改为 param.key
将 13 行的 placeholder=" ${q}" 改为 value="$ {param.key}"
重新启动服务，浏览器搜索框输入发现关键词已被保留。

添加高亮效果

Repository API，做高亮显示效果时，高亮显示结果 SearchHit 只能用 List 存放，不能用 Page，这样就会导致缺少分页信息。

返回到 ItemRepository 类，添加高亮相关配置。

方法上添加 @HighLight 注解

    @Highlight(
            parameters = @HighlightParameters(
                    preTags = "<em>",
                    postTags = "</em>"
            ),
            fields = @HighlightField(name = "title")
    )

修改方法返回值类型为 List<SearchHit<Item>>，修改完以后 service、serviceimpl、controller 类的方法都要改。

修改 SearchController 类

修改 SearchController 方法：添加一个存放 Item 对象的 List（IdentityLinkedList）。将获取的 page 对象遍历，获取里面的 Item 对象信息，再将 “title” 的高亮部分存入 List。在循环外将 item 对象创建添加至 model 里面，标签为 “items”。

将 model 调用 attribute 的方法删掉,我们重写方法：

首先定义一个 hlTitle 方法，用于获取高亮的 title：返回值为 String 字符串，参数为 title 的 List 集合，我们这里创建一个 StringBuilder 对象 sb 来添加 title。for 循环遍历 List，将遍历到的 String 字符串 append 到 sb 的后面。最后循环结束，返回 sb.toString。
创建一个 List 用来存放前面 SearchHit 封装的 Item 对象。循环遍历 SearchHit 的 List，获取每一个结果的 item，通过 hlTitle 方法将获取的 “title” 字段的高亮字符合成一个String 字符串 title，然后将 item 中原始 title 替换成高亮的 title，最后将 item 添加到之前创建的 List 集合中。
通过 model 将 List 集合 items 以 “items” 标签 addAttribute 到 jsp 中。

    @GetMapping("/search/toSearch.html")
    public String search(Model model,String key, Pageable pageable){
        // Model 是用来向界面传递数据的工具
        List<SearchHit<Item>> list = searchService.search(key, pageable);
        List<Item> items = new ArrayList<>();
        for (SearchHit<Item> sh : list){
            Item item = sh.getContent();
            // 创建一个 hlTitle 方法,获取高亮的 title
            String title = hlTitle(sh.getHighlightField("title"));
            item.setTitle(title); // item 中原始title替换成高亮的 title
            items.add(item);
        }

        model.addAttribute("items",items);
        model.addAttribute("page",pageable);

        return "/search.jsp";
    }

    private String hlTitle(List<String> title){
        StringBuilder sb = new StringBuilder();
        for (String s : title){
            sb.append(s);
        }
        return sb.toString();
    }

修改 search.jsp

更改 23 行，将 page.content 改为 items。
更改 45 行左右上下页配置，回到 controller 类里面添加 model 获取 pageable 属性 “page”
更改参数：将 number 改为 pageNumber，size 改为 pageSize。
更改分页判断条件：

当页数大于 1 时才有上一页，所以将 hasPrevious() 换成 pageNumber > 0

下一页的判断条件有些许不同，由于没有页数返回所以我们通过剩余 item.size 是否等于 page.pageSize，只有等于的时候（即填满的时候）才能翻下一页，即使没有剩余的 item 也要翻最后一次，所以条件要改为 items.size() == page.pageSize。

由于最后一页有空白页的情况，所以要在 forEach 循环开始添加判断：
```
<c:if test="${items.size() == 0}">
    <h2>后面没有商品了!</h2>
</c:if>
```

使商品 title 里面的关键词获得高亮

在 search.jsp 第 15 行左右 head 标签里面，添加高亮样式
```
<style>
    div.describe em {
        color: #f00;
    }
</style>
```
启动测试，有高亮效果即可。

胖蝶的程序猿生活

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Spring Data ElasticSearch - 分布式搜索和数据分析引擎相关操作实战流程

ElasticSearch集群 es-cluster，我们用 docker 启动 es，启用三个节点。准备服务器克隆 docker-base:es虚拟机内存设置成 2G 或以上右键 es 虚拟机 -> 设置，将内存设置成 2G（16G 内存可设为 3G）修改系统底层参数，直接粘贴进 VMwareecho 'vm.max_map_count=262144' >>/etc/sysctl.conf重启服务器：shutdown -r now设置 ip.
复制链接

扫一扫