😊你好,我是小航,一个正在变秃、变强的文艺倾年。
🔔本文讲解实战ElasticSearch搜索匹配,欢迎大家多多关注!
🔔一起卷起来叭!
前言:
某年某月某日,当我在逛珍爱网的时候,突然想到了自己还木有女朋友,甚至忽略了我是一个男性!哦当然这不是重点,重点是它没有匹配到我理想的另一半,于是我决定,自己写一个搜索匹配,寻找自己理想的另一半。
一、设计数据库
SQL设计如下:
字典表设计:(用户所处的城市、兴趣…)
CREATE TABLE `data_dict` (
`id` bigint NOT NULL AUTO_INCREMENT COMMENT '主键ID',
`node_name` varchar(50) NOT NULL COMMENT '节点名称',
`parent_id` bigint NOT NULL DEFAULT '0' COMMENT '父ID',
`type` int NOT NULL COMMENT '类型:0-城市;1-兴趣',
`node_level` int NOT NULL COMMENT '节点层级',
`show_status` int NOT NULL COMMENT '是否显示:1-显示;0-不显示',
`sort` int NOT NULL COMMENT '排序',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
二、初始化项目
项目结构:
如何初始化项目这里不再赘述,请看往期实战教程
三、功能实现
1.父子节点:
修改DataDictEntity表:
- 增加逻辑删除注解
- 增加child属性
package com.example.demo.entity;
import com.baomidou.mybatisplus.annotation.TableField;
import com.baomidou.mybatisplus.annotation.TableId;
import com.baomidou.mybatisplus.annotation.TableLogic;
import com.baomidou.mybatisplus.annotation.TableName;
import java.io.Serializable;
import java.util.Date;
import java.util.List;
import com.fasterxml.jackson.annotation.JsonInclude;
import lombok.Data;
/**
*
*
* @author Liu
* @email 1531137510@qq.com
* @date 2022-10-06 20:48:15
*/
@Data
@TableName("data_dict")
public class DataDictEntity implements Serializable {
private static final long serialVersionUID = 1L;
/**
* 主键ID
*/
@TableId
private Long id;
/**
* 节点名称
*/
private String nodeName;
/**
* 父ID
*/
private Long parentId;
/**
* 类型:0-城市;1-兴趣
*/
private Integer type;
/**
* 节点层级
*/
private Integer nodeLevel;
/**
* 是否显示:1-显示;0-不显示
*/
@TableLogic(value = "1", delval = "0")
private Integer showStatus;
/**
* 排序
*/
private Integer sort;
@JsonInclude(JsonInclude.Include.NON_EMPTY) // 属性为空不参与序列化,这里方便前端处理
@TableField(exist = false) // 数据库表中不存在该字段
private List<DataDictEntity> children;
}
逻辑删除的配置也可以通过配置文件配置:
mybatis-plus:
mapper-locations: classpath:/mapper/*.xml
global-config:
db-config:
id-type: auto # 主键自增
logic-delete-value: 1
logic-not-delete-value: 0
接下来我们编写接口:
控制层
ApiController:
@RestController
public class ApiController {
@Autowired
DataDictService dataDictService;
@GetMapping("/list/tree")
public Result<List<DataDictEntity>> listWithTree() {
List<DataDictEntity> entities = dataDictService.listWithTree();
return new Result<List<DataDictEntity>>().ok(entities);
}
}
业务层
DataDictServiceImpl:
/**
* 树形查询
*/
@Override
public List<DataDictEntity> listWithTree() {
// 1.查出所有分类(数据库只查询一次,内存进行修改)
List<DataDictEntity> entities = baseMapper.selectList(null);
// 2.组装分类
return entities.stream().filter(node -> node.getParentId() == 0) // 先过滤得到所有一级分类
.peek((nodeEntity) -> {
nodeEntity.setChildren(getChildrens(nodeEntity, entities)); // 递归得到一级分类的子部门
}).sorted(Comparator.comparingInt(node -> (node.getSort() == null ? 0 : node.getSort()))).collect(Collectors.toList());
}
/**
* 递归查询子节点
*/
private List<DataDictEntity> getChildrens(DataDictEntity root, List<DataDictEntity> all) {
return all.stream().filter(node -> root.getId().equals(node.getParentId())) // 找到root的子部门
.peek(dept -> {
dept.setChildren(getChildrens(dept, all)); // 设置为子部门
}).sorted(Comparator.comparingInt(node -> (node.getSort() == null ? 0 : node.getSort()))).collect(Collectors.toList());
}
具体逻辑已经写到注释上面了
我们新增几个测试数据:
INSERT INTO `data_dict` VALUES (1, '1', 0, 0, 1, 1, 2);
INSERT INTO `data_dict` VALUES (2, '1-1', 1, 0, 2, 1, 1);
INSERT INTO `data_dict` VALUES (3, '1-1-1', 2, 0, 3, 1, 1);
INSERT INTO `data_dict` VALUES (4, '2', 0, 0, 1, 1, 1);
INSERT INTO `data_dict` VALUES (5, '3', 0, 0, 1, 0, 1);
打开测试工具Apifox测试:
发送Get请求:http://localhost:8080/list/tree
返回结果:
{
"code": 0,
"msg": "success",
"data": [
{
"id": 4,
"nodeName": "2",
"parentId": 0,
"type": 0,
"nodeLevel": 1,
"showStatus": 1,
"sort": 1
},
{
"id": 1,
"nodeName": "1",
"parentId": 0,
"type": 0,
"nodeLevel": 1,
"showStatus": 1,
"sort": 2,
"children": [
{
"id": 2,
"nodeName": "1-1",
"parentId": 1,
"type": 0,
"nodeLevel": 2,
"showStatus": 1,
"sort": 1,
"children": [
{
"id": 3,
"nodeName": "1-1-1",
"parentId": 2,
"type": 0,
"nodeLevel": 3,
"showStatus": 1,
"sort": 1
}
]
}
]
}
]
}
如果树形节点数据不经常变动,且不是很重要的数据,我们可以考虑把数据缓存起来,加快查询速度
之前Redis详细的缓存实战请看这里:对接外部API + 性能调优
由于这里是一般场景,缓存数量不是很大,没必要使用第三方缓存,使用Spring Cache足够了:
1.开启Cache
@SpringBootApplication
@EnableCaching
public class DemoApplication {
public static void main(String[] args) {
SpringApplication.run(DemoApplication.class, args);
}
}
2.添加Cacheable 注解
/**
* 树形查询
* value:缓存名
* key:显示的指定key Spring官方更推荐,SpEL:Spring Expression Language,Spring 表达式语言
* sync = true 解决缓存击穿
*/
@Cacheable(value = {"data_dict"}, key = "#root.method.name", sync = true)
@Override
public List<DataDictEntity> listWithTree() {
// 1.查出所有分类(数据库只查询一次,内存进行修改)
List<DataDictEntity> entities = baseMapper.selectList(null);
log.info("查询了数据库!");
// 2.组装分类
return entities.stream().filter(node -> node.getParentId() == 0) // 先过滤得到所有一级分类
.peek((nodeEntity) -> {
nodeEntity.setChildren(getChildrens(nodeEntity, entities)); // 递归得到一级分类的子部门
}).sorted(Comparator.comparingInt(node -> (node.getSort() == null ? 0 : node.getSort()))).collect(Collectors.toList());
}
我们打开Api文档测试:
调用两次方法后发现:
查询了数据库!
# 只出现了一次!
如果需要配置第三方缓存,需要引入依赖(spring-boot-starter-cache),然后在配置文件修改spring.cache.type:
<dependency>
第三方依赖
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-cache</artifactId>
</dependency>
这里就不再赘述了
2.搜索引擎:
准备工作:
(1)下载ealastic search(存储和检索)和kibana(可视化检索)
版本要统一
docker pull elasticsearch:7.4.2
docker pull kibana:7.4.2
(2)配置:
# 将docker里的目录挂载到linux的/mydata目录中
# 修改/mydata就可以改掉docker里的
mkdir -p /mydata/elasticsearch/config
mkdir -p /mydata/elasticsearch/data
# es可以被远程任何机器访问
echo "http.host: 0.0.0.0" >/mydata/elasticsearch/config/elasticsearch.yml
# 递归更改权限,es需要访问
chmod -R 777 /mydata/elasticsearch/
(3)启动Elastic search:
# 9200是用户交互端口 9300是集群心跳端口
# -e指定是单阶段运行
# -e指定占用的内存大小,生产时可以设置32G
docker run --name elasticsearch -p 9200:9200 -p 9300:9300 \
-e "discovery.type=single-node" \
-e ES_JAVA_OPTS="-Xms64m -Xmx512m" \
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data \
-v /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \
-d elasticsearch:7.4.2
# 设置开机启动elasticsearch
docker update elasticsearch --restart=always
(4)启动kibana:
# kibana指定了了ES交互端口9200 # 5600位kibana主页端口
docker run --name kibana -e ELASTICSEARCH_HOSTS=http://192.168.56.10:9200 -p 5601:5601 -d kibana:7.4.2
# 设置开机启动kibana
docker update kibana --restart=always
(5)测试
查看elasticsearch版本信息: http://192.168.56.10:9200
{
"name": "66718a266132",
"cluster_name": "elasticsearch",
"cluster_uuid": "xhDnsLynQ3WyRdYmQk5xhQ",
"version": {
"number": "7.4.2",
"build_flavor": "default",
"build_type": "docker",
"build_hash": "2f90bbf7b93631e52bafb59b3b049cb44ec25e96",
"build_date": "2019-10-28T20:40:44.881551Z",
"build_snapshot": false,
"lucene_version": "8.2.0",
"minimum_wire_compatibility_version": "6.8.0",
"minimum_index_compatibility_version": "6.0.0-beta1"
},
"tagline": "You Know, for Search"
}
显示elasticsearch 节点信息 http://192.168.56.10:9200/_cat/nodes
127.0.0.1 14 99 25 0.29 0.40 0.22 dilm * 66718a266132
66718a266132代表上面的结点
*代表是主节点
访问Kibana: http://192.168.56.10:5601/app/kibana
为了增加ES的安全性,我们这里设置一下密码:
修改elasticsearch.yml
文件(6.2或更早版本需要安装X-PACK, 新版本已包含在发行版中)
vim /mydata/elasticsearch/config/elasticsearch.yml
## 增加内容:
xpack.security.enabled: true
xpack.license.self_generated.type: basic
xpack.security.transport.ssl.enabled: true
重启ES服务:
docker restart elasticsearch
进入elasticsearch
容器bin目录下初始化密码
:
docker exec -it elasticsearch /bin/bash
/usr/share/elasticsearch/bin/elasticsearch-setup-passwords interactive
# 因为需要设置 elastic,apm_system,kibana,kibana_system,logstash_system,beats_system,remote_monitoring_user 这些用户的密码,故这个过程比较漫长,耐心设置;注意输入密码的时候看不到是正常的
这里我们将密码修改为:123456
修改密码测试:
浏览器访问:http://192.168.56.10:9200
- 账号:elastic
- 密码:123456
exit # 退出之前的容器
# 进入kibana 容器内部
docker exec -it kibana /bin/bash
vi config/kibana.yml
# kinana.yml 末尾添加:
elasticsearch.username: "elastic"
elasticsearch.password: "123456"
# 重新启动kibana
exit
docker restart kibana
安装ik分词器:
由于所有的语言分词默认使用的都是“Standard Analyzer”,但是这些分词器针对于中文的分词,并不友好。为此需要安装中文的分词器。
查看自己的elasticsearch版本号:
访问:http://192.168.56.10:9200
版本对应关系:
IK version | ES version |
---|---|
master | 7.x -> master |
6.x | 6.x |
5.x | 5.x |
1.10.6 | 2.4.6 |
1.9.5 | 2.3.5 |
1.8.1 | 2.2.1 |
1.7.0 | 2.1.1 |
1.5.0 | 2.0.0 |
1.2.6 | 1.0.0 |
1.2.5 | 0.90.x |
1.1.3 | 0.20.x |
1.0.0 | 0.16.2 -> 0.19.0 |
之前我们已经将elasticsearch
容器的/usr/share/elasticsearch/plugins
目录,映射到宿主机的 /mydata/elasticsearch/plugins
目录下,所以我们直接下载/elasticsearch-analysis-ik-7.4.2.zip
文件,然后解压
到该文件夹下即可。安装完毕后,记得重启elasticsearch容器
。
安装完成后,测试分词器:
打开 kibana-DevTool 控制台:
GET _analyze
{
"analyzer": "ik_smart",
"text":"小航是中国人"
}
输出结果:
{
"tokens" : [
{
"token" : "小",
"start_offset" : 0,
"end_offset" : 1,
"type" : "CN_CHAR",
"position" : 0
},
{
"token" : "航",
"start_offset" : 1,
"end_offset" : 2,
"type" : "CN_CHAR",
"position" : 1
},
{
"token" : "是",
"start_offset" : 2,
"end_offset" : 3,
"type" : "CN_CHAR",
"position" : 2
},
{
"token" : "中国人",
"start_offset" : 3,
"end_offset" : 6,
"type" : "CN_WORD",
"position" : 3
}
]
}
小航
竟然没有被识别出来!!!
这可不行,得把“小航”当作一个词,所以我们搞个“自定义词库”
:
安装Nginx:
//先创建一个存放nginx的文件夹
cd /mydata/
mkdir nginx
//下载安装nginx1.10,只是为了获取配置信息,进行配置映射,直接安装会先下载再安装
docker run -p 80:80 --name nginx -d nginx:1.10
//将容器里面的配置文件拷贝到当前目录
docker container cp nginx:/etc/nginx .
//查看mydata的nginx下面有没有文化,有则表示拷贝成功,则可以停止服务
docker stop nginx
docker rm nginx
//为了防止后面安装新的nginx会出现的问题,进入mydata文件夹,再将之前复制的文件重新命名
mv nginx conf
//再创建nginx,将conf移动到nginx里面
mkdir nginx
mv conf nginx/
//再安装新的nginx
docker run -p 80:80 --name nginx \
-v /mydata/nginx/html:/usr/share/nginx/html \
-v /mydata/nginx/logs:/var/log/nginx \
-v /mydata/nginx/conf/:/etc//nginx \
-d nginx:1.10
//再在nginx的html下面创建一个文件夹
cd /mydata/nginx/html
mkdir es
cd es
//再创建一个fenci.txt,追加内容“小航”,并查看
echo '小航' >> ./fenci.txt
cat fenci.txt
nginx启动后测试访问该文件:
http://192.168.56.10/es/fenci.txt
修改/mydata/elasticsearch/plugins/elasticsearch-analysis-ik-7.4.2/config
中的IKAnalyzer.cfg.xml
去掉注释,修改地址
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>IK Analyzer 扩展配置</comment>
<!--用户可以在这里配置自己的扩展字典 -->
<entry key="ext_dict"></entry>
<!--用户可以在这里配置自己的扩展停止词字典-->
<entry key="ext_stopwords"></entry>
<!--用户可以在这里配置远程扩展字典 -->
<entry key="remote_ext_dict">http://192.168.56.10/es/fenci.txt</entry>
<!--用户可以在这里配置远程扩展停止词字典-->
<!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>
!!!重启es:
docker restart elasticsearch
再次测试:
GET _analyze
{
"analyzer": "ik_smart",
"text":"小航是中国人"
}
输出结果:
{
"tokens" : [
{
"token" : "小航",
"start_offset" : 0,
"end_offset" : 2,
"type" : "CN_WORD",
"position" : 0
},
{
"token" : "是",
"start_offset" : 2,
"end_offset" : 3,
"type" : "CN_CHAR",
"position" : 1
},
{
"token" : "中国人",
"start_offset" : 3,
"end_offset" : 6,
"type" : "CN_WORD",
"position" : 2
}
]
}
Nice!
整合Elasticsearch
如果您对ES的基础操作不太了解,请先学习!后期有时间再出ES快速上手教程,本期只写准备环境和整合
Java操作es有两种方式:
1)9300: TCP
- spring-data-elasticsearch:transport-api.jar;
- springboot版本不同,ransport-api.jar不同,不能适配es版本
7.x已经不建议使用,8以后就要废弃
2)9200: HTTP
- jestClient: 非官方,更新慢;
- RestTemplate:模拟HTTP请求,ES很多操作需要自己封装,麻烦;
- HttpClient:同上;
- Elasticsearch-Rest-Client:官方RestClient,封装了ES操作,API层次分明,上手简单;
我们最终选择Elasticsearch-Rest-Client(elasticsearch-rest-high-level-client),具体说明文档:https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high.html
1.导入依赖:(springboot这里默认给的版本是7.6,和咱们的不一样,这里排除重新引入)
<properties>
<elasticsearch.version>7.4.2</elasticsearch.version>
</properties>
<!-- elasticsearch -->
<!-- elasticsearch开始 -->
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>${elasticsearch.version}</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-client</artifactId>
<version>${elasticsearch.version}</version>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>${elasticsearch.version}</version>
<exclusions>
<exclusion>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
</exclusion>
<exclusion>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-client</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- elasticsearch结束 -->
修改后:
修改前:
2.配置信息:
application.yml:
elasticsearch:
schema: http
host: 192.168.56.10
port: 9200
username: elastic
password: 123456
编写ElasticSearchConfig配置类:
package com.example.demo.config;
import lombok.Data;
import org.apache.http.HttpHost;
import org.apache.http.auth.AuthScope;
import org.apache.http.auth.UsernamePasswordCredentials;
import org.apache.http.client.CredentialsProvider;
import org.apache.http.impl.client.BasicCredentialsProvider;
import org.elasticsearch.client.*;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
/**
* @author xh
* @Date 2022/10/8
*/
@Data
@Configuration
@ConfigurationProperties(prefix = "elasticsearch")
public class ElasticSearchConfig {
public static final RequestOptions COMMON_OPTIONS;
static {
RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
// 默认缓存限制为100MB,此处修改为30MB。
builder.setHttpAsyncResponseConsumerFactory(
new HttpAsyncResponseConsumerFactory
.HeapBufferedResponseConsumerFactory(30 * 1024 * 1024));
COMMON_OPTIONS = builder.build();
}
private String schema;
private String host;
private Integer port;
private String username;
private String password;
@Bean
public RestHighLevelClient client() {
// Elasticsearch需要basic auth验证
final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
// 配置账号密码
credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials(username, password));
// 通过builder创建rest client,配置http client的HttpClientConfigCallback。
RestClientBuilder builder = RestClient.builder(new HttpHost(host, port, schema))
.setHttpClientConfigCallback(httpClientBuilder -> {
httpClientBuilder.disableAuthCaching();
return httpClientBuilder.setDefaultCredentialsProvider(credentialsProvider);
});
return new RestHighLevelClient(builder);
}
}
3.测试:
@SpringBootTest
class DemoApplicationTests {
@Autowired
RestHighLevelClient client;
/**
* 测试获取elasticsearch对象
*/
@Test
void contextLoads() {
System.out.println(client);
}
/**
* 新建索引测试
**/
@Test
public void indexData() throws IOException {
// 设置索引
IndexRequest indexRequest = new IndexRequest ("users");
indexRequest.id("1");
User user = new User();
user.setUsername("张三");
Gson gson = new Gson();
String jsonString = gson.toJson(user);
//设置要保存的内容,指定数据和类型
indexRequest.source(jsonString, XContentType.JSON);
//执行创建索引和保存数据
IndexResponse index = client.index(indexRequest, ElasticSearchConfig.COMMON_OPTIONS);
System.out.println(index);
}
@Data
class User {
private String username;
}
}
运行结果:
org.elasticsearch.client.RestHighLevelClient@47248a48
说明elasticsearch对象成功加载到spring上下文中
IndexResponse[index=users,type=_doc,id=1,version=1,result=created,seqNo=0,primaryTerm=1,shards={"total":2,"successful":1,"failed":0}]
索引建立成功
数据库、索引设计
新增数据库:data_info
CREATE TABLE `data_info` (
`id` bigint NOT NULL AUTO_INCREMENT COMMENT '主键ID',
`title` varchar(255) NOT NULL COMMENT '标题',
`info` text NOT NULL COMMENT '详情',
`img` varchar(255) DEFAULT NULL COMMENT '标题图',
`likes` bigint NOT NULL DEFAULT '0' COMMENT '点赞量',
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=18 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
建立data_info
索引:
PUT data_info
{
"mappings":{
"properties": {
"dataId":{ "type": "long" },
"dataTitle": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer":"ik_smart"
},
"dataInfo": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer":"ik_smart"
},
"dataLike":{ "type":"long" },
"dataImg":{
"type": "keyword",
"index": false,
"doc_values": false
},
"node": {
"type": "nested",
"properties": {
"nodeId": {"type": "long" },
"nodeName": {
"type": "keyword",
"index": false,
"doc_values": false
}
}
}
}
}
}
索引说明:
PUT data_info
{
"mappings":{
"properties": {
"dataId":{ "type": "long" }, # 信息ID
"dataTitle": { # 信息标题
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer":"ik_smart"
},
"dataInfo": { # 简略信息
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer":"ik_smart"
},
"dataLike":{ "type":"long" }, # 信息点赞量
"dataImg":{ # 信息预览图
"type": "keyword",
"index": false, # 不可被检索,不生成index,只用做页面使用
"doc_values": false # 不可被聚合,默认为true
},
"node": { # 节点信息
"type": "nested",
"properties": {
"nodeId": {"type": "long" },
"nodeName": {
"type": "keyword",
"index": false,
"doc_values": false
}
}
}
}
}
}
数据新增
ApiController新增新的接口:save
@Autowired
DataDictService dataDictService;
@Autowired
DataInfoService dataInfoService;
@Autowired
RestHighLevelClient client;
@PostMapping("/save")
public Result<String> saveData(@RequestBody List<ESModel> esModels) {
boolean flag = dataInfoService.saveDatas(esModels);
if(flag) {
// TODO 审核后可检索到
flag = esUpdate(esModels);
}
if(flag) {
return new Result<String>().ok("数据保存成功!");
} else {
return new Result<String>().error("数据保存失败!");
}
}
private boolean esUpdate(List<ESModel> esModel) {
// 1.给ES建立一个索引 dataVo
BulkRequest bulkRequest = new BulkRequest();
for (ESModel model : esModel) {
// 设置索引
IndexRequest indexRequest = new IndexRequest("data_info");
// 设置索引id
indexRequest.id(model.getDataId().toString());
Gson gson = new Gson();
String jsonString = gson.toJson(model);
indexRequest.source(jsonString, XContentType.JSON);
// add
bulkRequest.add(indexRequest);
}
// bulk批量保存
BulkResponse bulk = null;
try {
bulk = client.bulk(bulkRequest, ElasticSearchConfig.COMMON_OPTIONS);
} catch (IOException e) {
e.printStackTrace();
}
boolean hasFailures = bulk.hasFailures();
if(hasFailures){
List<String> collect = Arrays.stream(bulk.getItems()).map(BulkItemResponse::getId).collect(Collectors.toList());
log.error("ES新增错误:{}",collect);
}
return !hasFailures;
}
具体解释都在注释中,这里就不赘述了。
DataInfoServiceImpl:
package com.example.demo.service.impl;
import com.example.demo.entity.DataDictEntity;
import com.example.demo.vo.ESModel;
import org.springframework.stereotype.Service;
import com.baomidou.mybatisplus.extension.service.impl.ServiceImpl;
import com.example.demo.dao.DataInfoDao;
import com.example.demo.entity.DataInfoEntity;
import com.example.demo.service.DataInfoService;
import java.util.ArrayList;
import java.util.List;
@Service("dataInfoService")
public class DataInfoServiceImpl extends ServiceImpl<DataInfoDao, DataInfoEntity> implements DataInfoService {
@Override
public boolean saveDatas(List<ESModel> esModels) {
List<DataInfoEntity> dataInfoEntities = new ArrayList<>();
for (ESModel esModel : esModels) {
DataInfoEntity dataInfoEntity = new DataInfoEntity();
dataInfoEntity.setImg(esModel.getDataImg());
dataInfoEntity.setInfo(esModel.getDataInfo());
dataInfoEntity.setLikes(0L);
dataInfoEntity.setTitle(esModel.getDataTitle());
dataInfoEntities.add(dataInfoEntity);
baseMapper.insert(dataInfoEntity);
esModel.setDataId(dataInfoEntity.getId());
esModel.setDataLike(dataInfoEntity.getLikes());
}
// return saveBatch(dataInfoEntities);
return true;
}
}
TODO:这里批量处理待优化,先鸽这!
启动项目测试:
测试数据:
[
{
"dataTitle": "title",
"dataInfo": "dataInfo",
"dataImg": "dataImg",
"nodes": [
{
"nodeId": 1,
"nodeName": "1"
}
]
}
]
返回结果:
{
"code": 0,
"msg": "success",
"data": "数据保存成功!"
}
我们打开ES控制台查看一下结果:
命令:
GET /data_info/_search
{
"query": {"match_all": {}}
}
结果:
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "data_info",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"dataId" : 1,
"dataTitle" : "title",
"dataInfo" : "dataInfo",
"dataLike" : 0,
"dataImg" : "dataImg",
"nodes" : [
{
"nodeId" : 1,
"nodeName" : "1"
}
]
}
}
]
}
}
Perfectly!
数据检索
我们先来思考一下检索条件可能有哪些:
全文检索:dataTitle、dataInfo
排序:dataLike(点赞量)
过滤:node.id
聚合:node
keyword=小航&
sort=dataLike_desc/asc&
node=3:4
额,貌似需求有点简单,好像不够把知识点都串上
增加一组测试数据:
[
{
"dataTitle": "速度还是觉得还是觉得合适机会减少",
"dataInfo": "网络新词 网络上经常会出现一些新词,比如“蓝瘦香菇”,蓝瘦香菇默认情况下会被分词,分词结果如下所示 蓝,瘦,香菇 这样的分词会导致搜索出很多不相关的结果,在这种情况下,我们使用扩展词库",
"dataImg": "dataImg",
"nodes": [
{
"nodeId": 1,
"nodeName": "节点1"
}
]
}
]
编写DSL查询语句:
GET /data_info/_search
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "速度",
"fields": [
"dataTitle",
"dataInfo"
]
}
}
],
"filter": {
"nested": {
"path": "nodes",
"query": {
"bool": {
"must": [
{
"term": {
"nodes.nodeId": {
"value": 1
}
}
}
]
}
}
}
}
}
},
"sort": [
{
"dataLike": {
"order": "desc"
}
}
],
"from": 0,
"size": 5,
"highlight": {
"fields": {
"dataTitle": {},
"dataInfo": {}
},
"pre_tags": "<b style='color:red'>",
"post_tags": "</b>"
}
}
查询结果:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "data_info",
"_type" : "_doc",
"_id" : "17",
"_score" : null,
"_source" : {
"dataId" : 17,
"dataTitle" : "速度还是觉得还是觉得合适机会减少",
"dataInfo" : "网络新词 网络上经常会出现一些新词,比如“蓝瘦香菇”,蓝瘦香菇默认情况下会被分词,分词结果如下所示 蓝,瘦,香菇 这样的分词会导致搜索出很多不相关的结果,在这种情况下,我们使用扩展词库",
"dataLike" : 0,
"dataImg" : "dataImg",
"nodes" : [
{
"nodeId" : 1,
"nodeName" : "节点1"
}
]
},
"highlight" : {
"dataTitle" : [
"<b style='color:red'>速度</b>还是觉得还是觉得合适机会减少"
]
},
"sort" : [
0
]
}
]
}
}
接下来我们使用Java的方式操作DSL:
SearchParam
请求参数:
package com.example.demo.vo;
import lombok.Data;
import java.util.List;
/**
* @author xh
* @Date 2022/10/12
*/
@Data
public class SearchParam {
// 页面传递过来的全文匹配关键字:keyword=小航
private String keyword;
//排序条件:sort=dataLike_desc/asc
private String sort;
/*** 按照节点进行筛选 */
// node=3:4
private List<String> nodes;
/*** 页码*/
private Integer pageNum = 1;
/*** 原生所有查询属性*/
private String _queryString;
}
SearchResult
返回结果:
package com.example.demo.vo;
import com.example.demo.entity.DataInfoEntity;
import lombok.Data;
import java.util.List;
/**
* @author xh
* @Date 2022/10/12
*/
@Data
public class SearchResult {
/** 查询到所有的DataInfos*/
private List<DataInfoEntity> dataInfos;
/*** 当前页码*/
private Integer pageNum;
/** 总记录数*/
private Long total;
/** * 总页码*/
private Integer totalPages;
}
由于我们的需求有:每条信息对应的标签也需要显示
@Data
@TableName("data_info")
public class DataInfoEntity implements Serializable {
private static final long serialVersionUID = 1L;
/**
* 主键ID
*/
@TableId(type = IdType.AUTO)
private Long id;
/**
* 标题
*/
private String title;
/**
* 详情
*/
private String info;
/**
* 标题图
*/
private String img;
/**
* 点赞量
*/
private Long likes;
/**
* 标签
*/
@TableField(exist = false)
private List<String> nodeNames;
}
编写接口:
ApiController
:
@Autowired
DataDictService dataDictService;
@Autowired
DataInfoService dataInfoService;
@Autowired
RestHighLevelClient client;
public static final Integer PAGE_SIZE = 5;
@GetMapping("/search")
public Result<SearchResult> getSearchPage(SearchParam searchParam, HttpServletRequest request) {
// TODO 请求参数加密 && 反爬虫
// 获取请求参数
searchParam.set_queryString(request.getQueryString());
SearchResult result = getSearchResult(searchParam);
return new Result<SearchResult>().ok(result);
}
/**
* 得到请求结果
*/
public SearchResult getSearchResult(SearchParam searchParam) {//根据带来的请求内容封装
SearchResult searchResult= null;
// 通过请求参数构建查询请求
SearchRequest request = buildSearchRequest(searchParam);
try {
SearchResponse searchResponse = client.search(request,
ElasticSearchConfig.COMMON_OPTIONS);
// 将es响应数据封装成结果
searchResult = buildSearchResult(searchParam,searchResponse);
} catch (IOException e) {
e.printStackTrace();
}
return searchResult;
}
private SearchResult buildSearchResult(SearchParam searchParam, SearchResponse searchResponse) {
SearchResult result = new SearchResult();
SearchHits hits = searchResponse.getHits();
//1. 封装查询到的商品信息
if (hits.getHits()!=null&&hits.getHits().length>0){
List<DataInfoEntity> dataInfoEntities = new ArrayList<>();
for (SearchHit hit : hits) {
// 获取JSON并解析为ESModel
String sourceAsString = hit.getSourceAsString();
Gson gson = new Gson();
ESModel esModel = gson.fromJson(sourceAsString, new TypeToken<ESModel>() {
}.getType());
// ESModel转DataInfoEntity
DataInfoEntity dataInfoEntity = new DataInfoEntity();
dataInfoEntity.setTitle(esModel.getDataTitle());
dataInfoEntity.setInfo(esModel.getDataInfo());
dataInfoEntity.setImg(esModel.getDataImg());
dataInfoEntity.setId(esModel.getDataId());
dataInfoEntity.setLikes(esModel.getDataLike());
dataInfoEntity.setNodeNames(esModel.getNodes().stream()
.map(ESModel.Node::getNodeName).collect(Collectors.toList()));
//设置高亮属性
if (!StringUtils.isEmpty(searchParam.getKeyword())) {
HighlightField dataTitle = hit.getHighlightFields().get("dataTitle");
if(dataTitle != null) {
String highLight = dataTitle.getFragments()[0].string();
dataInfoEntity.setTitle(highLight);
}
HighlightField dataInfo = hit.getHighlightFields().get("dataInfo");
if(dataInfo != null) {
String highLight = dataInfo.getFragments()[0].string();
dataInfoEntity.setInfo(highLight);
}
}
dataInfoEntities.add(dataInfoEntity);
}
result.setDataInfos(dataInfoEntities);
}
//2. 封装分页信息
//2.1 当前页码
result.setPageNum(searchParam.getPageNum());
//2.2 总记录数
long total = hits.getTotalHits().value;
result.setTotal(total);
//2.3 总页码
Integer totalPages = (int)total % PAGE_SIZE == 0 ?
(int)total / PAGE_SIZE : (int)total / PAGE_SIZE + 1;
result.setTotalPages(totalPages);
return result;
}
/**
* 构建请求语句
*/
private SearchRequest buildSearchRequest(SearchParam searchParam) {
// 用于构建DSL语句
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
//1. 构建bool query
BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();
//1.1 bool must
if (!StringUtils.isEmpty(searchParam.getKeyword())) {
boolQueryBuilder.must(
QueryBuilders.multiMatchQuery(searchParam.getKeyword(), "dataTitle", "dataInfo")
);
}
// 1.2 filter nested
List<Long> nodes = searchParam.getNodes();
BoolQueryBuilder queryBuilder = new BoolQueryBuilder();
if (nodes!=null && nodes.size() > 0) {
nodes.forEach(nodeId ->{
queryBuilder.must(QueryBuilders.termQuery("nodes.nodeId", nodeId));
});
}
NestedQueryBuilder nestedQueryBuilder = QueryBuilders.nestedQuery("nodes", queryBuilder, ScoreMode.None);
boolQueryBuilder.filter(nestedQueryBuilder);
//1.3 bool query构建完成
searchSourceBuilder.query(boolQueryBuilder);
//2. sort eg:sort=dataLike_desc/asc
if (!StringUtils.isEmpty(searchParam.getSort())) {
String[] sortSplit = searchParam.getSort().split("_");
searchSourceBuilder.sort(sortSplit[0], "asc".equalsIgnoreCase(sortSplit[1]) ? SortOrder.ASC : SortOrder.DESC);
}
//3. 分页 // 是检测结果分页
searchSourceBuilder.from((searchParam.getPageNum() - 1) * PAGE_SIZE);
searchSourceBuilder.size(PAGE_SIZE);
//4. 高亮highlight
if (!StringUtils.isEmpty(searchParam.getKeyword())) {
HighlightBuilder highlightBuilder = new HighlightBuilder();
highlightBuilder.field("dataTitle");
highlightBuilder.field("dataInfo");
highlightBuilder.preTags("<b style='color:red'>");
highlightBuilder.postTags("</b>");
searchSourceBuilder.highlighter(highlightBuilder);
}
log.debug("构建的DSL语句 {}",searchSourceBuilder.toString());
SearchRequest request = new SearchRequest(new String[]{"data_info"}, searchSourceBuilder);
return request;
}
测试接口:
请求地址:http://localhost:8080/search?keyword=速度&sort=dataLike_desc&nodes=1
GET请求
返回结果:
{
"code": 0,
"msg": "success",
"data": {
"dataInfos": [
{
"id": 17,
"title": "<b style='color:red'>速度</b>还是觉得还是觉得合适机会减少",
"info": "网络新词 网络上经常会出现一些新词,比如“蓝瘦香菇”,蓝瘦香菇默认情况下会被分词,分词结果如下所示 蓝,瘦,香菇 这样的分词会导致搜索出很多不相关的结果,在这种情况下,我们使用扩展词库",
"img": "dataImg",
"likes": 0,
"nodeNames": [
"节点1"
]
}
],
"pageNum": 1,
"total": 1,
"totalPages": 1
}
}
大功告成!