前言:
我们的应用经常需要添加检索功能,更或者是大量日志检索分析等,SpringBoot 通过整合 SpringData Elasticdearch 为我们提供了非常便捷的检索功能支持。
Elasticsearch是一个分布式搜索服务,提供Restful API,底层基于Lucene,采用多Shard的方式保证数据安全,并且提供自动Resharding的功能,GitHub等大型的站点也是采用了 Elasticsearch 作为其搜索服务。
Elasticsearch Java REST Client - 参考文档
Spring Data Elasticsearch - 参考文档
特别鸣谢:遇见狂神说
一、概述
1.1 与关系型数据库的客观对比
Elasticsearch 是面向文档的,使用 JSON 作为文档的序列化格式。
Elasticsearch(集群)中可以包含多个索引(数据库),每个索引中可以包含多个类型(表),每个类型下又包含多个文档(行),每个文档中又包含多个字段(列)。
与关系型数据库的客观对比如下:
Relational DB | Elasticsearch |
---|---|
数据库(database) | 索引(indices) |
表(tables) | 类型(types)(将被弃用) |
行(row) | 文档(documents) |
列(columns) | 字段(fields) |
1.2 物理设计
Elasticsearch 在后台把每个索引划分为多个分片,每个分片可以在集群中的不同服务器间迁移。
一个运行中的 Elasticsearch 实例称为一个节点,而集群是由一个或者多个拥有相同 cluster.name 配置的节点组成, 它们共同承担数据和负载的压力。
1.3 逻辑设计
一个索引类型中,包含多个文档,比如说文档1、文档2。当索引一篇文档时,可以通过这样的一个顺序找到它:
索引 》类型 》文档id
通过这个组合就能索引到某个具体的文档。(注意id不必是整数,实际上它是个字符串)
-
文档
在 Elasticsearch 中,文档是索引和搜索数据的最小单位。
文档有几个重要属性:
• 自我包含:一个文档同时包含字段和对应的值,也就是同时包含 key:value 。
• 层次性:一个文档中包含自文档。
• 结构灵活:文档不依赖预先定义的模式。尽管可以随意新增或忽略某个字段,但是每个字段的类型非常重要。
-
类型
类型是文档的逻辑容器,就像关系型数据库一样,表格是行的容器。
类型中对于字段的定义称为映射。
-
索引
索引是映射类型的容器,Elasticsearch 中的索引是一个非常大的文档集合。
索引存储了映射类型的字段和其它设置,然后它们被存储到了各个分片上。
1.4 工作原理
一个集群至少有一个节点,而一个节点就是一个 Elasticsearch 进程,节点可以有多个默认索引,如果创建索引,那么索引将会有5个分片(primary shard 又称主分片)构成的,每一个主分片会有一个副本(replica shard 又称复制分片)。
上图是一个有3个节点的集群,主分片与对应的复制分片都不回在同一个节点内,这样有利于如果某个节点宕机,数据也不至于丢失。
实际上,一个分片就是一个 Lucene 索引,一个包含倒排索引的文件目录,倒排索引的结构使得 Elasticsearch 在不扫描全部文档的情况下,就能检索文档包含的特定关键字。
1.5 倒排索引
Elasticsearch 使用的是一种称为倒排索引的结构,采用 Lucene 倒排索引作为底层。
这种结构适用于快速的全文搜索,一个索引由文档中所有不重复的列表构成,对于每一个词,都有一个包含它的文档列表。
例如,现在有两个文档,每个文档包含如下内容:
# 文档1包含的内容
Study every day, good good up to forever
# 文档2包含的内容
To forever, study every day, good good up
为了创建倒排索引,首先要将每个文档拆分成独立的词(或称为词条或者tokens),然后创建一个包含所有不重复的词条的排序列表,然后列出每个词条出现在哪个文档。
term | doc_1 | doc_2 |
---|---|---|
Study | ✓ | ✗ |
To | ✗ | ✗ |
every | ✓ | ✓ |
forever | ✓ | ✓ |
day | ✓ | ✓ |
study | ✗ | ✓ |
good | ✓ | ✓ |
every | ✓ | ✓ |
to | ✓ | ✗ |
up | ✓ | ✓ |
如果搜索 to forever,只需查看包含每个词条的文档。
term | doc_1 | doc_2 |
---|---|---|
to | ✓ | ✗ |
forever | ✓ | ✓ |
total | 2 | 1 |
两个文档都匹配,但是第一个文档比第二个文档的匹配程度更高。
如果没有别的条件,这两个包含关键字的文档都将返回。
二、部署&测试
2.1 部署 Elasticsearch
-
拉取镜像
docker pull elasticsearch
-
创建容器
其中9200是http访问端口,9300是tcp访问端口。
docker run -e "discovery.type=single-node" -e ES_JAVA_OPTS="-Xms512m -Xmx512m" -d -p 9200:9200 -p 9300:9300 --name es elasticsearch:7.6.2
启动异常:
ERROR: [1] bootstrap checks failed [1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least
解决:
查看max_map_count:
cat /proc/sys/vm/max_map_count 65530
设置max_map_count:
sysctl -w vm.max_map_count=262144
-
测试
访问 http://Server-IP:9200 出现以下页面
2.2 部署可视化工具 Elasticsearch-head
-
拉取镜像
docker pull mobz/elasticsearch-head:5
-
创建容器
docker run -d -p 9100:9100 --name head mobz/elasticsearch-head:5
-
解决跨域请求问题
进入 Elasticsearch 容器,修改配置文件elasticsearch.yml
行末添加以下字段:http.cors.enabled: true http.cors.allow-origin: "*"
重启服务
-
在查看或操作索引数据时,可能还报如下错误:
{“error”:“Content-Type header [application/x-www-form-urlencoded] is not supported”,“status”:406}
解决方法:
• 进入head 容器
• 安装 vim
配置国内镜像源:
mv /etc/apt/sources.list /etc/apt/sources.list.bak echo "deb http://mirrors.163.com/debian/ jessie main non-free contrib" >> /etc/apt/sources.list echo "deb http://mirrors.163.com/debian/ jessie-proposed-updates main non-free contrib" >>/etc/apt/sources.list echo "deb-src http://mirrors.163.com/debian/ jessie main non-free contrib" >>/etc/apt/sources.list echo "deb-src http://mirrors.163.com/debian/ jessie-proposed-updates main non-free contrib" >>/etc/apt/sources.list
更新安装源
apt-get update
安装 vim
apt-get install vim
• 进入_site目录,修改vendor.js文件
① 6886行 contentType: "application/x-www-form-urlencoded" 改成:contentType: "application/json;charset=UTF-8" ② 7573行 var inspectData = s.contentType === "application/x-www-form-urlencoded" && 改成:var inspectData = s.contentType === "application/json;charset=UTF-8" &&
-
测试
访问 http://Server-IP:9200 出现以下页面
2.3 部署可视化工具 Kibana
-
拉取镜像
docker pull kibana:7.6.2
-
创建容器
docker run -d -e ELASTICSEARCH_URL=http://39.105.80.221:9200 -p 5601:5601 --name kibana kibana:7.6.2
-
修改访问地址&汉化
进入容器
修改访问地址:编辑 kibana.yml 将 elasticsearch.hosts 修改为 Elasticsearch 服务地址
汉化:编辑 kibana.yml 行末添加 i18n.locale: “zh-CN”
-
测试
访问 http://Server-IP:5601 出现以下页面
2.4 安装 IK 分词器
什么是 IK 分词器?
分词:即把一段中文或者英文或分成一个个的关键字,我们在搜索的时候会把输入的信息进行分词,会把数据库或者索引库中的数据进行分词,然后进行一个匹配操作,默认的中文分词是将每一个字看成一个词,但这是不符合实际需求的,所以需要安装中文分词器 IK 来解决这个问题。
IK 提供了两个分词算法:ik_smart 和 ik_max_word ,其中 ik_smart 为最少切片,ik_max_word 为最细粒度切片。
-
进入 elasticsearch 容器
-
安装 wget
yum -y install wget
-
在 plugins 目录下创建 ik 目录
mkdir ik
-
进入 ik 目录使用 wget 下载对应版本
wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.6.2/elasticsearch-analysis-ik-7.6.2.zip
-
解压压缩包
unzip elasticsearch-analysis-ik-7.6.2.zip
-
删除压缩包
rm -rf elasticsearch-analysis-ik-7.6.2.zip
-
验证
重启 elasticsearch 容器后重新进入容器,在 bin 目录下执行指令:
elasticsearch-plugin list
显示 ik 即表示安装成功
-
测试
在 Kibana Dev Tools 控制台中输入以下命令
GET _analyze { "analyzer": "ik_smart", "text": "中国共产党" } GET _analyze { "analyzer": "ik_max_word", "text": "中国共产党" }
分别发送请求会得到不同响应
{ "tokens" : [ { "token" : "中国共产党", "start_offset" : 0, "end_offset" : 5, "type" : "CN_WORD", "position" : 0 } ] }
{ "tokens" : [ { "token" : "中国共产党", "start_offset" : 0, "end_offset" : 5, "type" : "CN_WORD", "position" : 0 }, { "token" : "中国", "start_offset" : 0, "end_offset" : 2, "type" : "CN_WORD", "position" : 1 }, { "token" : "国共", "start_offset" : 1, "end_offset" : 3, "type" : "CN_WORD", "position" : 2 }, { "token" : "共产党", "start_offset" : 2, "end_offset" : 5, "type" : "CN_WORD", "position" : 3 }, { "token" : "共产", "start_offset" : 2, "end_offset" : 4, "type" : "CN_WORD", "position" : 4 }, { "token" : "党", "start_offset" : 4, "end_offset" : 5, "type" : "CN_CHAR", "position" : 5 } ] }
2.5 添加自定义分词字典
-
进入 elasticsearch 容器
-
处理中文乱码问题
编辑 ~/.vimrc 文件,行末添加以下配置:
set fileencodings=utf-8,ucs-bom,gb18030,gbk,gb2312,cp936 set termencoding=utf-8 set encoding=utf-8
保存退出
-
进入 IK 插件安装目录
-
进入 config 目录
-
创建 dic 文件
touch caixukun.dic
-
编辑 dic 添加自定义词条
蔡徐坤 鸡你太美
-
编辑 IKAnalyzer.cfg.xml
<entry key="ext_dict">caixukun.dic</entry>
-
重启 elasticsearch 容器
-
测试
在 Kibana Dev Tools 控制台中输入以下命令:
GET _analyze { "analyzer": "ik_smart", "text": "蔡徐坤鸡你太美" }
默认响应数据:
{ "tokens" : [ { "token" : "蔡", "start_offset" : 0, "end_offset" : 1, "type" : "CN_CHAR", "position" : 0 }, { "token" : "徐", "start_offset" : 1, "end_offset" : 2, "type" : "CN_CHAR", "position" : 1 }, { "token" : "坤", "start_offset" : 2, "end_offset" : 3, "type" : "CN_CHAR", "position" : 2 }, { "token" : "鸡", "start_offset" : 3, "end_offset" : 4, "type" : "CN_CHAR", "position" : 3 }, { "token" : "你", "start_offset" : 4, "end_offset" : 5, "type" : "CN_CHAR", "position" : 4 }, { "token" : "太美", "start_offset" : 5, "end_offset" : 7, "type" : "CN_WORD", "position" : 5 } ] }
自定义字典添加后响应数据:
{ "tokens" : [ { "token" : "蔡徐坤", "start_offset" : 0, "end_offset" : 3, "type" : "CN_WORD", "position" : 0 }, { "token" : "鸡你太美", "start_offset" : 3, "end_offset" : 7, "type" : "CN_WORD", "position" : 1 } ] }
三、Rest 风格说明
一种软件结构风格,而不是标准,只是提供了一组设计原则和约束条件,它主要用于客户端和服务器交互类的软件。
基于这个风格设计的软件可以更简洁,更有层次,更易于实现缓存等机制。
基本 Rest 命令说明:
method | utl地址 | 描述 |
---|---|---|
PUT | localhost:9200/索引名称/类型名称/文档id | 创建文档(指定文档id) |
POST | localhost:9200/索引名称/类型名称 | 创建文档 |
POST | localhost:9200/索引名称/类型名称/文档id/_update | 修改文档 |
DELETE | localhost:9200/索引名称/类型名称/文档id | 删除文档 |
GET | localhost:9200/索引名称/类型名称/文档id | 通过id查询文档 |
POST | localhost:9200/索引名称/类型名称/_serch | 查询所有数据 |
3.1 基础测试
-
在 Kibana Dev Tools 控制台中输入以下命令:
PUT /test1/type1/1 { "name": "蔡徐坤", "age": 10 }
• 命令解释:
PUT:创建命令
test1:索引
type1:类型
1:id
“name”: “蔡徐坤”:属性
“age”: 10:属性 -
发送请求
得到响应如下:
#! Deprecation: [types removal] Specifying types in document index requests is deprecated, use the typeless endpoints instead (/{index}/_doc/{id}, /{index}/_doc, or /{index}/_create/{id}). { "_index" : "test1", "_type" : "type1", "_id" : "1", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 0, "_primary_term" : 1 }
-
进入 head 查看已创建的索引信息
3.2 创建索引规则
-
在 Kibana Dev Tools 控制台中输入以下命令:
PUT /test2 { "mappings": { "properties": { "name": { "type": "text" }, "age": { "type": "long" }, "birthday": { "type": "date" } } } }
-
发送请求
得到响应如下:
{ "acknowledged" : true, "shards_acknowledged" : true, "index" : "test2" }
-
进入 head 查看已创建的索引信息
3.3 查看默认的信息
如果文档字段没有指定,那么 Elasticsearch 就会自动配置默认字段。
- 在 Kibana Dev Tools 控制台中输入以下命令:
PUT /test3/_doc/1
{
"name": "蔡徐坤",
"age": 10,
"birthday": "1998-08-02"
}
-
发送请求
得到响应如下:
{ "_index" : "test3", "_type" : "_doc", "_id" : "1", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 0, "_primary_term" : 1 }
-
控制台中输入以下命令:
GET test3
-
发送请求
得到响应如下:
{ "test3" : { "aliases" : { }, "mappings" : { "properties" : { "age" : { "type" : "long" }, "birthday" : { "type" : "date" }, "name" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } } } }, "settings" : { "index" : { "creation_date" : "1596476421598", "number_of_shards" : "1", "number_of_replicas" : "1", "uuid" : "Rh3Z67EpSPSOUbz1lmgB7g", "version" : { "created" : "7060299" }, "provided_name" : "test3" } } } }
3.4 修改操作
通过 POST 命令实现修改操作。
- 在 Kibana Dev Tools 控制台中输入以下命令:
POST /test3/_doc/1/_update
{
"doc": {
"name": "坤坤"
}
}
-
发送请求
得到响应如下:
#! Deprecation: [types removal] Specifying types in document update requests is deprecated, use the endpoint /{index}/_update/{id} instead. { "_index" : "test3", "_type" : "_doc", "_id" : "1", "_version" : 2, // 更新次数 "result" : "updated", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 1, "_primary_term" : 1 }
版本号发生变化
3.5 删除操作
通过 DELETE 命令实现删除操作。
-
在 Kibana Dev Tools 控制台中输入以下命令:
DELETE test1
-
发送请求
得到响应如下:
{ "acknowledged" : true }
3.6 拓展命令
通过 GET _cat 命令可以获得当前 Elasticsearch 集群的许多信息。
- 查看集群健康值
GET _cat/health
- 查看索引具体信息
GET _cat/indices?v
四、关于文档的基本操作
4.1 添加数据 PUT
-
在 Kibana Dev Tools 控制台中输入以下命令:
PUT /stars/user/1 { "name": "蔡徐坤", "age": "22", "desc": "鸡你太美", "tags": ["唱","跳","rap","篮球"] }
-
发送请求
得到响应如下:
{ "_index" : "stars", "_type" : "user", "_id" : "1", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 0, "_primary_term" : 1 }
-
添加用户2
PUT /stars/user/2 { "name": "吴亦凡", "age": "29", "desc": "大碗宽面", "tags": ["加拿大","电鳗","说唱","嘻哈"] }
-
添加用户3
PUT /stars/user/3 { "name": "梁非凡", "age": "40", "desc": "也啦你", "tags": ["桌面清理大师","警察","啵嘴"] }
-
进入 head 查看已创建的索引信息
4.2 查询数据 GET
- 简单查询
GET stars/user/1
{
"_index" : "stars",
"_type" : "user",
"_id" : "1",
"_version" : 1,
"_seq_no" : 0,
"_primary_term" : 1,
"found" : true,
"_source" : {
"name" : "蔡徐坤",
"age" : "22",
"desc" : "鸡你太美",
"tags" : [
"唱",
"跳",
"rap",
"篮球"
]
}
}
- 复杂查询
包含关键字匹配
GET stars/user/_search?q=name:吴亦凡
"took" : 64,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 2.313365,
"hits" : [
{
"_index" : "stars",
"_type" : "user",
"_id" : "2",
"_score" : 2.313365, //匹配度
"_source" : {
"name" : "吴亦凡",
"age" : "29",
"desc" : "大碗宽面",
"tags" : [
"加拿大",
"电鳗",
"说唱",
"嘻哈"
]
}
},
{
"_index" : "stars",
"_type" : "user",
"_id" : "3",
"_score" : 0.4471386,
"_source" : {
"name" : "梁非凡",
"age" : "40",
"desc" : "吔*啦你",
"tags" : [
"桌面清理大师",
"警察",
"啵嘴"
]
}
}
]
}
}
4.3 更新数据 POST
-
在 Kibana Dev Tools 控制台中输入以下命令:
POST /stars/user/1/_update { "doc": { "name": "坤坤" } }
-
发送请求
得到响应如下:
{ "_index" : "stars", "_type" : "user", "_id" : "1", "_version" : 2, // 更新次数 "result" : "updated", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 3, "_primary_term" : 1 }
4.4 删除数据 DELETE
五、高级查询操作
5.1 普通查询
- 请求
GET stars/user/_search { "query": { "match": { "name": "凡" // 关键字 } } }
- 响应
{ "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 0.4471386, "hits" : [ { "_index" : "stars", "_type" : "user", "_id" : "2", "_score" : 0.4471386, "_source" : { "name" : "吴亦凡", "age" : "29", "desc" : "大碗宽面", "tags" : [ "加拿大", "电鳗", "说唱", "嘻哈" ] } }, { "_index" : "stars", "_type" : "user", "_id" : "3", "_score" : 0.4471386, "_source" : { "name" : "梁非凡", "age" : "40", "desc" : "吔*啦你", "tags" : [ "桌面清理大师", "警察", "啵嘴" ] } } ] } }
5.2 查询结果过滤指定字段
- 请求
GET stars/user/_search { "query": { "match": { "name": "凡" } }, "_source": ["name", "desc"] // 过滤字段 }
- 响应
{ "took" : 2, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 0.4471386, "hits" : [ { "_index" : "stars", "_type" : "user", "_id" : "2", "_score" : 0.4471386, "_source" : { "name" : "吴亦凡", "desc" : "大碗宽面" } }, { "_index" : "stars", "_type" : "user", "_id" : "3", "_score" : 0.4471386, "_source" : { "name" : "梁非凡", "desc" : "吔*啦你" } } ] } }
5.3 查询结果排序
- 请求
GET stars/user/_search { "query": { "match": { "name": "凡" } }, "sort": [ { "age.keyword": { "order": "desc" // 降序 } } ] }
- 响应
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : null, "hits" : [ { "_index" : "stars", "_type" : "user", "_id" : "3", "_score" : null, "_source" : { "name" : "梁非凡", "age" : "40", "desc" : "吔*啦你", "tags" : [ "桌面清理大师", "警察", "啵嘴" ] }, "sort" : [ "40" ] }, { "_index" : "stars", "_type" : "user", "_id" : "2", "_score" : null, "_source" : { "name" : "吴亦凡", "age" : "29", "desc" : "大碗宽面", "tags" : [ "加拿大", "电鳗", "说唱", "嘻哈" ] }, "sort" : [ "29" ] } ] } }
5.4 查询结果分页
- 请求
GET stars/user/_search { "query": { "match": { "name": "凡" } }, "_source": ["name", "desc"], "from": 0, // 开始位置 "size": 1 // 返回数据数目 }
- 响应
{ "took" : 3, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 0.4471386, "hits" : [ { "_index" : "stars", "_type" : "user", "_id" : "2", "_score" : 0.4471386, "_source" : { "name" : "吴亦凡", "desc" : "大碗宽面" } } ] } }
5.5 多条件查询
must:相当于关系型数据库 and
- 请求
GET stars/user/_search { "query": { "bool": { "must": [ { "match": { "name": "吴亦凡" } }, { "match": { "age": "29" } } ] } } }
- 响应
{ "took" : 5, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 3.2941942, "hits" : [ { "_index" : "stars", "_type" : "user", "_id" : "2", "_score" : 3.2941942, "_source" : { "name" : "吴亦凡", "age" : "29", "desc" : "大碗宽面", "tags" : [ "加拿大", "电鳗", "说唱", "嘻哈" ] } } ] } }
should:相当于关系型数据库 or
- 请求
GET stars/user/_search { "query": { "bool": { "should": [ { "match": { "name": "吴亦凡" } }, { "match": { "age": "29" } } ] } } }
- 响应
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 3.2941942, "hits" : [ { "_index" : "stars", "_type" : "user", "_id" : "2", "_score" : 3.2941942, "_source" : { "name" : "吴亦凡", "age" : "29", "desc" : "大碗宽面", "tags" : [ "加拿大", "电鳗", "说唱", "嘻哈" ] } }, { "_index" : "stars", "_type" : "user", "_id" : "3", "_score" : 0.4471386, "_source" : { "name" : "梁非凡", "age" : "40", "desc" : "吔*啦你", "tags" : [ "桌面清理大师", "警察", "啵嘴" ] } } ] } }
must_not:相当于关系型数据库 not
- 请求
GET stars/user/_search { "query": { "bool": { "must_not": [ { "match": { "age": "29" } } ] } } }
- 响应
{ "took" : 2, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 0.0, "hits" : [ { "_index" : "stars", "_type" : "user", "_id" : "3", "_score" : 0.0, "_source" : { "name" : "梁非凡", "age" : "40", "desc" : "吔*啦你", "tags" : [ "桌面清理大师", "警察", "啵嘴" ] } }, { "_index" : "stars", "_type" : "user", "_id" : "1", "_score" : 0.0, "_source" : { "name" : "坤坤", "age" : "22", "desc" : "鸡你太美", "tags" : [ "唱", "跳", "rap", "篮球" ] } } ] } }
5.6 根据过滤条件查询
- 请求
GET stars/user/_search { "query": { "bool": { "must": [ { "match": { "name": "凡" } } ], "filter": [ { "range": { "age": { "gte": 10, // 大于等于10岁 "lte": 30 // 小于等于30岁 } } } ] } } }
- 响应
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 0.4471386, "hits" : [ { "_index" : "stars", "_type" : "user", "_id" : "2", "_score" : 0.4471386, "_source" : { "name" : "吴亦凡", "age" : "29", "desc" : "大碗宽面", "tags" : [ "加拿大", "电鳗", "说唱", "嘻哈" ] } } ] } }
5.7 匹配多个条件查询
- 请求
GET stars/user/_search { "query": { "match": { "tags": "唱 跳" // 多个条件使用空格隔开 } } }
- 响应
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 1.7137355, "hits" : [ { "_index" : "stars", "_type" : "user", "_id" : "1", "_score" : 1.7137355, "_source" : { "name" : "坤坤", "age" : "22", "desc" : "鸡你太美", "tags" : [ "唱", "跳", "rap", "篮球" ] } }, { "_index" : "stars", "_type" : "user", "_id" : "2", "_score" : 0.4471386, "_source" : { "name" : "吴亦凡", "age" : "29", "desc" : "大碗宽面", "tags" : [ "加拿大", "电鳗", "说唱", "嘻哈" ] } } ] } }
5.8 精确查询
关于分词:
- term:直接通过倒排索引指定的词条进行精确查询
- match:先分析文档,再通过分析的文档进行查询
两个字段类型:
- text:会被分词器解析
- keyword:不会被分词器解析
5.9 高亮查询
- 请求
GET stars/user/_search { "query": { "match": { "name": "吴亦凡" } }, "highlight": { "fields": { "name": {} } } }
- 响应
{ "took" : 96, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 2.313365, "hits" : [ { "_index" : "stars", "_type" : "user", "_id" : "2", "_score" : 2.313365, "_source" : { "name" : "吴亦凡", "age" : "29", "desc" : "大碗宽面", "tags" : [ "加拿大", "电鳗", "说唱", "嘻哈" ] }, "highlight" : { "name" : [ "<em>吴</em><em>亦</em><em>凡</em>" // 高亮标签 ] } }, { "_index" : "stars", "_type" : "user", "_id" : "3", "_score" : 0.4471386, "_source" : { "name" : "梁非凡", "age" : "40", "desc" : "吔*啦你", "tags" : [ "桌面清理大师", "警察", "啵嘴" ] }, "highlight" : { "name" : [ "梁非<em>凡</em>" ] } } ] } }
5.10 自定义高亮标签
- 请求
GET stars/user/_search
{
"query": {
"match": {
"name": "吴亦凡"
}
},
"highlight": {
"pre_tags": "<p class='key' style='color:red'>",
"post_tags": "</p>",
"fields": {
"name": {}
}
}
}
- 响应
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 2.313365,
"hits" : [
{
"_index" : "stars",
"_type" : "user",
"_id" : "2",
"_score" : 2.313365,
"_source" : {
"name" : "吴亦凡",
"age" : "29",
"desc" : "大碗宽面",
"tags" : [
"加拿大",
"电鳗",
"说唱",
"嘻哈"
]
},
"highlight" : {
"name" : [
"<p class='key' style='color:red'>吴</p><p class='key' style='color:red'>亦</p><p class='key' style='color:red'>凡</p>"
]
}
},
{
"_index" : "stars",
"_type" : "user",
"_id" : "3",
"_score" : 0.4471386,
"_source" : {
"name" : "梁非凡",
"age" : "40",
"desc" : "吔*啦你",
"tags" : [
"桌面清理大师",
"警察",
"啵嘴"
]
},
"highlight" : {
"name" : [
"梁非<p class='key' style='color:red'>凡</p>"
]
}
}
]
}
}
三、SpringBoot 整合 Elasticsearch
3.1 环境搭建
-
导入依赖
注意 Elasticsearch 版本需保持一致。
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-elasticsearch</artifactId> </dependency>
-
编写配置类
@Configuration public class RestClientConfig extends AbstractElasticsearchConfiguration { @Override @Bean public RestHighLevelClient elasticsearchClient() { final ClientConfiguration clientConfiguration = ClientConfiguration.builder() .connectedTo("39.105.80.221:9200").build(); return RestClients.create(clientConfiguration).rest(); } }
3.2 索引相关操作
- 索引的创建
@SpringBootTest class ElasticApplicationTests { @Autowired RestHighLevelClient elasticsearchClient; /** * 测试索引的创建 */ @Test void test01() throws IOException { // 创建请求 CreateIndexRequest request = new CreateIndexRequest("test_index"); // 客户端执行请求 CreateIndexResponse response = elasticsearchClient.indices().create(request, RequestOptions.DEFAULT); System.out.println(response); } }
- 判断索引是否存在
@SpringBootTest class ElasticApplicationTests { @Autowired RestHighLevelClient elasticsearchClient; /** * 测试判断索引是否存在 */ @Test void test02() throws IOException { // 创建请求 GetIndexRequest request = new GetIndexRequest("test_index"); // 客户端执行请求 boolean response = elasticsearchClient.indices().exists(request, RequestOptions.DEFAULT); System.out.println(response); } }
- 索引的删除
@SpringBootTest
class ElasticApplicationTests {
@Autowired
RestHighLevelClient elasticsearchClient;
/**
* 测试索引的删除
*/
@Test
void test03() throws IOException {
// 创建请求
DeleteIndexRequest request = new DeleteIndexRequest("test_index");
// 客户端执行请求
AcknowledgedResponse response = elasticsearchClient.indices().delete(request, RequestOptions.DEFAULT);
System.out.println(response.isAcknowledged());
}
}
3.3 文档相关操作
- 文档的添加
@Test
void test04() throws IOException {
// 创建对象
User user = new User("testUser", 18);
// 创建请求
IndexRequest request = new IndexRequest("test_index");
// 设置id
request.id("1");
// 设置请求超时时间
request.timeout(TimeValue.timeValueSeconds(1));
// 将对象转为JSON数据放入请求
request.source(objectMapper.writeValueAsString(user), XContentType.JSON);
// 客户端发送请求
IndexResponse response = elasticsearchClient.index(request, RequestOptions.DEFAULT);
System.out.println(response.toString());
System.out.println(response.status());
}
- 判断文档是否存在
@Test
void test05() throws IOException {
// 创建请求
GetRequest request = new GetRequest("test_index", "1");
// 客户端发送请求
boolean response = elasticsearchClient.exists(request, RequestOptions.DEFAULT);
System.out.println(response);
}
- 文档信息的获取
@Test
void test06() throws IOException {
// 创建请求
GetRequest request = new GetRequest("test_index", "1");
// 客户端发送请求
GetResponse response = elasticsearchClient.get(request, RequestOptions.DEFAULT);
System.out.println(response.getSourceAsString());
}
- 文档信息的更新
@Test
void test07() throws IOException {
// 创建对象
User user = new User("testUser", 28);
// 创建请求
UpdateRequest request = new UpdateRequest("test_index", "1");
request.doc(objectMapper.writeValueAsString(user), XContentType.JSON);
// 客户端发送请求
UpdateResponse response = elasticsearchClient.update(request, RequestOptions.DEFAULT);
System.out.println(response.status());
}
- 文档信息的删除
@Test
void test08() throws IOException {
// 创建请求
DeleteRequest request = new DeleteRequest("test_index", "1");
// 设置请求超时时间
request.timeout(TimeValue.timeValueSeconds(1));
// 客户端发送请求
DeleteResponse response = elasticsearchClient.delete(request, RequestOptions.DEFAULT);
System.out.println(response.status());
}
- 文档数据的批量插入
@Test
void test09() throws IOException {
// 创建请求
BulkRequest request = new BulkRequest();
// 设置超时时间
request.timeout(TimeValue.timeValueSeconds(10));
// 创建批量数据
ArrayList<User> users = new ArrayList<>();
users.add(new User("testUser02", 20));
users.add(new User("testUser03", 21));
users.add(new User("testUser04", 22));
users.add(new User("testUser05", 23));
users.add(new User("testUser06", 24));
// 将批量数据添加至请求
for (int i = 0; i < users.size(); i++) {
request.add(
new IndexRequest("test_index")
.id("" + i)
.source(objectMapper.writeValueAsString(users.get(i)), XContentType.JSON)
);
}
// 客户端发送请求
BulkResponse responses = elasticsearchClient.bulk(request, RequestOptions.DEFAULT);
System.out.println(responses.hasFailures());
}
- 文档的查询
@Test
void test10() throws IOException {
// 创建请求
SearchRequest request = new SearchRequest("test_index");
// 设置搜索条件
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
// 创建查询构建器
sourceBuilder.query(QueryBuilders.termQuery("name.keyword", "testUser02"));
// 设置超时时间
sourceBuilder.timeout(TimeValue.timeValueSeconds(60));
request.source(sourceBuilder);
// 客户端发送请求
SearchResponse response = elasticsearchClient.search(request, RequestOptions.DEFAULT);
System.out.println(objectMapper.writeValueAsString(response.getHits()));
for (SearchHit hit : response.getHits().getHits()) {
System.out.println("----------");
System.out.println(hit.getSourceAsMap());
}
}
四、实战应用 - 京东搜索
4.1 环境搭建
- 导入依赖
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-elasticsearch</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-thymeleaf</artifactId> </dependency> <dependency> <groupId>org.jsoup</groupId> <artifactId>jsoup</artifactId> <version>1.13.1</version> </dependency>
- 配置文件
server: port: 8080 spring: thymeleaf: cache: false # 关闭 thymeleaf 缓存
- controller
@Controller public class IndexController { @GetMapping({"/", "/index"}) public String index() { return "index"; } }