文章目录
前文回顾
前篇文章 Docker快速下载、安装、运行ElasticSearch与Kibana容器 详细介绍了ElasticSearch与Kibina的安装。本文我们将开始学习ElasticSearch的核心概念、Rest API使用以及与Spring Boot的整合
核心概念
Inverted Index
倒排索引。它一种将词项映射到文档的数据结构。在倒排索引中,会将文档的分词作为索引,索引指向文档id集合。这样查询时根据分词可以快速找到对应的文档id集合,然后根据文档id集合检索出文档数据,大大提高了检索效率
index
索引。它负责组织存储相同类型的文档,类似于Mysql中的表。此外,索引还规定了文档数据的分片规则等信息
mapping
映射。它定义了索引中文档的结构和字段类型等信息
data type
数据类型。ElasticSearch常见的数据类型有Binary、Boolean、Date、Geopoint、IP、Keyword、Numeric、Object、Text等字段类型。其中Text类型的字段会参与分词,而Keyword类型的字段不会参与分词
document
文档,存储在索引中的数据,类似于Mysql中表的行数据
analyzer
分词器,它制定了对text字段的分词规则,ElasticSearch自带的分词器是Standard。但是Standard分词器不能满足中文的分词规则,我们可以使用IK分词器插件来作为中文分词器
Kibana操作ElasticSearch
操作索引
我们以商品信息为例
- 创建商品索引
PUT good
{
"mappings": {
"properties": {
"goodNumber": {
"type": "keyword",
"index": false
},
"category": {
"type": "keyword"
},
"brand": {
"type": "keyword"
},
"goodName": {
"type": "text"
},
"price": {
"type": "double"
},
"store": {
"type": "integer"
},
"image": {
"type": "keyword",
"index": false
},
"upTime": {
"type": "date"
}
}
}
}
- 获取索引信息
GET good
- 获取索引映射信息
GET good/_mapping
- 删除索引
DELETE good
操作文档
- 添加一篇文档(不指定文档id)
POST good/_doc
{
"brand": "小米",
"category": "手机",
"goodName": "小米12",
"goodNumber": "xm123",
"image": "xiaomi.png",
"price": 1500,
"store": 20,
"upTime": 1715571178304
}
- 添加一篇文档(指定文档id)
POST good/_doc/3001
{
"brand": "小米",
"category": "手机",
"goodName": "小米12 pro",
"goodNumber": "xm1234",
"image": "xiaomi.png",
"price": 1800,
"store": 30,
"upTime": 1715571178304
}
- 批量添加文档
POST /_bulk
{"index": {"_index": "good"}}
{"brand": "红米","category": "手机","goodName": "红米 note 12","goodNumber": "hm123","image": "hongmi.png","price": 1500,"store": 35,"upTime": 1715571178304}
{"index": {"_index": "good"}}
{"brand": "红米","category": "手机","goodName": "红米 note 13","goodNumber": "hm124","image": "hongmi.png","price": 1700,"store": 35,"upTime": 1715571178304}
{"index": {"_index": "good"}}
{"brand": "红米","category": "手机","goodName": "红米 note 14","goodNumber": "hm125","image": "hongmi.png","price": 1800,"store": 35,"upTime": 1715571178304}
- 根据文档id查询文档
GET good/_doc/3001
- 修改文档信息
POST good/_update/3001
{
"doc": {
"upTime" : 1715571178306
}
}
- 根据文档id删除文档
DELETE good/_doc/3001
全文检索
- 查询所有文档
GET good/_search
{
"query": {
"match_all": {}
}
}
- 根据id列表查询
GET good/_search
{
"query": {
"ids": {
"values": ["id1","id2"]
}
}
}
- 前缀精确匹配查询
GET good/_search
{
"query": {
"prefix": {
"goodName": {
"value": "小"
}
}
}
}
- 范围查询
GET good/_search
{
"query": {
"range": {
"price": {
"gte": 1000,
"lte": 2000
}
}
}
}
- 精确查询-匹配单个值(字段建议是keyword)
GET good/_search
{
"query": {
"term": {
"brand": {
"value": "小米"
}
}
}
}
- 精确查询-匹配多个值(字段建议是keyword)
GET good/_search
{
"query": {
"terms": {
"brand": [
"小米",
"华为"
]
}
}
}
- 单字段模糊查询
GET good/_search
{
"query": {
"match": {
"goodName": "米"
}
}
}
- 多字段模糊查询
GET good/_search
{
"query": {
"multi_match": {
"query": "米家",
"fields": ["brand","goodName"]
}
}
}
- 短语(不分词)模糊匹配
GET good/_search
{
"query": {
"match_phrase": {
"goodName": "小米12"
}
}
}
- 过滤(结果不会参与计算分数)
GET good/_search
{
"query": {
"bool": {
"filter": [
{
"term": {
"brand": "华为"
}
}
]
}
}
}
- 组合查询
GET good/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"goodName": "米"
}
},
{
"term": {
"brand": "红米"
}
}
]
}
}
}
- 高亮
GET good/_search
{
"query": {
"match": {
"goodName": "荣耀"
}
},
"highlight": {
"fields": {
"goodName": {
"pre_tags": [
"<p>"
],
"post_tags": [
"</p>"
]
}
}
}
}
- 排序
GET good/_search
{
"query": {
"match_all": {}
},
"sort": [
{
"price": {
"order": "desc"
}
}
]
}
- 分页
GET good/_search
{
"query": {
"match_all": {}
},
"from": 0,
"size": 10
}
- 聚合
GET good/_search
{
"query": {
"match_all": {}
},
"size": 0,
"aggs": {
"grandTermAgg": {
"terms": {
"field": "brand"
}
}
}
}
- 嵌套聚合
GET good/_search
{
"query": {
"match_all": {}
},
"size": 0,
"aggs": {
"grandTermAgg": {
"terms": {
"field": "brand"
},
"aggs": {
"priceAvgAgg": {
"avg": {
"field": "price"
}
},
"priceMinAgg": {
"min": {
"field": "price"
}
},
"priceMaxAgg": {
"max": {
"field": "price"
}
}
}
}
}
}
Spring Boot整合Java客户端操作ElasticSearch
- 初始化Spring Boot项目
我这里使用的Spring Boot版本是3.2.5
- 引入maven依赖
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter</artifactId>
</dependency>
<dependency>
<groupId>co.elastic.clients</groupId>
<artifactId>elasticsearch-java</artifactId>
<version>8.13.4</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.17.0</version>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
如果项目启动时抛出该异常:ClassNotFoundException: jakarta.json.spi.JsonProvider,则需要额外引入json-api依赖
<dependency>
<groupId>jakarta.json</groupId>
<artifactId>jakarta.json-api</artifactId>
<version>2.0.1</version>
</dependency>
- 在application.yml文件中配置ElasticSearch连接信息
es:
ip: es ip
port: es 端口
user: es 用户名
pwd: es 密码
ca_path: D:\ca\http_ca.crt # es 证书(证书如何获取请参考前篇文章)存放路径,需要改成你自己的
- 编写配置类
@Configuration
public class ElasticConfig {
@Value("${es.ip}")
private String ip;
@Value("${es.port}")
private int port;
@Value("${es.user}")
private String user;
@Value("${es.pwd}")
private String pwd;
@Value("${es.ca_path}")
private String caPath;
@Bean
public ElasticsearchClient elasticsearchClient() {
SSLContext sslContext = null;
try {
// ElasticSearch证书
File certFile = new File(caPath);
sslContext = TransportUtils
.sslContextFromHttpCaCrt(certFile);
BasicCredentialsProvider credsProv = new BasicCredentialsProvider();
credsProv.setCredentials(
AuthScope.ANY, new UsernamePasswordCredentials(user, pwd)
);
SSLContext finalSslContext = sslContext;
RestClient restClient = RestClient
.builder(new HttpHost(ip, port, "https"))
.setHttpClientConfigCallback(hc -> hc
.setSSLContext(finalSslContext)
.setDefaultCredentialsProvider(credsProv)
.setSSLHostnameVerifier(new MyHostnameVerifier())
)
.build();
// Create the transport and the API client
ElasticsearchTransport transport = new RestClientTransport(restClient, new JacksonJsonpMapper());
return new ElasticsearchClient(transport);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
private static class MyHostnameVerifier implements HostnameVerifier {
@Override
public boolean verify(String s, SSLSession sslSession) {
return true;
}
}
}
- 编写商品类
@Data
@NoArgsConstructor
@AllArgsConstructor
public class Good {
private String id;
private String goodNumber;
private String category;
private String brand;
private String goodName;
private Double price;
private Integer store;
private String image;
private Date upTime;
}
- Spring Boot 测试类注入ElasticSearch客户端
@Autowired
private ElasticsearchClient esClient;
- 创建索引
/**
* 创建索引
*/
@Test
void createIndex() throws IOException {
// 映射
String mapping = "{\n" +
" \"mappings\": {\n" +
" \"properties\": {\n" +
" \"goodNumber\": {\n" +
" \"type\": \"keyword\",\n" +
" \"index\": false\n" +
" },\n" +
" \"category\": {\n" +
" \"type\": \"keyword\"\n" +
" },\n" +
" \"brand\": {\n" +
" \"type\": \"keyword\"\n" +
" },\n" +
" \"goodName\": {\n" +
" \"type\": \"text\"\n" +
" },\n" +
" \"price\": {\n" +
" \"type\": \"double\"\n" +
" },\n" +
" \"store\": {\n" +
" \"type\": \"integer\"\n" +
" },\n" +
" \"image\": {\n" +
" \"type\": \"keyword\",\n" +
" \"index\": false\n" +
" },\n" +
" \"upTime\": {\n" +
" \"type\": \"date\"\n" +
" }\n" +
" }\n" +
" }\n" +
"}";
ByteArrayInputStream inputStream = new ByteArrayInputStream(mapping.getBytes(StandardCharsets.UTF_8));
CreateIndexRequest createIndexRequest = CreateIndexRequest.of(i -> i.index("good").withJson(inputStream));
boolean acknowledged = esClient.indices().create(createIndexRequest).acknowledged();
System.out.println("创建结果:" + acknowledged);
inputStream.close();
}
- 获取索引信息
/**
* 获取索引信息
*
* @throws IOException
*/
@Test
void getIndex() throws IOException {
GetIndexRequest getIndexRequest = GetIndexRequest.of(i -> i.index("good"));
IndexState index = esClient.indices().get(getIndexRequest).get("good");
System.out.println("获取索引结果:" + index);
}
- 获取索引映射信息
/**
* 获取映射信息
*
* @throws IOException
*/
@Test
void getIndexMapping() throws IOException {
GetMappingRequest getMappingRequest = GetMappingRequest.of(m -> m.index("good"));
GetMappingResponse mapping = esClient.indices().getMapping(getMappingRequest);
System.out.println("获取映射结果:" + mapping);
}
- 删除索引
/**
* 删除索引
*
* @throws IOException
*/
@Test
void deleteIndex() throws IOException {
DeleteIndexRequest deleteIndexRequest = DeleteIndexRequest.of(i -> i.index("good"));
boolean acknowledged = esClient.indices().delete(deleteIndexRequest).acknowledged();
System.out.println("删除结果:" + acknowledged);
}
- 创建一篇文档 - 不指定id
/**
* 创建一篇文档 - 不指定id
*/
@Test
void createDoc() throws IOException {
Good good = new Good(null, "g-123456", "手机", "华为", "荣耀V20", 2000.0, 100, "phong.png", new Date());
IndexRequest<Object> goodDoc = IndexRequest.of(d -> d.index("good").document(good));
Result result = esClient.index(goodDoc).result();
System.out.println(result);
}
- 创建一篇文档 - 指定id
/**
* 创建一篇文档 - 指定id
*/
@Test
void createDocAssignId() throws IOException {
Good good = new Good(null, "g-123457", "手机", "华为", "荣耀V30", 2500.0, 100, "phong.png", new Date());
IndexRequest<Object> goodDoc = IndexRequest.of(d -> d.index("good").id("1").document(good));
Result result = esClient.index(goodDoc).result();
System.out.println("创建文档结果:" + result);
}
- 创建多篇文档
/**
* 创建多篇文档
*/
@Test
void createDocs() throws IOException {
Good good1 = new Good(null, "g-123458", "手机", "华为", "荣耀V40", 30000.0, 100, "phong.png", new Date());
Good good2 = new Good(null, "g-123459", "手机", "华为", "荣耀V50", 3500.0, 100, "phong.png", new Date());
Good good3 = new Good(null, "g-123460", "手机", "华为", "荣耀V60", 4000.0, 100, "phong.png", new Date());
List<BulkOperation> operations = new ArrayList<>();
BulkOperation operation1 = BulkOperation.of(b -> b.index(i -> i.index("good").document(good1)));
BulkOperation operation2 = BulkOperation.of(b -> b.index(i -> i.index("good").document(good2)));
BulkOperation operation3 = BulkOperation.of(b -> b.index(i -> i.index("good").document(good3)));
operations.add(operation1);
operations.add(operation2);
operations.add(operation3);
BulkRequest bulkRequest = BulkRequest.of(b -> b.operations(operations));
List<BulkResponseItem> items = esClient.bulk(bulkRequest).items();
System.out.println("批量操作结果:" + items);
}
- 获取单篇文档
/**
* 获取单篇文档
*/
@Test
void getDoc() throws IOException {
GetRequest getRequest = GetRequest.of(g -> g.index("good").id("3001"));
GetResponse<Good> good = esClient.get(getRequest, Good.class);
System.out.println("获取单篇文档结果:" + good);
}
- 修改文档内容
/**
* 修改文档内容
*/
@Test
void updateDoc() throws IOException {
Good good = new Good(null, null, null, null, null, 2580.0, null, null, null);
UpdateRequest<Object, Object> updateRequest = UpdateRequest.of(u -> u.index("good").id("1").doc(good));
Result result = esClient.update(updateRequest, Good.class).result();
System.out.println("修改文档结果:" + result);
}
- 删除文档
/**
* 删除文档
*/
@Test
void deleteDoc() throws IOException {
DeleteRequest deleteRequest = DeleteRequest.of(d -> d.index("good").id("1"));
Result result = esClient.delete(deleteRequest).result();
System.out.println("删除文档结果:" + result);
}
- 查询所有文档
/**
* 查询所有文档
*/
@Test
void queryAllDocs() throws IOException {
SearchRequest searchRequest = SearchRequest.of(s -> s.index("good").query(q -> q.matchAll(m -> m)));
SearchResponse<Good> search = esClient.search(searchRequest, Good.class);
System.out.println("查询结果:" + search);
}
- 根据id列表查询
/**
* 根据id列表查询
*/
@Test
void termQueryByIds() throws IOException {
List<String> ids = Arrays.asList("AEwCcI8B5Qoto6bjuLsK", "AUwCcI8B5Qoto6bjuLsK");
SearchRequest searchRequest = SearchRequest.of(s -> s.index("good").query(q -> q.ids(i -> i.values(ids))));
SearchResponse<Good> search = esClient.search(searchRequest, Good.class);
System.out.println("查询结果:" + search);
}
- 前缀精确匹配查询
/**
* 前缀精确匹配查询
*/
@Test
void termQueryByPrefix() throws IOException {
SearchRequest searchRequest = SearchRequest.of(s -> s.index("good").query(q -> q.prefix(m -> m.field("goodName").value("小"))));
SearchResponse<Good> search = esClient.search(searchRequest, Good.class);
System.out.println("查询结果:" + search);
}
- 范围查询
/**
* 范围查询
*/
@Test
void termQueryByRange() throws IOException {
SearchRequest searchRequest = SearchRequest.of(s -> s.index("good").query(q -> q.range(r -> r.field("price")
.gte(JsonData.of(1000)).lte(JsonData.of(2000)).boost(2.0F))));
SearchResponse<Good> search = esClient.search(searchRequest, Good.class);
System.out.println("查询结果:" + search);
}
- 精确查询-匹配单个值(字段建议是keyword)
/**
* 精确查询-匹配单个值(字段建议是keyword)
*/
@Test
void termQueryByField() throws IOException {
SearchRequest searchRequest = SearchRequest.of(s -> s.index("good").query(q -> q.term(t -> t.field("brand").value("小米"))));
SearchResponse<Good> search = esClient.search(searchRequest, Good.class);
System.out.println("查询结果:" + search);
}
- 精确查询-匹配多个值(字段建议是keyword)
/**
* 精确查询-匹配多个值(字段建议是keyword)
*/
@Test
void termQueryByFields() throws IOException {
List<FieldValue> brands = Arrays.asList(FieldValue.of("小米"), FieldValue.of("华为"));
SearchRequest searchRequest = SearchRequest.of(s -> s.index("good").query(q -> q.terms(t -> t.field("brand")
.terms(qv -> qv.value(brands)))));
SearchResponse<Good> search = esClient.search(searchRequest, Good.class);
System.out.println("查询结果:" + search);
}
- 单字段模糊查询
/**
* 单字段模糊查询
*/
@Test
void matchQueryByField() throws IOException {
SearchRequest searchRequest = SearchRequest.of(s -> s.index("good").query(q -> q.match(m -> m.field("goodName").query("米"))));
SearchResponse<Good> search = esClient.search(searchRequest, Good.class);
System.out.println("查询结果:" + search);
}
- 多字段模糊查询
/**
* 多字段模糊查询
*/
@Test
void matchQueryByFields() throws IOException {
SearchRequest searchRequest = SearchRequest.of(s -> s.index("good").query(q -> q.multiMatch(m -> m.fields("goodName", "brand").query("米家"))));
SearchResponse<Good> search = esClient.search(searchRequest, Good.class);
System.out.println("查询结果:" + search);
}
- 短语(不分词)模糊匹配
/**
* 短语(不分词)模糊匹配
*/
@Test
void matchQueryByPhrase() throws IOException {
SearchRequest searchRequest = SearchRequest.of(s -> s.index("good").query(q -> q.matchPhrase(p -> p.field("goodName").query("小米12"))));
SearchResponse<Good> search = esClient.search(searchRequest, Good.class);
System.out.println("查询结果:" + search);
}
- 过滤(结果不会参与计算分数)
/**
* 过滤(结果不会参与计算分数)
*/
@Test
void filterQuery() throws IOException {
SearchRequest searchRequest = SearchRequest.of(s -> s.index("good").query(q -> q.bool(b -> b.filter(f -> f.term(t -> t.field("brand").value("华为"))))));
SearchResponse<Good> search = esClient.search(searchRequest, Good.class);
System.out.println("查询结果:" + search);
}
- 组合查询
/**
* 组合查询
*/
@Test
void boolQuery() throws IOException {
Query q1 = Query.of(q -> q.match(ma -> ma.field("goodName").query("米")));
Query q2 = Query.of(q -> q.term(ma -> ma.field("brand").value("红米")));
SearchRequest searchRequest = SearchRequest.of(s -> s.index("good").query(q -> q.bool(b -> b.must(q1, q2))));
SearchResponse<Good> search = esClient.search(searchRequest, Good.class);
System.out.println("查询结果:" + search);
}
- 高亮查询
/**
* 高亮查询
*/
@Test
void highlightQuery() throws IOException {
SearchRequest searchRequest = SearchRequest.of(s -> s.index("good")
.query(q -> q.match(m -> m.field("goodName").query("荣耀")))
.highlight(h -> h.fields("goodName", hf -> hf.preTags(List.of("<p>")).postTags(List.of("</p>")))));
SearchResponse<Good> search = esClient.search(searchRequest, Good.class);
System.out.println("查询结果:" + search);
}
- 排序
/**
* 排序
*/
@Test
void queryToSort() throws IOException {
SearchRequest searchRequest = SearchRequest.of(s -> s.index("good")
.query(q -> q.matchAll(m -> m))
.sort(so -> so.field(FieldSort.of(f -> f.field("price").order(SortOrder.Desc))))
);
SearchResponse<Good> search = esClient.search(searchRequest, Good.class);
System.out.println("查询结果:" + search);
}
- 分页
/**
* 分页
*/
@Test
void queryToPage() throws IOException {
SearchRequest searchRequest = SearchRequest.of(s -> s.index("good")
.query(q -> q.matchAll(m -> m))
.from(0).size(10)
);
SearchResponse<Good> search = esClient.search(searchRequest, Good.class);
System.out.println("查询结果:" + search);
}
- 聚合
/**
* 聚合
*/
@Test
void queryToAgg() throws IOException {
SearchRequest searchRequest = SearchRequest.of(s -> s.index("good")
.query(q -> q.matchAll(m -> m))
.size(0)
.aggregations("grandTermAgg", a -> a.terms(t -> t.field("brand")))
);
SearchResponse<Good> search = esClient.search(searchRequest, Good.class);
System.out.println("查询结果:" + search);
}
- 嵌套聚合
/**
* 嵌套聚合
*/
@Test
void queryToNestAgg() throws IOException {
SearchRequest searchRequest = SearchRequest.of(s -> s.index("good")
.query(q -> q.matchAll(m -> m))
.size(0)
.aggregations("grandTermAgg", a -> a
.terms(t -> t.field("brand"))
.aggregations("priceAvgAgg", na1 -> na1.avg(g -> g.field("price")))
.aggregations("priceMinAgg", na1 -> na1.min(g -> g.field("price")))
.aggregations("priceMaxAgg", na1 -> na1.max(g -> g.field("price")))
)
);
SearchResponse<Good> search = esClient.search(searchRequest, Good.class);
System.out.println("查询结果:" + search);
}
IK分词器的使用
- 下载IK分词器
下载地址 IK分词器,注意IK分词器版本要与ElasticSearch版本对应上,我这里下载的版本是8.13.2
- 解压并上传到Linux中
- 将解压后的IK分词器复制到ElasticSearch容器的
docker cp /opt/elasticsearch-analysis-ik-8.13.2/ a1dd2cdfb62b:/usr/share/elasticsearch/plugins/
其中/opt/elasticsearch-analysis-ik-8.13.2/是IK分词器存放路径,a1dd2cdfb62b是容器id,/usr/share/elasticsearch/plugins/是ElasticSearch插件存放目录
- 重启ElasticSearch容器
docker restart 容器名
- 分词器使用
# 分词器使用
POST _analyze
{
"analyzer": "standard",
"text": "美利坚合众国"
}
POST _analyze
{
"analyzer": "ik_max_word",
"text": "美利坚合众国"
}
POST _analyze
{
"analyzer": "ik_smart",
"text": "美利坚合众国"
}
ik_max_word是细粒度的分词器,而ik_smart是粗粒度的分词器
- 商品信息添加商品描述字段
PUT good/_mapping
{
"properties": {
"desp": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_smart"
}
}
}
- 添加商品数据
POST good/_doc
{
"brand": "华为",
"category": "手机",
"goodName": "华为 mate 70",
"desp": "华为 mate 70 是中华人民共和国一款非常优秀的手机",
"goodNumber": "hw1234",
"image": "huawei.png",
"price": 9000,
"store": 1000,
"upTime": 1715571178304
}
- 测试全文检索
GET good/_search
{
"query": {
"match": {
"desp": "中华人民共和国的手机"
}
}
}