一站式了解Elasticsearch，让你十分钟学会

歪比巴卜˙Ꙫ˙

已于 2023-05-09 10:57:40 修改

阅读量145

点赞数

文章标签： elasticsearch 搜索引擎 java 大数据 spring boot

于 2023-05-09 10:24:47 首次发布

本文链接：https://blog.csdn.net/qq_59567619/article/details/130561755

版权

声明：

1、在这里对es的介绍就不多说了，如果各位小伙伴没了解过es，可以点击基础入门 | Elasticsearch: 权威指南 | Elastic了解一下

1.2、如果各位小伙伴不清楚es的安装步骤，可以参考Elastisearch-介绍及安装 · 语雀

2 、ES客户端

Elasticsearch是一个实时的分布式开放源代码全文本搜索和分析引擎。可从RESTful Web服务界面访问它，并使用无模式的JSON（JavaScript对象表示法）文档存储数据。它基于Java编程语言构建，因此Elasticsearch可以在不同平台上运行。它使用户能够以很高的速度浏览大量的数据。

2.1、操作ES的三种方式：

使用elasticsearch-head插件
使用elasticsearch提供的Restful接口直接访问
使用elasticsearch提供的API进行访问

RESTful

3、 Spring Boot 集成 ES

Spring Boot 集成 ES 主要分为以下三步：

加入 ES 依赖
配置 ES
演示 ES 基本操作

加入依赖

<dependency>
    <groupId>org.elasticsearch</groupId>
    <artifactId>elasticsearch</artifactId>
    <version>7.1.0</version>
</dependency>
<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
    <version>7.1.0</version>
</dependency>

创建 ES 配置

在配置文件 application.yml中配置 ES 的相关参数，具体内容如下：

elasticsearch.host=localhost
elasticsearch.port=9200
elasticsearch.connTimeout=3000
elasticsearch.socketTimeout=5000
elasticsearch.connectionRequestTimeout=500

创建RestHighLevelClient

@Configuration
public class ElasticsearchConfiguration {

    @Value("${elasticsearch.host}")
    private String host;

    @Value("${elasticsearch.port}")
    private int port;

    @Value("${elasticsearch.connTimeout}")
    private int connTimeout;

    @Value("${elasticsearch.socketTimeout}")
    private int socketTimeout;

    @Value("${elasticsearch.connectionRequestTimeout}")
    private int connectionRequestTimeout;

    @Bean(destroyMethod = "close", name = "client")
    public RestHighLevelClient initRestClient() {
        RestClientBuilder builder = RestClient.builder(new HttpHost(host, port))
                .setRequestConfigCallback(requestConfigBuilder -> requestConfigBuilder
                        .setConnectTimeout(connTimeout)
                        .setSocketTimeout(socketTimeout)
                        .setConnectionRequestTimeout(connectionRequestTimeout));
        return new RestHighLevelClient(builder);
    }
}

演示es的基本操作

创建索引

PUT same_city_shop  (PUT 索引名)

创建mappings，相当于MySql表字段定义类型

PUT /same_city_shop
{
  "mappings": {
    "properties": {
        "shopId":{
          "type": "long"
        },
     "shopName":{
      "type":"text",
      "analyzer" : "ik_max_word",   //分词器、具体可查看文档
      "search_analyzer" : "ik_smart"
     },
     "shopLogo":{
       "type":"keyword"
     },
      "shopCoordinate":{
        "type": "geo_point"         //地理位置、具体可查看文档
      },
      "monthlySales":{
        "type":"integer"
      },
      "startingFee":{
        "type":"double"
      },
      "deliveryFee":{
        "type": "double"
      },
      "status":{
        "type": "integer"
      }
    }
  }
}

删除索引

DELETE same_city_shop

文档操作

在这就不一一展示了，不了解的小伙伴可参考ElasticSearch 文档的增删改查都不会？

ES查询语句

词条查询（term、trems）


#term

#类似MySql的 ==，精准匹配，搜索字段必须全部对应
GET same_city_shop/_search
{
  "query": {
    "term": {
      "status": 1
    }
  }
}

#terms

#类似于MySql的in, 精准匹配
GET same_city_shop/_search
{
  "query": {
   "terms": {
     "status": [1,2]
    }
  }
}

全文搜索（match、match_all、multi_match、match_phrase、match_phrase_prefix）

#match

#查询文本或短语的值匹配，类似于MySql的like
GET same_city_shop/_search
{
  "query": {
    "match": {
      "shopName": "智慧药房"
    }
  }
}

#match_all

#查询所有
GET same_city_shop/_search
{
  "query": {
    "match_all": {}
  }
}

#multi_match

#查询一个或多个字段匹配的文本或短语匹配,该例子为查询status和shopId为1的数据，类似于MySql的or
GET same_city_shop/_search
{
  "query": {
    "multi_match": {
      "query": 1,
      "fields": ["status","shopId"]
    }
  }
}

#match_phrase

#短语查询。match_phrase 会将检索关键词分词。match_phrase的分词结果必须在被检索字段的分词中都包含，而且顺序必须相同，而且默认必须都是连续的。

GET same_city_shop/_search
{
  "query": {
    "match_phrase":{
      "shopName": "药房"
    }
  }
}

#match_phrase_prefix

#实时模糊查询 类似于match_phrase ，但是它将查询字符串的最后一个词作为前缀(prefix)使用，例如该例子查询顺序为DYXAJFHSP 12P, 跟着 一个以 12P开始的词(prefix)   
GET product/_search
{
  "query":{
    "match_phrase_prefix": {
      "prodNameRecommendInitial": "DYXAJFHSP 12P"
    }
  }
}

模糊查询

#wildcard

#工作原理和prefix相同，只不过它在1不是只比较开头，它能支持更为复杂的匹配模式。
它使用标准的 shell 模糊查询：? 匹配任意字符，* 匹配0个或多个字符。
GET product/_search
{
  "query": {
    "wildcard": {
      "prodNameRecommendInitial": "999*"
    }
  }
}

范围查询

#range

#范围查询 gt：大于 gte：大于等于 lt：小于 lte：小于等于
GET same_city_shop/_search
{
  "query": {
    "range": {
      "deliveryFee": {
        "get": 10,
        "lte": 20
      }
    }
  }
}

#must和should同时使用，会使should失效  搜索999失效
GET product/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "prodNameZh": "江中"
          }
        }
      ],
      "should": [
        {
          "match": {
            "prodNameZh": "999"
          }
        }
      ]
    }
  }
}

如果需要must和should同时使用的话，可以在最外层包含must，将多个should条件包含在must中
boost：权重，权重值越高，排名越靠前

GET product/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "should": [
              {
                "match": {
                  "prodNameZh": {
                    "query": "999感冒灵",
                    "boost": 10
                  }
                }
              },
              {
                "match_phrase": {
                  "briefZh": "效果很好"
                }
              }
            ]
          }
        }
      ]
    }
  }
}

filter包含should

#类似MySql的or查询，两者包含一个，或者都包含
GET product/_search
{
 "query": {
   "bool": {
     "filter": [
       {
         "bool": {
            "should":[
               {
                "wildcard":{
                "prodNameRecommendInitial":"dy*"
                }
              },
              {
                "term":{
                  "status":1
                }
              }
            ]
          }
        }
      ]
    }
  }
}

nested 查询

Nested （嵌套）类型，是特殊的对象类型，特殊的地方是索引对象数组方式不同，允许数组中的对象各自地进行索引。目的是对象之间彼此独立被查询出来。一般用于数组，JSON。

#nested查询
GET product/_search
{
  "query": {
    "nested": {
      "path": "category",
      "query": {
        "bool": {
          "should": [
            {
              "terms": {
                "category.categoryId": [
                  437
                ]
              }
            }
          ]
        }
      }
    }
  }
}

filter和must_not同时使用


#must_not和filter是同级关系， must_not 必须不包含 类似于MySql的not in
GET product/_search
{
  "query":{
    "bool": {
      "filter": [
        {
          "term": {
            "status": 1
          }
        },
        {
          "term": {
            "hasStock":true
          }
        },
        {
          "term": {
            "prodType": 0
          }
        },
        {
          "terms": {
            "prodId": [3,2]
          }
        }
      ],
      "must_not": [
        {
          "term": {
            "prodId":2
          }
        }
      ]
    }
  }
}

地理位置查询

es提供了两种表示地理位置的方式：
（1）用纬度－经度表示的坐标点使用 geo_point 字段类型。
（2）以 GeoJSON 格式定义的复杂地理形状，使用 geo_shape 字段类型。

#地理位置：经纬度坐标点 geo_point
PUT /same_city_shop
{
  "mappings": {
    "properties": {
      "shopCoordinate":{
        "type": "geo_point"
      }
    }
  }
}

#翻译出来就是：根据当前位置经（lon）纬（lat）度查询周边10km的shopName为樱武健康商城并且status为1的数据，并且排序 
_geo_distance：当前地理位置
order 排序方式
unit：距离单位
mode：max、min、sum、avg    

GET same_city_product/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "shopName": "樱武健康商城"
          }
        },
        {
          "match": {
            "status": "1"
          }
        }
      ],
      "filter": [
        {
          "geo_distance": {
            "distance": "10km",
            "shopCoordinate" : {
            "lat" : 31.346997,
            "lon" : 121.509799
            }
          }
        }
      ]
    }
  },
  "sort": [
    {
      "_geo_distance": {
        "shopCoordinate" : {
            "lat" : 31.346997,  
            "lon" : 121.509799
            },
        "order": "asc",
        "unit": "km",   
        "mode": "min" 
      }
    }
  ],
  "from": 0,
  "size": 5
}