Spring Data Easticsearch 5.2.2 文档部分翻译

最新推荐文章于 2024-04-14 20:46:03 发布

吃头孢不喝酒

最新推荐文章于 2024-04-14 20:46:03 发布

阅读量91

点赞数

文章标签：数据库

原文链接：https://docs.spring.io/spring-data/elasticsearch/reference/index.html

版权

大部分为机翻

一、概览

Spring Data for Elasticsearch 是 Spring Data 总体项目的一部分，该项目旨在为新数据存储提供熟悉且一致的基于 Spring 的编程模型，同时保留特定存储的特性和功能。

Spring Data Elasticsearch 项目提供与 Elasticsearch 搜索引擎的集成。Spring Data Elasticsearch 的主要功能领域是以 POJO 为中心的模型，用于与 Elastichsearch 文档交互，以及轻松编写 Repository 风格的数据访问层。

Clients（客户端）：连接和配置各种 HTTP 客户端。

ElasticsearchTemplate 和 ReactiveElasticsearchTemplate：在 ES 索引操作和 POJO 之间提供对象映射的辅助类。

Object Mapping（对象映射）：功能丰富、注释驱动的对象映射器。

实体回调：保存、更新、删除前后的回调。

Data Repositories（数据存储库）：存储库接口，包括支持自定义查询。

Join-Types, Routing, Scripting（连接类型、路由、脚本）：与特定 Elasticsearch 功能集成。

二、学习

2-1 Elasticsearch Clients

本章说明了spring支持的 Elasticsearch 客户端实现的配置和使用。

Spring Data Elasticsearch 在 Elasticsearch 客户端上运行。Elasticsearch 客户端由 Elasticsearch 客户端库提供，可连接到单个 Elasticsearch 节点或群集。虽然可以直接使用 Elasticsearch 客户端与群集协同工作，但使用 Spring Data Elasticsearch 的应用程序通常会使用 Elasticsearch Operations 和 Elasticsearch Repositories 的高层抽象。

2-1-1 Imperative Rest Client

要使用 imperative (non-reactive) client，必须像这样配置一个配置 Bean：

import org.springframework.data.elasticsearch.client.elc.ElasticsearchConfiguration;

@Configuration
public class MyClientConfig extends ElasticsearchConfiguration {

	@Override
	public ClientConfiguration clientConfiguration() {
		return ClientConfiguration.builder()   // 有关构建器方法的详细说明，请参阅 "客户端配置"
			.connectedTo("localhost:9200")
			.build();
	}
}

该类可通过重写或等方法进行进一步配置。ElasticsearchConfiguration jsonpMapper() transportOptions()

然后就可以在其他 Spring 组件中注入以下 Bean：

import org.springframework.beans.factory.annotation.Autowired;
@Autowired
ElasticsearchOperations operations;      //ElasticsearchOperations 的一个实现

@Autowired
ElasticsearchClient elasticsearchClient; //使用的是.co.elastic.clients.elasticsearch.ElasticsearchClient

@Autowired
RestClient restClient;                   //来自 Elasticsearch 库的底层 RestClient

@Autowired
JsonpMapper jsonpMapper;                 //通过 Elasticsearch JsonpMapperTransport 传输用户的信息

基本上，人们只需使用该实例与 Elasticsearch 集群交互即可。使用存储库时，也会在引擎盖下使用该实例。 ElasticsearchOperations

2-1-2 Reactive Rest Client

在使用reactive stack时，配置必须源自不同的类：

import org.springframework.data.elasticsearch.client.elc.ReactiveElasticsearchConfiguration;

@Configuration
public class MyClientConfig extends ReactiveElasticsearchConfiguration {

	@Override
	public ClientConfiguration clientConfiguration() {
		return ClientConfiguration.builder()           //有关构建器方法的详细说明，请参阅 "客户端配置"。
			.connectedTo("localhost:9200")
			.build();
	}
}

该类允许通过重写或等方法进行进一步配置。
ReactiveElasticsearchConfigurationjsonpMapper()transportOptions()
然后就可以在其他 Spring 组件中注入以下 Bean：

@Autowired
ReactiveElasticsearchOperations operations; //ReactiveElasticsearchOperations的一个实例

@Autowired
ReactiveElasticsearchClient elasticsearchClient;    //客户端实现。这是一个基于 Elasticsearch 客户端实现的反应式实现。

@Autowired
RestClient restClient;    //来自 Elasticsearch 库的底层 RestClient

@Autowired
JsonpMapper jsonpMapper;    //通过 Elasticsearch JsonpMapperTransport 传输用户的信息

基本上，人们只需使用该实例与 Elasticsearch 集群交互即可。使用repositories（存储库）时，也会在引擎盖下使用该实例。
ReactiveElasticsearchOperations

2-1-3 Client Configuration

客户端行为可通过客户端配置更改，该配置允许设置 SSL、connect and socket timeouts（连接和套接字超时）、headers 和其他参数的选项。

Example 1. Client Configuration

import org.springframework.data.elasticsearch.client.ClientConfiguration;
import org.springframework.data.elasticsearch.support.HttpHeaders;

import static org.springframework.data.elasticsearch.client.elc.ElasticsearchClients.*;

HttpHeaders httpHeaders = new HttpHeaders();
httpHeaders.add("some-header", "on every request")                      (1)

ClientConfiguration clientConfiguration = ClientConfiguration.builder()
  .connectedTo("localhost:9200", "localhost:9291")                      (2)
  .usingSsl()                                                           (3)
  .withProxy("localhost:8888")                                          (4)
  .withPathPrefix("ela")                                                (5)
  .withConnectTimeout(Duration.ofSeconds(5))                            (6)
  .withSocketTimeout(Duration.ofSeconds(3))                             (7)
  .withDefaultHeaders(defaultHeaders)                                   (8)
  .withBasicAuth(username, password)                                    (9)
  .withHeaders(() -> {                                                  (10)
    HttpHeaders headers = new HttpHeaders();
    headers.add("currentTime", LocalDateTime.now().format(DateTimeFormatter.ISO_LOCAL_DATE_TIME));
    return headers;
  })
  .withClientConfigurer(                                                (11)
    ElasticsearchClientConfigurationCallback.from(clientBuilder -> {
  	  // ...
      return clientBuilder;
  	}))
  . // ... other options
  .build();

(1)如果需要自定义，则定义默认头信息
(2)使用生成器提供集群地址、设置默认或启用 SSL.HttpHeaders
(3)可选择启用 SSL。该函数的重载可以使用 Elasticsearch 8 在启动时输出的证书指纹，也可以将其作为替代。
(4)可选择设置代理。
(5)可选择设置路径前缀，主要用于不同群集的反向代理。
(6)设置连接超时。
(7)设置套接字超时
(8)可选择设置标头。
(9)添加基本身份验证。
(10)可以指定一个函数，每次向 Elasticsearch 发送请求前都会调用该函数--例如，在这里，当前时间被写入了标头。
(11)配置已创建客户端的函数（参见客户端配置回调），可多次添加。

!重要

如上例所示，添加头信息供应商可以注入可能随时间变化的头信息，如身份验证 JWT 标记。如果在反应式设置中使用，则供应商函数不得阻塞！

2-1-3-1 Client configuration callbacks

该类提供了配置客户端最常用的参数。如果这些参数还不够，用户可以使用以下方法添加回调函数。
ClientConfigurationwithClientConfigurer(ClientConfigurationCallback<?>)

提供了以下回调函数：

2-1-3-1-1 Configuration of the low level Elasticsearch :`RestClient`

该回调提供了一个可用于配置 Elasticsearch

:org.elasticsearch.client.RestClientBuilder RestClient

ClientConfiguration.builder()
    .withClientConfigurer(ElasticsearchClients.ElasticsearchRestClientConfigurationCallback.from(restClientBuilder -> {
        // configure the Elasticsearch RestClient
        return restClientBuilder;
    }))
    .build();

2-1-3-1-2 Configuration of the HttpAsyncClient used by the low level Elasticsearch :`RestClient`

该回调用于配置 .org.apache.http.impl.nio.client.HttpAsyncClientBuilderRestClient 所使用的 HttpCLient。

ClientConfiguration.builder()
    .withClientConfigurer(ElasticsearchClients.ElasticsearchHttpClientConfigurationCallback.from(httpAsyncClientBuilder -> {
        // configure the HttpAsyncClient
        return httpAsyncClientBuilder;
    }))
    .build();

2-1-4 Client Logging

要查看服务器实际发送和接收的内容，需要打开传输层的日志记录，如下文代码段所述。可以在 Elasticsearch 客户端中将软件包的级别设置为 "跟踪"（参见 www.elastic.co/guide/en/elasticsearch/client/java-api-client/current/java-rest-low-usage-logging.htmlRequestResponsetracer）。
启用传输层日志记录

<logger name="tracer" level="trace"/>

2-2 Elasticsearch Object Mapping

Spring Data Elasticsearch 对象映射是将 Java 对象（领域实体）映射到存储在 Elasticsearch 中的 JSON 表示形式并返回的过程。内部用于此映射的类是 MappingElasticsearchConverter。

2-2-1Meta Model Object Mapping

基于元模型的方法使用域类型信息来读取/写入 Elasticsearch。这样就可以为特定的域类型映射注册实例.Converter

2-2-1-1 Mapping Annotation Overview 映射注释概述

它使用元数据来驱动对象到文档的映射。元数据来自实体的属性，这些属性可以被注释。MappingElasticsearchConverter

可提供以下注释：

@Document：应用于类级别，表示该类是映射到数据库的候选类。最重要的属性有（有关完整的属性列表，请查阅 API 文档）：

indexName：存储此实体的索引名称。它可以包含一个 SpEL 模板表达式，如 "log-#{T(java.time.LocalDate).now().toString()}"。

createIndex：是否在版本库启动时创建索引的标志。默认值为 true。请参阅自动创建索引及相应映射.

@Id：应用于字段级别，以标记用于标识目的的字段。
@Transient、@ReadOnlyProperty、@WriteOnlyProperty：有关详细信息，请参阅下面的 "控制向 Elasticsearch 写入和从 Elasticsearch 读取哪些属性 "一节。

@PersistenceConstructor：从数据库实例化对象时使用的构造函数（即使是受软件包保护的构造函数）。构造函数参数通过名称映射到检索文档中的键值。

@Field：应用于字段级别并定义字段属性，大部分属性都映射到相应的 Elasticsearch 映射定义（以下列表并不完整，请查看注解 Javadoc 以获取完整参考）：

name：字段在 Elasticsearch 文档中的名称，如果未设置，则使用 Java 字段名称。

type：类型：字段类型，可以是 Text、Keyword、Long、Integer、Short、Byte、Double、Float、Half_Float、Scaled_Float、Date、Date_Nanos、Boolean、Binary、Integer_Range、Float_Range、Long_Range、Double_Range、Date_Range、Ip_Range、Object、Nested、Ip、TokenCount、Percolator、Flattened、Search_As_You_Type 之一。请参见 Elasticsearch 映射类型。如果未指定字段类型，则默认为。这意味着不会为该属性写入映射条目，Elasticsearch 会在存储该属性的第一条数据时动态添加映射条目（有关动态映射规则，请查阅 Elasticsearch 文档）。

format：一种或多种内置日期格式，请参阅下一节日期格式映射。

pattern：一种或多种自定义日期格式，请参阅下一节日期格式映射。

store：是否将原始字段值存储在 Elasticsearch 中的标志，默认值为 false。

analyzer, searchAnalyzer, normalizer：用于指定自定义分析器 analyzers 和归一化器 normalizer

@GeoPoint：将字段标记为 geo_point 数据类型。如果字段是 GeoPoint 类的实例，则可以省略。

@ValueConverter 定义了一个用于转换给定属性的类。与已注册的 Spring 转换器不同的是，它只能转换已注释的属性，而不能转换给定类型的所有属性。

映射元数据基础架构是在独立的 Spring-data-commons 项目中定义的，与技术无关。

2-2-1-2 Controlling which properties are written to and read from Elasticsearch 控制哪些属性被写入或读出Elasticsearch

本节详细介绍了定义属性值是写入还是读出 Elasticsearch 的注解。

@Transient：使用此注解的属性不会写入映射，其值也不会发送到 Elasticsearch，当从 Elasticsearch 返回文档时，也不会在生成的实体中设置此属性。

@ReadOnlyProperty：带有此注解的属性不会将其值写入 Elasticsearch，但在返回数据时，该属性将填入从 Elasticsearch 返回的文档值。其中一个用例是在索引映射中定义的运行时字段。

@WriteOnlyProperty：带有此注解的属性将在 Elasticsearch 中存储其值，但在读取文档时不会设置任何值。例如，这可用于合成字段，这些字段应进入 Elasticsearch 索引，但不会在其他地方使用。

2-2-1-3 Date format mapping

派生自 TemporalAccessor 或 java.util.Date 类型的属性必须具有 FieldType.Date 类型的 @Field 注解，或者必须为该类型注册自定义转换器。本段将介绍 FieldType.Date 的使用。

@Field 注解的两个属性定义了写入映射的日期格式信息（另请参阅 Elasticsearch 内置格式和 Elasticsearch 自定义日期格式）

format 属性用于定义至少一种预定义格式。如果没有定义，则使用 _date_optional_time 和 epoch_millis 的默认值。

pattern属性可用于添加其他自定义格式字符串。如果只想使用自定义日期格式，则必须将 format 属性设置为空 {}。

下表显示了不同的属性以及根据其值创建的映射：

注意：如果使用自定义日期格式，则需要使用 uuuu 表示年份，而不是 yyyy。这是因为 Elasticsearch 7 中的一个变化。

有关预定义值及其模式的完整列表，请查看 org.springframework.data.elasticsearch.annotations.DateFormat 枚举的代码。

2-2-1-4 Range types

当一个字段被注释为 Integer_Range、Float_Range、Long_Range、Double_Range、Date_Range 或 Ip_Range 之一的类型时，该字段必须是一个将被映射到 Elasticsearch 范围的类的实例。例如

class SomePersonData {

    @Field(type = FieldType.Integer_Range)
    private ValidAge validAge;

    // getter and setter
}

class ValidAge {
    @Field(name="gte")
    private Integer from;

    @Field(name="lte")
    private Integer to;

    // getter and setter
}

作为替代，Spring Data Elasticsearch 提供了一个 Range<T> 类，因此前面的示例可以写成这样：

class SomePersonData {

    @Field(type = FieldType.Integer_Range)
    private Range<Integer> validAge;

    // getter and setter
}

<T> 类型支持的类有 Integer, Long, Float, Double, Date 和实现 TemporalAccessor 接口的类。

2-2-1-5 Mapped field names

在没有进一步配置的情况下，Spring Data Elasticsearch 将使用对象的属性名作为 Elasticsearch 中的字段名。可以通过使用 @Field 注解来更改单个字段的名称。

也可以在客户端（Elasticsearch 客户端）的配置中定义字段命名策略（FieldNamingStrategy）。例如，如果配置了 SnakeCaseFieldNamingStrategy，对象的 sampleProperty 属性就会映射到 Elasticsearch 中的 sample_property。字段命名策略适用于所有实体；可以通过在属性上使用 @Field 设置特定名称来覆盖它。

2-2-1-6 Non-field-backed properties

通常，实体中使用的属性是实体类的字段。在某些情况下，属性值可能是在实体中计算得出的，并应存储在 Elasticsearch 中。在这种情况下，可以用 @Field 注解注解getter方法（getProperty()），除此之外，还必须用 @AccessType(AccessType.Type .PROPERTY)注解该方法。在这种情况下需要使用的第三个注解是 @WriteOnlyProperty，因为这样的值只会写入 Elasticsearch。完整示例

@Field(type = Keyword)
@WriteOnlyProperty
@AccessType(AccessType.Type.PROPERTY)
public String getProperty() {
	return "some value that is calculated here";
}

2-2-1-7 Other property annotations

@IndexedIndexName
此注解可设置在实体的字符串属性上。该属性不会写入映射，不会存储在 Elasticsearch 中，也不会从 Elasticsearch 文档中读取其值。在持久化实体后，例如调用 ElasticsearchOperations.save(T entity)，从该调用返回的实体将包含该属性中保存实体的索引名称。这在索引名称由 bean 动态设置或写入别名时非常有用。

在此类属性中输入某些值并不会设置实体存储的索引！

2-2-2 Mapping Rules

2-2-2-1 Type Hints 类型提示

映射使用发送到服务器的文档中嵌入的类型提示，以实现通用类型映射。这些类型提示在文档中表示为 _class 属性，并为每个聚合根编写。

例 1. Type Hints

public class Person {    （1）
  @Id String id;
  String firstname;
  String lastname;
}

{
  "_class" : "com.example.Person",    （1）
  "id" : "cb7bef",
  "firstname" : "Sarah",
  "lastname" : "Connor"
}

（1）默认情况下，类型提示使用域类型类名。

类型提示可以配置为保存自定义信息。为此请使用 @TypeAlias 注解。

注意：确保将带有 @TypeAlias 的类型添加到初始实体集（AbstractElasticsearchConfiguration#getInitialEntitySet）中，以便在首次从存储中读取数据时已有可用的实体信息。

例 2. 带别名的类型提示

@TypeAlias("human")    （1）
public class Person {

  @Id String id;
  // ...
}

{
  "_class" : "human",    （1）
  "id" : ...
}

（1）写入实体时会使用配置的别名。

注意：除非属性类型是Object，否则不会为嵌套对象写入类型提示、接口或实际值类型与属性声明不匹配，否则不会为嵌套对象写入类型提示。

2-2-2-2 Disabling Type Hints

当应该使用的索引已经存在，但其映射中没有定义类型提示，且映射模式设置为严格时，可能有必要禁用类型提示的写入。在这种情况下，写入类型提示会产生错误，因为字段无法自动添加。

可以通过覆盖从 AbstractElasticsearchConfiguration 派生的配置类中的 writeTypeHints() 方法来禁用整个应用程序的类型提示（请参阅 Elasticsearch 客户端）。

另一种方法是使用 @Document 注解禁用单个索引的类型提示：

@Document(indexName = "index", writeTypeHint = WriteTypeHint.FALSE)

警告：我们强烈建议不要禁用类型提示。只有在迫不得已的情况下才这样做。禁用类型提示可能导致在多态数据情况下无法从 Elasticsearch 中正确检索文档，或者文档检索完全失败。

2-2-2-3 Geospatial Types

Point和 GeoPoint 等地理空间类型会转换成纬度/时距对。

例 3. 地理空间类型

public class Address {
  String city, street;
  Point location;
}

{
  "city" : "Los Angeles",
  "street" : "2800 East Observatory Road",
  "location" : { "lat" : 34.118347, "lon" : -118.3026284 }
}

2-2-2-4 GeoJson Types

Spring Data Elasticsearch 通过提供 GeoJson 接口和不同几何图形的实现来支持 GeoJson 类型。它们根据 GeoJson 规范映射到 Elasticsearch 文档。在编写索引映射时，实体的相应属性会在索引映射中指定为 geo_shape。(也可查阅 Elasticsearch 文档）

例 4. GeoJson 类型

public class Address {

  String city, street;
  GeoJsonPoint location;
}

{
  "city": "Los Angeles",
  "street": "2800 East Observatory Road",
  "location": {
    "type": "Point",
    "coordinates": [-118.3026284, 34.118347]
  }
}

实现了以下 GeoJson 类型：
GeoJsonPoint
GeoJsonMultiPoint
GeoJsonLineString
GeoJsonMultiLineString
GeoJsonPolygon
GeoJsonMultiPolygon
GeoJsonGeometryCollection

2-2-2-5 Collections

对于集合内部的值，在类型提示和自定义转换时，采用与聚合根相同的映射规则。

例 5. 集合

public class Person {

  // ...

  List<Person> friends;

}

{
  // ...

  "friends" : [ { "firstname" : "Kyle", "lastname" : "Reese" } ]
}

2-2-2-4 Maps

对于映射内的值，在类型提示和自定义转换时，采用与聚合根相同的映射规则。不过，Key必须是字符串，以便 Elasticsearch 处理。

例 6. 集合

public class Person {

  // ...

  Map<String, Address> knownLocations;

}

{
  // ...

  "knownLocations" : {
    "arrivedAt" : {
       "city" : "Los Angeles",
       "street" : "2800 East Observatory Road",
       "location" : { "lat" : 34.118347, "lon" : -118.3026284 }
     }
  }
}

2-2-3 Custom Conversions

从上一节的配置来看，ElasticsearchCustomConversions 允许注册特定规则来映射域和简单类型。

例 7. 元模型对象映射配置

@Configuration
public class Config extends ElasticsearchConfiguration  {

	@NonNull
	@Override
	public ClientConfiguration clientConfiguration() {
		return ClientConfiguration.builder() //
				.connectedTo("localhost:9200") //
				.build();
	}

  @Bean
  @Override
  public ElasticsearchCustomConversions elasticsearchCustomConversions() {
    return new ElasticsearchCustomConversions(
      Arrays.asList(new AddressToMap(), new MapToAddress()));    （1）
  }

  @WritingConverter                                                （2）
  static class AddressToMap implements Converter<Address, Map<String, Object>> {

    @Override
    public Map<String, Object> convert(Address source) {

      LinkedHashMap<String, Object> target = new LinkedHashMap<>();
      target.put("ciudad", source.getCity());
      // ...

      return target;
    }
  }

  @ReadingConverter                                            （3）
  static class MapToAddress implements Converter<Map<String, Object>, Address> {

    @Override
    public Address convert(Map<String, Object> source) {

      // ...
      return address;
    }
  }
}

{
  "ciudad" : "Los Angeles",
  "calle" : "2800 East Observatory Road",
  "localidad" : { "lat" : 34.118347, "lon" : -118.3026284 }
}

（1）添加转换器实现。

（2）设置用于将 DomainType 写入 Elasticsearch 的转换器。

（3）设置用于从搜索结果中读取 DomainType 的转换器。