Elasticsearch Document Update API详解、原理与示例

最新推荐文章于 2024-07-25 18:20:01 发布

中间件兴趣圈

最新推荐文章于 2024-07-25 18:20:01 发布

阅读量1w

点赞数 1

分类专栏： Elasticsearch Elasticsearch使用指南文章标签： elasticsearch es update api upsert detectNoop

本文链接：https://blog.csdn.net/prestigeding/article/details/83932181

版权

Elasticsearch 同时被 2 个专栏收录

27 篇文章 23 订阅

订阅专栏

Elasticsearch使用指南

27 篇文章 38 订阅

订阅专栏

本文将详细介绍单文档(Document)的更新API，其更新API如下：

public final UpdateResponse update(UpdateRequest updateRequest, RequestOptions options) throws IOException
public final void updateAsync(UpdateRequest updateRequest, RequestOptions options, ActionListener listener)
其核心需要关注UpdateRequest。
1、UpdateRequest详解
UpdateRequest的核心类图如图所示：

我们首先来看一下UpdateRequest的核心属性：
protected ShardId shardId：指定需要执行的分片信息。
protected String index：索引库，类似关系型数据库的database。
private String type：类型名，类似于关系数据库的table(表)。
private String id ：文档ID，所谓的文档，类似于关系数据库的行，id,类似于关系数据库的主键ID。
private String routing：分片值，默认为id的值，elasticsearch的分片路由算法为( hashcode(routing) % primary_sharding_count(主分片个数) )。
private String parent：
Script script：通过脚步更新文档。
private String[] fields：指定更新操作后，需要返回的文档的字段信息，默认为不返回，已废弃，被fetchSourceContext取代。
private FetchSourceContext fetchSourceContext：执行更新操作后，如果命中，需要返回_source的上下文配置，与fields的区别是fetchSourceContext支持通配符表达式来匹配字段名，其详细已经在《Elasticsearch Document Get API详解、原理与示例》中详细介绍过。
private long version = Versions.MATCH_ANY：版本号。
private VersionType versionType = VersionType.INTERNAL：版本类型，分为内部版本、外部版本，默认为内部版本。
private int retryOnConflict = 0：更新冲突时重试次数。
private RefreshPolicy refreshPolicy = RefreshPolicy.NONE：刷新策略。NONE：代表不重试；
private ActiveShardCount waitForActiveShards = ActiveShardCount.DEFAULT：执行操作之前需要等待激活的副本数，已在《Elasticsearch Document Get API详解、原理与示例》中详细介绍。
private IndexRequest upsertRequest：使用该字段进行更新操作，如果原索引不存在，则更新，类似于saveOrUpdate操作，该操作需要与脚步执行，详细将在后续章节中描述，
private boolean scriptedUpsert = false;是否是用脚步执行更新操作。
private boolean docAsUpsert = false; 是否使用saveOrUpdate模式，即是否使用IndexRequest upsertRequest进行更新操作。(docAsUpser=true+ doc组合，将使用saveOrUpdate模式)。
private boolean detectNoop = true;是否检查空操作，下文会进行详细介绍。
private IndexRequest doc;默认使用该请求进行更新操作。
从上述我们基本可以得知更新基本有3种方式，script、upsert、doc(普通更新)。

2、深入分析Elasticsearch Update API（更新API）
2.1 Script脚步更新
Elasticsearch可以通过脚本(painless)进行更新，其具体语法见：https://www.elastic.co/guide/en/elasticsearch/painless/current/index.html ，，本节不会深入去学习其语法，后续会看单独的章节对其进行详细讲解。

2.2 部分字段更新（普通更新方式）
更新API支持传递一个部分文档（_source字段中包含类型的部门字段），它将被合并到现有的文档中（简单的递归合并，对象的内部合并，替换核心的“键/值”和数组）。如果需要完全替代现有的文档，请使用(Index API)。以下部分更新为现有文档添加了一个新字段：(下文会给出基于java的API调用)。

POST test/_doc/1/_update
{
    "doc" : {
        "name" : "new_name"
    }
}

如果指定了doc和script，则script属性优先，关于更新API一个比较好的实践是使用脚步更新（painless），后续会重点章节详细介绍。

2.3 检测空更新（检测本请求是否值得更新）
该功能特性的意思是当提交的请求，发现与原文档的数据并未发送变化，是否执行update操作，默认检测。如果开启检测，detectNoop=true，如果检测到数据并未发生变化，则返回结果为noop（空操作），如果detectNoop=false，每次操作都会执行，版本号将自增。

2.4 保存或更新(Upserts)
如果文档还不存在，upsert元素的内容将作为新文档插入。Elasticsearch支持scripted_upsert和doc_as_upsert两种模式，以scripted_upsert优先。通过UpdateRequest#scriptedUpsert和UpdateRequest#docAsUpsert控制。

2.5 核心参数一览表
更新API主要核心参数一览表

参数名	说明
retry_on_conflict	Elasticsearch基于版本进行乐观锁控制，当版本冲突后，允许的重试次数，超过重试次数retry_on_conflict后抛出异常。
routing	路由策略。
timeout	等待分片的超时时间。
wait_for_active_shards	在执行命令之前需要等待副本的数量。
refresh	刷新机制。
_source	允许在响应中控制更新后的源是否和如何返回。默认情况下，更新的源代码不会返回。有关源字段过滤，请参考《Elasticsearch Document Get API详解、原理与示例》中详细介绍。
version	版本字段，基于乐观锁控制。

注意：更新API不支持除内部以外的版本控制，外部（版本类型外部和外部的）或强制（版本类型的force）版本控制不受更新API的支持，因为它会导致弹性搜索版本号与外部系统不同步。

3、Update API使用示例
本节将暂时不会展示使用脚步进行更新的Demo，此部分会在后续文章中单独的章节来介绍ElasticSearch painless Script。

3.1 常规更新（更新部分字段）

public static void testUpdate_partial() {
		RestHighLevelClient client = EsClient.getClient();
		try {
			UpdateRequest request = new UpdateRequest("twitter", "_doc", "10");
			IndexRequest indexRequest = new IndexRequest("twitter", "_doc", "10");
			Map<String, String> source = new HashMap<>();
			source.put("user", "dingw2");
			indexRequest.source(source);
			request.doc(indexRequest);
			UpdateResponse result = client.update(request, RequestOptions.DEFAULT);
			System.out.println(result);
			testGet();
		} catch (Throwable e) {
			e.printStackTrace();
		} finally {
			EsClient.close(client);
		}
	}

最终结果：调用get API能反映出user字段已经更新为dingw2，及更新成功。

3.2 开启detectNoop（detectNoop=true），并且并不改变数据

public static void testUpdate_noop() {
		RestHighLevelClient client = EsClient.getClient();
		try {
			UpdateRequest request = new UpdateRequest("twitter", "_doc", "10");
			request.detectNoop(true);
			request.doc(buildIndexRequest());
			
			UpdateResponse result = client.update(request, RequestOptions.DEFAULT);
			System.out.println(result);
		} catch (Throwable e) {
			e.printStackTrace();
		} finally {
			EsClient.close(client);
		}
	}

返回结果：

{
   "_shards": {
        "total": 0,
        "successful": 0,
        "failed": 0
   },
   "_index": "twitter",
   "_type": "_doc",
   "_id": "10",
   "_version": 6,
   "result": "noop"
}

其特征为result为noop，并且_shards各个字段都返回0，表示没有在任何分片上执行该动作，并且数据的版本_version并不会发送变化。

3.3 不开启detectNoop（detectNoop=false），并且并不改变数据

public static void testUpdate_no_noop() {
		RestHighLevelClient client = EsClient.getClient();
		try {
			UpdateRequest request = new UpdateRequest("twitter", "_doc", "10");
			request.detectNoop(false);
			request.doc(buildIndexRequest());
			UpdateResponse result = client.update(request, RequestOptions.DEFAULT);
			System.out.println(result);
		} catch (Throwable e) {
			e.printStackTrace();
		} finally {
			EsClient.close(client);
		}
	}

返回结果：

{
   "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
   },
   "_index": "twitter",
   "_type": "_doc",
   "_id": "10",
   "_version": 7,
   "result": "updated"
}

其主要特征表现为result=updated，表示执行的动作为更新，并且版本号自增1，_shards反馈的是各分片的执行情况。

3.4 saveOrUpdate更新模式（upsert）

/**
	 * 更新操作，原记录不存在，使用saveOrUpdate模式。
	 */
	public static void testUpdate_upsert() {
		RestHighLevelClient client = EsClient.getClient();
		try {
			UpdateRequest request = new UpdateRequest("twitter", "_doc", "11");
			IndexRequest indexRequest = new IndexRequest("twitter", "_doc", "11");
			Map<String, String> source = new HashMap<>();
			source.put("user", "dingw");
			source.put("post_date", "2009-11-17T14:12:12");
			source.put("message", "hello,update upsert。");
			
			indexRequest.source(source);
			request.doc(indexRequest);
			request.docAsUpsert(true);
			UpdateResponse result = client.update(request, RequestOptions.DEFAULT);
			System.out.println(result);
		} catch (Throwable e) {
			e.printStackTrace();
		} finally {
			EsClient.close(client);
		}
	}

返回结果：

{
   "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
   },
   "_index": "twitter",
   "_type": "_doc",
   "_id": "11",
   "_version": 1,
   "result": "created"
}

返回结果其核心表现为：result:created，表示是一个新增操作。
Document API就讲解到这里了，本节详细介绍了Document Update API的核心关键点以及实现要点，最后给出Demo展示如何在JAVA中使用Update API。

见文如面，我是威哥，热衷于成体系剖析JAVA主流中间件，关注公众号『中间件兴趣圈』，回复专栏可获取成体系专栏导航，回复资料可以获取笔者的学习思维导图。
在这里插入图片描述