es基本操作方法

最新推荐文章于 2023-07-03 20:30:00 发布

weixin_30496431

最新推荐文章于 2023-07-03 20:30:00 发布

阅读量357

点赞数

文章标签： json 大数据

原文链接：http://www.cnblogs.com/the-silverwing/p/10271688.html

版权

取得多个文档

使用 multi-get 或者 mget API 来将这些检索请求放在一个请求中，将比逐个文档请求更快地检索到全部文档。mget API 要求有一个 docs 数组作为参数，每个元素包含需要检索文档的元数据，包括 _index 、 _type和 _id 。如果你想检索一个或者多个特定的字段，那么你可以通过 _source 参数来指定这些字段的名字：

GET /_mget
{
   "docs" : [ { "_index" : "website", "_type" : "blog", "_id" : 2 }, { "_index" : "website", "_type" : "pageviews", "_id" : 1, "_source": "views" } ] }

响应体：

{
   "docs" : [ { "_index" : "website", "_id" : "2", "_type" : "blog", "found" : true, "_source" : { "text" : "This is a piece of cake...", "title" : "My first external blog entry" }, "_version" : 10 }, { "_index" : "website", "_id" : "1", "_type" : "pageviews", "found" : true, "_version" : 2, "_source" : { "views" : 2 } } ] }

bulk 与其他的请求体格式稍有不同，如下所示：

{ action: { metadata }}\n { request body }\n { action: { metadata }}\n { request body }\n ...

这种格式类似一个有效的单行 JSON 文档流，它通过换行符(\n)连接到一起。注意两个要点：

每行一定要以换行符(\n)结尾， 包括最后一行 。这些换行符被用作一个标记，可以有效分隔行。
这些行不能包含未转义的换行符，因为他们将会对解析造成干扰。这意味着这个 JSON 不能使用 pretty 参数打印。

action/metadata 行指定 哪一个文档 做 什么操作 。

action 必须是以下选项之一:

create

如果文档不存在，那么就创建它。详情请见创建新文档。

index

创建一个新文档或者替换一个现有的文档。详情请见索引文档和更新整个文档。

update

部分更新一个文档。详情请见文档的部分更新。

delete

删除一个文档。详情请见删除文档。

metadata 应该指定被索引、创建、更新或者删除的文档的 _index 、 _type 和 _id 。

例如，一个 delete 请求看起来是这样的：

{ "delete": { "_index": "website", "_type": "blog", "_id": "123" }}

request body 行由文档的 _source 本身组成--文档包含的字段和值。它是 index 和 create 操作所必需的，这是有道理的：你必须提供文档以索引。

它也是 update 操作所必需的，并且应该包含你传递给 update API 的相同请求体： doc 、 upsert 、 script 等等。删除操作不需要 request body 行。

{ "create":  { "_index": "website", "_type": "blog", "_id": "123" }} { "title": "My first blog post" }

如果不指定 _id ，将会自动生成一个 ID ：

{ "index": { "_index": "website", "_type": "blog" }} { "title": "My second blog post" }

为了把所有的操作组合在一起，一个完整的 bulk 请求有以下形式:

POST /_bulk
{ "delete": { "_index": "website", "_type": "blog", "_id": "123" }}  { "create": { "_index": "website", "_type": "blog", "_id": "123" }} { "title": "My first blog post" } { "index": { "_index": "website", "_type": "blog" }} { "title": "My second blog post" } { "update": { "_index": "website", "_type": "blog", "_id": "123", "_retry_on_conflict" : 3} } { "doc" : {"title" : "My updated blog post"} }

这个 Elasticsearch 响应包含 items 数组，这个数组的内容是以请求的顺序列出来的每个请求的结果。

{
   "took": 4, "errors": false,  "items": [ { "delete": { "_index": "website", "_type": "blog", "_id": "123", "_version": 2, "status": 200, "found": true }}, { "create": { "_index": "website", "_type": "blog", "_id": "123", "_version": 3, "status": 201 }}, { "create": { "_index": "website", "_type": "blog", "_id": "EiwfApScQiiy7TIKFxRCTw", "_version": 1, "status": 201 }}, { "update": { "_index": "website", "_type": "blog", "_id": "123", "_version": 4, "status": 200 }} ] }

每个子请求都是独立执行，因此某个子请求的失败不会对其他子请求的成功与否造成影响。如果其中任何子请求失败，最顶层的 error 标志被设置为 true ，并且在相应的请求报告出错误明细：

POST /_bulk
{ "create": { "_index": "website", "_type": "blog", "_id": "123" }} { "title": "Cannot create - it already exists" } { "index": { "_index": "website", "_type": "blog", "_id": "123" }} { "title": "But we can update it" }

拷贝为 CURL 在 SENSE 中查看

在响应中，我们看到 create 文档 123 失败，因为它已经存在。但是随后的 index 请求，也是对文档 123操作，就成功了：

{
   "took": 3, "errors": true,  "items": [ { "create": { "_index": "website", "_type": "blog", "_id": "123", "status": 409,  "error": "DocumentAlreadyExistsException  [[website][4] [blog][123]: document already exists]" }}, { "index": { "_index": "website", "_type": "blog", "_id": "123", "_version": 5, "status": 200  }} ] }

	一个或者多个请求失败。
	这个请求的HTTP状态码报告为 `409 CONFLICT` 。
	解释为什么请求失败的错误信息。
	第二个请求成功，返回 HTTP 状态码 `200 OK` 。

这也意味着 bulk 请求不是原子的：不能用它来实现事务控制。每个请求是单独处理的，因此一个请求的成功或失败不会影响其他的请求。

转载于:https://www.cnblogs.com/the-silverwing/p/10271688.html

weixin_30496431

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
es基本操作方法

取得多个文档使用multi-get或者mgetAPI来将这些检索请求放在一个请求中，将比逐个文档请求更快地检索到全部文档。mgetAPI 要求有一个docs数组作为参数，每个元素包含需要检索文档的元数据，包括_index、_type和_id。如果你想检索一个或者多个特定的字段，那么你可以通过_source参数来指定这些字段的名字：GET /_mg...
复制链接

扫一扫