Bulk API
声明:本文根据ES官方文档进行翻译与总结而得。转载请注明作者:https://blog.csdn.net/qingmou_csdn
批量API使得在单个API调用中执行许多索引/删除操作成为可能。这可以大大提高索引速度
DELETE gaoyh
POST _bulk
{"index":{"_index":"gaoyh","_type":"test","_id":"1"}}
{"index_test":"value_of_index_test"}
{"delete":{"_index":"gaoyh","_type":"test","_id":"2"}}
{"create":{"_index":"gaoyh","_type":"test","_id":"3"}}
{"create_test":"value_of_create_test"}
相应结果如下:
(成功创建索引下文档1并赋值(201);
找不到索引下文档2(404);
成功创建索引下文档3(201))
{
"took": 152,
"errors": false,
"items": [
{
"index": {
"_index": "gaoyh",
"_type": "test",
"_id": "1",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1,
"status": 201
}
},
{
"delete": {
"_index": "gaoyh",
"_type": "test",
"_id": "2",
"_version": 1,
"result": "not_found",
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1,
"status": 404
}
},
{
"create": {
"_index": "gaoyh",
"_type": "test",
"_id": "3",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1,
"status": 201
}
}
]
}
GET gaoyh/test/_mget
{
"ids":["1","2","3"]
}
响应结果如下:
{
"docs": [
{
"_index": "gaoyh",
"_type": "test",
"_id": "1",
"_version": 1,
"found": true,
"_source": {
"index_test": "value_of_index_test"
}
},
{
"_index": "gaoyh",
"_type": "test",
"_id": "2",
"found": false
},
{
"_index": "gaoyh",
"_type": "test",
"_id": "3",
"_version": 1,
"found": true,
"_source": {
"create_test": "value_of_create_test"
}
}
]
}
此时如果再次执行bulk命令,响应结果如下:
(errors:true;
文档1更新,版本号加1(201);
文档2找不到(404);
文档3版本冲突,创建失败(409),此即index与create的区别。)
{
"took": 5,
"errors": true,
"items": [
{
"index": {
"_index": "gaoyh",
"_type": "test",
"_id": "1",
"_version": 2,
"result": "updated",
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"_seq_no": 1,
"_primary_term": 1,
"status": 200
}
},
{
"delete": {
"_index": "gaoyh",
"_type": "test",
"_id": "2",
"_version": 1,
"result": "not_found",
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"_seq_no": 1,
"_primary_term": 1,
"status": 404
}
},
{
"create": {
"_index": "gaoyh",
"_type": "test",
"_id": "3",
"status": 409,
"error": {
"type": "version_conflict_engine_exception",
"reason": "[test][3]: version conflict, document already exists (current version [1])",
"index_uuid": "7uz9hJ59RSWM3ZDHlpP8JA",
"shard": "4",
"index": "gaoyh"
}
}
}
]
}
增加update命令:
POST _bulk
{"index":{"_index":"gaoyh","_type":"test","_id":"1"}}
{"index_test":"value_of_index_test"}
{"update":{"_index":"gaoyh","_type":"test","_id":"1"}}
{"doc":{"update_test":"value_of_update_test"}}
响应结果为:
首先更新文档1,版本号加1,result:updated,status:200;
然后执行更新命令,更新文档1,版本号加1,result:updated,status:200;
GET 文档1结果如下:
{
"_index": "gaoyh",
"_type": "test",
"_id": "1",
"_version": 4,
"found": true,
"_source": {
"index_test": "value_of_index_test",
"update_test": "value_of_update_test"
}
--------------------------------------------------------------------------------
若执行如下代码:
POST _bulk
{"update":{"_index":"gaoyh","_type":"test","_id":"1"}}
{"doc":{"index_test":"value_of_update_test"}}
则更新index_test内容,文档1的_source内容如下:
"_source": {
"index_test": "value_of_update_test",
"update_test": "value_of_update_test"
}
注意:index操作时,不管原先有无文档1,也不管文档1内容为什么,都是直接更新为index操作后的文档1,包括文档1的source内容。
DELETE gaoyh
POST _bulk
{"index":{"_index":"gaoyh","_type":"test","_id":"1"}}
{"index_test":"value_of_index_test"}
{"delete":{"_index":"gaoyh","_type":"test","_id":"1"}}
{"update":{"_index":"gaoyh","_type":"test","_id":"1"}}
{"doc":{"update_test":"value_of_update_test"}}
响应结果如下:
(创建文档1-版本1(201);删除文档1-版本2(200);更新文档1-失败(404),找不到文档1)
{
"took": 149,
"errors": true,
"items": [
{
"index": {
"_index": "gaoyh",
"_type": "test",
"_id": "1",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1,
"status": 201
}
},
{
"delete": {
"_index": "gaoyh",
"_type": "test",
"_id": "1",
"_version": 2,
"result": "deleted",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 1,
"_primary_term": 1,
"status": 200
}
},
{
"update": {
"_index": "gaoyh",
"_type": "test",
"_id": "1",
"status": 404,
"error": {
"type": "document_missing_exception",
"reason": "[test][1]: document missing",
"index_uuid": "HTERVHQJSEOqBKpcGSVkWQ",
"shard": "3",
"index": "gaoyh"
}
}
}
]
}
总结:
1、最后一行数据必须以换行符\n结尾;注意格式;
2、可能的操作有index , create , delete , update ; 其中,
- index 和 create 操作的下一行内容为 index 和 create 操作的源内容( index and create expect a source on the next line),并且具有和标准Index API参数相同的语义(即,如果已经存在具有相同索引和类型的文档,则create将失败,而index将根据需要添加或替换文档);
- delete不期望下一行中的源并具有与标准Index API参数相同的语义(delete does not expect a source on the following line, and has the same semantics as the standard delete API);
- update期望在下一行指定部分doc , upsert 和 script 及其选项(update expects that the partial doc, upsert and script and its options are specified on the next line)
例如:
DELETE gaoyh
POST _bulk
{"update":{"_index":"gaoyh","_type":"test","_id":"1"}}
{"script":{"source":"ctx._source.counter += params.count","lang":"painless","params":{"count":4}},"scripted_upsert":true,"upsert":{"counter":1}}
响应结果:
(scripted_upsert设为true,文档1不存在,先创建文档1且复制counter为1,然后执行脚本内容,故文档1的counter为5;
scripted_upsert设为false,则创建不存在的文档1并赋值counter为1,此时文档1的counter为1,再次执行此bulk,文档1内容方为5.)