1.bulk
POST /_bulk
{"delete":{"_index":"index_test2","_type":"product1","_id":"1"}}
{"create":{"_index":"index_test2","_type":"product1","_id":"1"}}
{"test_name":"name1","test_age":15}
{"index":{"_index":"index_test2","_type":"product1","_id":"2"}}
{"test_name":"name2","test_age":20}
{"update":{"_index":"index_test2","_type":"product1","_id":"1","retry_on_conflict":3}}
{"doc":{"name":"jieke"}}
{
"took": 287,
"errors": false,
"items": [
{
"delete": {
"_index": "index_test2",
"_type": "product1",
"_id": "1",
"_version": 1,
"result": "not_found",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1,
"status": 404
}
},
{
"create": {
"_index": "index_test2",
"_type": "product1",
"_id": "1",
"_version": 2,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 1,
"_primary_term": 1,
"status": 201
}
},
{
"index": {
"_index": "index_test2",
"_type": "product1",
"_id": "2",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1,
"status": 201
}
},
{
"update": {
"_index": "index_test2",
"_type": "product1",
"_id": "1",
"_version": 3,
"result": "updated",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 2,
"_primary_term": 1,
"status": 200
}
}
]
}
有哪些类型的操作可以执行呢?
(1)delete:删除一个文档,只要1个json串就可以了
(2)create:PUT /index/type/id/_create,强制创建
(3)index:普通的put操作,可以是创建文档,也可以是全量替换文档
(4)update:执行的partial update操作
bluk注意事项和了解
bulk api对json的语法,有严格的要求,每个json串不能换行,只能放一行,同时一个json串和一个json串之间,必须有一个换行
bulk操作中,任意一个操作失败,是不会影响其他的操作的,但是在返回结果里,会告诉你异常日志
bulk request会加载到内存里,如果太大的话,性能反而会下降,因此需要反复尝试一个最佳的bulk size。一般从1000~5000条数据开始,尝试逐渐增加。另外,如果看大小的话,最好是在5~15MB之间。
bluk为什么不用换行的json串呢?
如果使用换行的json串,会被转换成Json对象这样才比较好取值,导致数据在内存中至少翻了一倍,这样是非常耗内存和cpu的