ElasticSearch--bulk批量增删改

bulk request会加载到内存中,如果太大的话性能也会下降。一般可以尝试逐渐增加,大小在5至15MB之间寻找平衡点

  • 语法:除delete操作外,每一个操作需要两个json串,语法如下:

{"action":{"metadata"}}

{"data"}

其中action有以下选项:

  1. delete:删除一个文档,只要1个json串
  2. create: PUT /index/type/id/_create,强制创建,如果指定的id已存在,者报错
  3. index:普通的put操作,可以是创建文档,也可以是全量替换文档
  4. update:执行partial update操作

metadata有选项有:{"_index":"xxx","_type":"xxx","_id":"x","retry_on_conflict":5}

  • 示例:

GET _bulk
{"delete":{"_index":"accounts","_type":"person","_id":"1"}}
{"create":{"_index":"accounts","_type":"person","_id":"1"}}
{"user":"zeng","age":10,"salary":20000,"title":"工程师","desc":"数据库管理","tag":"girl","tags":"xyz"}
{"create":{"_index":"accounts","_type":"person","_id":"1"}}
{"user":"zeng","age":10,"salary":20000,"title":"工程师","desc":"数据库管理","tag":"girl","tags":"xyz"}
{"index":{"_index":"accounts","_type":"person","_id":"1"}}
{"user":"zeng","age":10,"salary":20000,"title":"工程师","desc":"数据库管理","tag":"boy"}
{"update":{"_index":"accounts","_type":"person","_id":"1","_retry_on_conflict":5}}
{"doc":{"tag":"girl"}}
返回:
{
  "took": 23,
  "errors": true,
  "items": [
    {
      "delete": {
        "found": true,
        "_index": "accounts",
        "_type": "person",
        "_id": "1",
        "_version": 11,
        "result": "deleted",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "status": 200
      }
    },
    {
      "create": {
        "_index": "accounts",
        "_type": "person",
        "_id": "1",
        "_version": 12,
        "result": "created",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "created": true,
        "status": 201
      }
    },
    {
      "create": {
        "_index": "accounts",
        "_type": "person",
        "_id": "1",
        "status": 409,//强制创建,如果指定的id已存在,者报错
        "error": {
          "type": "version_conflict_engine_exception",
          "reason": "[person][1]: version conflict, document already exists (current version [12])",
          "index_uuid": "qgPjJUv6Sm-VBesoOMlNPA",
          "shard": "3",
          "index": "accounts"
        }
      }
    },
    {
      "index": {
        "_index": "accounts",
        "_type": "person",
        "_id": "1",
        "_version": 13,
        "result": "updated",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "created": false,
        "status": 200
      }
    },
    {
      "update": {
        "_index": "accounts",
        "_type": "person",
        "_id": "1",
        "_version": 14,
        "result": "updated",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "status": 200
      }
    }
  ]
}
  • 异常1:

{
  "error": {
    "root_cause": [
      {
        "type": "action_request_validation_exception",
        "reason": "Validation Failed: 1: script or doc is missing;"
      }
    ],
    "type": "action_request_validation_exception",
    "reason": "Validation Failed: 1: script or doc is missing;"
  },
  "status": 400
}

一盘查看update操作的josn字符串是否是如下格式

{"doc":{"data"}}

  • 异常2:

{
  "error": {
    "root_cause": [
      {
        "type": "json_e_o_f_exception",
        "reason": "Unexpected end-of-input: expected close marker for Object (start marker at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@694f1218; line: 1, column: 1])\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@694f1218; line: 1, column: 3]"
      }
    ],
    "type": "json_e_o_f_exception",
    "reason": "Unexpected end-of-input: expected close marker for Object (start marker at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@694f1218; line: 1, column: 1])\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@694f1218; line: 1, column: 3]"
  },
  "status": 500
}

请查看请求json格式,json串需要去格式化,只能在一行中。

  • 为何不采用更合理,更易读的json串?

主要原因是,如果那样的话,解析json串会形成请求数据的拷贝(如JsonArray对象),占用内存增加,使es的JVM进行频繁的gc操作,非常影响性能。

采用目前的格式,只需要将json串进行分割操作,然后转发给相应的节点进行处理就可以了,不会浪费内存,性能会大大增加。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值