ElasticSearch报错:Found unrecoverable error [xxx] returned Bad Request(400) - failed to parse解决办法

20/03/27 10:50:53 WARN scheduler.TaskSetManager: Lost task 27.2 in stage 0.0 (TID 71, xxx11, executor 5): org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: Found unrecoverable error [xxx] returned Bad Request(400) - failed to parse; Bailing out..
        at org.elasticsearch.hadoop.rest.RestClient.processBulkResponse(RestClient.java:251)
        at org.elasticsearch.hadoop.rest.RestClient.bulk(RestClient.java:203)
        at org.elasticsearch.hadoop.rest.RestRepository.tryFlush(RestRepository.java:222)
        at org.elasticsearch.hadoop.rest.RestRepository.flush(RestRepository.java:244)
        at org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:198)
        at org.elasticsearch.hadoop.rest.RestRepository.writeToIndex(RestRepository.java:161)
        at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:67)
        at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:107)
        at org.elasticsearch.spark.rdd.EsSpark$$anonfun$doSaveToEs$1.apply(EsSpark.scala:107)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:121)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$11.apply(Executor.scala:407)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1408)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:413)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:748)

以上是报错信息

网上找半天没找到什么解决方案

下面是scala代码,将HDFS上的数据通过spark存到ES,这里用到了EsSpark类

    val sc = SparkSession.builder().appName("step1")
      //.master("local[*]")
      .config("es.index.auto.create", "true")
      .config("pushdown", "true")
      .config("es.nodes", "xxx0,xxx1,xxx2,xxx3")
      .config("es.port", "9200")
      .config("es.nodes.wan.only", "true")
      .config("es.batch.write.retry.wait", "500")
      .config("es.batch.write.retry.count", "50")
      .config("es.batch.size.bytes", "300000000")
      .config("es.batch.size.entries", "10000")
      .config("es.batch.write.refresh", "false")
      .config("es.batch.write.retry.count", "60")
      .config("es.http.timeout", "10m")
      .config("es.http.retries", "50")
      .config("es.action.heart.beat.lead", "50")
      .getOrCreate()
    val today = args(0)
    val jsonRdd = sc.sparkContext.textFile(s"xxx")
        .map(line=>{
          val arr = line.replace("\"", "").split(",", -1)
          val color = arr(0).split("_")(1) match {
            case "0" => "蓝色"
            case "1" => "黄色"
            case "2" => "黑色"
            case "3" => "白色"
            case "4" => "渐变绿色"
            case "5" => "黄绿双拼色"
            case "6" => "蓝白渐变色"
            case _ => "未确定"
          }


          Map(
            "vehicle_color"->color,
            "vehicle_number"->arr(0).split("_")(0),
            "total_count" -> arr(1),
            "average_park_time" -> arr(2),
            "average_park_fee" -> arr(3),
            "park1" -> arr(4),
            "park2" -> arr(5),
            "park3" -> arr(6),
            "pay_way" -> arr(7)
          ).filter(_._2 != "")

        })

    EsSpark.saveToEs(jsonRdd, "xxx") //用这个方法不会报错
//    EsSpark.saveJsonToEs(jsonRdd, "xxx")  //用这个方法会报错

 

saveToEs会把Map格式的数据直接导入es
saveJsonToEs希望我将Json格式的数据导入es,但我这里是Map类型的,所以类型不一致报错了
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值