spark写es 报错 Could not write all entries for bulk operation [47/1081

最新推荐文章于 2024-01-15 14:00:16 发布

上杉仓南

最新推荐文章于 2024-01-15 14:00:16 发布

阅读量2.1k

点赞数 1

文章标签： elasticsearch spark big data

本文链接：https://blog.csdn.net/weixin_42489619/article/details/122879095

版权

报错详细日志信息：

es报错org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 55.0 failed 4 times, most recent failure: Lost task 0.3 in stage 55.0 (TID 4643, 192.168.1.203, executor 3): org.elasticsearch.hadoop.EsHadoopException: Could not write all entries for bulk operation [47/10813]. Error sample (first [5] error messages):
failed to parse [edgeEasyPovertyStartDate]
failed to parse [edgeEasyPovertyStartDate]
failed to parse [edgeEasyPovertyStartDate]
failed to parse [edgeEasyPovertyStartDate]
failed to parse [edgeEasyPovertyStartDate]
Bailing out...
at org.elasticsearch.hadoop.rest.bulk.BulkProcessor.flush(BulkProcessor.java:475)
at org.elasticsearch.hadoop.rest.bulk.BulkProcessor.add(BulkProcessor.java:106)
at org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:187)
at org.elasticsearch.hadoop.rest.RestRepository.writeToIndex(RestRepository.java:168)
at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:67)
at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:101)
at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:101)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:100)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

从日志中分析得知：failed to parse [edgeEasyPovertyStartDate] spark在写入es的时候，解析

"edgeEasyPovertyStartDate" 这个字段出现问题，于是我查了下这个es索引这个字段类型

是integer 类型，而这个字段数据类型实际上是 date 类型，从新创建es索引，把 edgeEasyPovertyStartDate 字段类型integer--->date 就没有问题了。

spark 写 es 报错有许多是由于es索引字段和数据类型不匹配导致的。

上杉仓南

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
spark写es 报错 Could not write all entries for bulk operation [47/1081

报错详细日志信息：es报错org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 55.0 failed 4 times, most recent failure: Lost task 0.3 in stage 55.0 (TID 4643, 192.168.1.203, executor 3): org.elasticsearch.hadoop.EsHadoopException: Coul
复制链接

扫一扫