spark写es 报错 Could not write all entries for bulk operation [47/1081

报错详细日志信息:

es报错org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 55.0 failed 4 times, most recent failure: Lost task 0.3 in stage 55.0 (TID 4643, 192.168.1.203, executor 3): org.elasticsearch.hadoop.EsHadoopException: Could not write all entries for bulk operation [47/10813]. Error sample (first [5] error messages):
failed to parse [edgeEasyPovertyStartDate]
failed to parse [edgeEasyPovertyStartDate]
failed to parse [edgeEasyPovertyStartDate]
failed to parse [edgeEasyPovertyStartDate]
failed to parse [edgeEasyPovertyStartDate]

Bailing out...
at org.elasticsearch.hadoop.rest.bulk.BulkProcessor.flush(BulkProcessor.java:475)
at org.elasticsearch.hadoop.rest.bulk.BulkProcessor.add(BulkProcessor.java:106)
at org.elasticsearch.hadoop.rest.RestRepository.doWriteToIndex(RestRepository.java:187)
at org.elasticsearch.hadoop.rest.RestRepository.writeToIndex(RestRepository.java:168)
at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:67)
at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:101)
at org.elasticsearch.spark.sql.EsSparkSQL$$anonfun$saveToEs$1.apply(EsSparkSQL.scala:101)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:100)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

从日志中分析得知:failed to parse [edgeEasyPovertyStartDate]  spark在写入es的时候,解析

"edgeEasyPovertyStartDate" 这个字段出现问题,于是我查了下这个es索引这个字段类型

 

 是integer 类型,而这个字段数据类型实际上是 date 类型,从新创建es索引,把 edgeEasyPovertyStartDate 字段类型integer--->date 就没有问题了。

spark 写 es 报错有许多是由于es索引字段和数据类型不匹配导致的。

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
根据引用和引用的内容,FOR ALL ENTRIES 子句通常在 SAP ABAP 内核中的数据访问 (FDA) 中使用,用于查询数据时的性能优化。该子句可以在 SELECT 语句中与其他表进行 JOIN 操作,以提高查询速度。而根据引用,如果查询数据量庞大,可以使用“%_HINTS 数据库”来进一步优化查询性能。 综上所述,发 for ALL ENTRIES 是指在 SAP ABAP 中使用 FOR ALL ENTRIES 子句来进行数据查询,并通过与其他表进行 JOIN 操作或使用其他性能优化技术来提高查询速度。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* [SAP For all entries 的效率](https://download.csdn.net/download/zxjsinlan/1458706)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_1"}}] [.reference_item style="max-width: 33.333333333333336%"] - *2* [SAP ABAP FOR ALL ENTRIES 的用法](https://blog.csdn.net/i042416/article/details/124131342)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_1"}}] [.reference_item style="max-width: 33.333333333333336%"] - *3* [ABAP: FOR ALL ENTRIES IN 用法](https://blog.csdn.net/qq_39128364/article/details/128645074)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_1"}}] [.reference_item style="max-width: 33.333333333333336%"] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值