org.apache.spark.util.SparkFatalException

最新推荐文章于 2024-01-08 17:57:35 发布

Shockang

最新推荐文章于 2024-01-08 17:57:35 发布

阅读量2.9k

点赞数 2

分类专栏： Spark异常问题汇总文章标签： spark

本文链接：https://blog.csdn.net/Shockang/article/details/119063441

版权

Spark异常问题汇总专栏收录该内容

14 篇文章 7 订阅

订阅专栏

前言

本文隶属于专栏《Spark异常问题汇总》，该专栏为笔者原创，引用请注明来源，不足和错误之处请在评论区帮忙指出，谢谢！

本专栏目录结构和参考文献请见 Spark异常问题汇总

问题描述

加工维表的过程中做了两个维表的关联报错：

java.util.concurrent.ExecutionException: org.apache.spark.util.SparkFatalException
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:206)
        at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$anonfun$doExecuteBroadcast$2.apply$mcVI$sp(BroadcastExchangeExec.scala:152)
        at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$anonfun$doExecuteBroadcast$2.apply(BroadcastExchangeExec.scala:150)
        at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$anonfun$doExecuteBroadcast$2.apply(BroadcastExchangeExec.scala:150)
        at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
        at scala.collection.immutable.Range.foreach(Range.scala:160)
        at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
        at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:150)
        at org.apache.spark.sql.execution.InputAdapter.doExecuteBroadcast(WholeStageCodegenExec.scala:387)
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:158)
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:154)
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:169)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:166)
        at org.apache.spark.sql.execution.SparkPlan.executeBroadcast(SparkPlan.scala:154)
        at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.prepareBroadcast(BroadcastHashJoinExec.scala:117)
        at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.codegenInner(BroadcastHashJoinExec.scala:211)
        at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doConsume(BroadcastHashJoinExec.scala:101)
        at org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:189)
        at org.apache.spark.sql.execution.ProjectExec.consume(basicPhysicalOperators.scala:41)
        at org.apache.spark.sql.execution.ProjectExec.doConsume(basicPhysicalOperators.scala:71)
        at org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:189)
        at org.apache.spark.sql.execution.FilterExec.consume(basicPhysicalOperators.scala:91)
        at org.apache.spark.sql.execution.FilterExec.doConsume(basicPhysicalOperators.scala:216)
        at org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:189)
        at org.apache.spark.sql.execution.FileSourceScanExec.consume(DataSourceScanExec.scala:165)
        at org.apache.spark.sql.execution.ColumnarBatchScan$class.produceBatches(ColumnarBatchScan.scala:144)
        at org.apache.spark.sql.execution.ColumnarBatchScan$class.doProduce(ColumnarBatchScan.scala:83)
        at org.apache.spark.sql.execution.FileSourceScanExec.doProduce(DataSourceScanExec.scala:165)
        at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:90)
        at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:85)
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:169)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:166)
        at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:85)
        at org.apache.spark.sql.execution.FileSourceScanExec.produce(DataSourceScanExec.scala:165)
        at org.apache.spark.sql.execution.FilterExec.doProduce(basicPhysicalOperators.scala:131)
        at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:90)
        at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:85)
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:169)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:166)
        at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:85)
        at org.apache.spark.sql.execution.FilterExec.produce(basicPhysicalOperators.scala:91)
        at org.apache.spark.sql.execution.ProjectExec.doProduce(basicPhysicalOperators.scala:51)
        at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:90)
        at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:85)
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:169)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:166)
        at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:85)
        at org.apache.spark.sql.execution.ProjectExec.produce(basicPhysicalOperators.scala:41)
        at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doProduce(BroadcastHashJoinExec.scala:96)
        at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:90)
        at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:85)
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:169)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:166)
        at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:85)
        at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.produce(BroadcastHashJoinExec.scala:40)
        at org.apache.spark.sql.execution.ProjectExec.doProduce(basicPhysicalOperators.scala:51)
        at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:90)
        at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:85)
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:169)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)

问题定位

创建广播的时候超时了，数据集太大

解决方案

spark.sql.broadcastTimeout

默认5分钟，可以尝试把这个参数调大再试试，当然了，等那么久，值不值得就看作业整体情况了

spark.sql.autoBroadcastJoinThreshold=-1

关掉 autoBroadcast

Shockang

关注

2
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
org.apache.spark.util.SparkFatalException

前言本文隶属于专栏《Spark异常问题汇总》，该专栏为笔者原创，引用请注明来源，不足和错误之处请在评论区帮忙指出，谢谢！本专栏目录结构和参考文献请见 Spark异常问题汇总问题描述加工维表的过程中做了两个维表的关联报错：java.util.concurrent.ExecutionException: org.apache.spark.util.SparkFatalException at java.util.concurrent.FutureTask.report(Future
复制链接

扫一扫

专栏目录