Caused by: org.apache.spark.SparkException: java.util.concurrent.ExecutionException: Exception throw

文章描述了一次在测试环境中使用HiveonSpark执行SQL时遇到的问题,主要原因是空表ods_knvv_full导致任务失败。作者尝试移除空表后,查询成功运行。关键词包括类加载异常、任务取消和数据关联操作。
摘要由CSDN通过智能技术生成

hive on spark 出現錯誤,Caused by: org.apache.spark.SparkException: java.util.concurrent.ExecutionException: Exception thrown by job

Caused by: java.lang.ClassNotFoundException: org.antlr.runtime.tree.Tree
	at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
	... 54 more
2023-10-20 10:21:48,263 INFO cluster.YarnClusterScheduler: Cancelling stage 5
2023-10-20 10:21:48,265 INFO cluster.YarnClusterScheduler: Killing all running tasks in stage 5: Stage cancelled
2023-10-20 10:21:48,265 INFO scheduler.DAGScheduler: ShuffleMapStage 5 (UnionRDD (Map 1 (1), Map 6 (1))) failed in 1.204 s due to Job aborted due to stage failure: Task 0 in stage 5.0 failed 4 times, most recent failure: Lost task 0.3 in stage 5.0 (TID 11, hadoop103, executor 2): UnknownReason
Driver stacktrace:
2023-10-20 10:21:48,266 ERROR client.RemoteDriver: Failed to run job ce5f418f-d59b-48c1-b3f7-96221c0c4e96
java.util.concurrent.ExecutionException: Exception thrown by job
	at org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:282)
	at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:287)
	at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382)
	at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:343)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 5.0 failed 4 times, most recent failure: Lost task 0.3 in stage 5.0 (TID 11, hadoop103, executor 2): UnknownReason
Driver stacktrace:
	at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2023)
	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:1972)
	at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:1971)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1971)
	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:950)
	at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:950)
	at scala.Option.foreach(Option.scala:407)
	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:950)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2203)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2152)
	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2141)
	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)

發現使用命令

set hive.auto.convert.join=false;

已經沒有用了,具體的代碼如下:

with customer as (
    select kunnr,
           adrnr,
           mcod3,
           name1,
           sortl,
           stras
    from ods_kna1_full
    where dt = '2023-10-19'
    and is_deleted = 0
)
,customer_append01 as(
    select kunnr,
           dtype,
           fdliy,
           sonly,
           case
               when dtype is not null and fdliy is not null
                   then concat(dtype, '/', fdliy)
               else dtype
           end as cmtyp
    from ods_zkna1_append01_full
    where dt = '2023-10-19'
)
,so1 as (
    select kunnr,
           vwerk
    from ods_knvv_full
    where dt = '2023-10-19'
)
,address as (
    select addrnumber,
           location,
           name1,
           street,
           house_num1
    from ods_adrc_full
    where dt = '2023-10-19'
)
insert overwrite table dim_it_cutm partition (dt='2023-10-19')
select
       customer.kunnr,
       customer.name1,
       customer.mcod3,
      if(addrnumber is null,customer.stras,concat(nvl(address.street,''),' ', nvl(address.house_num1,''))) stras,
       customer.adrnr,
        so1.vwerk,
       address.location,
       if( customer_append01.sonly = 'X', 'Yes', sonly)  sonly,
       customer_append01.cmtyp
from customer
left join customer_append01 on customer.kunnr = customer_append01.kunnr
left join so1 on customer.kunnr = so1.kunnr
left join address on customer.adrnr = address.addrnumber;

數據量也不大,在prd環境是可以運行成功的,但是在test環境就一直報錯誤,後面想了一下 ods_knvv_full 是一張空表,要不將這張空表去除看看,發現去除之後,可以成功,這是什麼原理?????

with customer as (
    select kunnr,
           adrnr,
           mcod3,
           name1,
           sortl,
           stras
    from ods_kna1_full
    where dt = '2023-10-19'
    and is_deleted = 0
)
,customer_append01 as(
    select kunnr,
           dtype,
           fdliy,
           sonly,
           case
               when dtype is not null and fdliy is not null
                   then concat(dtype, '/', fdliy)
               else dtype
           end as cmtyp
    from ods_zkna1_append01_full
    where dt = '2023-10-19'
)
,so1 as (
    select kunnr,
           vwerk
    from ods_knvv_full
    where dt = '2023-10-19'
)
,address as (
    select addrnumber,
           location,
           name1,
           street,
           house_num1
    from ods_adrc_full
    where dt = '2023-10-19'
)
insert overwrite table dim_it_cutm partition (dt='2023-10-19')
select
       customer.kunnr,
       customer.name1,
       customer.mcod3,
      if(addrnumber is null,customer.stras,concat(nvl(address.street,''),' ', nvl(address.house_num1,''))) stras,
       customer.adrnr,
--        so1.vwerk,
       null,
       address.location,
       if( customer_append01.sonly = 'X', 'Yes', sonly)  sonly,
       customer_append01.cmtyp
from customer
left join customer_append01 on customer.kunnr = customer_append01.kunnr
-- left join so1 on customer.kunnr = so1.kunnr
left join address on customer.adrnr = address.addrnumber;
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

ysksolution

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值