背景:
新写的azkaban调度,手动执行没问题,定时执行就报错,如下:
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO - Job failed with java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper cannot be cast to org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerDirectAccess
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO - 2022-05-11 06:13:14,838 ERROR [cc03d3bf-accd-4779-b30b-613ec97f72f8 main] status.SparkJobMonitor (SessionState.java:printError(1250)) - Job failed with java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper cannot be cast to org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerDirectAccess
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO - java.util.concurrent.ExecutionException: Exception thrown by job
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO - at org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:282)
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO - at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:287)
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO - at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:382)
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO - at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:343)
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO - at java.util.concurrent.FutureTask.run(FutureTask.java:266)
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO - at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO - at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
11-05-2022 06:13:14 CST dwt_to_ads_bi_user_daily_active INFO - at java.lang.Thread.run(Thread.java:748)
手动执行hql,结果都能出来。
发现问题:
当hashTable为空时,会发生这种情况。
官方解释:
解决方法:
执行hql前,添加命令如下:
set hive.mapjoin.optimized.hashtable=false;