Spark报错处理系列之:org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 312.0 failed 4 times, most recent failure: Lost task 0.3 in stage 312.0 TID 9203,dn-005, executor 236: java.io.FileNotFoundException: File does not exist ... It is possible the underlying files have been updated. You can explicitly invalidate the cache in Spark by running 'REFRESH TABLE tableName' command in SQL or by recreating the Dataset/DataFrame involved.
一、完整报错
- org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 312.0 failed 4 times, most recent failure: Lost task 0.3 in stage 312.0 (TID 9203,dn-005, executor 236): java.io.FileNotFoundException: File does not exist: hdfs://…/dwh/dwd/optics_i/datetime=20230107/part-00002-a0t96718-c113-4u66-a833-fb2a4dfu62dd.c000.zstd.parquet
It is possible the underlying files have been updated. You can explicitly invalidate the cache in Spark by running ‘REFRESH TABLE tableName’ command in SQL or by recreating the Dataset/DataFrame involved.
at org.apache.spark.sql.execution.datasources.FileScanRDDKaTeX parse error: Can't use function '$' in math mode at position 5: a
本文针对Spark报错'Job aborted due to stage failure: Task ... FileNotFoundException'进行深入分析,错误源于指定HDFS文件不存在。解决方案包括检查文件是否被更新或删除,以及在SQL中使用REFRESH TABLE命令刷新元数据。
订阅专栏 解锁全文
1020

被折叠的 条评论
为什么被折叠?



