问题
之前开发的时候遇到. Failed to get broadcast_0_piece0 of broadcast_0异常.
20/07/03 15:58:50 ERROR Utils: Exception encountered
org.apache.spark.SparkException: Failed to get broadcast_0_piece0 of broadcast_0
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.scala:178)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:150)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:150)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$readBlocks(TorrentBroadcast.scala:150)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:222)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1303)
at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:206)
at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:139)
at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:212)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:94)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
原因&解决措施
去节点查看了一下知道. 基本上是第一次没有执行完毕, 随后又开了一个SparkContext导致的.
使用单例模式替换即可. 单例模式代码如下所示(此处使用饿汉模式):
// SparkConfig 要写成单例模式.
public class SeanSparkConfig {
public static JavaSparkContext sparkContext;
public static String exameDateFilePath = "data/demos/exam/exam.txt";
static {
//boolean runLocalFlag = false;
boolean runLocalFlag = true;
SparkConf sparkconf;
if(runLocalFlag) {
sparkconf = new SparkConf().setAppName("Examle").setMaster("local[2]");
}else {
sparkconf = new SparkConf().setAppName("Examle").setMaster("spark://localhost:7077").setJars(new String[] {"/Users/sean/Documents/Gitrep/bigdata/spark/target/spark-demo.jar"});
}
sparkContext = new JavaSparkContext(sparkconf);
}
}
使用. 使用时直接调用即可.
JavaRDD<String> examLinesRDD = SeanSparkConfig.sparkContext.textFile(SeanSparkConfig.exameDateFilePath);
1
Reference
[1]. Failed to get broadcast_10_piece0 of broadcast_10
Failed to get broadcast_10_piece0 of broadcast_10 - 简书
[2]. Spark常见异常:Failed to get broadcast_32_piece0 of broadcast_32
Spark常见异常:Failed to get broadcast_32_piece0 of broadcast_32-CSDN博客
[3]. repartition导致的广播失败,关于错误Failed to get broadcast_544_piece0 of broadcast_544