apache spark java,Spark作业由于java.io.NotSerializableException而失败:org.apache.spark.SparkContext...

I am facing above exception when I am trying to apply a method(ComputeDwt) on RDD[(Int,ArrayBuffer[(Int,Double)])] input.

I am even using extends Serialization option to serialize objects in spark.Here is the code snippet.

input:series:RDD[(Int,ArrayBuffer[(Int,Double)])]

DWTsample extends Serialization is a class having computeDwt function.

sc: sparkContext

val kk:RDD[(Int,List[Double])]=series.map(t=>(t._1,new DWTsample().computeDwt(sc,t._2)))

Error:

org.apache.spark.SparkException: Job failed: java.io.NotSerializableException: org.apache.spark.SparkContext

org.apache.spark.SparkException: Job failed: java.io.NotSerializableException: org.apache.spark.SparkContext

at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:760)

at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:758)

at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60)

at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)

at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:758)

at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:556)

at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:503)

at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:361)

at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$run(DAGScheduler.scala:441)

at org.apache.spark.scheduler.DAGScheduler$$anon$1.run(DAGScheduler.scala:149)

Could anyone suggest me what could be the problem and what should be done to overcome this issue?

解决方案

The line

series.map(t=>(t._1,new DWTsample().computeDwt(sc,t._2)))

references the SparkContext (sc) but SparkContext isn't serializable. SparkContext is designed to expose operations that are run on the driver; it can't be referenced/used by code that's run on workers.

You'll have to re-structure your code so that sc isn't referenced in your map function closure.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值