【Spark】object not serializable (class: A)

 

异常信息如下:

Exception in thread "main" org.apache.spark.SparkException: Task not serializable
    at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:345)
    at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:335)
    at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:159)
    at org.apache.spark.SparkContext.clean(SparkContext.scala:2299)
    at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:371)
    at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:370)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
    at org.apache.spark.rdd.RDD.map(RDD.scala:370)
    at com.sangfor.sdp.hbase.bulkload.BulkLoadData$$anonfun$main$1.apply$mcVI$sp(BulkLoadData.scala:86)
    at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
    at com.sangfor.sdp.hbase.bulkload.BulkLoadData$.main(BulkLoadData.scala:84)
    at com.sangfor.sdp.hbase.bulkload.BulkLoadData.main(BulkLoadData.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.NotSerializableException: com.sdp.hbase.entity.IndexRowkeyMetaData
Serialization stack:
    - object not serializable (class: com.sangfor.sdp.hbase.entity.IndexRowkeyMetaData, value: com.sangfor.sdp.hbase.entity.IndexRowkeyMetaData@4745bcc6)
    - field (class: com.sangfor.sdp.hbase.bulkload.BulkLoadData$$anonfun$main$1$$anonfun$5, name: indexRowkeyMetaData$1, type: class com.sangfor.sdp.hbase.entity.IndexRowkeyMetaData)
    - object (class com.sangfor.sdp.hbase.bulkload.BulkLoadData$$anonfun$main$1$$anonfun$5, <function1>)
    at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
    at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
    at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
    at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:342)
    ... 23 more

 

这是因为spark本身分发任务的时候,对象本身需要做序列化操作。如果没做,则在服务之间的无法做远程对象通信RPC。

有两种解决的方案:

一种是实体类集成

java.io.Serializable 接口

另一种是:

sparkConf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")

sparkConf.registerKryoClasses(Array(classOf[com.sdp.hbase.entity.IndexRowkeyMetaData]))

来指定spark序列化的方案和对象。

 

转载于:https://www.cnblogs.com/yankang/p/10582686.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值