inputdstream mysql_This is because the DStream object is being referred to from within the closure

昨天把项目代码重新整合了一下, 然后就报错了:

ERROR StreamingContext: Error starting the context, marking it as stopped

java.io.NotSerializableException: DStream checkpointing has been enabled but the DStreams with their functions are not serializable

com.gac.xs6.core.impl.EnAdapterImpl$

java.io.NotSerializableException: Object of org.apache.spark.streaming.kafka010.DirectKafkaInputDStream$DirectKafkaInputDStreamCheckpointData is being serialized possibly as a part of closure of an RDD operation. This is because the DStream object is being referred to from within the closure. Please rewrite the RDD operation inside this DStream to avoid this. This has been enforced to avoid bloating of Spark tasks with unnecessary objects.

19/06/05 10:03:05 org.apache.spark.internal.Logging$class.logError(Logging.scala:91) ERROR Utils: Exception encountered

java.io.NotSerializableException: Object of org.apache.spark.streaming.dstream.DStreamCheckpointData is being serialized possibly as a part of closure of an RDD operation. This is because the DStream object is being referred to from within the closure. Please rewrite the RDD operation inside this DStream to avoid this. This has been enforced to avoid bloating of Spark tasks with unnecessary objects.

原因可能是 mysql 数据库配置的序列化问题:

@Singleton

class MysqlConfiguration extends Serializable {

private val config: Config = ConfigFactory.load()

lazy val mysqlConf: Config = config.getConfig("mysql")

lazy val mysqlJdbcUrl: String = mysqlConf.getString("jdbcUrl")

lazy val mysqlJdbcDriver: String = mysqlConf.getString("jdbcDriver")

lazy val dataSourceUser: String = mysqlConf.getString("dataSource.user")

lazy val dataSourcePassword: String = mysqlConf.getString("dataSource.password")

lazy val dataSourceSize: Int = mysqlConf.getInt("dataSource.size")

}

但是我已经这么序列化了。

@Singleton

class NationApplication @Inject()(sparkConf: SparkConfiguration,

kafkaConf: KafkaConfiguration,

hbaseConf: HbaseConfiguration,

sparkContext: NaSparkContext[SparkContext],

mysqlConf: MysqlConfiguration,

source: NaDStreamSource[Option[(String, String)]],

adapter: SourceDecode,

sanitizer: DataSanitizer,

eventsExact: EventsExact,

eventsCheck: EventsCheck,

anomalyExact: AnomalyExact,

anomalyDetection: AnomalyDetection

) extends Serializable with Logging {

我在类里面添加了 MySQL 配置。然后在 eventStream.foreachRDD 里面引入

了 mysqlConf:

def saveAnomaly(eventStream: DStream[(String, EventUpdate)]): Unit = {

eventStream.foreachRDD((x: RDD[(String, EventUpdate)]) => {

if (!x.isEmpty()) {

x.foreachPartition { res: Iterator[(String, EventUpdate)] => {

// val mysqlConf = new MysqlConfiguration

val c = new GacJdbcUtil(mysqlConf) // 获取 MySQL 数据库配置

...

结论:

在 foreachRDD 里面,使用 new 方法创建一个新的连接, 不使用类传递过来的配置

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值