spark 算子内部引用sparkSession对象报java.lang.NullPointerException解决方法

异常

我的代码如下:

object Test{
	def main(args:Array[String]):Unit={
		val sparkConf = new sparkConf().setAppName("test").setMaster("local[*]")
		val sc = new SparkContext(sparkConf)
	
		val spark = SparkSession.builder().appName("test").master("local[*]").getOrCreate()
	
		val df_tb_test1:DataFrame = Prop.readTab("tb_test1",spark)
	
		df_tb_test.foreach(cols => {
			val col1 = cols.getString(0)
			Prop.readTab("tb_test2",spark)	//在这个foreach内部再次使用sparkSession对象
		})	
	}	
}

spark算子中使用spark对象的时候,会报出如下的错误,空指针异常:

java.lang.NullPointerException
	at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:135)
	at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:133)
	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:632)
	at com.mycase.test.TestSpark$.queryPersonByAge(TestSpark.scala:39)
	at com.mycase.test.TestSpark$$anonfun$1.apply(TestSpark.scala:28)
	at com.mycase.test.TestSpark$$anonfun$1.apply(TestSpark.scala:27)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
	at scala.collection.Iterator$class.foreach(Iterator.scala:893)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
	at scala.collection.AbstractIterator.to(Iterator.scala:1336)
	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1336)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1336)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:936)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:936)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:108)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
18/12/29 21:44:32 ERROR Executor: Exception in task 2.0 in stage 0.0 (TID 2)
java.lang.NullPointerException
	at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:135)
	at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:133)
	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:632)
	at com.mycase.test.TestSpark$.queryPersonByAge(TestSpark.scala:39)
	at com.mycase.test.TestSpark$$anonfun$1.apply(TestSpark.scala:28)
	at com.mycase.test.TestSpark$$anonfun$1.apply(TestSpark.scala:27)
	at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
	at scala.collection.Iterator$class.foreach(Iterator.scala:893)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
	at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
	at scala.collection.AbstractIterator.to(Iterator.scala:1336)
	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
	at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1336)
	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
	at scala.collection.AbstractIterator.toArray(Iterator.scala:1336)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:936)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1$$anonfun$13.apply(RDD.scala:936)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
	at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2069)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
	at org.apache.spark.scheduler.Task.run(Task.scala:108)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
18/12/29 21:44:32 ERROR Executor: Exception in task 7.0 in stage 0.0 (TID 7)

异常原因

我们的spark算子中要使用到sparkSession,首先,我们要弄清楚spark算子和sparkSession分别存在的位置是在哪里,如下图:
在这里插入图片描述
很明显sparkSession是存在于driver端,而我们的算子是存在于每一个worker端的excutor,所以在程序执行的过程中,要是算子中想要调用sparkSession的话是调不到的,虽然有对象,但是都是null,会报出空指针异常的错误。

解决方法

在这里有个人可以提供两个方法进行处理

  1. 将要foreach的数据先collect,然后生成一个数组,遍历数组进行处理
object Test{
	def main(args:Array[String]):Unit={
	val sparkConf = new sparkConf().setAppName("test").setMaster("local[*]")
	val sc = new SparkContext(sparkConf)
	
	val spark = SparkSession.builder().appName("test").master("local[*]").getOrCreate()
	
	val arr_tb_test1:Array[(String,String)] = Prop.readTab("tb_test1",spark).rdd.map(row => (row(0).toString,row(1).toString)).collect
	
	arr_tb_test1.foreach(cols => {
		val col1 = cols._1
		Prop.readTab("tb_test2",spark)
	})	
}	
}	
  1. 将spark和sc单独写成一个单例对象,通过单例对象获得spark和sc
class Demo() {
	val sparkConf = new sparkConf().setAppName("test").setMaster("local[*]")
	val sc = new SparkContext(sparkConf)
	
	val spark = SparkSession.builder().appName("test").master("local[*]").getOrCreate()
}

object Demo {
	private val demo = new Demo
	def getContext():Demo={
		demo
	}
}
object Test{
	def main(args:Array[String]):Unit={
		val df_tb_test1:DataFrame = Prop.readTab("tb_test1",Demo.getContext.spark)
		df_tb_test1.foreach(cols => {
			val col1 = cols.getString(0)
			Prop.readTab("tb_test2",Demo.getContext.spark)	//在这个foreach内部使用单例对象调sparkSess
		})	
	}	
}
  • 5
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值