spark sql 调用save方法时的空指针

最近在研究spark sql ,但是遇到一个无力着手的问题,不多说,直接上代码:

public  static  void  main(String[] args){

    SparkConf conf=new SparkConf()
            .setMaster("local")
            .setAppName("MyGenericLoadAndSave");
    SparkContext sc=new SparkContext(conf);

    SQLContext sqlContext=new SQLContext(sc);
    DataFrame df=sqlContext.read().load("E:\\data\\users.parquet");
    DataFrame result=df.select("name", "favorite_color");
    System.out.println("result:"+result);
    if(result!=null&&result.write()!=null){
        System.out.println("write:"+result.write());
        result.write().save("E:\\data\\usersResult.parquet");
    }

}

期待的结果是在e盘的data目录新建一个结果文件,但是结果有点意外,不多说,直接上异常:

result:[name: string, favorite_color: string]
write:org.apache.spark.sql.DataFrameWriter@24f545bc
16/04/11 17:18:08 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 86.4 KB, free 86.4 KB)
16/04/11 17:18:08 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 19.3 KB, free 105.7 KB)
16/04/11 17:18:08 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:51207 (size: 19.3 KB, free: 2.4 GB)
16/04/11 17:18:08 INFO SparkContext: Created broadcast 1 from save at MyGenericLoadAndSave.java:27
16/04/11 17:18:09 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 225.3 KB, free 331.0 KB)
16/04/11 17:18:09 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 19.3 KB, free 350.3 KB)
16/04/11 17:18:09 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on localhost:51207 (size: 19.3 KB, free: 2.4 GB)
16/04/11 17:18:09 INFO SparkContext: Created broadcast 2 from save at MyGenericLoadAndSave.java:27
Exception in thread "main" java.lang.NullPointerException
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1010)
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
    at org.apache.hadoop.util.Shell.run(Shell.java:455)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
    at org.apache.hadoop.util.Shell.execCommand(Shell.java:808)
    at org.apache.hadoop.util.Shell.execCommand(Shell.java:791)
    at org.apache.hadoop.fs.FileUtil.execCommand(FileUtil.java:1097)
    at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:582)
    at org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getPermission(RawLocalFileSystem.java:557)
    at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anonfun$buildInternalScan$1$$anon$1$$anonfun$9.apply(ParquetRelation.scala:344)
    at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anonfun$buildInternalScan$1$$anon$1$$anonfun$9.apply(ParquetRelation.scala:337)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
    at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
    at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
    at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anonfun$buildInternalScan$1$$anon$1.<init>(ParquetRelation.scala:337)
    at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anonfun$buildInternalScan$1.apply(ParquetRelation.scala:327)
    at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anonfun$buildInternalScan$1.apply(ParquetRelation.scala:327)
    at org.apache.spark.util.Utils$.withDummyCallSite(Utils.scala:2189)
    at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation.buildInternalScan(ParquetRelation.scala:326)
    at org.apache.spark.sql.sources.HadoopFsRelation.buildInternalScan(interfaces.scala:661)
    at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$10.apply(DataSourceStrategy.scala:131)
    at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$10.apply(DataSourceStrategy.scala:131)
    at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:292)
    at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$pruneFilterProject$1.apply(DataSourceStrategy.scala:291)
    at org.apache.spark.sql.execution.datasources.DataSourceStrategy$.pruneFilterProjectRaw(DataSourceStrategy.scala:370)
    at org.apache.spark.sql.execution.datasources.DataSourceStrategy$.pruneFilterProject(DataSourceStrategy.scala:287)
    at org.apache.spark.sql.execution.datasources.DataSourceStrategy$.apply(DataSourceStrategy.scala:127)
    at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
    at org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
    at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
    at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:59)
    at org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:47)
    at org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:45)
    at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:52)
    at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:52)
    at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:53)
    at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation.run(InsertIntoHadoopFsRelation.scala:108)
    at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)
    at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)
    at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
    at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
    at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55)
    at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)
    at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:256)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:139)
    at main.java.local.loadAndSave.MyGenericLoadAndSave.main(MyGenericLoadAndSave.java:27)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
16/04/11 17:18:09 INFO SparkContext: Invoking stop() from shutdown hook

多么刺眼的空指针异常啊,没事,从我们开始学java的时候就经常跟它打交道,真的是小菜一碟,空指针嘛,当然是你的对象为空了,然而

result:[name: string, favorite_color: string]
write:org.apache.spark.sql.DataFrameWriter@24f545bc

我打印出来的对象并不为空,那么只能说是调用save()方法的时候,save方法报空了,这个是spark的问题吗?我已经有点懵了,欢迎各位拍砖

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值