我试图使用toDF()将Spark RDD转换为Spark SQL数据帧.我已成功多次使用此函数,但在这种情况下,我收到编译器错误:
error: value toDF is not a member of org.apache.spark.rdd.RDD[com.example.protobuf.SensorData]
这是我的代码如下:
// SensorData is an auto-generated class
import com.example.protobuf.SensorData
def loadSensorDataToRdd : RDD[SensorData] = ???
object MyApplication {
def main(argv: Array[String]): Unit = {
val conf = new SparkConf()
conf.setAppName("My application")
conf.set("io.compression.codecs", "com.hadoop.compression.lzo.LzopCodec")
val sc = new SparkContext(conf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.implicits._
val sensorDataRdd = loadSensorDataToRdd()
val sensorDataDf = sensorDataRdd.toDF() //
}
}
我猜测问题出在SensorData类上,这是一个从Protocol Buffer自动生成的Java类.为了将RDD转换为数据帧,我该怎么办?