根据spark1.6.0官网
1.类反射的方式
先获得某个实体类的RDD,然后用toDF()方法
val people = sc.textFile("examples/src/main/resources/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt)).toDF()
2.编程方式
1.创建Row类型的RDD
2.创建跟Rows中结构匹配的schema(通过StructType)
3.使用sqlContext.createDataFrame(rowRDD, schema)
// Generate the schema based on the string of schema
val schema =
StructType(
schemaString.split(" ").map(fieldName => StructField(fieldName, StringType, true)))
// Convert records of the RDD (people) to Rows.
val rowRDD = people.map(_.split(",")).map(p => Row(p(0), p(1).trim))
// Apply the schema to the RDD.
val peopleDataFrame = sqlContext.createDataFrame(rowRDD, schema)