List(“a”,“b”,“c”,“d”)表示具有一个字段的记录,因此结果集在每行中显示一个元素.
要获得预期的输出,该行应该包含四个字段/元素.因此,我们将列表包装为List((“a”,“b”,“c”,“d”)),它代表一行,包含四个字段.
以类似的方式,具有两行的列表作为List((“a1”,“b1”,“c1”,“d1”),(“a2”,“b2”,“c2”,“d2”))
scala> val list = sc.parallelize(List(("a", "b", "c", "d"))).toDF()
list: org.apache.spark.sql.DataFrame = [_1: string, _2: string, _3: string, _4: string]
scala> list.show
+---+---+---+---+
| _1| _2| _3| _4|
+---+---+---+---+
| a| b| c| d|
+---+---+---+---+
scala> val list = sc.parallelize(List(("a1","b1","c1","d1"),("a2","b2","c2","d2"))).toDF
list: org.apache.spark.sql.DataFrame = [_1: string, _2: string, _3: string, _4: string]
scala> list.show
+---+---+---+---+
| _1| _2| _3| _4|
+---+---+---+---+
| a1| b1| c1| d1|
| a2| b2| c2| d2|
+---+---+---+---+