一、JAVA list 转 DataFrame or DataSet
1 2 3 4 5 6 7 8 9 10 | case class CaseJava( var num : String, var id : String, var start _ time : String, var istop _ time : String) val listData : java.util.List[CaseJava] = new java.util.ArrayList[CaseJava] listData.add( new CaseJava( "11" , "22" , "33" , "44" )) val dataFrame = spark.createDataFrame(listData, classOf[CaseJava]) |
二、scala MutableList 转 DataFrame or DataSe
1、方式一:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | val spark = SparkSession.builder().appName( "Spark-SQL" ).master( "local[2]" ).getOrCreate() import spark.implicits. _ var tom = new TestPerson( "Tom Hanks" , 37 , 35.5 ) var sam = new TestPerson( "Sam Smith" , 40 , 40.5 ) val PersonList = mutable.MutableList[TestPerson]() //Adding data in list PersonList + = tom PersonList + = sam //It will be work. var personDS = Seq(PersonList).toDS() |
2、方式二:
1 2 3 4 5 6 7 8 9 10 11 12 13 | case class TestPerson(name : String, age : Long, salary : Double) val spark = SparkSession.builder().appName( "List to Dataset" ).master( "local[*]" ).getOrCreate() var tom = new TestPerson( "Tom Hanks" , 37 , 35.5 ) var sam = new TestPerson( "Sam Smith" , 40 , 40.5 ) // mutable.MutableList[TestPerson]() is not required , i used below way which was // cleaner val PersonList = List(tom,sam) import spark.implicits. _ PersonList.toDS().show |
3、方式三:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | case class TestPerson(name : String, age : Long, salary : Double) val tom = TestPerson( "Tom Hanks" , 37 , 35.5 ) val sam = TestPerson( "Sam Smith" , 40 , 40.5 ) val PersonList = mutable.MutableList[TestPerson]() PersonList + = tom PersonList + = sam val personDS = PersonList.toDS() println(personDS.getClass) personDS.show() val personDF = PersonList.toDF() println(personDF.getClass) personDF.show() personDF.select( "name" , "age" ).show() |
更多请参考:https://stackoverflow.com/questions/39397652/convert-scala-list-to-dataframe-or-dataset