SparkSql1.x dataFrame创建方式

1.使用Scala方式的方式来创建DataFrame

	case class People(name:String,age:Int)

	val conf = new SparkConf().setMaster("local[2]").setAppName("反射方式创建DataFrame")
    val sc = new SparkContext(conf)
    val sqlContext = new SQLContext(sc)
    val file: RDD[String] = sc.textFile("E://demo.txt")
    val peopleRdd: RDD[People] = file.map(_.split(" ")).map(p => People(p(0), p(1).toInt))
    import sqlContext.implicits._
    val peopleDF: DataFrame = peopleRdd.toDF()
    peopleDF.show()
    peopleDF.createOrReplaceTempView("people")
    val frame: DataFrame = sqlContext.sql("select * from people")
    frame.show()

2.使用StructType的方式创建DataFrame的方式来创建DataFrame

	val conf: SparkConf = new SparkConf().setAppName("StructType的方式创建DataFrame").setMaster("local[2]")
    val sc = new SparkContext(conf)
    val sqlContext = new SQLContext(sc)
    import org.apache.spark.sql.Row
    import org.apache.spark.sql.types.{StructField, StructType,StringType,IntegerType}
    val file: RDD[String] = sc.textFile("E://demo.txt")
    val rowRDD = file.map(_.split(" ")).map(
      x=>{
        Row(x(0),x(1).toInt)
      }
    )
    val schema = StructType(
//      List(
//        StructField("name", StringType, true),
//        StructField("age", IntegerType, true)
//      )
      StructField("name", StringType, true)::
      StructField("age", IntegerType, true)::Nil
    )
    val peopleDF: DataFrame = sqlContext.createDataFrame(rowRDD,schema)
    peopleDF.show()

3.加载json文件、csv文件、jdbc连接数据库等方式来创建DataFrame

	val conf: SparkConf = new SparkConf().setAppName("通过json文件的方式创建DataFrame").setMaster("local[2]")
    val sc = new SparkContext(conf)
    sc.setLogLevel("WARN")
    val sqlContext = new SQLContext(sc)
    //加载方式1
    val frame: DataFrame = sqlContext.read.json("E://people.json")
    //加载方式2
    //val frame: DataFrame = sqlContext.read.format("json").load("E:// people.json")
    //val frame: DataFrame = sqlContext.read.parquet("E://people.parquet")
    frame.createOrReplaceTempView("people")
    sqlContext.sql("select * from people").show()
    //保存方式
    frame.write.json("E://test.json")
    frame.write.csv("E://test.csv")
    frame.write.parquet("E://test.parquet")
    frame.write.format("json").save("E://test1.json")

jdbc方式

    val conf = new SparkConf().setMaster("local[2]").setAppName("加载jdbc数据源")
    val sc = new SparkContext(conf)
    val sqlContext = new SQLContext(sc)
    val url = "jdbc:mysql://ip:3306/mk"
    val table = "user"
    val properties = new Properties()
    properties.setProperty("user","youuser")
    properties.setProperty("password","youpassword")
    properties.setProperty("driver","com.mysql.jdbc.Driver")
    //方式1
    val df = sqlContext.read.jdbc(url,table,properties)
    //方式2
    // val jdbcDF = spark.read.format("jdbc").option("url", "jdbc:mysql://localhost:3306/***")  //*****这是数据库名
    //      .option("driver", "com.mysql.jdbc.Driver").option("dbtable", "****")//*****是表名
    //      .option("user", "*****").option("password", "*****").load()
    df.createOrReplaceTempView("dbs")
    sqlContext.sql("select count(1) from dbs").show()
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值