Spark SQL的使用
本文主要给出了Spark1.4之前和之后IDE编译的区别
Spark1.4之前
在1.4版本之前table的注册使用的是registerAsTable方法
case class Person(name:String,age:Int)
def main(args: Array[String]): Unit = {
val sconf = new SparkConf().setMaster("local[5]").setAppName("SQL")
val sc = new SparkContext(sconf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
//将rdd隐式转换为schemeRDD
import sqlContext._
val people = sc.textFile("D:\\123.txt").map(_.split(","))
.map(p => Person(p(0),p(1).trim().toInt)).toDF()
people.registerASTable("people")
val res = sqlContext.sql("select name,age from peop")
res.map { t => t(0)+" "+t(1) } collect() foreach { println }
}
Spark1.4及以后
使用registerTempTable取代registerAsTable并且方法所在的包也由sqlContext._ 变为qlContext.implicits._
case class Person(name:String,age:Int)
def main(args: Array[String]): Unit = {
val sconf = new SparkConf().setMaster("local[5]").setAppName("SQL")
val sc = new SparkContext(sconf)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
//将rdd隐式转换为schemeRDD
import sqlContext.implicits._
val people = sc.textFile("D:\\123.txt").map(_.split(","))
.map(p => Person(p(0),p(1).trim().toInt)).toDF()
people.registerTempTable("people")
val res = sqlContext.sql("select name,age from peop")
res.map { t => t(0)+" "+t(1) } collect() foreach { println }
}
写在最后,一般我们在集群中直接写Scala时
case class Person(name:String,age:Int)
都是写在程序中的,但是在IDE中就需要放到main函数外面或者直接放到Object外面。