The entry point to programming Spark with the Dataset and DataFrame API.
In environments that this has been created upfront (e.g. REPL, notebooks), use the builder
to get an existing session:
SparkSession.builder().getOrCreate()
The builder can also be used to create a new session:
SparkSession.builder
.master("local")
.appName("Word Count")
.config("spark.some.config.option", "some-value")
.getOrCreate()
SparkSession 是 Spark SQL 的入口。
def getOrCreate(): SparkSession = synchronized {
// Get the session from current thread's active session.
var session = activeThreadSession.get()
if ((session ne null) && !session.sparkContext.isStopped) {
options.foreach { case (k, v) => session.sessionState.conf.setConfString(k, v) }
if (options.nonEmpty) {
logWarning("Using an existing SparkSession; some configuration may not take effect.")
}
return session
}
...
// If the current thread does not have an active session, get it from the global session.
session = defaultSession.get()
...
}
SparkSession 获取meta data 信息
import org.apache.spark.sql.SparkSession
val spark = SparkSession.getActiveSession.orElse(SparkSession.getDefaultSession).get
val tableIdent = spark.sessionState.sqlParser.parseTableIdentifier("t2")
val flag = spark.sessionState.catalog.isTempView(tableIdent)
println("flag:" + flag)
Literal(2)