simba.sql("Select * from b") 调用SparkSession.scala的sql()方法:
def sql(sqlText: String): DataFrame = {
Dataset.ofRows(self, sessionState.sqlParser.parsePlan(sqlText))
}
Dataset.ofRows()方法:
def ofRows(sparkSession: SparkSession, logicalPlan: LogicalPlan): DataFrame = {
val qe = sparkSession.sessionState.executePlan(logicalPlan)
qe.assertAnalyzed()
new Dataset[Row](sparkSession, qe, RowEncoder(qe.analyzed.schema))
}
SessionState.scala下面的executePlan()方法:
def executePlan(plan: LogicalPlan): QueryExecution = new QueryExecution(sparkSession, plan)
至此,定义了一个QueryExecution()方法,用来直接相关的SparkSQL查询。
def assertAnalyzed(): Unit = {
try sparkSession.sessionState.analyzer.checkAnalysis(analyzed) catch {
case e: AnalysisException =>
val ae = new AnalysisException(e.message, e.line, e.startPosition, Some(analyzed))
ae.setStackTrace(e.getStackTrace)
throw ae
}
}
检测分析是否发生异常
new Dataset[Row](sparkSession, qe, RowEncoder(qe.analyzed.schema))引用
class Dataset[T] private[sql](
@transient val sparkSession: SparkSession,
@DeveloperApi @InterfaceStability.Unstable @transient val queryExecution: QueryExecution,
encoder: Encoder[T])
获得查询数据,至此结束。