SparkContext中getCallSite方法的作用
先看图片如下:
源码如下:
def getCallSite(skipClass: String => Boolean = sparkInternalExclusionFunction): CallSite = {
// Keep crawling up the stack trace until we find the first function not inside of the spark
// package. We track the last (shallowest) contiguous Spark method. This might be an RDD
// transformation, a SparkContext function (such as parallelize), or anything else that leads
// to instantiation of an RDD. We also track the first (deepest) user method, file, and line.
//不断地向上追踪栈,直到我们发现不在spark包中的第一个方法,我们追踪最后(最浅)连续的spark方法,
//这可能是一个RDD的转换算子,一个sparkContext函数(例如parallelize)或者其他任何导致对RDD实例化的
//操作我们也追踪第一个(最深)用户方法、文件、和行。
var lastSparkMethod = "<unknown>"
var firstUserFile = "<unknown>"
var firstUserLine = 0
var insideSpark = true
var callStack = new ArrayBuffer[String]() :+ "<unknown>"
Thread.currentThread.getStackTrace().foreach { ste: StackTraceElement =>
// When running under some profilers, the current stack trace might contain some bogus
// frames. This is intended to ensure that we don't crash in these situations by
// ignoring any frames that we can't examine.
//当在一些分析器下运行时,当前的堆栈跟踪可能包含一些伪造的信息。
// 这是为了确保在这些情况下我们不会忽略任何我们不能检查的框架
//如果堆栈元素不为空且堆栈方法名称包含getStackTrace()
if (ste != null && ste.getMethodName != null
&& !ste.getMethodName.contains("getStackTrace")) {
if (insideSpark) {
//skipClass作用:如果是spark内部类或者scala类则返回true
if (skipClass(ste.getClassName)) {
//如果栈元素的方法名为init则把类名赋值给lastSparkMoth否则把方法名赋值给lastSparkMethod
lastSparkMethod = if (ste.getMethodName == "<init>") {
// Spark method is a constructor; get its class name
ste.getClassName.substring(ste.getClassName.lastIndexOf('.') + 1)
} else {
ste.getMethodName
}
callStack(0) = ste.toString // Put last Spark method on top of the stack trace.
} else {
if (ste.getFileName != null) {
firstUserFile = ste.getFileName
if (ste.getLineNumber >= 0) {
firstUserLine = ste.getLineNumber
}
}
callStack += ste.toString
insideSpark = false
}
} else {
callStack += ste.toString
}
}
}
val callStackDepth = System.getProperty("spark.callstack.depth", "20").toInt
val shortForm =
if (firstUserFile == "HiveSessionImpl.java") {
// To be more user friendly, show a nicer string for queries submitted from the JDBC
// server.
"Spark JDBC Server Query"
} else {
s"$lastSparkMethod at $firstUserFile:$firstUserLine"
}
val longForm = callStack.take(callStackDepth).mkString("\n")
CallSite(shortForm, longForm)
}