#Spark的调度策略
Spark目前有两种调度策略,一种是FIFO即先来先得,另一种是FAIR即公平策略。所谓的调度策略就是对待调度的对象进行排序,按照优先级来进行调度。调度的排序接口如下所示,就是对两个可调度的对象进行比较。
private[spark] trait SchedulingAlgorithm {
def comparator(s1: Schedulable, s2: Schedulable): Boolean
}
其实现类为FIFOSchedulingAlgorithm、FairSchedulingAlgorithm
/**
* FIFO排序的实现,主要因素是优先级、其次是对应的Stage
* 优先级高的在前面,优先级相同,则靠前的stage优先
*/
private[spark] class FIFOSchedulingAlgorithm extends SchedulingAlgorithm {
override def comparator(s1: Schedulable, s2: Schedulable): Boolean = {
//一般来说优先级越小优先级越高
val priority1 = s1.priority
val priority2 = s2.priority
var res = math.signum(priority1 - priority2)
if (res == 0) {
//如果优先级相同,那么Stage靠前的优先
val stageId1 = s1.stageId
val stageId2 = s2.stageId
res = math.signum(stageId1 - stageId2)
}
if (res < 0) {
true
} else {
false
}
}
}
注:
可以根据自己对优先级的定义重写这个比较方法,但有一点注意,就是如果优先级和Stage都相同,那么默认后来居上
private[spark] class FairSchedulingAlgorithm extends SchedulingAlgorithm {
override def comparator(s1: Schedulable, s2: Schedulable): Boolean = {
//最小共享,可以理解为执行需要的最小资源即CPU核数,其他相同时,所需最小核数小的优先调度
val