kubernetes调度器之前已经分析过SchedulerCache、ScheduleAlgorithm、SchedulerExtender、Framework等核心数据结构,也分析了优选、调度、抢占流程的核心实现,本文是本系列目前打算的最后一章, 也是当前阶段对调度的学习的一个总结
整个系列文档我已经已经更新到语雀上了地址是,谢谢大家分享加微信一起交流 https://www.yuque.com/baxiaoshi/tyado3/
1. Binder
Binder负责将调度器的调度结果,传递给apiserver,即将一个pod绑定到选择出来的node节点
1.1 构建binder
在scheduler/factory中会构建一个默认的binder
func getBinderFunc(client clientset.Interface, extenders []algorithm.SchedulerExtender) func(pod *v1.Pod) Binder {
defaultBinder := &binder{client}
return func(pod *v1.Pod) Binder {
for _, extender := range extenders {
if extender.IsBinder() && extender.IsInterested(pod) {
return extender
}
}
return defaultBinder
}
}
1.2 binder接口实现
binder接口和简单只需要调用apiserver的pod的bind接口即可完成绑定操作
// Implement Binder interface
var _ Binder = &binder{}
// Bind just does a POST binding RPC.
func (b *binder) Bind(binding *v1.Binding) error {
klog.V(3).Infof("Attempting to bind %v to %v", binding.Name, binding.Target.Name)
return b.Client.CoreV1().Pods(binding.Namespace).Bind(binding)
}
1.3 不可思议的bind时机
执行绑定的操作位于Scheudler.bind接口,在调用Framework.RunBindPlugins后,只有当返回的状态不是成功,而是SKIP的时候,才执行bind操作,真的不知道是怎么想的,后续如果加入对应的bind插件,也需要返回SKIP,理解不了大神的思维
bindStatus := sched.Framework.RunBindPlugins(ctx, state, assumed, targetNode)
var err error
if !bindStatus.IsSuccess() {
if bindStatus.Code() == framework.Skip {
// 如果所有的插件都skip了菜允许将pod绑定到apiserver
err = sched.GetBinder(assumed).Bind(&v1.Binding{
ObjectMeta: metav1.ObjectMeta{Namespace: assumed.Namespace, Name: assumed.Name, UID: assumed.UID},
Target: v1.ObjectReference{
Kind: "Node",
Name: targetNode,
},
})
} else {
err = fmt.Errorf("Bind failure, code: %d: %v", bindStatus.Code(), bindStatus.Message())
}
}
2 调度组件核心流程概览
2.1 调度器初始化
2.1.1 调度器参数初始化
调度器的参数的初始化已经都放到defaultSchedulerOptions中了,后续应该更多的都会采用改种方式,避免散落在构建参数的各个阶段
var defaultSchedulerOptions = schedulerOptions{
schedulerName: v1.DefaultSchedulerName,
schedulerAlgorithmSource: schedulerapi.SchedulerAlgorithmSource{
Provider: defaultAlgorithmSourceProviderName(),
},
hardPodAffinitySymmetricWeight: v1.DefaultHardPodAffinitySymmetricWeight,
disablePreemption: false,
percentageOfNodesToScore: schedulerapi.DefaultPercentageOfNodesToScore,
bindTimeoutSeconds: BindTimeoutSeconds,
podInitialBackoffSeconds: int64(internalqueue.DefaultPodInitialBackoffDuration.Seconds()),
podMaxBackoffSeconds: int64(internalqueue.DefaultPodMaxBackoffDuration.Seconds()),
}
2.1.2 插件工厂注册表的初始化
插件工厂注册表的初始化分