Deployment是怎么工作的?(上篇)

Deployment是怎么工作的?(上)

在之前的两篇文章中,

ReplicaSet是怎么工作的?(上)

ReplicaSet是怎么工作的?(下)

我们梳理了ReplicaSet的工作模式,知道了ReplicasSet为了Pods的数量始终保持预期,不断的创建和删除Pod操作。简单来说,为了能快速灵活处理每个ReplicaSet,它使用了一个队列

  • 写入端通过监听ReplicaSet和Pod的变更事件,把对应的ReplicaSet控制器索引加入到队列当中
  • 消费端通过从队列中取得ReplicaSet控制器索引,从缓存中取得控制器,并执行相应的操作

既然有了ReplicaSet,是否能够直接用于实例的发布呢?

试验一下,使用ReplicaSet创建2个pod,镜像版本是v1

✗ kubectl get pod               
NAME                       READY   STATUS    RESTARTS   AGE
rs-test-6kvbb              1/1     Running   0          58s
rs-test-qts2s              1/1     Running   0          58s
rs-test-x2h7t              1/1     Running   0          58s

之后修改ReplicaSet描述文件的镜像版本为v2,发现pod并不会更新。

✗ kubectl get pod               
NAME                       READY   STATUS    RESTARTS   AGE
rs-test-6kvbb              1/1     Running   0          3m32s
rs-test-qts2s              1/1     Running   0          3m32s
rs-test-x2h7t              1/1     Running   0          3m32s

因为我们知道ReplicaSet对比的只是pod的label,没有检测镜像版本。此时,删除一个实例,让ReplicaSet重新拉起,新实例的镜像版本是v2,而原先的两个实例依旧是v1。

✗ kubectl get pod               
NAME                       READY   STATUS    RESTARTS   AGE
rs-test-6kvbb              1/1     Running   0          4m57s
rs-test-qts2s              1/1     Running   0          4m57s
rs-test-tccxq              1/1     Running   0          74s   #删除重新拉起的

所以要使用ReplicaSet更新版本,操作的流程应该为

  1. 替换ReplicaSet描述文件的版本,为最新版本,如v1->v2
  2. 根据策略逐个删除ReplicaSet所管辖的Pod实例
  3. 等待ReplicaSet拉起新的Pod实例
  4. 等待所有旧版本pod删除,新版本拉起

这种方式操作繁琐,也不够自动化,因此k8s使用Deployment来管理ReplicaSet,以便优雅的发布。

Deployment

k8s官网对Deployment的定义如下:

一个 Deployment 为 Pod 和 ReplicaSet 提供声明式的更新能力。

源码阅读

方法入口

按照惯例,先找Deployment控制器的入口,在controller-manager的NewControllerInitializers()方法中找到Deployment控制器的运行方法:

func startDeploymentController(ctx ControllerContext) (http.Handler, bool, error) {
	dc, err := deployment.NewDeploymentController(
		ctx.InformerFactory.Apps().V1().Deployments(),
		ctx.InformerFactory.Apps().V1().ReplicaSets(),
		ctx.InformerFactory.Core().V1().Pods(),
		ctx.ClientBuilder.ClientOrDie("deployment-controller"),
	)
	if err != nil {
		return nil, true, fmt.Errorf("error creating Deployment controller: %v", err)
	}
	go dc.Run(int(ctx.ComponentConfig.DeploymentController.ConcurrentDeploymentSyncs), ctx.Stop)
	return nil, true, nil
}

通过context传入了一个Deployments的informer,一个ReplicaSet的informer,和一个Pod的informer。分别通知Deployment、ReplicaSet和Pod的事件。我们看下构造方法NewDeploymentController(),看过ReplicaSet介绍的读者会发现结构很熟悉:

  • 注入的三个informer分别添加了事件处理的方法
  • 定义了一个同步处理的方法dc.syncHandler = dc.syncDeployment
  • 新建了一个队列dc.enqueueDeployment = dc.enqueue
// NewDeploymentController creates a new DeploymentController.
func NewDeploymentController(dInformer appsinformers.DeploymentInformer, rsInformer appsinformers.ReplicaSetInformer, podInformer coreinformers.PodInformer, client clientset.Interface) (*DeploymentController, error) {
	eventBroadcaster := record.NewBroadcaster()
	eventBroadcaster.StartStructuredLogging(0)
	eventBroadcaster.StartRecordingToSink(&v1core.EventSinkImpl{Interface: client.CoreV1().Events("")})

	if client != nil && client.CoreV1().RESTClient().GetRateLimiter() != nil {
		if err := ratelimiter.RegisterMetricAndTrackRateLimiterUsage("deployment_controller", client.CoreV1().RESTClient().GetRateLimiter()); err != nil {
			return nil, err
		}
	}
	dc := &DeploymentController{
		client:        client,
		eventRecorder: eventBroadcaster.NewRecorder(scheme.Scheme, v1.EventSource{Component: "deployment-controller"}),
		queue:         workqueue.NewNamedRateLimitingQueue(workqueue.DefaultControllerRateLimiter(), "deployment"),
	}
	dc.rsControl = controller.RealRSControl{
		KubeClient: client,
		Recorder:   dc.eventRecorder,
	}

	dInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{
		AddFunc:    dc.addDeployment,
		UpdateFunc: dc.updateDeployment,
		// This will enter the sync loop and no-op, because the deployment has been deleted from the store.
		DeleteFunc: dc.deleteDeployment,
	})
	rsInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{
		AddFunc:    dc.addReplicaSet,
		UpdateFunc: dc.updateReplicaSet,
		DeleteFunc: dc.deleteReplicaSet,
	})
	podInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{
		DeleteFunc: dc.deletePod,
	})

	dc.syncHandler = dc.syncDeployment
	dc.enqueueDeployment = dc.enqueue

	dc.dLister = dInformer.Lister()
	dc.rsLister = rsInformer.Lister()
	dc.podLister = podInformer.Lister()
	dc.dListerSynced = dInformer.Informer().HasSynced
	dc.rsListerSynced = rsInformer.Informer().HasSynced
	dc.podListerSynced = podInformer.Informer().HasSynced
	return dc, nil
}

由此,同样分为上下两篇来介绍Deployment,上篇讲述往队列里写入了什么,下篇讲述从队里中消费都做了什么。

事件处理

Deployment Add/Update/Delete

针对Deployment配置的增删改处理逻辑非常简单,都是收到变更事件之后直接写入队列。


func (dc *DeploymentController) addDeployment(obj interface{}) {
	d := obj.(*apps.Deployment)
	klog.V(4).InfoS("Adding deployment", "deployment", klog.KObj(d))
	dc.enqueueDeployment(d)
}

func (dc *DeploymentController) updateDeployment(old, cur interface{}) {
	oldD := old.(*apps.Deployment)
	curD := cur.(*apps.Deployment)
	klog.V(4).InfoS("Updating deployment", "deployment", klog.KObj(oldD))
	dc.enqueueDeployment(curD)
}

func (dc *DeploymentController) deleteDeployment(obj interface{}) {
	d, ok := obj.(*apps.Deployment)
	if !ok {
		tombstone, ok := obj.(cache.DeletedFinalStateUnknown)
		if !ok {
			utilruntime.HandleError(fmt.Errorf("couldn't get object from tombstone %#v", obj))
			return
		}
		d, ok = tombstone.Obj.(*apps.Deployment)
		if !ok {
			utilruntime.HandleError(fmt.Errorf("tombstone contained object that is not a Deployment %#v", obj))
			return
		}
	}
	klog.V(4).InfoS("Deleting deployment", "deployment", klog.KObj(d))
	dc.enqueueDeployment(d)
}

enqueueDeployment()在上一节初始化的时候有提到,dc.enqueueDeployment = dc.enqueue,我们来看一下它的定义和enqueue的实现

// DeploymentController is responsible for synchronizing Deployment objects stored
// in the system with actual running replica sets and pods.
type DeploymentController struct {
	// To allow injection of syncDeployment for testing.
	syncHandler func(dKey string) error
	
	// used for unit testing
	enqueueDeployment func(deployment *apps.Deployment)

	// Deployments that need to be synced
	queue workqueue.RateLimitingInterface
}


func (dc *DeploymentController) enqueue(deployment *apps.Deployment) {
	key, err := controller.KeyFunc(deployment)
	if err != nil {
		utilruntime.HandleError(fmt.Errorf("couldn't get key for object %#v: %v", deployment, err))
		return
	}

	dc.queue.Add(key)
}

也没有什么额外的逻辑,关于queue的详细介绍可以看ReplicaSet是怎么工作的?(上),也是一个限速队列

ReplicaSet Add

ReplicaSet Add事件的handle函数是常规操作。有趣的是,在ReplicaSetController中,对Pod Add事件的处理,与DeploymentController中对ReplicaSet Add事件的处理非常类似

首先,使用过client-go的朋友们会知道,如果client-go重启,它会以Add事件,把当前所有的ReplicaSet都list出来一遍,那么如果有已经标记了删除的ReplicaSet(即pod.DeletionTimestamp != nil),直接走删除的逻辑就好了。

其次,通过当前Add事件出现的ReplicaSet,获取它的OwnerReference(metav1.GetControllerOf(rs)这个操作),如果找到了这个ReplicaSet的owner——某个Deployment,则把这个Deployment加入队列

// addReplicaSet enqueues the deployment that manages a ReplicaSet when the ReplicaSet is created.
func (dc *DeploymentController) addReplicaSet(obj interface{}) {
	rs := obj.(*apps.ReplicaSet)

	if rs.DeletionTimestamp != nil {
		// On a restart of the controller manager, it's possible for an object to
		// show up in a state that is already pending deletion.
		dc.deleteReplicaSet(rs)
		return
	}

	// If it has a ControllerRef, that's all that matters.
	if controllerRef := metav1.GetControllerOf(rs); controllerRef != nil {
		d := dc.resolveControllerRef(rs.Namespace, controllerRef)
		if d == nil {
			return
		}
		klog.V(4).InfoS("ReplicaSet added", "replicaSet", klog.KObj(rs))
		dc.enqueueDeployment(d)
		return
	}

	// Otherwise, it's an orphan. Get a list of all matching Deployments and sync
	// them to see if anyone wants to adopt it.
	ds := dc.getDeploymentsForReplicaSet(rs)
	if len(ds) == 0 {
		return
	}
	klog.V(4).InfoS("Orphan ReplicaSet added", "replicaSet", klog.KObj(rs))
	for _, d := range ds {
		dc.enqueueDeployment(d)
	}
}

我们着重看一下,当以Add事件加入的ReplicaSet找不到owner时,即是个孤儿ReplicaSet无人认领时,它是如何把所有匹配的Deployments列出来和它进行匹配,看是否有Deployment愿意认领它的。

// getDeploymentsForReplicaSet returns a list of Deployments that potentially
// match a ReplicaSet.
func (dc *DeploymentController) getDeploymentsForReplicaSet(rs *apps.ReplicaSet) []*apps.Deployment {
	deployments, err := util.GetDeploymentsForReplicaSet(dc.dLister, rs)
	if err != nil || len(deployments) == 0 {
		return nil
	}
	// Because all ReplicaSet's belonging to a deployment should have a unique label key,
	// there should never be more than one deployment returned by the above method.
	// If that happens we should probably dynamically repair the situation by ultimately
	// trying to clean up one of the controllers, for now we just return the older one
	if len(deployments) > 1 {
		// ControllerRef will ensure we don't do anything crazy, but more than one
		// item in this list nevertheless constitutes user error.
		klog.V(4).InfoS("user error! more than one deployment is selecting replica set",
			"replicaSet", klog.KObj(rs), "labels", rs.Labels, "deployment", klog.KObj(deployments[0]))
	}
	return deployments
}

我们看到调用GetDeploymentsForReplicaSet()方法,把所有的Deployment和当前ReplicaSet的label进行一次匹配,再过滤。

  • 如果没有Deployment匹配到,说明就是个孤儿ReplicaSet,有可能是单独创建的,总之不是当前的已经有的Deployment创建的。
  • 如果有多个Deployment匹配到,会打一行日志提示不应该这样操作,后续会把所有匹配的Deployment加入到队列里。
  • 如果是一个的话是最好的情况。直接认领了,并把它加到队列里
ReplicaSet Update

当有ReplicaSet更新时,找到管理这个ReplicaSet的Deployment,并唤醒这个Deployment。

  1. 如果ResourceVersion不变,则说明没有更新,什么也不做
  2. 如果OwnerReference变了,说明这个ReplicaSet易主了,则需要唤醒之前的Deployment做同步
  3. 如果OwnerReference没变,说明还是同一个Deployment在管理这个ReplicaSet,则唤醒这个Deployment适配ReplicaSet的变更
  4. 如果找不到OwnerReference,说明这个ReplicaSet是个孤儿,看看有没有Deployment会认领它
// updateReplicaSet figures out what deployment(s) manage a ReplicaSet when the ReplicaSet
// is updated and wake them up. If the anything of the ReplicaSets have changed, we need to
// awaken both the old and new deployments. old and cur must be *apps.ReplicaSet
// types.
func (dc *DeploymentController) updateReplicaSet(old, cur interface{}) {
	curRS := cur.(*apps.ReplicaSet)
	oldRS := old.(*apps.ReplicaSet)
	if curRS.ResourceVersion == oldRS.ResourceVersion {
		// Periodic resync will send update events for all known replica sets.
		// Two different versions of the same replica set will always have different RVs.
		return
	}

	curControllerRef := metav1.GetControllerOf(curRS)
	oldControllerRef := metav1.GetControllerOf(oldRS)
	controllerRefChanged := !reflect.DeepEqual(curControllerRef, oldControllerRef)
	if controllerRefChanged && oldControllerRef != nil {
		// The ControllerRef was changed. Sync the old controller, if any.
		if d := dc.resolveControllerRef(oldRS.Namespace, oldControllerRef); d != nil {
			dc.enqueueDeployment(d)
		}
	}

	// If it has a ControllerRef, that's all that matters.
	if curControllerRef != nil {
		d := dc.resolveControllerRef(curRS.Namespace, curControllerRef)
		if d == nil {
			return
		}
		klog.V(4).InfoS("ReplicaSet updated", "replicaSet", klog.KObj(curRS))
		dc.enqueueDeployment(d)
		return
	}

	// Otherwise, it's an orphan. If anything changed, sync matching controllers
	// to see if anyone wants to adopt it now.
	labelChanged := !reflect.DeepEqual(curRS.Labels, oldRS.Labels)
	if labelChanged || controllerRefChanged {
		ds := dc.getDeploymentsForReplicaSet(curRS)
		if len(ds) == 0 {
			return
		}
		klog.V(4).InfoS("Orphan ReplicaSet updated", "replicaSet", klog.KObj(curRS))
		for _, d := range ds {
			dc.enqueueDeployment(d)
		}
	}
}
ReplicaSet Delete

这部分没有什么需要特别强调的,当有ReplicaSet删除时,唤醒它OwnerReference里的Deployment,做对应的处理

// deleteReplicaSet enqueues the deployment that manages a ReplicaSet when
// the ReplicaSet is deleted. obj could be an *apps.ReplicaSet, or
// a DeletionFinalStateUnknown marker item.
func (dc *DeploymentController) deleteReplicaSet(obj interface{}) {
	rs, ok := obj.(*apps.ReplicaSet)

	// When a delete is dropped, the relist will notice a pod in the store not
	// in the list, leading to the insertion of a tombstone object which contains
	// the deleted key/value. Note that this value might be stale. If the ReplicaSet
	// changed labels the new deployment will not be woken up till the periodic resync.
	if !ok {
		tombstone, ok := obj.(cache.DeletedFinalStateUnknown)
		if !ok {
			utilruntime.HandleError(fmt.Errorf("couldn't get object from tombstone %#v", obj))
			return
		}
		rs, ok = tombstone.Obj.(*apps.ReplicaSet)
		if !ok {
			utilruntime.HandleError(fmt.Errorf("tombstone contained object that is not a ReplicaSet %#v", obj))
			return
		}
	}

	controllerRef := metav1.GetControllerOf(rs)
	if controllerRef == nil {
		// No controller should care about orphans being deleted.
		return
	}
	d := dc.resolveControllerRef(rs.Namespace, controllerRef)
	if d == nil {
		return
	}
	klog.V(4).InfoS("ReplicaSet deleted", "replicaSet", klog.KObj(rs))
	dc.enqueueDeployment(d)
}
Pod Delete

这里是Deployment比较特殊的一个事件处理,按理说Deployment是ReplicaSet的控制器,只需要关注ReplicaSet的行为,无需单独关注Pod的删除事件。

从注释得知,当Deployment的Strategy设置为Recreate时,会用到此处的逻辑。看Recreate的定义可知,在创建新Pod的时候,会把所有存在的Pods都先杀死。

const (
	// Kill all existing pods before creating new ones.
	RecreateDeploymentStrategyType DeploymentStrategyType = "Recreate"

	// Replace the old ReplicaSets by new one using rolling update i.e gradually scale down the old ReplicaSets and scale up the new one.
	RollingUpdateDeploymentStrategyType DeploymentStrategyType = "RollingUpdate"
)

默认走RollingUpdate的话Deployment只需要控制ReplicaSet即可。

// deletePod will enqueue a Recreate Deployment once all of its pods have stopped running.
func (dc *DeploymentController) deletePod(obj interface{}) {
	pod, ok := obj.(*v1.Pod)

	// When a delete is dropped, the relist will notice a pod in the store not
	// in the list, leading to the insertion of a tombstone object which contains
	// the deleted key/value. Note that this value might be stale. If the Pod
	// changed labels the new deployment will not be woken up till the periodic resync.
	if !ok {
		tombstone, ok := obj.(cache.DeletedFinalStateUnknown)
		if !ok {
			utilruntime.HandleError(fmt.Errorf("couldn't get object from tombstone %#v", obj))
			return
		}
		pod, ok = tombstone.Obj.(*v1.Pod)
		if !ok {
			utilruntime.HandleError(fmt.Errorf("tombstone contained object that is not a pod %#v", obj))
			return
		}
	}
	klog.V(4).InfoS("Pod deleted", "pod", klog.KObj(pod))
	if d := dc.getDeploymentForPod(pod); d != nil && d.Spec.Strategy.Type == apps.RecreateDeploymentStrategyType {
		// Sync if this Deployment now has no more Pods.
		rsList, err := util.ListReplicaSets(d, util.RsListFromClient(dc.client.AppsV1()))
		if err != nil {
			return
		}
		podMap, err := dc.getPodMapForDeployment(d, rsList)
		if err != nil {
			return
		}
		numPods := 0
		for _, podList := range podMap {
			numPods += len(podList)
		}
		if numPods == 0 {
			dc.enqueueDeployment(d)
		}
	}
}

如果有Pod的删除事件,此处会追溯这个Pod的OwnerReference——ReplicaSet,然后追溯这些ReplicaSet的OwnerReference——Deployment,如果不为空,说明这个Pod最上层是由Deployment来管控的。

此时,再加上Deployment的Strategy设置为Recreate,则需计算最上层的这个Deployment所管理的Pods数目是否为0,只有当为0时,才会进行新建的动作,才需要把这个Deployment加入到队列里以便后续处理。

总结

Deployment的控制器主要做了如下的几件事情:

  1. 监听变更:Deployment 控制器在 controller-manager 中注册一个Deployment的informer,一个ReplicaSet的informer,和一个Pod的informer。分别通知Deployment、ReplicaSet和Pod的事件。在 pkg/controller/deployment/deployment_controller.go 文件中,定义了一个名为 NewDeploymentController 的方法,它将被这三个informer初始化,用于监听任何Deployment、ReplicaSet和Pod的事件,包括ADD、UPDATE、DELETE,Pod的比较特殊,只监听DELETE,一共3+3+1=7种事件
  2. 响应变更:一旦 Deployment 控制器接收到变更通知,它会将变更放入队列以进行处理。
  3. 同步状态:这部分在下篇进行讲解。

在这里插入图片描述

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值