Client-go之util（workqueue）

最新推荐文章于 2024-08-18 10:24:25 发布

来自万古的忧伤

最新推荐文章于 2024-08-18 10:24:25 发布

阅读量473

点赞数

分类专栏： kubernetes源码解析 # client-go源码解析文章标签： fifo kubernetes client-go

本文链接：https://blog.csdn.net/weixin_45413603/article/details/108047999

版权

kubernetes源码解析同时被 2 个专栏收录

22 篇文章 3 订阅

订阅专栏

client-go源码解析

9 篇文章 1 订阅

订阅专栏

本文为《Kubernetes 源码剖析》读书笔记，书籍简介：http://www.broadview.com.cn/book/6104

一、简介

workqueue称为工作队列，kubernetes的workqueue队列与普通fifo(先进先出，first-int，first-out)队列相比实现略显复杂，他的主要功能在于标记和去重
队列特性

有序	按照添加顺序处理元素(ietm)
去重	相同元素在同一时间不会被重复处理，例如一个元素在处理钱添加了多次，只会被处理一次
并发性	多生产者和多消费者
标记机制	支持标记功能，标记一个元素是否被处理，也允许元素在处理是重新排队
通知机制	shutdown方法通过信号量通知队列不再接收新的元素并通知metric goroutine退出
延迟	支持延迟队列，延迟一段时间后再将元素放入队列
限速	支持限速队列，元素存入队列时进行速率限制，限制一个元素被重新排队(Reenqueued)的次数
metric	支持metric的监控指标可用于prometheus监控

workqueue支持3中队列，提供了3种接口，不同队列实现可应对不同的使用场景，分别介绍如下

interface	FIFO队列接口，先进先出队列，并支持去重机制
DelayingInterface	延迟队列接口，基于interface接口封装，延迟一段时间后再将元素存入队列
RateLimitingInterface	限速队列接口，基于DelayingInterface接口封装，支持元素存入队列进行速率限制

二、FIFO队列

（1）数据结构

//FIFO队列的常规方法接口
type Interface interface {
	//给队列添加元素item
	Add(item interface{})
	//队列长度
	Len() int
	//从队列中获取头部第一个元素 并且确认队列是否关闭
	Get() (item interface{}, shutdown bool)
	//处理完毕 状态切换
	Done(item interface{})
	//关闭队列
	ShutDown()
	//查询队列是否在关闭
	ShuttingDown() bool
}

func NewNamed(name string) *Type {
	rc := clock.RealClock{}
	return newQueue(
		rc,
		globalMetricsFactory.newQueueMetrics(name, rc),
		defaultUnfinishedWorkUpdatePeriod,
	)
}
func newQueue(c clock.Clock, metrics queueMetrics, updatePeriod time.Duration) *Type {
	t := &Type{
		clock:                      c,
		dirty:                      set{},
		processing:                 set{},
		cond:                       sync.NewCond(&sync.Mutex{}),
		metrics:                    metrics,
		unfinishedWorkUpdatePeriod: updatePeriod,
	}
	//以goroutine的方式 不断的获取队列的状态 从而判定是否需要关闭队列
	go t.updateUnfinishedWorkLoop()
	return t
}
type Type struct {
	//队列定义了我们处理项目的顺序。 每一个队列元素应位于脏集中，而不应位于处理集。
	queue []t
	//定义所有需要处理的项目。
	dirty set
	//当前正在处理的事物在处理集中。 这些东西可能同时处于脏状态。 当我们完成处理并将其从此集中删除时，我们将检查它是否在脏集中，如果是，则将其添加到队列中。
	processing set
	cond *sync.Cond
	shuttingDown bool
	//queue的一些指标暴露
	metrics queueMetrics
	//未完成的工作更新期
	unfinishedWorkUpdatePeriod time.Duration
	clock                      clock.Clock
}

（2）流程介绍

FIFO队列数据结构中最主要的字段有queue、dirty和processing，其中queue是实际存储元素的地方是一个slice结构，用于保证元素有序，dirty非常关键，能够保证去重，还能保证在一个元素哪怕被添加了很多次(并发情况下),但也只会被处理一次；processing字段用于标记机制，标记一个元素是否被正在处理，根据workqueue的特性理解源码的实现
在这里插入图片描述

通过add方法往FIFO中分别插入1、2、3这三个元素，此时队列中的queue和dirty字段分别有1、2、3元素，processing为空，然后通过get方法获取到先进先出也就是元素1 ，此时队列的queue和dirty字段分别由2、3元素，而1元素会被放入processing中，标识该元素正在被处理，最后处理完毕通过done方法标记该元素已经被处理完成，此时队列中的processing的1会被删除

如下图，再成长情况下FIFO队列运行在并发场景下。高并发下如何保证在处理一个元素之前哪怕其被添加了多次，但也只会被处理一次
在这里插入图片描述
假设goroutine A通过Get方法获取1元素，1元素被添加至processing字段中，同一时间goroutine B 通过Add方法插入另一个1元素，此时在processing字段中已经存在相同的元素，所以后面的1并不会直接被添加在queue中，当前FIFO中的dirty有1、2、3元素，processing字段有1元素，在goroutine A通过Done方法标记处理完成之后，如果dirty字段中存有1元素在goroutineA 通过Done方法标记处理完成后，如果dirty字段中存在1元素，则将1元素追加到queue字段中的尾部，需要注意额的是dirty和processing都是用hash map数据结构实现的，所以不需要考虑无序只保证去重即可

（3）底层实现

type empty struct{}
type t interface{}
type set map[t]empty
//set map[interface]struct{}
//检查map中是否存在 传入的参数 做有效判断
func (s set) has(item t) bool {
	_, exists := s[item]
	return exists
}
//给map插入一个t
func (s set) insert(item t) {
	s[item] = empty{}
}
//从map中删除元素
func (s set) delete(item t) {
	delete(s, item)
}
func (q *Type) Add(item interface{}) {
	//保证线程安全
	q.cond.L.Lock()
	defer q.cond.L.Unlock()
	//判断队列是否关闭
	if q.shuttingDown {
		return
	}
	//判断队列中是否已经存在这个元素 存在直接退出
	if q.dirty.has(item) {
		return
	}
	//修改指标值
	q.metrics.add(item)
	//在预处理这块插入
	q.dirty.insert(item)
	//判断是不是处理中状态 是的话直接退出
	if q.processing.has(item) {
		return
	}
	//想队列中添加
	q.queue = append(q.queue, item)
	//信号唤醒一个等待在c上的goroutine，如果有的话。 允许但不要求呼叫者在通话过程中保持c.L。  c.l 也就是有锁  线程安全
	q.cond.Signal()
}
//队列的长度
func (q *Type) Len() int {
	q.cond.L.Lock()
	defer q.cond.L.Unlock()
	return len(q.queue)
}
func (q *Type) Get() (item interface{}, shutdown bool) {
	q.cond.L.Lock()
	defer q.cond.L.Unlock()
	//当队列 长度为0  并且队列没有关闭的情况下 会wait 等待 等待数据插入队列中
	for len(q.queue) == 0 && !q.shuttingDown {
		q.cond.Wait()
	}
	//经过循环 那么队列长度 不等于0 并且不关闭状态 但是到这里队列长度依旧等于0 那么标记shutdown 让队列关闭
	if len(q.queue) == 0 {
		// We must be shutting down.
		return nil, true
	}
 	//从队列中拿出第0个元素  然后并且删除第0个元素 先进先出队列
	item, q.queue = q.queue[0], q.queue[1:]
	//在指标中修改get的数据
	q.metrics.get(item)
	//在处理中插入我们到出来的 队列的第0个元素
	q.processing.insert(item)
	//在预处理这块删除
	q.dirty.delete(item)
	return item, false
}
//完成将项目标记为已完成处理，如果在处理过程中再次将其标记为dirty，则会将其重新添加到队列中以进行重新处理。
func (q *Type) Done(item interface{}) {
	q.cond.L.Lock()
	defer q.cond.L.Unlock()
	//在指标收集这里修改完成的值
	q.metrics.done(item)
	//当 这个事件处理完毕之后 在处理中删除 这个已经done的 元素
	q.processing.delete(item)
	//判断dirty中是否还存在
	if q.dirty.has(item) {
		//如果dirty中还存在 那么就把它再次放回队列
		q.queue = append(q.queue, item)
		//并且唤醒watch 队列的一个goroutine
		q.cond.Signal()
	}
}
//ShutDown将导致q忽略添加到其中的所有新项目。 一旦工人goroutine耗尽了队列中的现有项目，就会指示他们退出。
func (q *Type) ShutDown() {
	q.cond.L.Lock()
	defer q.cond.L.Unlock()
	q.shuttingDown = true
	q.cond.Broadcast()
}

func (q *Type) ShuttingDown() bool {
	q.cond.L.Lock()
	defer q.cond.L.Unlock()
	//直接返回shuttingDown 队列当前的状态 是否关闭
	return q.shuttingDown
}
//这里主要是一个队列的状态监测机制，用来对 队列的状态做监测 在初始化queue的时候会以goroutine的方式来启动 这个方法 通过时间的循环来不断的监测队列
func (q *Type) updateUnfinishedWorkLoop() {
	//初始化一个 未完成的工作更新期
	t := q.clock.NewTicker(q.unfinishedWorkUpdatePeriod)
	defer t.Stop()
	//根据unfinishedWorkUpdatePeriod 我们判定多久循环一次
	for range t.C() {
		//这里是一个匿名函数的调用  主要是用来做队列的 判定 看队列是否处于关闭状态
		if !func() bool {
			q.cond.L.Lock()
			defer q.cond.L.Unlock()
			if !q.shuttingDown {
				q.metrics.updateUnfinishedWork()
				return true
			}
			return false

		}() {
			return
		}
	}
}

三、延迟队列

（1）数据结构

延迟队列，基于FIFO队列接口封装，在原有功能上增加了AddAfter方法，其原理是延迟一段时间后再将元素插入FIFO队列数据结构如下

//DelayingInterface是一个可以在以后添加项目的接口。 这样可以在失败后更轻松地重新排队项目，而不会陷入热循环。
//这里基于fifo队列做了一些新的封装
type DelayingInterface interface {
	Interface
	// AddAfter adds an item to the workqueue after the indicated duration has passed
	//在指定的持续时间过去之后，AddAfter将项目添加到工作队列中
	AddAfter(item interface{}, duration time.Duration)
}
type delayingType struct {
	Interface
	// clock tracks time for delayed firing
	clock clock.Clock
	// stopCh lets us signal a shutdown to the waiting loop
	//stopCh让我们向等待循环发出关闭信号
	stopCh chan struct{}
	// stopOnce guarantees we only signal shutdown a single time
	//stopOnce保证我们仅一次发出关闭信号
	stopOnce sync.Once
	// heartbeat ensures we wait no more than maxWait before firing
	//心跳确保我们在发射前等待的时间不超过maxWait
	heartbeat clock.Ticker

	// waitingForAddCh is a buffered channel that feeds waitingForAdd
	//WaitingForAddCh是一个缓冲通道，用于提供waitingForAdd
	waitingForAddCh chan *waitFor
	// metrics counts the number of retries
	metrics retryMetrics
}
type waitFor struct {
	data    t
	readyAt time.Time
	// index in the priority queue (heap)
	index int
}
//waitForPriorityQueue为waitFor项目实现一个优先级队列。
//waitForPriorityQueue实现heap.Interface。
//时间紧随其后的项（即readyAt最小的项）位于根（索引0）。
//Peek在索引0处返回该最小项。在从队列中删除该最小项并通过容器/堆将其放置在索引Len（）-1后，Pop返回该最小项。
//Push将在索引Len（）处添加一个项目，然后容器/堆将其渗滤到正确的位置。
type waitForPriorityQueue []*waitFor

（2）流程介绍

AddAfter方法会插入一个item（元素）参数，并附带一个duration（延迟时间）参数，该duration参数用于指定元素延迟插入FIFO队列的时间。如果duration小于或等于0，会直接将元素插入FIFO队列中
delayingType结构中最主要的字段是waitingForAddCh，其默认初始大小为1000，通过AddAfter方法插入元素时，是非阻塞状态的，只有当插入的元素大于或等于1000时，延迟队列才会处于阻塞状态。waitingForAddCh字段中的数据通过goroutine运行的waitingLoop函数持久运行
在这里插入图片描述

（3）底层实现

//AddAfter在给定延迟后将给定项目添加到工作队列
func (q *delayingType) AddAfter(item interface{}, duration time.Duration) {
	// don't add if we're already shutting down
	if q.ShuttingDown() {
		return
	}

	q.metrics.retry()

	// immediately add things with no delay
	if duration <= 0 {
		q.Add(item)
		return
	}
	select {
	case <-q.stopCh:
		// unblock if ShutDown() is called
	case q.waitingForAddCh <- &waitFor{data: item, readyAt: q.clock.Now().Add(duration)}:
	}
}
//在初始化延迟队列的时候  waitingLoop会以一个goroutine的方式运行
//等待循环运行，直到关闭工作队列，并检查要添加的项目列表。
func (q *delayingType) waitingLoop() {
	defer utilruntime.HandleCrash()

	// Make a placeholder channel to use when there are no items in our list
	never := make(<-chan time.Time)

	waitingForQueue := &waitForPriorityQueue{}
	//Init建立此程序包中其他例程所需的堆不变式。
	//关于堆不变式，Init是幂等的，只要使堆不变式无效，就可以调用它。
	//复杂度为O（n），其中n = h.Len（）。
	heap.Init(waitingForQueue)

	waitingEntryByData := map[t]*waitFor{}

	for {
		if q.Interface.ShuttingDown() {
			return
		}

		now := q.clock.Now()

		// Add ready entries
		for waitingForQueue.Len() > 0 {
			//拿到第一个元素 waitFor
			entry := waitingForQueue.Peek().(*waitFor)
			if entry.readyAt.After(now) {
				break
			}
			//Pop从堆中删除并返回最小元素（根据Less）。
			//复杂度为O（log n），其中n = h.Len（）。
			//Pop等效于Remove（h，0）。
			entry = heap.Pop(waitingForQueue).(*waitFor)
			//插入队列并且从map中删掉
			q.Add(entry.data)
			delete(waitingEntryByData, entry.data)
		}

		// Set up a wait for the first item's readyAt (if one exists)
		//设置等待第一项的readyAt（如果存在）
		nextReadyAt := never
		if waitingForQueue.Len() > 0 {
			entry := waitingForQueue.Peek().(*waitFor)
			nextReadyAt = q.clock.After(entry.readyAt.Sub(now))
		}

		select {
		case <-q.stopCh:
			return

		case <-q.heartbeat.C():
			// continue the loop, which will add ready items
			//继续循环，这将添加准备好的项目

		case <-nextReadyAt:
			// continue the loop, which will add ready items
			//继续循环，这将添加准备好的项目
		//如果能从延迟中读取
		case waitEntry := <-q.waitingForAddCh:
			//After报告时间t是否在u之后。 也就是是否延迟
			if waitEntry.readyAt.After(q.clock.Now()) {
				//延迟就插入优先级队列
				insert(waitingForQueue, waitingEntryByData, waitEntry)
			} else {
				//不延迟直接插入队列
				q.Add(waitEntry.data)
			}

			drained := false
			for !drained {
				select {
				case waitEntry := <-q.waitingForAddCh:
					if waitEntry.readyAt.After(q.clock.Now()) {
						insert(waitingForQueue, waitingEntryByData, waitEntry)
					} else {
						q.Add(waitEntry.data)
					}
				default:
					drained = true
				}
			}
		}
	}
}

四、限速队列

限速队列，基于延迟队列和FIFO队列接口封装，限速队列接口（RateLimitingInterface）在原有功能上增加了AddRateLimited、Forget、NumRequeues方法。限速队列的重点不在于RateLimitingInterface接口，而在于它提供的4种限速算法接口（RateLimiter）。其原理是，限速队列利用延迟队列的特性，延迟某个元素的插入时间，达到限速目的。

type RateLimiter interface {
	// When gets an item and gets to decide how long that item should wait
	//什么时候获取一个项目并决定该项目要等待多长时间
	//获取指定元素应该等待的时间
	When(item interface{}) time.Duration
	// Forget indicates that an item is finished being retried.  Doesn't matter whether its for perm failing
	// or for success, we'll stop tracking it
	//释放指定元素，清空该元素的排队数
	Forget(item interface{})
	// NumRequeues returns back how many failures the item has had
	//获取指定元素的排队数
	NumRequeues(item interface{}) int
}

注意：这里有一个非常重要的概念——限速周期，一个限速周期是指从执行AddRateLimited方法到执行完Forget方法之间的时间。如果该元素被Forget方法处理完，则清空排队数。

（1）令牌桶算法

令牌桶算法是通过Go语言的第三方库golang.org/x/time/rate实现的。令牌桶算法内部实现了一个存放token（令牌）的“桶”，初始时“桶”是空的，token会以固定速率往“桶”里填充，直到将其填满为止，多余的token会被丢弃。每个元素都会从令牌桶得到一个token，只有得到token的元素才允许通过（accept），而没有得到token的元素处于等待状态。令牌桶算法通过控制发放token来达到限速目的。
在这里插入图片描述

//DefaultControllerRateLimiter是用于工作队列的默认速率限制器的无参数构造函数。 它具有整体和逐项速率限制。 整体是一个令牌桶，每个项目是指数的
func DefaultControllerRateLimiter() RateLimiter {
	return NewMaxOfRateLimiter(
		NewItemExponentialFailureRateLimiter(5*time.Millisecond, 1000*time.Second),
		// 10 qps, 100 bucket size.  This is only for retry speed and its only the overall factor (not per item)
		//10 qps，100桶大小。 这仅适用于重试速度及其唯一的整体因素（不适用于每个项目）
		//每秒存放的数量 根总量
		&BucketRateLimiter{Limiter: rate.NewLimiter(rate.Limit(10), 100)},
	)
}

//BucketRateLimiter使标准存储桶适应工作队列速率限制器API
//令牌桶
type BucketRateLimiter struct {
	*rate.Limiter
}

var _ RateLimiter = &BucketRateLimiter{}

func (r *BucketRateLimiter) When(item interface{}) time.Duration {
	return r.Limiter.Reserve().Delay()
}

func (r *BucketRateLimiter) NumRequeues(item interface{}) int {
	return 0
}

func (r *BucketRateLimiter) Forget(item interface{}) {
}

（2）排队指数算法

排队指数算法将相同元素的排队数作为指数，排队数增大，速率限制呈指数级增长，但其最大值不会超过maxDelay。元素的排队数统计是有限速周期的，一个限速周期是指从执行AddRateLimited方法到执行完Forget方法之间的时间。如果该元素被Forget方法处理完，则清空排队数

//项目指数失败率限制因素
//排队指数算法
type ItemExponentialFailureRateLimiter struct {
	failuresLock sync.Mutex
	//排队失败的次数
	failures     map[interface{}]int
	//最初的限速单位 默认5ms
	baseDelay time.Duration
	//最大限速单位默认1000s
	maxDelay  time.Duration
}
//在同一限速周期内，如果不存在相同元素，那么所有元素的延迟时间为baseDelay；而在同一限速周期内，如果存在相同元素，那么相同元素的延迟时间呈指数级增长，最长延迟时间不超过maxDelay
//我们假定baseDelay是1*time.Millisecond，
//maxDelay是1000*time.Second。
//假设在一个限速周期内通过AddRateLimited方法插入10个相同元素，
//那么第1个元素会通过延迟队列的AddAfter方法插入并设置延迟时间为1ms（即baseDelay），
//第2个相同元素的延迟时间为2ms，第3个相同元素的延迟时间为4ms，第4个相同元素的延迟时间为8ms，
//第5个相同元素的延迟时间为16ms……第10个相同元素的延迟时间为512ms，最长延迟时间不超过1000s（即maxDelay）
func (r *ItemExponentialFailureRateLimiter) When(item interface{}) time.Duration {
	r.failuresLock.Lock()
	defer r.failuresLock.Unlock()
	//failures 元素排队数 当AddRateLimited 方法插入新元素 就加1
	exp := r.failures[item]
	r.failures[item] = r.failures[item] + 1

	// The backoff is capped such that 'calculated' value never overflows.

	backoff := float64(r.baseDelay.Nanoseconds()) * math.Pow(2, float64(exp))
	if backoff > math.MaxInt64 {
		return r.maxDelay
	}
	//如果延时时间 超过 超过了最大 就返回最大延迟时间 所以这里最大延迟1000s
	calculated := time.Duration(backoff)
	if calculated > r.maxDelay {
		return r.maxDelay
	}

	return calculated
}
func (r *ItemExponentialFailureRateLimiter) Forget(item interface{}) {
	r.failuresLock.Lock()
	defer r.failuresLock.Unlock()

	delete(r.failures, item)
}

（3）计数器算法

计数器算法是限速算法中最简单的一种，其原理是：限制一段时间内允许通过的元素数量，例如在1分钟内只允许通过100个元素，每插入一个元素，计数器自增1，当计数器数到100的阈值且还在限速周期内时，则不允许元素再通过

这里简单来说就是

type ItemFastSlowRateLimiter struct {
	failuresLock sync.Mutex
	//元素排队数
	failures     map[interface{}]int
	//控制从fast速率转换到slow速率
	maxFastAttempts int
	//快 慢速率
	fastDelay       time.Duration
	slowDelay       time.Duration
}
//假设fastDelay是5*time.Millisecond，slowDelay是10*time.Second，maxFastAttempts是3。
//在一个限速周期内通过AddRateLimited方法插入4个相同的元素，那么前3个元素使用fastDelay定义的fast速率，
//当触发maxFastAttempts字段时，第4个元素使用slowDelay定义的slow速率
func (r *ItemFastSlowRateLimiter) When(item interface{}) time.Duration {
	r.failuresLock.Lock()
	defer r.failuresLock.Unlock()

	r.failures[item] = r.failures[item] + 1

	if r.failures[item] <= r.maxFastAttempts {
		return r.fastDelay
	}

	return r.slowDelay
}

func (r *ItemFastSlowRateLimiter) NumRequeues(item interface{}) int {
	r.failuresLock.Lock()
	defer r.failuresLock.Unlock()

	return r.failures[item]
}

func (r *ItemFastSlowRateLimiter) Forget(item interface{}) {
	r.failuresLock.Lock()
	defer r.failuresLock.Unlock()

	delete(r.failures, item)
}

（4）混合模式

简单明了混合使用

type MaxOfRateLimiter struct {
	limiters []RateLimiter
}

func NewMaxOfRateLimiter(limiters ...RateLimiter) RateLimiter {
	return &MaxOfRateLimiter{limiters: limiters}
}

func (r *MaxOfRateLimiter) NumRequeues(item interface{}) int {
	ret := 0
	for _, limiter := range r.limiters {
		curr := limiter.NumRequeues(item)
		if curr > ret {
			ret = curr
		}
	}

	return ret
}

func (r *MaxOfRateLimiter) Forget(item interface{}) {
	for _, limiter := range r.limiters {
		limiter.Forget(item)
	}
}