informer中reflector机制的实现分析与源码解读

SRExianxian

已于 2024-08-01 09:46:33 修改

阅读量944

点赞数 15

分类专栏： kubernetes 文章标签： kubernetes golang

于 2024-07-31 16:15:52 首次发布

本文链接：https://blog.csdn.net/xsw164711368/article/details/140824957

版权

kubernetes 专栏收录该内容

18 篇文章 0 订阅

订阅专栏

1.背景

informer机制中(下图)，reflector反射器的作用比较重要，本文从源码角度了解下reflector的原理与实现。

informer架构图

1.1 reflector的作用

从上图也可以看出reflector是“承上启下”的作用: 对上与apiserver交互获取数据，对下与deltaFIFO交互，把获取的数据存储起来。具体的说有2点:

1. 通过listwatch机制，从apiserver获取某资源的全量对象数据并监听 apiserver 的资源变化事件(Event)Add/Update/Delete
1. 将Add/Update/Delete事件 Add 到deltaFIFO 队列中去

2. reflector.go源码分析

找到reflector源码, 位于k8s.io/client-go/tools/cach/reflector.go

2.1 Reflector结构体

比较重要的是store,ListerWatcher,分别用于存储事件与获取事件

// Reflector watches a specified resource and causes all changes to be reflected in the given store.
type Reflector struct {
	// The destination to sync up with the watch source
	store Store  				// 存储通过listerwatcher获取到的事件
	// listerWatcher is used to perform lists and watches.
	listerWatcher ListerWatcher  		// 资源对象的全量list和增量事件的watch
                      ......
}

2.2 reflector的构造函数

构造函数很简单。

func NewReflector(lw ListerWatcher, expectedType interface{}, store Store, resyncPeriod time.Duration) *Reflector {
	return NewNamedReflector(naming.GetNameFromCallsite(internalPackages...), lw, expectedType, store, resyncPeriod)
}

// NewNamedReflector same as NewReflector, but with a specified name for logging
func NewNamedReflector(name string, lw ListerWatcher, expectedType interface{}, store Store, resyncPeriod time.Duration) *Reflector {
	r := &Reflector{
		name:          name,
		listerWatcher: lw,
		store:         store,
		expectedType:  reflect.TypeOf(expectedType),
		period:        time.Second,			// 如果list对象出错，等待period时间后重试
		resyncPeriod:  resyncPeriod,
		clock:         &clock.RealClock{},
	}
	return r
}

2.3 reflector的启动

reflector创建后，需要启动起来，才能不停的从 apiserver 获取事件，reflector启动后就一直循环运行，直到收到stopCh的停止通知

Run()方法中重要的是 r.ListAndWatch(stopCh)的执行，接下来我们需要具体了解这个函数

// Run starts a watch and handles watch events. Will restart the watch if it is closed.
// Run will exit when stopCh is closed.
func (r *Reflector) Run(stopCh <-chan struct{}) {

	wait.Until(func() { 				// 一直循环直到收到stopCh的停止
		if err := r.ListAndWatch(stopCh); err != nil {	// 启动List/Watch机制
			utilruntime.HandleError(err)
		}
	}, r.period, stopCh)
}

2.4 ListAndWatch 资源的监听

函数代码比较长，核心的有2点，先调用r.listerWatcher.List(options)获取全量对象，并记录版本号。再用方法 r.listerWatcher.Watch(options)根据版本号watch之后的资源变化，把变化的数据放入watcher.Interface的ResultChan返回的chan Event管道里面去，r.watchHandler(w, &resourceVersion, resyncerrc, stopCh)再判断事件类型add/modified/delete,操作缓存store里面的数据。

ResourceVersion资源版本号比较重要的字段，Kubernetes中所有的资源都拥有该字段，它标识当前资源对象的版本号。每次修改当前资源对象时，Kubernetes API Server都会更改ResourceVersion，使得client-go执行Watch操作时可以根据ResourceVersion来确定当前资源对象是否发生变化。

比如，获取所有Pod的资源数据时，获取资源数据是由options的ResourceVersion（资源版本号)参数控制的，如果ResourceVersion为0，则表示获取所有Pod的资源数据；如果ResourceVersion非0，则表示根据资源版本号继续获取。即使获取数据是网络出现中断，后面也可以根据资源版本号，知道哪些已经获取了，类似文件下载的"断点续传"能力。

/ ListAndWatch first lists all items and get the resource version at the moment of call,
// and then use the resource version to watch.
// It returns error if ListAndWatch didn't even try to initialize watch.
func (r *Reflector) ListAndWatch(stopCh <-chan struct{}) error {
	
	var resourceVersion string

	// Explicitly set "0" as resource version - it's fine for the List()
	// to be served from cache and potentially be delayed relative to
	// etcd contents. Reflector framework will catch up via Watch() eventually.
                      // list某资源对象时，先设置ResourceVersion为0
	options := metav1.ListOptions{ResourceVersion: "0"}

	if err := func() error {
		initTrace := trace.New("Reflector ListAndWatch", trace.Field{"name", r.name})
		defer initTrace.LogIfLong(10 * time.Second)
		var list runtime.Object
		var err error
		listCh := make(chan struct{}, 1)
		panicCh := make(chan interface{}, 1)
		go func() {
			defer func() {
				if r := recover(); r != nil {
					panicCh <- r
				}
			}()
			// Attempt to gather list in chunks, if supported by listerWatcher, if not, the first
			// list request will return the full response.
			pager := pager.New(pager.SimplePageFunc(func(opts metav1.ListOptions) (runtime.Object, error) {
             				// 获取某资源的全量数据,用于获取资源下的所有对象的数据
				return r.listerWatcher.List(opts)
			}))
			if r.WatchListPageSize != 0 {
				pager.PageSize = r.WatchListPageSize
			}
			// Pager falls back to full list if paginated list calls fail due to an "Expired" error.
			list, err = pager.List(context.Background(), options)
			close(listCh)
		}()
		select {
		case <-stopCh:
			return nil
		case r := <-panicCh:
			panic(r)
 		// 等待r.listerWatcher.List(opts)的执行完成
		case <-listCh:
		}
		if err != nil {
			return fmt.Errorf("%s: Failed to list %v: %v", r.name, r.expectedType, err)
		}
		initTrace.Step("Objects listed")
		listMetaInterface, err := meta.ListAccessor(list)
		if err != nil {
			return fmt.Errorf("%s: Unable to understand list result %#v: %v", r.name, list, err)
		}
		// 提取资源版本号
		resourceVersion = listMetaInterface.GetResourceVersion()
		initTrace.Step("Resource version extracted")
		// meta.ExtractList用于将资源数据转换成资源对象列表，将runtime.Object对象转换成[]runtime.Object对象。
		items, err := meta.ExtractList(list)
		if err != nil {
			return fmt.Errorf("%s: Unable to understand list result %#v (%v)", r.name, list, err)
		}
		initTrace.Step("Objects extracted")
		// 把list获取到的全量对象，存放到store(daltaFIFO)中去，并会替换已存在的对象，进过进一步查看函数syncWith可知，数据实际是存放到threadSafeMap中的
		if err := r.syncWith(items, resourceVersion); err != nil {
			return fmt.Errorf("%s: Unable to sync list result: %v", r.name, err)
		}
		initTrace.Step("SyncWith done")
		// 存放数据完成后，更新资源版本号
		r.setLastSyncResourceVersion(resourceVersion)
		initTrace.Step("Resource version updated")
		return nil
	}(); err != nil {
		return err
	}

	// 重新同步的逻辑
	resyncerrc := make(chan error, 1)
	cancelCh := make(chan struct{})
	defer close(cancelCh)
	go func() {
		resyncCh, cleanup := r.resyncChan()
		defer func() {
			cleanup() // Call the last one written into cleanup
		}()
		for {
			select {
			case <-resyncCh:
			case <-stopCh:
				return
			case <-cancelCh:
				return
			}
			if r.ShouldResync == nil || r.ShouldResync() {
				klog.V(4).Infof("%s: forcing resync", r.name)
				if err := r.store.Resync(); err != nil {
					resyncerrc <- err
					return
				}
			}
			cleanup()
			resyncCh, cleanup = r.resyncChan()
		}
	}()

	for {
		// give the stopCh a chance to stop the loop, even in case of continue statements further down on errors
		select {
		case <-stopCh:
			return nil
		default:
		}

		timeoutSeconds := int64(minWatchTimeout.Seconds() * (rand.Float64() + 1.0))
		options = metav1.ListOptions{
			ResourceVersion: resourceVersion,
			// We want to avoid situations of hanging watchers. Stop any wachers that do not
			// receive any events within the timeout window.
			TimeoutSeconds: &timeoutSeconds,
			// To reduce load on kube-apiserver on watch restarts, you may enable watch bookmarks.
			// Reflector doesn't assume bookmarks are returned at all (if the server do not support
			// watch bookmarks, it will ignore this field).
			AllowWatchBookmarks: true,
		}
		//  创建一个watcher 监听事件，把监听事件放到watcher.Interface的ResultChan返回的chan Event管道里面去
		w, err := r.listerWatcher.Watch(options)
		if err != nil {
			switch err {
			case io.EOF:
				// watch closed normally
			case io.ErrUnexpectedEOF:
				klog.V(1).Infof("%s: Watch for %v closed with unexpected EOF: %v", r.name, r.expectedType, err)
			default:
				utilruntime.HandleError(fmt.Errorf("%s: Failed to watch %v: %v", r.name, r.expectedType, err))
			}
			// If this is "connection refused" error, it means that most likely apiserver is not responsive.
			// It doesn't make sense to re-list all objects because most likely we will be able to restart
			// watch where we ended.
			// If that's the case wait and resend watch request.
			if utilnet.IsConnectionRefused(err) {
				time.Sleep(time.Second)
				continue
			}
			return nil
		}
		//  开启watch 监听的增量变化的某资源事件，并把事件放到deltaFIFO
		if err := r.watchHandler(w, &resourceVersion, resyncerrc, stopCh); err != nil {
			if err != errorStopRequested {
				switch {
				case apierrs.IsResourceExpired(err):
					klog.V(4).Infof("%s: watch of %v ended with: %v", r.name, r.expectedType, err)
				default:
					klog.Warningf("%s: watch of %v ended with: %v", r.name, r.expectedType, err)
				}
			}
			return nil
		}
	}
}

2.5 watchHandler方法

watchHandler从ResultChan()返回的chan Event里面取事件，判断事件的类型"add/modify,delete"，再更据类型，调用store接口的对应方法，修改deltaFIFO里面的数据

// watchHandler watches w and keeps *resourceVersion up to date.
func (r *Reflector) watchHandler(w watch.Interface, resourceVersion *string, errc chan error, stopCh <-chan struct{}) error {
	start := r.clock.Now()
	eventCount := 0

	// Stopping the watcher should be idempotent and if we return from this function there's no way
	// we're coming back in with the same watch interface.
	defer w.Stop()

loop:
	for {
		select {
		case <-stopCh:
			return errorStopRequested
		case err := <-errc:
			return err
		// 从ResultChan里面去事件，判断事件的类型"add/modify,delete"，再更据类型，调用store接口的对应方法。修改deltaFIFO里面的数据
		case event, ok := <-w.ResultChan():
			if !ok {
				break loop
			}
			if event.Type == watch.Error {
				return apierrs.FromObject(event.Object)
			}
			if e, a := r.expectedType, reflect.TypeOf(event.Object); e != nil && e != a {
				utilruntime.HandleError(fmt.Errorf("%s: expected type %v, but watch event object had type %v", r.name, e, a))
				continue
			}
			meta, err := meta.Accessor(event.Object)
			if err != nil {
				utilruntime.HandleError(fmt.Errorf("%s: unable to understand watch event %#v", r.name, event))
				continue
			}
			// 获取新事件对应的资源版本号
			newResourceVersion := meta.GetResourceVersion()
			switch event.Type {
			// 如果监听到add事件
			case watch.Added:
				// 把事件放到缓存
				err := r.store.Add(event.Object)
				if err != nil {
					utilruntime.HandleError(fmt.Errorf("%s: unable to add watch event object (%#v) to store: %v", r.name, event.Object, err))
				}
			// 如果监听到modified事件
			case watch.Modified:
				err := r.store.Update(event.Object)
				if err != nil {
					utilruntime.HandleError(fmt.Errorf("%s: unable to update watch event object (%#v) to store: %v", r.name, event.Object, err))
				}
			// 如果监听到是deleted事件
			case watch.Deleted:
				// TODO: Will any consumers need access to the "last known
				// state", which is passed in event.Object? If so, may need
				// to change this.
				err := r.store.Delete(event.Object)
				if err != nil {
					utilruntime.HandleError(fmt.Errorf("%s: unable to delete watch event object (%#v) from store: %v", r.name, event.Object, err))
				}
			case watch.Bookmark:
				// A `Bookmark` means watch has synced here, just update the resourceVersion
			default:
				utilruntime.HandleError(fmt.Errorf("%s: unable to understand watch event %#v", r.name, event))
			}
			// 更新LastSyncResourceVersion
			*resourceVersion = newResourceVersion
			r.setLastSyncResourceVersion(newResourceVersion)
			eventCount++
		}
	}

	watchDuration := r.clock.Since(start)
	if watchDuration < 1*time.Second && eventCount == 0 {
		return fmt.Errorf("very short watch: %s: Unexpected watch close - watch lasted less than a second and no items received", r.name)
	}
	klog.V(4).Infof("%s: Watch close - %v total %v items received", r.name, r.expectedType, eventCount)
	return nil
}

流程图大致如下：

在这里插入图片描述

3. listwatch.go 源码分析

从上面可以看下，reflector重度依赖listwatch机制。接下来我们从源码层面了解下，listwatch机制如何实现的。

找到listwatch.go源码，位于vendor/k8s.io/client-go/tools/cache/listwatch.go

根据上面分析，reflector调用了ListerWatcher接口的List()方法与Watch()方法。所以我们从ListerWatcher接口入手。

在这里插入图片描述

3.1 watch.Interface接口

watch.go代码中定义了watch.Interface接口，它有2个功能，stop()停止监听，ResultChan() 放事件放到chan Event管道，并返回这个管道

// ListerWatcher is any object that knows how to perform an initial list and start a watch on a resource.
type ListerWatcher interface {  // ListerWatcher是Lister与Watcher两个接口的组合
	Lister
	Watcher
}


// Lister is any object that knows how to perform an initial list.
type Lister interface {
	// List should return a list type object; the Items field will be extracted, and the
	// ResourceVersion field will be used to start the watch in the right place.
	List(options metav1.ListOptions) (runtime.Object, error)           // List()方法会返回list到的对象列表
}

// Watcher is any object that knows how to start a watch on a resource.
type Watcher interface {
	// Watch should begin a watch at the specified version.
	Watch(options metav1.ListOptions) (watch.Interface, error)    // Watch()方法返回一个watch.Interface
}



// Interface can be implemented by anything that knows how to watch and report changes.
type Interface interface {
	// Stops watching. Will close the channel returned by ResultChan(). Releases
	// any resources used by the watch.
	Stop()

	// Returns a chan which will receive all the events. If an error occurs
	// or Stop() is called, this channel will be closed, in which case the
	// watch should be completely cleaned up.
	ResultChan() <-chan Event                          // watch.Interface会把监听到的事件，放到存放事件的管道chan Event
}

3.2 emptyWatch结构体

emptyWatch类型实现了watch.Interface


// 定义类型emptyWatch，emptyWatch是存放Event的管道chan
type emptyWatch chan Event


// emptyWatch分别定义Stop()方法和ResultChan()方法，从而就实现了watch.Interface接口

// Stop implements Interface
func (w emptyWatch) Stop() {
}

// ResultChan implements Interface
func (w emptyWatch) ResultChan() <-chan Event {
	return chan Event(w)  // 返回一个存放事件的管道
}


// NewEmptyWatch returns a watch interface that returns no results and is closed.
// May be used in certain error conditions where no information is available but
// an error is not warranted.
func NewEmptyWatch() Interface {
	ch := make(chan Event)
	close(ch)
	return emptyWatch(ch)
}

3.3 Event事件的具体定义

Event事件包括2部分，一是事件类型包括增、删、查、标记，二是运行时对象。通俗的说就是某种资源对象发生了什么操作，比如

// EventType defines the possible types of events.
type EventType string

const (
	Added    EventType = "ADDED"
	Modified EventType = "MODIFIED"
	Deleted  EventType = "DELETED"
	Bookmark EventType = "BOOKMARK"
	Error    EventType = "ERROR"

	DefaultChanSize int32 = 100
)

// Event represents a single event to a watched resource.
// +k8s:deepcopy-gen=true
type Event struct {
	Type EventType

	// Object is:
	//  * If Type is Added or Modified: the new state of the object.
	//  * If Type is Deleted: the state of the object immediately before deletion.
	//  * If Type is Bookmark: the object (instance of a type being watched) where
	//    only ResourceVersion field is set. On successful restart of watch from a
	//    bookmark resourceVersion, client is guaranteed to not get repeat event
	//    nor miss any events.
	//  * If Type is Error: *api.Status is recommended; other types may make sense
	//    depending on context.
	Object runtime.Object
}

3.4 ListWatch结构体

ListWatch结构体实现了ListerWatcher接口

watch监听机制需要使用http chunk机制，而关于http Chunk机制后面再详细讨论

// ListWatch knows how to list and watch a set of apiserver resources.  It satisfies the ListerWatcher interface.
// It is a convenience function for users of NewReflector, etc.
// ListFunc and WatchFunc must not be nil
type ListWatch struct {			/// ListWatch结构体的定义
	ListFunc  ListFunc		// 用于list全量对象的函数ListFunc
	WatchFunc WatchFunc		// 用于watch增量事件的函数WatchFunc
	// DisableChunking requests no chunking for this list watcher.
	DisableChunking bool
}


// ListWatch结构体定义了List()与Watch()方法，从而实现了ListerWatcher接口
// List a set of apiserver resources
func (lw *ListWatch) List(options metav1.ListOptions) (runtime.Object, error) {
	if !lw.DisableChunking {
		return pager.New(pager.SimplePageFunc(lw.ListFunc)).List(context.TODO(), options)
	}
	return lw.ListFunc(options)
}

// Watch a set of apiserver resources
func (lw *ListWatch) Watch(options metav1.ListOptions) (watch.Interface, error) {
	return lw.WatchFunc(options)
}


// ListFunc knows how to list resources
type ListFunc func(options metav1.ListOptions) (runtime.Object, error)		// ListFunc类型的定义

// WatchFunc knows how to watch resources
type WatchFunc func(options metav1.ListOptions) (watch.Interface, error)		// WatchFunc类型的定义

ListWatch结构体的构造函数，需要传入资源名resource，命名空间namespace，与选择器fieldSelector和Getter。
Getter用于向apiserver发起http请求下面再论。

watchFunc函数里面的Watch()操作,是通过HTTP协议与apiserver建立长连接，接收到apiserver发来的资源变更事件。Watch操作的实现机制使用HTTP协议的分块传输编码（Chunked Transfer Encoding）。
当client-go调用apiserver时，apiserver在Response的HTTPHeader中设置Transfer-Encoding的值为chunked，表示采用分块传输编码，客户端收到该信息后，便与服务端进行连接，并等待下一个数据块（即资源的事件信息）。关于chunk的详细介绍请见博客: http chunk的介绍https://juejin.cn/post/7133865158304071694


// NewListWatchFromClient creates a new ListWatch from the specified client, resource, namespace and field selector.
func NewListWatchFromClient(c Getter, resource string, namespace string, fieldSelector fields.Selector) *ListWatch {
	optionsModifier := func(options *metav1.ListOptions) {
		options.FieldSelector = fieldSelector.String()
	}
	return NewFilteredListWatchFromClient(c, resource, namespace, optionsModifier)  		// 实际调用的是NewFilteredListWatchFromClient()方法
}

// NewFilteredListWatchFromClient creates a new ListWatch from the specified client, resource, namespace, and option modifier.
// Option modifier is a function takes a ListOptions and modifies the consumed ListOptions. Provide customized modifier function
// to apply modification to ListOptions with a field selector, a label selector, or any other desired options.
func NewFilteredListWatchFromClient(c Getter, resource string, namespace string, optionsModifier func(options *metav1.ListOptions)) *ListWatch {
	listFunc := func(options metav1.ListOptions) (runtime.Object, error) {			// 初始化listFunc函数
		optionsModifier(&options)
		return c.Get().
			Namespace(namespace).
			Resource(resource).
			VersionedParams(&options, metav1.ParameterCodec).
			Do().
			Get()					// 通过restclient调用的apiserver的Get api接口
	}
	watchFunc := func(options metav1.ListOptions) (watch.Interface, error) {		// // 初始化listFunc函数
		options.Watch = true
		optionsModifier(&options)
		return c.Get().
			Namespace(namespace).
			Resource(resource).
			VersionedParams(&options, metav1.ParameterCodec).
			Watch()
	}							// 通过restclient调用的apiserver的watch api接口
	return &ListWatch{ListFunc: listFunc, WatchFunc: watchFunc}
}

3.5 Getter

Getter接口的作用是返回一个能发起restful api请求的 client

// Getter interface knows how to access Get method from RESTClient.
type Getter interface {
	Get() *restclient.Request
}

4. 总结

4.1 reflector机制运行

(1) informer启动后，会调用reflector的Run()函数
(2) reflector启动后，执行r.ListAndWatch(stopCh)函数，一直循环运行，直到收到stopCh的停止通知
(3) 接着执行r.listerWatcher.List(opts)用于获取资源下的所有对象的数据
(4) 接着执行items, err := meta.ExtractList(list)，meta.ExtractList用于将资源数据转换成资源对象列表，将runtime.Object对象转换成[]runtime.Object对象。
(5) 接着执行r.syncWith(items, resourceVersion)，把list获取到的全量对象，存放到store(daltaFIFO)中去，并会替换Replace()已存在的对象，进过进一步查看函数syncWith可知，数据实际是存放到threadSafeMap中的
(6) 接着执行r.listerWatcher.Watch(options)，创建一个watcher 监听事件，把监听事件放到watcher.Interface的ResultChan返回的chan Event管道里面去
(7) 接着执行r.watchHandler(w, &resourceVersion, resyncerrc, stopCh)，从chan Event管道里面取出资源事件，并把事件放到deltaFIFO

4.2 listwatch源码调用流程

(1) List资源对象: Reflector---->ListerWatcher接口—>ListFunc—>Getter()接口—>restclient.Request的Get()接口
(2) Watch资源变化事件: Reflector---->ListerWatcher接口—>WatchFunc—>Getter()接口—>restclient.Request的Watch()接口