sriov-network-device-plugin 代码分析

代码路径

GitHub - k8snetworkplumbingwg/sriov-network-device-plugin: SRIOV network device plugin for KubernetesSRIOV network device plugin for Kubernetes. Contribute to k8snetworkplumbingwg/sriov-network-device-plugin development by creating an account on GitHub.https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin.git

主流程(main.go:main()可见)

  1. 读取cni配置信息并初始化服务句柄(/etc/pcidp/config.json)

    type resourceManager struct {
      cliParams // 系统参数
      pluginWatchMode bool // 插件模式,通过目录'/var/lib/kubelet/plugins_registry' 来判断模式,目前使用的插件注册模式
      rFactory        types.ResourceFactory // 接口服务,接口封装了服务主要接口调用
      configList      []*types.ResourceConfig // 解析/etc/pcidp/config.json中配置文件存放在此,并根据configList未后续步骤3提供config
      resourceServers []types.ResourceServer // 通过步骤3将相应的服务初始化成ResourceServer,追加再次,后续用于开启Resource服务
      deviceProviders map[types.DeviceType]types.DeviceProvider
    }
  2. 采集主机的pci信息,并根据class分类(当前未两种NetDeviceType,AcceleratorType)

    经过NetDeviceType.AddTargetDevices,AcceleratorType.AddTargetDevices过滤
  3. 服务初始化

    根据配置文件对device设备进行筛选,初始化为一个资源服务,并提供接口

    type ResourceFactory interface {
      GetResourceServer(ResourcePool) (ResourceServer, error)
      GetDefaultInfoProvider(string, string) DeviceInfoProvider
      GetSelector(string, []string) (DeviceSelector, error)
      GetResourcePool(rc *ResourceConfig, deviceList []PciDevice) (ResourcePool, error)
      GetRdmaSpec(string) RdmaSpec
      GetDeviceProvider(DeviceType) DeviceProvider
      GetDeviceFilter(*ResourceConfig) (interface{}, error)
      GetNadUtils() NadUtils
    }
  1. 启动服务

    每个资源服务会在/var/lib/kubelet/plugins_registry创建相应的sock文件
    服务句柄为
    type resourceServer struct {
      resourcePool       types.ResourcePool // 保存
      pluginWatch        bool // 
      endPoint           string // Socket file
      sockPath           string // Socket file path
      resourceNamePrefix string
      grpcServer         *grpc.Server
      termSignal         chan bool
      updateSignal       chan bool
      stopWatcher        chan bool
      checkIntervals     int // health check intervals in seconds
    }
  2. 等待终止信号

Kubelet接口

deviceplugin:
• registerapi:  k8s.io/kubelet/pkg/apis/pluginregistration/v1 
  • 通信机制:GRPC
  • PROTO: k8s.io/kubelet/pkg/apis/pluginregistration/probe/v1/api.proto
• pluginapi:  k8s.io/kubelet/pkg/apis/deviceplugin/v1beta1
  • 通信机制:GRPC
  • PROTO: k8s.io/kubelet/pkg/apis/deviceplugin/v1beta1/api.proto

服务流程

主逻辑server.go

type resourceServer struct {
  resourcePool       types.ResourcePool //资源池接口
  pluginWatch        bool  //监听插件使能
  endPoint           string // Socket file kubelet
  sockPath           string // Socket file path  resourceServer
  resourceNamePrefix string // 资源名称
  grpcServer         *grpc.Server //  grpc服务
  termSignal         chan bool
  updateSignal       chan bool
  stopWatcher        chan bool
  checkIntervals     int // health check intervals in seconds 默认20秒
}
  1. 获取服务资源名称,将资源池中的设备保存在/var/run/k8s,cni.cncf.io/devinfo/dp/中

  2. 监听/var/lib/kubelet/plugins_registry/自身服务的套接字(grpc流程 通信方式:unix)

  3. 注册服务(grpc流程)

    pluginapi.RegisterDevicePluginServer(rs.grpcServer, rs)

该接口作为kubelet的调用实现
type DevicePluginServer interface {
  GetDevicePluginOptions(context.Context, *Empty) (*DevicePluginOptions, error)
  ListAndWatch(*Empty, DevicePlugin_ListAndWatchServer) error
  Allocate(context.Context, *AllocateRequest) (*AllocateResponse, error)
  PreStartContainer(context.Context, *PreStartContainerRequest) (*PreStartContainerResponse, error)
}
  1. 启动服务(grpc流程)

Kubelet事件

ListAndWatch

Allocate

Kubelet接口测试

调试

问题汇总

  • 节点中如何定义资源呢?

为节点发布扩展资源 | Kubernetes

  • 如何为容器分配资源呢?

为容器分派扩展资源 | Kubernetes

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
Another example is the SRIOV_NET_VF resource class, which is provided by SRIOV-enabled network interface cards. In the case of multiple SRIOV-enabled NICs on a compute host, different qualitative traits may be tagged to each NIC. For example, the NIC called enp2s0 might have a trait “CUSTOM_PHYSNET_PUBLIC” indicating that the NIC is attached to a physical network called “public”. The NIC enp2s1 might have a trait “CUSTOM_PHYSNET_INTRANET” that indicates the NIC is attached to the physical network called “Intranet”. We need a way of representing that these NICs each provide SRIOV_NET_VF resources but those virtual functions are associated with different physical networks. In the resource providers data modeling, the entity which is associated with qualitative traits is the resource provider object. Therefore, we require a way of representing that the SRIOV-enabled NICs are themselves resource providers with inventories of SRIOV_NET_VF resources. Those resource providers are contained on a compute host which is a resource provider that has inventory records for other types of resources such as VCPU, MEMORY_MB or DISK_GB. This spec proposes that nested resource providers be created to allow for distinguishing details of complex components of some resource providers. During review the question came up about “rolling up” amounts of these nested providers to the root level. Imagine this scenario: I have a NIC with two PFs, each of which has only 1 VF available, and I get a request for 2 VFs without any traits to distinguish them. Since there is no single resource provider that can satisfy this request, it will not select this root provider, even though the root provider “owns” 2 VFs. This spec does not propose any sort of “rolling up” of inventory, but this may be something to consider in the future. If it is an idea that has support, another BP/spec can be created then to add this behavior.
07-23
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值