OpenShift 4 - 用KubeletConfig和ContainerRuntimeConfig分别修改集群节点的Kubelet和cri-o的配置

OpenShift 4.x HOL教程汇总
说明:本文已经在OpenShift 4.6环境中验证

Kubelet、KubeletConfig和KubeletConfigController

在《OpenShift 4 - 如何用Machine Config Operator修改集群节点CoreOS的配置》一文中提到在OpenShift的Machine Config Controller中包括一个名为Kubelet Config Controller的子组件,该组件接收基于CRD的KubeletConfig配置对象并将其实施于适用节点的Kubelet环境中,也就是说OpenShift 4集群中Node节点的Kubelete环境是通过该组件实现配置的。

在安装OpenShift集群过程中会在Ignition中提供缺省的KubeConfig配置,我们可以在安装后修改节点Kubelet使用的配置,从而修改Kubelet的运行参数。

查看节点的Kubelet配置

  1. 执行命令,查看当前集群中的worker类型节点。
$ oc get node -l node-role.kubernetes.io/worker
NAME                                              STATUS   ROLES    AGE     VERSION
ip-10-0-150-145.ap-southeast-1.compute.internal   Ready    worker   3h32m   v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal     Ready    worker   3h32m   v1.19.0+d59ce34
  1. 查看一个Worker类型的Node对象,其中“Allocatable”部分是该Node的Kubelete所用到的一部分配置。其中“pods: 250”是这个Kubelet可运行的最大pod数。
$ oc describe node <WORKER_NODE> | grep Allocatable -A7
Allocatable:
  attachable-volumes-aws-ebs:  25
  cpu:                         15500m
  ephemeral-storage:           114381692328
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      63991700Ki
  pods:                        250
  1. 查看这个OpenShift集群所包含的MachineConfig对象。从结果可以看出“01-worker-kubelet”是用来设置worker类型Node里面的Kubelet环境的;另外确认只有2个名为“99-master-XXXXX”和2个名为“rendered-worker-XXXXX”的machineconfig对象。
$ oc get machineconfig
NAME                                                        GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE
00-master                                                   0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             6h
00-worker                                                   0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             6h
01-master-container-runtime                                 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             6h
01-master-kubelet                                           0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             6h
01-worker-container-runtime                                 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             6h
01-worker-kubelet                                           0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             6h
99-master-ea3d87d1-a5df-4137-a3a3-849915e40cdd-registries   0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             6h
99-master-ssh                                                                                          3.1.0             6h9m
99-worker-cdf0041e-c96c-401d-9881-5c8243a58991-registries   0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             6h
99-worker-ssh                                                                                          3.1.0             6h9m
rendered-master-2613a048ee6bb4b27621cfff3c44a676            99eb744f5094224edb60d88ca85d607ab151ebdf   3.1.0             6h
rendered-master-c5bfe43313bf45eb9abd3e8422421b6d            0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             3h30m
rendered-worker-46a5c3ba1b88f2b312aa349e71f4a0fa            0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             3h30m
rendered-worker-c74c310336d86b894dec0c5b49743ebd            99eb744f5094224edb60d88ca85d607ab151ebdf   3.1.0             6h
  1. 查看名为01-worker-kubelet的machineconfig中内容,确认kubelet.conf文件的路径在“/etc/kubernetes/kubelet.conf”。
$ oc describe machineconfig 01-worker-kubelet | grep '\--config'
      --config=/etc/kubernetes/kubelet.conf \
  1. 执行命令,进入一个Worker Node。
$ oc debug node/<WORKER-NODE=NAME>
Starting pod/ip-10-0-150-145ap-southeast-1computeinternal-debug ...
To use host binaries, run `chroot /host`
  1. 在Node节点内部执行命令,查看kubelet.conf文件内容,其中可以看到“maxPods: 250”的配置,这和上面(2)看到的一样的结果。
sh-4.4# chroot /host
sh-4.4# more /etc/kubernetes/kubelet.conf
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
  x509:
    clientCAFile: /etc/kubernetes/kubelet-ca.crt
  anonymous:
    enabled: false
cgroupDriver: systemd
cgroupRoot: /
clusterDNS:
 7. 172.30.0.10
clusterDomain: cluster.local
containerLogMaxSize: 50Mi
maxPods: 250
kubeAPIQPS: 50
kubeAPIBurst: 100
rotateCertificates: true
serializeImagePulls: false
staticPodPath: /etc/kubernetes/manifests
systemCgroups: /system.slice
systemReserved:
  cpu: 500m
  memory: 1Gi
  ephemeral-storage: 1Gi
featureGates:
  LegacyNodeRoleBehavior: false
  NodeDisruptionExclusion: true
  RotateKubeletServerCertificate: true
  SCTPSupport: true
  ServiceNodeExclusion: true
  SupportPodPidsLimit: true
serverTLSBootstrap: true
  1. 从OpenShift的worker节点中退出来。
sh-4.4# exit
sh-4.4# exit

修改节点的Kubelet配置

  1. 创建内容如下的change-maxPods-cr.yaml文件。其中通过KubeletConfig对象将kubelet的maxPods设为500,另外它只针对带有“custom-kubelet: large-pods”的Node才有效。
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
  name: set-max-pods
spec:
  machineConfigPoolSelector:
    matchLabels:
      custom-kubelet: large-pods
  kubeletConfig:
    maxPods: 500
  1. 查看名为worker的machineconfigpool的信息,确认没有“custom-kubelet=large-pods”的标签。
$ oc get machineconfigpool worker --show-labels
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE   LABELS
worker   rendered-worker-1e4ff665b30dd1099a34d6e636654353   True      False      False      2              2                   2                     0                      8h    machineconfiguration.openshift.io/mco-built-in=
  1. 修改worker类的machineconfigpool的配置,增加“custom-kubelet=large-pods”标签。
$ oc label machineconfigpool worker custom-kubelet=large-pods
  1. 创建Kubeletconfig对象。
$ oc create -f change-maxPods-cr.yaml
$ oc get kubeletconfig
NAME           AGE
set-max-pods   7s
  1. 查看Worker节点更新状态,可以看到OpenShift对2个Worker节点逐个更新Kubelet的配置。另外在更一个新节点的时候,该节点暂时处于“SchedulingDisabled”状态,以便临时不被Pod等资源调度到。最后2个Worker节点都恢复到只有“Ready”的状态。
NAME                                              STATUS                        ROLES    AGE     VERSION
ip-10-0-150-145.ap-southeast-1.compute.internal   Ready                         worker   7h19m   v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal     Ready,SchedulingDisabled      worker   7h19m   v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal     NotReady,SchedulingDisabled   worker   7h19m   v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal     NotReady,SchedulingDisabled   worker   7h19m   v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal     Ready,SchedulingDisabled      worker   7h19m   v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal     Ready,SchedulingDisabled      worker   7h19m   v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal     Ready                         worker   7h19m   v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal     Ready                         worker   7h19m   v1.19.0+d59ce34
ip-10-0-150-145.ap-southeast-1.compute.internal   Ready                         worker   7h19m   v1.19.0+d59ce34
ip-10-0-150-145.ap-southeast-1.compute.internal   Ready,SchedulingDisabled      worker   7h19m   v1.19.0+d59ce34
ip-10-0-150-145.ap-southeast-1.compute.internal   Ready,SchedulingDisabled      worker   7h19m   v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal     Ready                         worker   7h20m   v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal     Ready                         worker   7h20m   v1.19.0+d59ce34
ip-10-0-150-145.ap-southeast-1.compute.internal   Ready,SchedulingDisabled      worker   7h20m   v1.19.0+d59ce34
ip-10-0-150-145.ap-southeast-1.compute.internal   Ready,SchedulingDisabled      worker   7h20m   v1.19.0+d59ce34
ip-10-0-150-145.ap-southeast-1.compute.internal   Ready,SchedulingDisabled      worker   7h20m   v1.19.0+d59ce34
ip-10-0-150-145.ap-southeast-1.compute.internal   Ready                         worker   7h21m   v1.19.0+d59ce34
  1. 查看machineconfig对象,确认已经有3个名为“99-worker-XXXXXX“和3个名为“rendered-worker-XXXXX”的machineconfig对象。可以根据AGE判断“99-worker-cdf0041e-c96c-401d-9881-5c8243a58991-kubelet”和“rendered-worker-1e4ff665b30dd1099a34d6e636654353”是新创建的machineconfig对象,其中“99-worker-cdf0041e-c96c-401d-9881-5c8243a58991-kubelet”中包含了对Worker类型节点的Kubelet做的变更配置信息。
$ oc get machineconfig
NAME                                                        GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE
00-master                                                   0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             8h
00-worker                                                   0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             8h
01-master-container-runtime                                 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             8h
01-master-kubelet                                           0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             8h
01-worker-container-runtime                                 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             8h
01-worker-kubelet                                           0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             8h
99-master-ea3d87d1-a5df-4137-a3a3-849915e40cdd-registries   0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             8h
99-master-ssh                                                                                          3.1.0             8h
99-worker-cdf0041e-c96c-401d-9881-5c8243a58991-kubelet      0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             30m
99-worker-cdf0041e-c96c-401d-9881-5c8243a58991-registries   0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             8h
99-worker-ssh                                                                                          3.1.0             8h
rendered-master-2613a048ee6bb4b27621cfff3c44a676            99eb744f5094224edb60d88ca85d607ab151ebdf   3.1.0             8h
rendered-master-c5bfe43313bf45eb9abd3e8422421b6d            0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             5h35m
rendered-worker-1e4ff665b30dd1099a34d6e636654353            0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             30m
rendered-worker-46a5c3ba1b88f2b312aa349e71f4a0fa            0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2   3.1.0             5h35m
rendered-worker-c74c310336d86b894dec0c5b49743ebd            99eb744f5094224edb60d88ca85d607ab151ebdf   3.1.0             8h
  1. 查看名为set-max-pods的kubeletconfig对象的status,其中“type: Success”代表已经成功执行了。
$ oc get kubeletconfig set-max-pods -o yaml
。。。
status:
  conditions:
  - lastTransitionTime: "2020-12-05T15:37:47Z"
    message: Success
    status: "True"
    type: Success
  1. 再次查看Worker节点的Kubelet配置,发现已经变成“pods:500”了。
 $ oc describe node <WORKER_NODE> | grep Allocatable -A7
Allocatable:
  attachable-volumes-aws-ebs:  25
  cpu:                         15500m
  ephemeral-storage:           114381692328
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      63991700Ki
  pods:                        500

CRI-O、ContainerRuntimeConfig和MachineConfigController

OpenShift 4的集群适用了CRI-O作为其容器运行环境,节点CRI-O的配置文件etc/crio/crio.conf。若要修改节点的CRI-O的配置参数,需要用到OpenShift中的CRD类型对象ContainerRuntimeConfig保存定制的配置参数。当OpenShift发现有新的ContainerRuntimeConfig后,会根据其内容生成对应的r名为ender-xxxx的MachineConfig对象,此后OpenShift的MachineConfigController会将MachineConfig对象发送到所有相关对点的MachineConfigDaemon,再由它完成该节点配置的修改。

查看节点的cri-o配置

  1. 根据本文前面的方法进入一个master节点,然后查看crio运行环境的配置文件中的“pids_limit”参数,确认缺省为1024。最后从OpenShift的master节点退出来。
sh-4.4# cat /etc/crio/crio.conf | grep -v "#"  | sed '/^$/d' |grep -i pids_limit
pids_limit = 1024

修改节点的cri-o配置

  1. 为master类型的machineconfigpool添加标签。ContainerRuntimeConfig对象会提交到该标签对应的machine上。
$ oc label machineconfigpool master debug-crio=config-log-and-pid
machineconfigpool.machineconfiguration.openshift.io/master labeled
  1. 创建内容如下的ContainerRuntimeConfig.yaml文件,其中通过定义的ContainerRuntimeConfig对象来修改节点ContainerRuntime的pidsLimit和logLevel参数。
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
 name: set-log-and-pid
spec:
 machineConfigPoolSelector:
   matchLabels:
     debug-crio: config-log-and-pid
 containerRuntimeConfig:
   pidsLimit: 2048
   logLevel: debug
  1. 创建ContainerRuntimeConfig对象。
$ oc create -f ContainerRuntimeConfig.yaml
containerruntimeconfig.machineconfiguration.openshift.io/set-log-and-pid created
$ oc get ContainerRuntimeConfig
NAME              AGE
set-log-and-pid   5s
  1. 查看MachineConfig,确认OpenShift生成了名为rendered-master-xxx的MachineConfig,并且其中的“pids_limit”参数为“2048”。
$ oc get MachineConfigs | grep rendered
rendered-master-1eac183c39006eab3480e3acfc9ba8db                  99eb744f5094224edb60d88ca85d607ab151ebdf   3.1.0             27h
rendered-master-d08c556ab53f07069ff0c46e741de224                  99eb744f5094224edb60d88ca85d607ab151ebdf   3.1.0             50s
rendered-worker-92b2d6fba537bbf64d3493dcc2b6a207                  99eb744f5094224edb60d88ca85d607ab151ebdf   3.1.0             27h
 
$ python3 -c "import sys, urllib.parse; print(urllib.parse.unquote(sys.argv[1]))" $(oc get MachineConfig/rendered-master-d08c556ab53f07069ff0c46e741de224 -o YAML | grep -B4 crio.conf | grep source | tail -n 1 | cut -d, -f2) | grep pid
    pids_limit = 2048
  1. 查看master节点更新状态,确认集群中master轮流进行了更新。
$ oc get node -l node-role.kubernetes.io/master -w
NAME                                              STATUS                        ROLES    AGE   VERSION
ip-10-0-134-103.ap-southeast-1.compute.internal   Ready                         master   26h   v1.19.0+d59ce34
ip-10-0-178-236.ap-southeast-1.compute.internal   Ready,SchedulingDisabled      master   26h   v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal   Ready                         master   26h   v1.19.0+d59ce34
ip-10-0-134-103.ap-southeast-1.compute.internal   Ready                         master   26h   v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal   Ready                         master   26h   v1.19.0+d59ce34
ip-10-0-178-236.ap-southeast-1.compute.internal   NotReady,SchedulingDisabled   master   26h   v1.19.0+d59ce34
ip-10-0-178-236.ap-southeast-1.compute.internal   NotReady,SchedulingDisabled   master   26h   v1.19.0+d59ce34
ip-10-0-134-103.ap-southeast-1.compute.internal   Ready                         master   26h   v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal   Ready                         master   26h   v1.19.0+d59ce34
ip-10-0-178-236.ap-southeast-1.compute.internal   Ready,SchedulingDisabled      master   26h   v1.19.0+d59ce34
ip-10-0-178-236.ap-southeast-1.compute.internal   Ready                         master   26h   v1.19.0+d59ce34
ip-10-0-178-197.ap-southeast-1.compute.internal   Ready                         worker   26h   v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal   Ready                         master   26h   v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal   Ready,SchedulingDisabled      master   26h   v1.19.0+d59ce34
ip-10-0-157-96.ap-southeast-1.compute.internal    Ready                         worker   26h   v1.19.0+d59ce34
ip-10-0-178-236.ap-southeast-1.compute.internal   Ready                         master   26h   v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal   NotReady,SchedulingDisabled   master   26h   v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal   Ready,SchedulingDisabled      master   26h   v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal   Ready                         master   26h   v1.19.0+d59ce34
ip-10-0-134-103.ap-southeast-1.compute.internal   Ready                         master   26h   v1.19.0+d59ce34
ip-10-0-134-103.ap-southeast-1.compute.internal   Ready,SchedulingDisabled      master   26h   v1.19.0+d59ce34
ip-10-0-178-236.ap-southeast-1.compute.internal   Ready                         master   26h   v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal   Ready                         master   26h   v1.19.0+d59ce34
ip-10-0-134-103.ap-southeast-1.compute.internal   NotReady,SchedulingDisabled   master   26h   v1.19.0+d59ce34
ip-10-0-134-103.ap-southeast-1.compute.internal   Ready,SchedulingDisabled      master   26h   v1.19.0+d59ce34
ip-10-0-134-103.ap-southeast-1.compute.internal   Ready                         master   26h   v1.19.0+d59ce34
  1. 用上小一节的步骤再次查看/etc/crio/crio.conf文件,确认“pids_limit”已经被修改为2048。

参考

https://access.redhat.com/documentation/zh-cn/openshift_container_platform/4.5/html-single/scalability_and_performance/index
https://docs.openshift.com/container-platform/4.5/scalability_and_performance/recommended-host-practices.html
https://www.redhat.com/en/blog/red-hat-openshift-container-platform-4-now-defaults-cri-o-underlying-container-engine

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值