《OpenShift 4.x HOL教程汇总》
说明:本文已经在OpenShift 4.6环境中验证
文章目录
Kubelet、KubeletConfig和KubeletConfigController
在《OpenShift 4 - 如何用Machine Config Operator修改集群节点CoreOS的配置》一文中提到在OpenShift的Machine Config Controller中包括一个名为Kubelet Config Controller的子组件,该组件接收基于CRD的KubeletConfig配置对象并将其实施于适用节点的Kubelet环境中,也就是说OpenShift 4集群中Node节点的Kubelete环境是通过该组件实现配置的。
在安装OpenShift集群过程中会在Ignition中提供缺省的KubeConfig配置,我们可以在安装后修改节点Kubelet使用的配置,从而修改Kubelet的运行参数。
查看节点的Kubelet配置
- 执行命令,查看当前集群中的worker类型节点。
$ oc get node -l node-role.kubernetes.io/worker
NAME STATUS ROLES AGE VERSION
ip-10-0-150-145.ap-southeast-1.compute.internal Ready worker 3h32m v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal Ready worker 3h32m v1.19.0+d59ce34
- 查看一个Worker类型的Node对象,其中“Allocatable”部分是该Node的Kubelete所用到的一部分配置。其中“pods: 250”是这个Kubelet可运行的最大pod数。
$ oc describe node <WORKER_NODE> | grep Allocatable -A7
Allocatable:
attachable-volumes-aws-ebs: 25
cpu: 15500m
ephemeral-storage: 114381692328
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 63991700Ki
pods: 250
- 查看这个OpenShift集群所包含的MachineConfig对象。从结果可以看出“01-worker-kubelet”是用来设置worker类型Node里面的Kubelet环境的;另外确认只有2个名为“99-master-XXXXX”和2个名为“rendered-worker-XXXXX”的machineconfig对象。
$ oc get machineconfig
NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE
00-master 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 6h
00-worker 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 6h
01-master-container-runtime 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 6h
01-master-kubelet 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 6h
01-worker-container-runtime 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 6h
01-worker-kubelet 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 6h
99-master-ea3d87d1-a5df-4137-a3a3-849915e40cdd-registries 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 6h
99-master-ssh 3.1.0 6h9m
99-worker-cdf0041e-c96c-401d-9881-5c8243a58991-registries 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 6h
99-worker-ssh 3.1.0 6h9m
rendered-master-2613a048ee6bb4b27621cfff3c44a676 99eb744f5094224edb60d88ca85d607ab151ebdf 3.1.0 6h
rendered-master-c5bfe43313bf45eb9abd3e8422421b6d 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 3h30m
rendered-worker-46a5c3ba1b88f2b312aa349e71f4a0fa 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 3h30m
rendered-worker-c74c310336d86b894dec0c5b49743ebd 99eb744f5094224edb60d88ca85d607ab151ebdf 3.1.0 6h
- 查看名为01-worker-kubelet的machineconfig中内容,确认kubelet.conf文件的路径在“/etc/kubernetes/kubelet.conf”。
$ oc describe machineconfig 01-worker-kubelet | grep '\--config'
--config=/etc/kubernetes/kubelet.conf \
- 执行命令,进入一个Worker Node。
$ oc debug node/<WORKER-NODE=NAME>
Starting pod/ip-10-0-150-145ap-southeast-1computeinternal-debug ...
To use host binaries, run `chroot /host`
- 在Node节点内部执行命令,查看kubelet.conf文件内容,其中可以看到“maxPods: 250”的配置,这和上面(2)看到的一样的结果。
sh-4.4# chroot /host
sh-4.4# more /etc/kubernetes/kubelet.conf
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
x509:
clientCAFile: /etc/kubernetes/kubelet-ca.crt
anonymous:
enabled: false
cgroupDriver: systemd
cgroupRoot: /
clusterDNS:
7. 172.30.0.10
clusterDomain: cluster.local
containerLogMaxSize: 50Mi
maxPods: 250
kubeAPIQPS: 50
kubeAPIBurst: 100
rotateCertificates: true
serializeImagePulls: false
staticPodPath: /etc/kubernetes/manifests
systemCgroups: /system.slice
systemReserved:
cpu: 500m
memory: 1Gi
ephemeral-storage: 1Gi
featureGates:
LegacyNodeRoleBehavior: false
NodeDisruptionExclusion: true
RotateKubeletServerCertificate: true
SCTPSupport: true
ServiceNodeExclusion: true
SupportPodPidsLimit: true
serverTLSBootstrap: true
- 从OpenShift的worker节点中退出来。
sh-4.4# exit
sh-4.4# exit
修改节点的Kubelet配置
- 创建内容如下的change-maxPods-cr.yaml文件。其中通过KubeletConfig对象将kubelet的maxPods设为500,另外它只针对带有“custom-kubelet: large-pods”的Node才有效。
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
name: set-max-pods
spec:
machineConfigPoolSelector:
matchLabels:
custom-kubelet: large-pods
kubeletConfig:
maxPods: 500
- 查看名为worker的machineconfigpool的信息,确认没有“custom-kubelet=large-pods”的标签。
$ oc get machineconfigpool worker --show-labels
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE LABELS
worker rendered-worker-1e4ff665b30dd1099a34d6e636654353 True False False 2 2 2 0 8h machineconfiguration.openshift.io/mco-built-in=
- 修改worker类的machineconfigpool的配置,增加“custom-kubelet=large-pods”标签。
$ oc label machineconfigpool worker custom-kubelet=large-pods
- 创建Kubeletconfig对象。
$ oc create -f change-maxPods-cr.yaml
$ oc get kubeletconfig
NAME AGE
set-max-pods 7s
- 查看Worker节点更新状态,可以看到OpenShift对2个Worker节点逐个更新Kubelet的配置。另外在更一个新节点的时候,该节点暂时处于“SchedulingDisabled”状态,以便临时不被Pod等资源调度到。最后2个Worker节点都恢复到只有“Ready”的状态。
NAME STATUS ROLES AGE VERSION
ip-10-0-150-145.ap-southeast-1.compute.internal Ready worker 7h19m v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal Ready,SchedulingDisabled worker 7h19m v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal NotReady,SchedulingDisabled worker 7h19m v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal NotReady,SchedulingDisabled worker 7h19m v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal Ready,SchedulingDisabled worker 7h19m v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal Ready,SchedulingDisabled worker 7h19m v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal Ready worker 7h19m v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal Ready worker 7h19m v1.19.0+d59ce34
ip-10-0-150-145.ap-southeast-1.compute.internal Ready worker 7h19m v1.19.0+d59ce34
ip-10-0-150-145.ap-southeast-1.compute.internal Ready,SchedulingDisabled worker 7h19m v1.19.0+d59ce34
ip-10-0-150-145.ap-southeast-1.compute.internal Ready,SchedulingDisabled worker 7h19m v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal Ready worker 7h20m v1.19.0+d59ce34
ip-10-0-190-1.ap-southeast-1.compute.internal Ready worker 7h20m v1.19.0+d59ce34
ip-10-0-150-145.ap-southeast-1.compute.internal Ready,SchedulingDisabled worker 7h20m v1.19.0+d59ce34
ip-10-0-150-145.ap-southeast-1.compute.internal Ready,SchedulingDisabled worker 7h20m v1.19.0+d59ce34
ip-10-0-150-145.ap-southeast-1.compute.internal Ready,SchedulingDisabled worker 7h20m v1.19.0+d59ce34
ip-10-0-150-145.ap-southeast-1.compute.internal Ready worker 7h21m v1.19.0+d59ce34
- 查看machineconfig对象,确认已经有3个名为“99-worker-XXXXXX“和3个名为“rendered-worker-XXXXX”的machineconfig对象。可以根据AGE判断“99-worker-cdf0041e-c96c-401d-9881-5c8243a58991-kubelet”和“rendered-worker-1e4ff665b30dd1099a34d6e636654353”是新创建的machineconfig对象,其中“99-worker-cdf0041e-c96c-401d-9881-5c8243a58991-kubelet”中包含了对Worker类型节点的Kubelet做的变更配置信息。
$ oc get machineconfig
NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE
00-master 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 8h
00-worker 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 8h
01-master-container-runtime 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 8h
01-master-kubelet 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 8h
01-worker-container-runtime 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 8h
01-worker-kubelet 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 8h
99-master-ea3d87d1-a5df-4137-a3a3-849915e40cdd-registries 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 8h
99-master-ssh 3.1.0 8h
99-worker-cdf0041e-c96c-401d-9881-5c8243a58991-kubelet 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 30m
99-worker-cdf0041e-c96c-401d-9881-5c8243a58991-registries 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 8h
99-worker-ssh 3.1.0 8h
rendered-master-2613a048ee6bb4b27621cfff3c44a676 99eb744f5094224edb60d88ca85d607ab151ebdf 3.1.0 8h
rendered-master-c5bfe43313bf45eb9abd3e8422421b6d 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 5h35m
rendered-worker-1e4ff665b30dd1099a34d6e636654353 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 30m
rendered-worker-46a5c3ba1b88f2b312aa349e71f4a0fa 0157b684b81eb5cbbe4e37d7b7e018ce5d5967d2 3.1.0 5h35m
rendered-worker-c74c310336d86b894dec0c5b49743ebd 99eb744f5094224edb60d88ca85d607ab151ebdf 3.1.0 8h
- 查看名为set-max-pods的kubeletconfig对象的status,其中“type: Success”代表已经成功执行了。
$ oc get kubeletconfig set-max-pods -o yaml
。。。
status:
conditions:
- lastTransitionTime: "2020-12-05T15:37:47Z"
message: Success
status: "True"
type: Success
- 再次查看Worker节点的Kubelet配置,发现已经变成“pods:500”了。
$ oc describe node <WORKER_NODE> | grep Allocatable -A7
Allocatable:
attachable-volumes-aws-ebs: 25
cpu: 15500m
ephemeral-storage: 114381692328
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 63991700Ki
pods: 500
CRI-O、ContainerRuntimeConfig和MachineConfigController
OpenShift 4的集群适用了CRI-O作为其容器运行环境,节点CRI-O的配置文件etc/crio/crio.conf。若要修改节点的CRI-O的配置参数,需要用到OpenShift中的CRD类型对象ContainerRuntimeConfig保存定制的配置参数。当OpenShift发现有新的ContainerRuntimeConfig后,会根据其内容生成对应的r名为ender-xxxx的MachineConfig对象,此后OpenShift的MachineConfigController会将MachineConfig对象发送到所有相关对点的MachineConfigDaemon,再由它完成该节点配置的修改。
查看节点的cri-o配置
- 根据本文前面的方法进入一个master节点,然后查看crio运行环境的配置文件中的“pids_limit”参数,确认缺省为1024。最后从OpenShift的master节点退出来。
sh-4.4# cat /etc/crio/crio.conf | grep -v "#" | sed '/^$/d' |grep -i pids_limit
pids_limit = 1024
修改节点的cri-o配置
- 为master类型的machineconfigpool添加标签。ContainerRuntimeConfig对象会提交到该标签对应的machine上。
$ oc label machineconfigpool master debug-crio=config-log-and-pid
machineconfigpool.machineconfiguration.openshift.io/master labeled
- 创建内容如下的ContainerRuntimeConfig.yaml文件,其中通过定义的ContainerRuntimeConfig对象来修改节点ContainerRuntime的pidsLimit和logLevel参数。
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
name: set-log-and-pid
spec:
machineConfigPoolSelector:
matchLabels:
debug-crio: config-log-and-pid
containerRuntimeConfig:
pidsLimit: 2048
logLevel: debug
- 创建ContainerRuntimeConfig对象。
$ oc create -f ContainerRuntimeConfig.yaml
containerruntimeconfig.machineconfiguration.openshift.io/set-log-and-pid created
$ oc get ContainerRuntimeConfig
NAME AGE
set-log-and-pid 5s
- 查看MachineConfig,确认OpenShift生成了名为rendered-master-xxx的MachineConfig,并且其中的“pids_limit”参数为“2048”。
$ oc get MachineConfigs | grep rendered
rendered-master-1eac183c39006eab3480e3acfc9ba8db 99eb744f5094224edb60d88ca85d607ab151ebdf 3.1.0 27h
rendered-master-d08c556ab53f07069ff0c46e741de224 99eb744f5094224edb60d88ca85d607ab151ebdf 3.1.0 50s
rendered-worker-92b2d6fba537bbf64d3493dcc2b6a207 99eb744f5094224edb60d88ca85d607ab151ebdf 3.1.0 27h
$ python3 -c "import sys, urllib.parse; print(urllib.parse.unquote(sys.argv[1]))" $(oc get MachineConfig/rendered-master-d08c556ab53f07069ff0c46e741de224 -o YAML | grep -B4 crio.conf | grep source | tail -n 1 | cut -d, -f2) | grep pid
pids_limit = 2048
- 查看master节点更新状态,确认集群中master轮流进行了更新。
$ oc get node -l node-role.kubernetes.io/master -w
NAME STATUS ROLES AGE VERSION
ip-10-0-134-103.ap-southeast-1.compute.internal Ready master 26h v1.19.0+d59ce34
ip-10-0-178-236.ap-southeast-1.compute.internal Ready,SchedulingDisabled master 26h v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal Ready master 26h v1.19.0+d59ce34
ip-10-0-134-103.ap-southeast-1.compute.internal Ready master 26h v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal Ready master 26h v1.19.0+d59ce34
ip-10-0-178-236.ap-southeast-1.compute.internal NotReady,SchedulingDisabled master 26h v1.19.0+d59ce34
ip-10-0-178-236.ap-southeast-1.compute.internal NotReady,SchedulingDisabled master 26h v1.19.0+d59ce34
ip-10-0-134-103.ap-southeast-1.compute.internal Ready master 26h v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal Ready master 26h v1.19.0+d59ce34
ip-10-0-178-236.ap-southeast-1.compute.internal Ready,SchedulingDisabled master 26h v1.19.0+d59ce34
ip-10-0-178-236.ap-southeast-1.compute.internal Ready master 26h v1.19.0+d59ce34
ip-10-0-178-197.ap-southeast-1.compute.internal Ready worker 26h v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal Ready master 26h v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal Ready,SchedulingDisabled master 26h v1.19.0+d59ce34
ip-10-0-157-96.ap-southeast-1.compute.internal Ready worker 26h v1.19.0+d59ce34
ip-10-0-178-236.ap-southeast-1.compute.internal Ready master 26h v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal NotReady,SchedulingDisabled master 26h v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal Ready,SchedulingDisabled master 26h v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal Ready master 26h v1.19.0+d59ce34
ip-10-0-134-103.ap-southeast-1.compute.internal Ready master 26h v1.19.0+d59ce34
ip-10-0-134-103.ap-southeast-1.compute.internal Ready,SchedulingDisabled master 26h v1.19.0+d59ce34
ip-10-0-178-236.ap-southeast-1.compute.internal Ready master 26h v1.19.0+d59ce34
ip-10-0-221-178.ap-southeast-1.compute.internal Ready master 26h v1.19.0+d59ce34
ip-10-0-134-103.ap-southeast-1.compute.internal NotReady,SchedulingDisabled master 26h v1.19.0+d59ce34
ip-10-0-134-103.ap-southeast-1.compute.internal Ready,SchedulingDisabled master 26h v1.19.0+d59ce34
ip-10-0-134-103.ap-southeast-1.compute.internal Ready master 26h v1.19.0+d59ce34
- 用上小一节的步骤再次查看/etc/crio/crio.conf文件,确认“pids_limit”已经被修改为2048。
参考
https://access.redhat.com/documentation/zh-cn/openshift_container_platform/4.5/html-single/scalability_and_performance/index
https://docs.openshift.com/container-platform/4.5/scalability_and_performance/recommended-host-practices.html
https://www.redhat.com/en/blog/red-hat-openshift-container-platform-4-now-defaults-cri-o-underlying-container-engine