1、什么是topologyKey?
首先,我们得先了解一下,topologyKey是什么?
原则上,topologyKey可以是任何合法的标签密钥。 但是,出于性能和安全性原因,topologyKey受到以下一些限制:
- 对于亲和关系,以及pod反亲和的硬亲和条件requiredDuringSchedulingIgnoredDuringExecution时,topologyKey不允许为空;
- 由于准入控制器LimitPodHardAntiAffinityTopology的存在,如果计划在pod反亲和的requiredDuringSchedulingIgnoredDuringExecution中使用,则需要修改准入控制器或者直接禁用它;
- 对于pod反亲和中的软亲和:preferredDuringSchedulingIgnoredDuringExecution,如果没指定topologyKey的值,将会使用kubernetes.io/hostname, failure-domain.beta.kubernetes.io/zone和failure-domain.beta.kubernetes.io/region这三个内建的字段值;
除上述情况外,topologyKey可以是任何合法的标签密钥。
2、topologyKey原理
apiVersion: v1
kind: Pod
metadata:
name: with-pod-affinity
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: security
operator: In
values:
- S1
topologyKey: failure-domain.beta.kubernetes.io/zone
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: security
operator: In
values:
- S2
topologyKey: failure-domain.beta.kubernetes.io/zone
containers:
- name: with-pod-affinity
image: k8s.gcr.io/pause:2.0
以上是topologyKey的demo和我画的对应调度图,具体说明如下:
- 首先podAffinity期望必须调度至运行有标签是security=s1的pod的节点上,且颗粒度是zone。
所以图中zone=foo或者zone=bar都可以。
特别注意:要么都调度至zone=foo,要么都调度至zone=bar中,混合调度就不成立了 - 然后podAntiAffinity,期望最好不要调度至运行了标签为security=S2的pod的node节点上(颗粒度为主机)
- 综上所述,只会调度在node-3或者node-7上面!
补充说明:
pod affinity and anti-affinity的逻辑表达式(operator)分为:In, NotIn, Exists, DoesNotExist
3、实战:PodAffinity
1、环境准备:有一个pod运行在centos-2.shared上,且标签为app=ngx-new
[root@centos-1 dingqishi]# kubectl get pod -o wide --show-labels
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS
ngx-new-cb79d555-2c7qq 1/1 Running 0 44h 10.244.1.7 centos-2.shared <none> <none> app=ngx-new,pod-template-hash=cb79d555
2、编辑deploy-with-required-podAffinity.yaml,我们希望这些pod可以调度至有app=ngx-new标签的pod的节点上,并且颗粒度是zone的那些节点
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-with-pod-affinity
spec:
replicas: 3
selector:
matchLabels:
app: myapp
template:
metadata:
name: myapp
labels:
app: myapp
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution: #硬亲和,表示希望调度到有app<in>ngx-new标签的pod的节点上,并且颗粒度是zone
- labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["ngx-new"]}
topologyKey: zone
containers:
- name: myapp
image: nginx
3、给2个节点打标签,并且划分不通的zone,预期新pod只会调度至centos-2.shared上
kubectl label nodes centos-2.shared zone=foo
kubectl label nodes centos-3.shared zone=bar
注意:如果centos-3.shared也是zone=foo,新pod也会调度到上面;
因为此时,centos-3.shared和centos-2.shared两个节点处于同一位置(zone)
4、apply上面的deploy-with-required-podAffinity.yaml,发现新pod都调度至centos-2.shared上了
[root@centos-1 dingqishi]# kubectl get pod -o wide --show-labels
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS
myapp-with-pod-affinity-778f46bf4-92fxq 1/1 Running 0 16m 10.244.1.2 centos-2.shared <none> <none> app=myapp,pod-template-hash=778f46bf4
myapp-with-pod-affinity-778f46bf4-gwv6z 1/1 Running 0 16m 10.244.1.3 centos-2.shared <none> <none> app=myapp,pod-template-hash=778f46bf4
myapp-with-pod-affinity-778f46bf4-lcvz5 1/1 Running 0 16m 10.244.1.4 centos-2.shared <none> <none> app=myapp,pod-template-hash=778f46bf4
ngx-new-cb79d555-2c7qq 1/1 Running 0 44h 10.244.1.7 centos-2.shared <none> <none> app=ngx-new,pod-template-hash=cb79d555
5、将centos-3.shared的标签也修改为zone=foo,这时候我们delete-f,并重新apply.
发现centos-2.shared和centos-3.shared都会被调度到,和预期一样
#修改 centos-3.shared标签
[root@centos-1 dingqishi]# kubectl label nodes centos-3.shared zone=foo --overwrite
node/centos-3.shared labeled
[root@centos-1 dingqishi]# kubectl get node --show-labels
NAME STATUS ROLES AGE VERSION LABELS
centos-1.shared Ready master 16d v1.16.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=centos-1.shared,kubernetes.io/os=linux,node-role.kubernetes.io/master=
centos-2.shared Ready <none> 16d v1.16.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=centos-2.shared,kubernetes.io/os=linux,zone=foo
centos-3.shared Ready <none> 16d v1.16.3 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/arch=amd64,kubernetes.io/hostname=centos-3.shared,kubernetes.io/os=linux,zone=foo
#重新apply
[root@centos-1 dingqishi]# kubectl apply -f deploy-with-required-podAffinity.yaml
deployment.apps/myapp-with-pod-affinity created
#观察pod部署情况:发现centos-2.shared和centos-3.shared都会被调度到,和预期一样
[root@centos-1 dingqishi]# kubectl get pod -o wide --show-labels
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS
myapp-with-pod-affinity-778f46bf4-2rqnj 1/1 Running 0 34s 10.244.2.8 centos-3.shared <none> <none> app=myapp,pod-template-hash=778f46bf4
myapp-with-pod-affinity-778f46bf4-fjfpr 1/1 Running 0 34s 10.244.2.7 centos-3.shared <none> <none> app=myapp,pod-template-hash=778f46bf4
myapp-with-pod-affinity-778f46bf4-tb8v7 1/1 Running 0 34s 10.244.1.5 centos-2.shared <none> <none> app=myapp,pod-template-hash=778f46bf4
ngx-new-cb79d555-2c7qq 1/1 Running 0 44h 10.244.1.7 centos-2.shared <none> <none> app=ngx-new,pod-template-hash=cb79d555
知识点:
- pod亲和度,pod对pod的亲和性,表示是否愿意与相关pod调度至一个区域(可以是node、机架、也可以是机房)
- 如何定义同一区域,则需要使用本章节提及的topologyKey(v1.16特性)进行标识