一、环境搭建
使用一开始nginx:v1.7.9 创建deployment
[root@hdss7-21 ~]# kubectl create deployment nginx-dp --image=harbor.od.com:180/public/nginx:v1.7.9 -n kube-system
deployment.apps/nginx-dp created
[root@hdss7-21 ~]# kubectl get pods -o wide -n kube-system |grep nginx-dp
nginx-dp-6f8459c455-hpz7s 1/1 Running 0 5m44s 172.7.21.8 hdss7-21.host.com <none> <none>
动态扩容2份,按照正常调度情况下,会在hdss7-21和hdss7-22分别建立容器
[root@hdss7-21 ~]# kubectl scale deployment nginx-dp --replicas=2 -n kube-system
deployment.extensions/nginx-dp scaled
[root@hdss7-21 ~]# kubectl get pods -o wide -n kube-system |grep nginx-dp
nginx-dp-6f8459c455-hpz7s 1/1 Running 0 6m9s 172.7.21.8 hdss7-21.host.com <none> <none>
nginx-dp-6f8459c455-znqnq 1/1 Running 0 8s 172.7.22.2 hdss7-22.host.com <none> <none>
deployment整出来
[root@hdss7-21 ~]# kubectl get deployment nginx-dp -o yaml -n kube-system >nginx-dp.yaml
[root@hdss7-21 ~]# vi nginx-dp.yaml 不需要的默认删除修改
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
app: nginx-dp
name: nginx-dp
namespace: kube-system
spec:
replicas: 2
selector:
matchLabels:
app: nginx-dp
template:
metadata:
labels:
app: nginx-dp
spec:
containers:
- image: harbor.od.com:180/public/nginx:v1.7.9
imagePullPolicy: IfNotPresent
name: nginx
二、污点、容忍度讲解
1、证明NoSchedule 有污点,容器不能启动
容器缩容成一份,其中有一个节点pod非启动。对此容器加一个污点,查看能不能启动
[root@hdss7-21 ~]# kubectl scale deployment nginx-dp --replicas=1 -n kube-system
deployment.extensions/nginx-dp scaled
[root@hdss7-21 ~]# kubectl get pods -o wide -n kube-system |grep nginx-dp
nginx-dp-6f8459c455-hpz7s 1/1 Running 0 170m 172.7.21.8 hdss7-21.host.com <none> <none>
[root@hdss7-22 ~]# kubectl taint node hdss7-22.host.com qd=bxj:NoSchedule 添加污点的方式 key=value:管理方式
查看污点
[root@hdss7-22 ~]# vi nodes-taints.tmpl
{{printf "%-50s %-12s\n" "Node" "Taint"}}
{{- range .items}}
{{- if $taint := (index .spec "taints") }}
{{- .metadata.name }}{{ "\t" }}
{{- range $taint }}
{{- .key }}={{ .value }}:{{ .effect }}{{ "\t" }}
{{- end }}
{{- "\n" }}
{{- end}}
{{- end}}
[root@hdss7-22 ~]# kubectl get nodes -o go-template-file="./nodes-taints.tmpl"
Node Taint
hdss7-22.host.com qd=bxj:NoSchedule
修改容器缩容成二份,查看hdss7-22 查看能不能启动
[root@hdss7-21 ~]# kubectl scale deployment nginx-dp --replicas=2 -n kube-system
Ddeployment.extensions/nginx-dp scaled
查看后还是依旧启动的是hdss7-21
[root@hdss7-21 ~]# kubectl get pods -o wide -n kube-system |grep nginx-dp
nginx-dp-6f8459c455-hpz7s 1/1 Running 0 3h9m 172.7.21.8 hdss7-21.host.com <none> <none>
nginx-dp-6f8459c455-x58p5 1/1 Running 0 111s 172.7.21.9 hdss7-21.host.com <none> <none>
总结:只要是节点做(NoSchedule)污点,deployment没有配置任何污点容忍,这个节点就不会再被所有的任何调用
2、配置能够容忍此节点,就算此节点有污点
修改dp.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
app: nginx-dp
name: nginx-dp
namespace: kube-system
spec:
replicas: 2
selector:
matchLabels:
app: nginx-dp
template:
metadata:
labels:
app: nginx-dp
spec:
tolerations:
- key: qd
value: bxj
effect: NoSchedule
containers:
- image: harbor.od.com:180/public/nginx:v1.7.9
imagePullPolicy: IfNotPresent
name: nginx
tolerations:
- key: qd
value: bxj
effect: NoSchedule
这句话的意思是:我检测这个node节点有污点,如果这个节点的污点是key=value:effect格式,也就是qd=bxj:NoSchedule一摸一样,我容忍你这个node节点可以部署我dp.yaml的pod
查看nginx-dp,发现了hdss7-22节点部署了nginx-dp
[root@hdss7-21 ~]# kubectl apply -f nginx-dp.yaml
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply
deployment.extensions/nginx-dp configured
[root@hdss7-21 ~]# kubectl get pods -o wide -n kube-system |grep nginx-dp
nginx-dp-6d99d9c6b9-ftmsz 1/1 Running 0 21s 172.7.22.2 hdss7-22.host.com <none> <none>
nginx-dp-6d99d9c6b9-lbs7s 1/1 Running 0 13s 172.7.21.10 hdss7-21.host.com <none> <none>
nginx-dp-6f8459c455-hpz7s 0/1 Terminating 0 3h28m <none> hdss7-21.host.com <none> <none>
[root@hdss7-21 ~]# kubectl get pods -o wide -n kube-system |grep nginx-dp
nginx-dp-6d99d9c6b9-ftmsz 1/1 Running 0 25s 172.7.22.2 hdss7-22.host.com <none> <none>
nginx-dp-6d99d9c6b9-lbs7s 1/1 Running 0 17s 172.7.21.10 hdss7-21.host.com <none> <none>
总结:针对NoSchedule污点,只要在dp.yaml做对应的一摸一样污点容忍,就可以容忍你这个node节点可以部署我dp.yaml的pod
3、effect: NoExecute讲解
NoSchedule我们知道了,如果一个 pod 没有声明容忍这个 Taint,则系统不会把该 Pod 调度到有这个 Taint 的 node 上。
NoExecute:定义 pod 的驱逐行为,你可以调度到node节点,但是不可以执行。以应对节点故障。
容器缩容成一份,删除qd=bxj:NoSchedule污点,创建一个NoExecute污点
[root@hdss7-21 ~]# kubectl scale deployment nginx-dp --replicas=0 -n kube-system
deployment.extensions/nginx-dp scaled
[root@hdss7-22 ~]# kubectl taint node hdss7-22.host.com qd-
node/hdss7-22.host.com untainted
[root@hdss7-22 ~]# kubectl taint node hdss7-22.host.com qd=bxj:NoExecute
node/hdss7-22.host.com tainted
修改dp.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
app: nginx-dp
name: nginx-dp
namespace: kube-system
spec:
replicas: 2
selector:
matchLabels:
app: nginx-dp
template:
metadata:
labels:
app: nginx-dp
spec:
tolerations:
- key: qd
value: bxj
effect: NoExecute
containers:
- image: harbor.od.com:180/public/nginx:v1.7.9
imagePullPolicy: IfNotPresent
扩容2份,查看后ndinx-dp在hdss7-21运行,而且所有的节点都挂在hdss7-21
[root@hdss7-21 ~]# kubectl apply -f nginx-dp.yaml
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply
deployment.extensions/nginx-dp configured
[root@hdss7-21 ~]# kubectl apply -f nginx-dp.yaml
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply
deployment.extensions/nginx-dp configured
[root@hdss7-21 ~]# kubectl get pods -o wide -n kube-system
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
blackbox-exporter-6b87d9d798-4qw6h 1/1 Running 1 13h 172.7.21.7 hdss7-21.host.com <none> <none>
cadvisor-kgpzc 1/1 Running 0 66m 10.4.7.21 hdss7-21.host.com <none> <none>
coredns-6b6c4f9648-bpg6k 1/1 Running 29 37d 172.7.21.6 hdss7-21.host.com <none> <none>
heapster-85c94856f7-8zfkf 1/1 Running 25 37d 172.7.21.4 hdss7-21.host.com <none> <none>
kube-state-metrics-6bc667c8b9-fg7jt 1/1 Running 0 15m 172.7.21.9 hdss7-21.host.com <none> <none>
kubernetes-dashboard-7977cc79db-dxz7r 1/1 Running 38 37d 172.7.21.3 hdss7-21.host.com <none> <none>
nginx-dp-6d99d9c6b9-knfwf 1/1 Running 0 8m12s 172.7.21.11 hdss7-21.host.com <none> <none>
nginx-dp-6d99d9c6b9-qhplg 1/1 Running 0 8m12s 172.7.21.10 hdss7-21.host.com <none> <none>
node-exporter-c92vd 1/1 Running 1 32h 10.4.7.21 hdss7-21.host.com <none> <none>
traefik-ingress-xs4md 1/1 Running 34 49d 172.7.21.2 hdss7-21.host.com <none> <none>
[root@hdss7-21 ~]# kubectl scale deployment nginx-dp --replicas=2 -n kube-system
总结:NoExecute也是污点,只要是节点打上NoExecute后,任何pod不能在NoExecute节点调度,会立即驱逐(即使已经启动的容器),但是yaml配置了qd=bxj:NoExecute跟node节点配置一摸一样,为什么也不行。因为yaml配置了一摸一样,只能代表除非其他节点资源耗尽等,才会调度到这个节点,而且可以调度到node节点,但是不可以执行pod。
NoExecute作用:
发现:如果执行了NoExecute污点,即使做了容忍也是调度部执行。调度有什么用:这种情况下,再生产下,常用在这个运算节点要下线维修,应该先加一个NoExecute污点,然他把容器都排干,然后在kubect delete node,从节点摘除,这样是科学的。
4、理解污点的作用
场景一:
普罗米修斯启动后,占用2g内存正常轻轻松松。生产一般20g。所以生产中一般给普罗米修斯单独做一个运算节点,这个运算节点只是做普罗米修斯节点,任何容器不能在这个节点调度,就算调度也起不来,因为已经没资源了。这种情况下给普罗米修斯节点打污点,只能让普罗米修斯容忍。这样任何都不能调度过来,只能普罗米修斯。
场景二:
公司买进了一批机器,分别是sas盘、sata盘、ssd盘(高性能),把这些机器做成node,通过打标签方式把sas盘的节点划分成一起、sata盘的节点划分成一起、ssd盘的节点划分成一起,实现高并发高请求的业务指定在 ssd盘(高性能)调度,而不常用的业务在sas盘、sata盘(低性能)调度
例子:假如给sas盘一堆机器(运算节点)打一个标签 disktype=sas,假如给ssd盘一堆机器(运算节点)打一个标签 disktype=ssd 。就可以人为调度,比如说调度IO密集容器到ssd,就可以加给ssd(节点)添加一个nodeSelector(标签选择),让他做标签选择。
5、node多污点
5.1、添加多污点
[root@hdss7-21 kube-apiserver]# kubectl scale deployment nginx-dp --replicas=0 -n kube-system
[root@hdss7-22 rules.d]# kubectl describe node |grep -i taint hdss7-22无任何污点
Taints: <none>
[root@hdss7-21 kube-apiserver]# kubectl taint node hdss7-21.host.com qd=bxj:NoSchedule
node/hdss7-21.host.com tainted
[root@hdss7-21 kube-apiserver]# kubectl taint node hdss7-21.host.com qd=bxz:NoSchedule key不允许相同
error: Node hdss7-21.host.com already has qd taint(s) with same effect(s) and --overwrite is false
[root@hdss7-21 kube-apiserver]# kubectl taint node hdss7-21.host.com qd-
node/hdss7-21.host.com untainted
污点可以有value,也可以没有value(key=<no value>:属性)
root@hdss7-21 kube-apiserver]# kubectl taint node hdss7-21.host.com bxj=:NoSchedule
node/hdss7-21.host.com tainted
[root@hdss7-21 kube-apiserver]# kubectl taint node hdss7-21.host.com bxz=:NoSchedule
node/hdss7-21.host.com tainted
查看污点
[root@hdss7-21 kube-apiserver]# kubectl describe node hdss7-21 |grep -i taint 只是显示最后一条
Taints: bxj:NoSchedule
脚本能显示全部
[root@hdss7-21 ~]# kubectl get nodes -o go-template-file="./nodes-taints.tmpl"
Node Taint
hdss7-21.host.com bxz=<no value>:NoSchedule bxj=<no value>:NoSchedule
5.2、两污点调度
目前hdss7-22无任何污点,hdss7-21可以容忍bxj、bxz是不是能调度
[root@hdss7-21 ~]# vi nginx-dp.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
app: nginx-dp
name: nginx-dp
namespace: kube-system
spec:
replicas: 2
selector:
matchLabels:
app: nginx-dp
template:
metadata:
labels:
app: nginx-dp
spec:
tolerations:
- key: bxj
effect: NoSchedule
- key: bxz
effect: NoSchedule
containers:
- image: harbor.od.com:180/public/nginx:v1.7.9
imagePullPolicy: IfNotPresent
name: nginx
[root@hdss7-21 ~]# kubectl apply -f nginx-dp.yaml
deployment.extensions/nginx-dp configured
[root@hdss7-21 ~]# kubectl scale deployment nginx-dp --replicas=4 -n kube-system
deployment.extensions/nginx-dp scaled
hdss7-21、hdss7-22都可以调度
[root@hdss7-21 ~]# kubectl get pods -o wide -n kube-system
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-dp-747869cdcb-5q8lr 1/1 Running 0 112s 172.7.21.5 hdss7-21.host.com <none> <none>
nginx-dp-747869cdcb-cntz5 1/1 Running 0 112s 172.7.21.6 hdss7-21.host.com <none> <none>
nginx-dp-747869cdcb-j98sn 1/1 Running 0 2m55s 172.7.21.4 hdss7-21.host.com <none> <none>
nginx-dp-747869cdcb-xxxq2 1/1 Running 0 2m55s 172.7.22.10 hdss7-22.host.com <none> <none>
删除一个污点
[root@hdss7-21 ~]# vi nginx-dp.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
app: nginx-dp
name: nginx-dp
namespace: kube-system
spec:
replicas: 2
selector:
matchLabels:
app: nginx-dp
template:
metadata:
labels:
app: nginx-dp
spec:
tolerations:
- key: bxj
effect: NoSchedule
containers:
- image: harbor.od.com:180/public/nginx:v1.7.9
imagePullPolicy: IfNotPresent
name: nginx
[root@hdss7-21 ~]# kubectl apply -f nginx-dp.yaml
deployment.extensions/nginx-dp configured
只有hdss7-22可以调度
[root@hdss7-21 ~]# kubectl get pods -o wide -n kube-system
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-dp-747869cdcb-j98sn 0/1 Terminating 0 5m32s 172.7.21.4 hdss7-21.host.com <none> <none>
nginx-dp-747869cdcb-xxxq2 1/1 Running 0 5m32s 172.7.22.10 hdss7-22.host.com <none> <none>
nginx-dp-855d84b6b8-95wl7 0/1 ContainerCreating 0 3s 172.7.22.16 hdss7-22.host.com <none> <none>
nginx-dp-855d84b6b8-h5xhs 1/1 Running 0 6s 172.7.22.11 hdss7-22.host.com <none> <none>
node-exporter-c92vd 1/1 Running 3 3d3h 10.4.7.21 hdss7-21.host.com <none> <none>
node-exporter-m9r4z 1/1 Running 1 42h 10.4.7.22 hdss7-22.host.com <none> <none>
总结:只要是匹配到一个污点不能容忍,触发不调度
6、所有node污点
7、讲解k8s nodeSelector
labels 在 K8s 中是一个很重要的概念,作为一个标识,Service、Deployments 和 Pods 之间的关联都是通过 label 来实现的。而每个节点也都拥有 label,通过设置 label 相关的策略可以使得 pods 关联到对应 label 的节点上。
nodeSelector
nodeSelector
是最简单的约束方式。nodeSelector
是PodSpec
的一个字段。通过
--show-labels
可以查看当前nodes
的labels
1
2
3
4
$ kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
minikube Ready <none> 1m v1.10.0 beta.kubernetes.io
/arch
=amd64,beta.kubernetes.io
/os
=linux,kubernetes.io/
hostname
=minikube
如果没有额外添加 nodes labels,那么看到的如上所示的默认标签。我们可以通过 kubectl label node 命令给指定 node 添加 labels:
1
2
3
4
5
$ kubectl label node minikube disktype=ssd
node
/minikube
labeled
$ kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
minikube Ready <none> 5m v1.10.0 beta.kubernetes.io
/arch
=amd64,beta.kubernetes.io
/os
=linux,disktype=ssd,kubernetes.io
/host
当然,你也可以通过 kubectl label node 删除指定的 labels(标签 key 接 - 号即可)
1
2
3
4
5
$ kubectl label node minikube disktype-
node
/minikube
labeled
$ kubectl get node --show-labels
NAME STATUS ROLES AGE VERSION LABELS
minikube Ready <none> 23m v1.10.0 beta.kubernetes.io
/arch
=amd64,beta.kubernetes.io
/os
=linux,kubernetes.io
/hostname
=minikube
创建测试 pod 并指定 nodeSelector 选项绑定节点:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$
cat
nginx.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
env
:
test
spec:
containers:
- name: nginx
image: nginx
imagePullPolicy: IfNotPresent
nodeSelector:
disktype: ssd
$ kubectl create -f nginx.yaml
pod
/nginx
created
查看 pod 调度的节点,即我们指定有 disktype=ssd label 的 minikube 节点:
1
2
3
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
nginx 1
/1
Running 0 1m 172.18.0.4 minikube