目录
一.资源限制
- 生产环境中我们需要对pod进行资源限制,因为pod内部是容器,而容器是共享内核资源的,如果不对pod进行资源限制,就会导致一个pod可能占用大量资源。
- 如右是kubernetes官方对,pod资源限制的定义与解释,https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
- pod资源在yaml文件限制的字段如下所示
#resources表示资源限制字段
##requests表示基本资源
##limits表示资源上限,即这个pod最大能用到多少资源
spec.containers[].resources.limits.cpu //cpu上限
spec.containers[].resources.limits.memory //内存上限
spec.containers[].resources.requests.cpu //创建时分配的基本cpu资源
spec.contaoners[].resources.requests.memory //创建时分配的基本内存资源
实例如下
- 创建yaml文件
#如下所示的yaml文件中创建两个容器,就是一个pod中创建两个容器
[root@master demo]# cat pod2.yaml
apiVersion: v1
kind: Pod
metadata:
name: frontend
spec:
containers:
- name: db
image: mysql
env:
- name: MYSQL_ROOT_PASSWORD
value: "password"
resources:
requests:
memory: "64Mi" ##基础内存为64M
cpu: "250m" ##基础cpu使用为25%
limits:
memory: "128Mi" ##这个容器内存上限为128M
cpu: "500m" ##这个容器cpu上限为50%
- name: wp
image: wordpress
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
[root@master demo]#
-
创建pod
##apply和create都可以创建资源,但是apply还有更新加载的功能
[root@master demo]# kubectl apply -f pod2.yaml
pod/frontend created
-
查看事件(event),即创建过程,资源创建之前会被记录在event事件中,可以通过event查看创建过程中出现的错误
[root@master demo]# kubectl describe pod frontend
Name: frontend
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: 192.168.43.103/192.168.43.103
Start Time: Tue, 12 May 2020 10:14:15 +0800
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"frontend","namespace":"default"},"spec":{"containers":[{"env":[{"name...
Status: Running
IP: 172.17.60.2
Containers:
db:
Container ID: docker://b51bcdd6f7962d3f0fa73c2cd93fca5d58fd7ebe74eb428cc5fb91b4f0935929
Image: mysql
Image ID: docker-pullable://mysql@sha256:61a2a33f4b8b4bc93b7b6b9e65e64044aaec594809f818aeffbff69a893d1944
Port: <none>
Host Port: <none>
State: Running
Started: Tue, 12 May 2020 10:14:26 +0800
Ready: True
Restart Count: 0
Limits:
cpu: 500m
memory: 128Mi
Requests:
cpu: 250m
memory: 64Mi
Environment:
MYSQL_ROOT_PASSWORD: password
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-h4tl7 (ro)
wp:
Container ID: docker://668b88b91a49acaf0278d628e2a158b21ba0ecc26fced51288eb8cb243b4589a
Image: wordpress
Image ID: docker-pullable://wordpress@sha256:c3312ab9d4b35148c3ae6f6e06ca3a999850c4aa34dbe310856c52311ec06a93
Port: <none>
Host Port: <none>
State: Running
Started: Tue, 12 May 2020 10:14:33 +0800
Ready: True
Restart Count: 0
Limits:
cpu: 500m
memory: 128Mi
Requests:
cpu: 250m
memory: 64Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-h4tl7 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-h4tl7:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-h4tl7
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 21s default-scheduler Successfully assigned default/frontend to 192.168.43.103
Normal Pulling 19s kubelet, 192.168.43.103 pulling image "mysql"
Normal Pulled 10s kubelet, 192.168.43.103 Successfully pulled image "mysql"
Normal Created 10s kubelet, 192.168.43.103 Created container
Normal Started 10s kubelet, 192.168.43.103 Started container
Normal Pulling 10s kubelet, 192.168.43.103 pulling image "wordpress"
Normal Pulled 4s kubelet, 192.168.43.103 Successfully pulled image "wordpress"
Normal Created 4s kubelet, 192.168.43.103 Created container
Normal Started 3s kubelet, 192.168.43.103 Started container
[root@master demo]#
-
查看node节点的资源状态
#查看pod网络状态
[root@master demo]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
frontend 2/2 Running 0 14s 172.17.60.2 192.168.43.103 <none>
##查看pod资源对应节点的资源状态
[root@master demo]# kubectl describe nodes 192.168.43.103
Name: 192.168.43.103
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/hostname=192.168.43.103
Annotations: node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Mon, 27 Apr 2020 20:46:21 +0800
Taints: <none>
Unschedulable: false
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
OutOfDisk False Tue, 12 May 2020 10:23:48 +0800 Tue, 12 May 2020 09:09:37 +0800 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Tue, 12 May 2020 10:23:48 +0800 Tue, 12 May 2020 09:09:37 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Tue, 12 May 2020 10:23:48 +0800 Tue, 12 May 2020 09:09:37 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Tue, 12 May 2020 10:23:48 +0800 Mon, 27 Apr 2020 20:46:21 +0800 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Tue, 12 May 2020 10:23:48 +0800 Tue, 12 May 2020 09:09:37 +0800 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 192.168.43.103
Hostname: 192.168.43.103
Capacity:
cpu: 1
ephemeral-storage: 20470Mi
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 2911652Ki
pods: 110
Allocatable:
cpu: 1
ephemeral-storage: 19317915617
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 2809252Ki
pods: 110
System Info:
Machine ID: bf6c47173ce244fc94186bd579f13d7f
System UUID: EB0A4D56-93E1-9352-9F9E-D0F9B49FCECE
Boot ID: c5da0e09-5876-419a-ac62-689058e2e389
Kernel Version: 3.10.0-1062.el7.x86_64
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://19.3.8
Kubelet Version: v1.12.3
Kube-Proxy Version: v1.12.3
##在此字段中包含资源限制
##cpu基本资源是50%,上限是100%
##内存基本资源是128M,上限是256M
##由于是一个pod中创建两个container,所以这些资源都是相加的
Non-terminated Pods: (1 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
default frontend 500m (50%) 1 (100%) 128Mi (4%) 256Mi (9%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 500m (50%) 1 (100%)
memory 128Mi (4%) 256Mi (9%)
Events: <none>
- 查看pod状态,和命名空间
##看到ready的状态为2/2说明创建了两个容器
[root@master demo]# kubectl get pod
NAME READY STATUS RESTARTS AGE
frontend 2/2 Running 0 27s
##default命名空间是pod资源默认的
[root@master demo]# kubectl get ns
NAME STATUS AGE
default Active 14d
kube-public Active 14d
kube-system Active 14d
[root@master demo]#
二.重启策略
- 在pod遇到故障之后的重启的动作称为重启策略
1.Always:当容器终止退出之后,总是总是重启容器,为默认策略
2.OnFailure:当容器异常退出之后(退出状态码为非0)时,重启容器
3.Never:当容器终止退出,从不重启容器
注意:k8s中不支持重启pod资源,只有删除重建
- 查看已有控制器,策略为Always
创建实例如下
- 编辑YAML文件,创建资源,定义重启策略
##如下使得镜像10秒之后异常退出,查看镜像会不会重启
##查看restarts次数
[root@master demo]# cat pod3.yaml
apiVersion: v1
kind: Pod
metadata:
name: foo
spec:
containers:
- name: busybox
image: busybox
args: ##添加命令参数
- /bin/sh
- -c
- sleep 10; exit 3 ##休眠10s,并且返回状态码为3
[root@master demo]#
- 创建pod资源,查看restart状态
##pod的状态由containercreating-->running-->error-->running
##其中restart的次数为1
[root@master demo]# kubectl create -f pod3.yaml
pod/foo created
[root@master demo]# kubectl get pod
NAME READY STATUS RESTARTS AGE
foo 0/1 ContainerCreating 0 10s
[root@master demo]# kubectl get pod
NAME READY STATUS RESTARTS AGE
foo 1/1 Running 0 13s
[root@master demo]# kubectl get pod
NAME READY STATUS RESTARTS AGE
foo 0/1 Error 0 31s
[root@master demo]# kubectl get pod
NAME READY STATUS RESTARTS AGE
foo 1/1 Running 1 39s
[root@master demo]#
- 重新定义pod3.yaml,添加重启策略,异常退出也不会重启
- 创建资源,查看状态
[root@master demo]# kubectl create -f pod3.yaml
pod/foo created
##当状态为Error之后不会再重启容器
[root@master demo]# kubectl get pod
NAME READY STATUS RESTARTS AGE
foo 0/1 ContainerCreating 0 5s
[root@master demo]# kubectl get pod
NAME READY STATUS RESTARTS AGE
foo 1/1 Running 0 12s
[root@master demo]# kubectl get pod
NAME READY STATUS RESTARTS AGE
foo 1/1 Running 0 20s
[root@master demo]# kubectl get pod
NAME READY STATUS RESTARTS AGE
foo 0/1 Error 0 22s
[root@master demo]#
三.健康检查(探针--Probe)
- Kubernetes对pod的健康检查可以通过两类探针来检查:LivenssProbe(亲和性探针)、ReadinessProbe(就绪性探针),kubelet定期执行这两类探针来诊断容器的健康状态。
- 如右是Kubernetes官网对探针的解析,https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
- 下面这两种探针可以同时定义
LivenessProbe(亲和性探针)
-
亲和性探针:用于判断容器是否存活( Running 状 态),如果LivenessProbe 探针探测到容器不健康,则 kubelet 将杀掉该容器,并根据容器的重启策略做相应的处理。如果一个容器不包含 LivenessProbe探针,那么 kubelet 认为该容器的 LivenessProbe 探针返回的 值永远是Success 。
ReadinessProbe(就绪性探针)
- 就绪性探针:用于判断容器服务是否可用(Ready状态),达到ready状态的pod次啊可以接受请求。对于被service管理的pod,Service与Pod Endpoint的关联关系也将基于Pod是否ready进行设置。如果在运行过程中ready状态变为false,则系统自动将其从service的后端Endpoint列表中隔离出去,后续在把恢复到Ready状态的Pod加回后端的Endpoint列表。这样就能保证客户端在访问service'时不会转发到服务不可用的pod实例上
endpoint是service负载均衡集群列表,添加pod资源的地址
探针的三种检查方式
- 亲和性探针和就绪性探针都可以配置这三种检查方式
ExecAction
- 在容器内部执行一个命令,如果该命令的返回码为0,则表明容器健康。
- 实例如下,通过“cat /tmp/health”命令来判断一个容器运行状态是否正常。在该pod运行之后,将在创建/tmp/health文件10s后删除该文件,而且亲和性探针的初始探测时间(initialDelaySeconds)为15s,探测结果是Fail,将导致Kubelet杀掉该容器并且重启容器
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness-exec
spec:
containers:
- name: liveness
image: busybox
args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 20; rm -rf /tmp/healthy; sleep 100s #休眠100s,给k8s一个检查pod状态的时间
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
timeoutSeconds: 5
periodSeconds: 5
##initialDelaySeconds:启动容器之后进行首次健康检查的等待时间,单位为s
##timeoutSeconds:健康检查发送请求之后等待响应的超时时间,单位为s。当超时发生时,kubelet会认为容器以及无法提供服务,将会重启该容器。
##periodSeconds,探测的频率时间
TCPSocketAction
- 通过容器的IP地址和端口号执行TCP检查,如果能够简历TCP链接,则表明容器健康。
- 实例如下,通过与容器内的localhost:80建立TCP连接进行健康检查:
apiVersion: v1
kind: Pod
metadata:
name: pod-with-healthcheck
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
livenessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 30 #首次等30s之后再检查
timeoutSeconds: 1
periodSeconds: 5 #每个5秒检查一次
HTTPGetAction
- 通过容器的IP地址、端口号以及路径调用HTTP Get方法,如果响应的状态码大于或者等于200且小于400,则认为容器健康
- 实例如下,kubelet定时发送HTTP请求到localhost:80/_status/healthz来进行容器应用的健康检查
apiVersion: v1
kind: Pod
metadata:
name: pod-with-healthcheck
spec:
containers:
-name: nginx
image: nginx
ports:
- containerPort: 80
livenessProbe:
httpGet:
path: /_status/healthz
port: 80
initialDelaySeconds: 15
timeoutSeconds: 1
periodSeconds: 5
使用TCP/http的探针时,对于日志中网站访问量会有误差(对于网站服务使用tcp/http探测时频率可以降低),一般在生产环境中使用exec检查方式比较多。当然,不同的业务可使用不同检查方式