POD 生命周期
Pod对象自从其创建开始至其终止退出的时间范围称为其生命周期,在这段时间中Pod会处于多种不同的状态,并执行一些操作
其中,创建主容器 main container
为必需的操作,其他可选的操作还包括运行初始化容器 init container
、容器启动后钩子 post start hook
、容器的存活性探测 liveness probe
就绪性探测 readiness probe
以及容器终止前钩子 pre stop hook
等,这些操作是否执行则取决于Pod的定义
Pod相位
无论是类似前面几节中的由用户手动创建,还是通过 Deployment
等控制器创建,Pod对象总是应该处于其生命进程中以下几个相位 phase
之一
- Pending:API Server创建了Pod资源对象并已存入etcd中,但它尚未被调度完成,或者仍处于从仓库下载镜像的过程中
- Running:Pod已经被调度至某节点,并且所有容器都已经被kubelet创建完成
- Succeeded:Pod中的所有容器都已经成功终止并且不会被重启,通常这个状态会很短
- Failed:所有容器都已经终止,但至少有一个容器终止失败,即容器返回了非0值的退出状态或已经被系统终止
- Unknown:API Server无法正常获取到Pod对象的状态信息,通常是由于其无法与所在工作节点的kubelet通信所致
Pod的相位是在其生命周期中的宏观概述,而非对容器或Pod对象的综合汇总,而且相位的数量和含义被严格界定,它仅包含上面列举的相位值
Pod 创建过程
- 用户通过
kubectl
或其他API
客户端提交Pod Spec
给API Server
。 API Server
尝试着将Pod
对象的相关信息存入etcd
中,待写入操作执行完成,APServer
即会返回确认信息至客户端。API Server
开始反映etcd
中的状态变化- 所有的
Kubernetes
组件均使用watch
机制来跟踪检查API Server
上的相关的变动 kube-scheduler(调度器)
通过其watcher
觉察到API Server
创建了新的Pod
对象但尚未绑定至任何工作节点kube-scheduler
为Pod
对象挑选一个工作节点并将结果信息更新至API Server
- 调度结果信息由
API Server
更新至etcd
存储系统,而且API Server
也开始反映此Pod
对象的调度结果 Pod
被调度到的目标工作节点上的kubelet
尝试在当前节点上调用Docker
启动容器,并将容器的结果状态回送至API Server
API Server
将Pod
状态信息存入etcd
系统中- 在
etcd
确认写入操作成功完成后,API Server
将确认信息发送至相关的kubelet
,事件将通过它被接受
pod重启策略
pod中的容器一旦挂了的重启策略,字段为 restartPolicy
,其有三个状态
- Always:一旦pod中容器挂了则总是重启,其为默认值,现反复重启的情况其策略为,复数的重启间隔时长是前一次重启时长的倍数,重启间隔时长极限为300秒,如第一次重启后间隔10秒第二次重启,第三次重启与第二次重启之间间隔30秒,以此类推直到300秒
- OnFailure:一旦pod中容器挂了且状态是错误则重启
- Never:一直不重启
一旦pod调度成功后,除非节点挂了否则k8s不会将其重启到其他节点,而重启pod规则则取决于 restartPolicy
[root@master-0 ~]# kubectl explain pods.spec.restartPolicy
KIND: Pod
VERSION: v1
FIELD: restartPolicy <string>
DESCRIPTION:
Restart policy for all containers within the pod. One of Always, OnFailure,
Never. Default to Always. More info:
https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy
pod结束策略
当pod结束需要平滑终止,提交删除或停止pod时会向pod内的每一个容器下发终止信号15
,宽限期通常为30秒,也可自行指定,而宽限期过了之后会下发 kill
信号,强行进行终止
pod探测
liveness probe 和 readiness probe 都可以下以下三种探针类型:
- ExecAction
- TCPSocketAction
- HTTPGetAction
-
ExecAction
探测语法[root@master-0 ~]# kubectl explain pods.spec.containers lifecycle <Object> # 启动后结束前钩子 Actions that the management system should take in response to container lifecycle events. Cannot be updated. livenessProbe <Object> Periodic probe of container liveness. Container will be restarted if the probe fails. Cannot be updated. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes name <string> -required- Name of the container specified as a DNS_LABEL. Each container in a pod must have a unique name (DNS_LABEL). Cannot be updated. ports <[]Object> List of ports to expose from the container. Exposing a port here gives the system additional information about the network connections a container uses, but is primarily informational. Not specifying a port here DOES NOT prevent that port from being exposed. Any port which is listening on the default "0.0.0.0" address inside a container will be accessible from the network. Cannot be updated. readinessProbe <Object> Periodic probe of container service readiness. Container will be removed from service endpoints if the probe fails. Cannot be updated. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes [root@master-0 ~]# kubectl explain pods.spec.containers.livenessProbe #readiness probe同理 KIND: Pod VERSION: v1 RESOURCE: livenessProbe <Object> DESCRIPTION: Periodic probe of container liveness. Container will be restarted if the probe fails. Cannot be updated. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes Probe describes a health check to be performed against a container to determine whether it is alive or ready to receive traffic. FIELDS: exec <Object> One and only one of the following should be specified. Exec specifies the action to take. failureThreshold <integer> #探测重试次数 Minimum consecutive failures for the probe to be considered failed after having succeeded. Defaults to 3. Minimum value is 1. httpGet <Object> HTTPGet specifies the http request to perform. initialDelaySeconds <integer> #容器启动后的延迟探测时间 Number of seconds after the container has started before liveness probes are initiated. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes periodSeconds <integer> #周期间隔时长 How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1. successThreshold <integer> Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness and startup. Minimum value is 1. tcpSocket <Object> TCPSocket specifies an action involving a TCP port. TCP hooks not yet supported timeoutSeconds <integer> #超时时间 Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1. More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#container-probes [root@master-0 ~]# vim liveness-exec.yaml apiVersion: v1 kind: Pod metadata: name: liveness-exec-pod namespace: default spec: containers: - name: liveness-exec-pod-container image: busbox:latest imagePullPolicy: IfNotPresent command: ["/bin/sh","-c","touch /tmp/healthy; sleep30; rm -f /tmp/healthy; sleep 36000"] livenessProbe: exec: command: ["test","-e","/tmp/healthy"] initialDelaySeconds: 1 periodSeconds: 3 [root@master-0 ~]# kubectl create -f liveness-exec.yaml pod/liveness-exec-pod created [root@master-0 ~]# kubectl describe pods liveness-exec-pod Name: liveness-exec-pod Namespace: default Priority: 0 Node: slave-1.shared/10.211.55.37 Start Time: Tue, 07 Apr 2020 02:41:40 -0400 Labels: <none> Annotations: <none> Status: Running IP: 10.244.1.6 IPs: IP: 10.244.1.6 Containers: liveness-exec-pod-container: Container ID: docker://58fecd244cf28a7753c6b9f72ddeabc895cead4ed568aeb1115034fbbd8cd666 Image: busybox:latest Image ID: docker-pullable://busybox@sha256:b26cd013274a657b86e706210ddd5cc1f82f50155791199d29b9e86e935ce135 Port: <none> Host Port: <none> Command: /bin/sh -c touch /tmp/healthy; sleep30; rm -f /tmp/healthy; sleep 36000 State: Running Started: Tue, 07 Apr 2020 02:42:56 -0400 Last State: Terminated Reason: Error Exit Code: 137 Started: Tue, 07 Apr 2020 02:42:17 -0400 Finished: Tue, 07 Apr 2020 02:42:56 -0400 Ready: True Restart Count: 2 Liveness: exec [test -e /tmp/healthy] delay=1s timeout=1s period=3s #success=1 #failure=3 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-lbf5s (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: default-token-lbf5s: Type: Secret (a volume populated by a Secret) SecretName: default-token-lbf5s Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled <unknown> default-scheduler Successfully assigned default/liveness-exec-pod to slave-1.shared Normal Pulled 13s (x3 over 89s) kubelet, slave-1.shared Container image "busybox:latest" already present on machine Normal Created 13s (x3 over 89s) kubelet, slave-1.shared Created container liveness-exec-pod-container Normal Started 13s (x3 over 89s) kubelet, slave-1.shared Started container liveness-exec-pod-container Warning Unhealthy 4s (x9 over 88s) kubelet, slave-1.shared Liveness probe failed: Normal Killing 4s (x3 over 82s) kubelet, slave-1.shared Container liveness-exec-pod-container failed liveness probe, will be restarted [root@master-0 ~]# kubectl get pods -w NAME READY STATUS RESTARTS AGE liveness-exec-pod 1/1 Running 3 2m12s pod-demo 2/2 Running 7 5d23h
-
TCPSocketAction
探测语法[root@master-0 ~]# kubectl explain pods.spec.containers.livenessProbe.tcpSocket KIND: Pod VERSION: v1 RESOURCE: tcpSocket <Object> DESCRIPTION: TCPSocket specifies an action involving a TCP port. TCP hooks not yet supported TCPSocketAction describes an action based on opening a socket FIELDS: host <string> Optional: Host name to connect to, defaults to the pod IP. port <string> -required- Number or name of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME.
-
HTTPGetAction
探测语法[root@master-0 ~]# kubectl explain pods.spec.containers.livenessProbe.httpGet KIND: Pod VERSION: v1 RESOURCE: httpGet <Object> DESCRIPTION: HTTPGet specifies the http request to perform. HTTPGetAction describes an action based on HTTP Get requests. FIELDS: host <string> #主机 Host name to connect to, defaults to the pod IP. You probably want to set "Host" in httpHeaders instead. httpHeaders <[]Object> #自定义在请求报文中发什么请求守护 Custom headers to set in the request. HTTP allows repeated headers. path <string> #URL Path to access on the HTTP server. port <string> -required- #端口 Name or number of the port to access on the container. Number must be in the range 1 to 65535. Name must be an IANA_SVC_NAME. scheme <string> Scheme to use for connecting to the host. Defaults to HTTP. [root@master-0 ~]# vim liveness-httpget.yaml apiVersion: v1 kind: Pod metadata: name: liveness-httpget-container namespace: default spec: containers: - name: liveness-httpget-container image: nginx:latest imagePullPolicy: IfNotPresent ports: - name: http containerPort: 80 livenessProbe: httpGet: port: http path: /index.html initialDelaySeconds: 1 periodSeconds: 3 [root@master-0 ~]# kubectl create -f liveness-httpget.yaml pod/liveness-httpget-container created [root@master-0 ~]# kubectl describe pods liveness-httpget-container Name: liveness-httpget-container Namespace: default Priority: 0 Node: slave-1.shared/10.211.55.37 Start Time: Tue, 07 Apr 2020 05:07:44 -0400 Labels: <none> Annotations: <none> Status: Running IP: 10.244.1.7 IPs: IP: 10.244.1.7 Containers: liveness-httpget-container: Container ID: docker://398df0adc6b016d2013a78410a0a0a2bb333ea1216f8f1672048fc55cce9cf59 Image: nginx:latest Image ID: docker-pullable://nginx@sha256:282530fcb7cd19f3848c7b611043f82ae4be3781cb00105a1d593d7e6286b596 Port: 80/TCP Host Port: 0/TCP State: Running Started: Tue, 07 Apr 2020 05:07:44 -0400 Ready: True Restart Count: 0 Liveness: http-get http://:http/index.html delay=1s timeout=1s period=3s #success=1 #failure=3 Environment: <none> Mounts: /var/run/secrets/kubernetes.io/serviceaccount from default-token-lbf5s (ro) Conditions: Type Status Initialized True Ready True ContainersReady True PodScheduled True Volumes: default-token-lbf5s: Type: Secret (a volume populated by a Secret) SecretName: default-token-lbf5s Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s node.kubernetes.io/unreachable:NoExecute for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled <unknown> default-scheduler Successfully assigned default/liveness-httpget-container to slave-1.shared Normal Pulled 66s kubelet, slave-1.shared Container image "nginx:latest" already present on machine Normal Created 66s kubelet, slave-1.shared Created container liveness-httpget-container Normal Started 66s kubelet, slave-1.shared Started container liveness-httpget-container
livenessprobe探测失败时会重启pod,而readinessprbe探测失败会让pod处于非就绪情况从而不加入service的调度中
[root@master-0 ~]# cat readiness-httpget.yaml
apiVersion: v1
kind: Pod
metadata:
name: liveness-httpget-container
namespace: default
spec:
containers:
- name: liveness-httpget-container
image: nginx:latest
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 80
readinessProbe:
httpGet:
port: http
path: /index.html
initialDelaySeconds: 1
periodSeconds: 3
lifecycle钩子
一般配合 gitlab
使用
[root@master-0 ~]# kubectl explain pods.spec.containers.lifecycle
KIND: Pod
VERSION: v1
RESOURCE: lifecycle <Object>
DESCRIPTION:
Actions that the management system should take in response to container
lifecycle events. Cannot be updated.
Lifecycle describes actions that the management system should take in
response to container lifecycle events. For the PostStart and PreStop
lifecycle handlers, management of the container blocks until the action is
complete, unless the container process fails, in which case the handler is
aborted.
FIELDS:
postStart <Object> #启动前
PostStart is called immediately after a container is created. If the
handler fails, the container is terminated and restarted according to its
restart policy. Other management of the container blocks until the hook
completes. More info:
https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks
preStop <Object> #结束后
PreStop is called immediately before a container is terminated due to an
API request or management event such as liveness/startup probe failure,
preemption, resource contention, etc. The handler is not called if the
container crashes or exits. The reason for termination is passed to the
handler. The Pod's termination grace period countdown begins before the
PreStop hooked is executed. Regardless of the outcome of the handler, the
container will eventually terminate within the Pod's termination grace
period. Other management of the container blocks until the hook completes
or until the termination grace period is reached. More info:
https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks
[root@master-0 ~]# cat poststart-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: poststart-pod
namespace: default
spec:
containers:
- name: busbox-httpd
image: busbox:latest
imagePullPolicy: IfNotPresent
lifecycle:
postStart:
exec:
command: ['mkdir','-p','/data/web/html']
command: ['/bin/sh','-c','sleep 36000']
#command: ["/bin/httpd"] # 这样写会有一个问题,当运行时由于lifecycle是启动后运行的,所以此时的command运行时还没有/data/web/html,所以会报错,他们不能进行强依赖
#args: ["-f","-h /data/web/html"]