【CKA考试笔记】十、健康性检查（探针）

戴陵FL

已于 2022-07-10 18:09:00 修改

阅读量1.2k

点赞数

分类专栏：容器技术文章标签： docker 运维 kubernetes 容器云原生

于 2022-07-10 18:04:55 首次发布

本文链接：https://blog.csdn.net/weixin_41755556/article/details/125681031

版权

容器技术专栏收录该内容

21 篇文章 8 订阅

订阅专栏

文章目录

实验环境
一、探针概述
二、liveness probe 存活探针
三、readiness probe 就绪探针
- 实验：command探测方式

实验环境

完成初始化集群的环境：
（vms21）192.168.26.21——master1
（vms22）192.168.26.22——worker1
（vms23）192.168.26.23——worker2

一、探针概述

在一个k8s集群中，deployment作为控制器，当某个worker节点中的pod出现问题了，状态不是Running了，这种情况下，deployment是会帮我们将这个pod重新创建的
现假设节点的状态还是Running，但是pod里的应用程序运行不正常了、不能对外提供服务了，此时，deployment是不会重启这个pod的，那怎么处理这种情况呢？——我们可以去探测、检测每个pod是不是可以正常对外工作，若发现问题了，根据处理方式的不同，探测可以分为两类：
1.liveness probe——存活探针：发现问题就重启（重启：重新创建pod里的容器）
2.readiness probe——就绪探针：发现问题，不重启

二、liveness probe 存活探针

liveness probe的探测方式：
1.command：在容器里执行一个命令，不在乎命令运行的结果是什么，只在乎命令的运行成了还是没运行成，命令执行成功了，则认为探测成功，否则认为探测失败，就会马上重启pod
2.httpGet：访问服务，若访问超时了则认为探测失败
3.tcpSocket：与服务建立tcp三次握手，若能建立，则认为探测成功，否则认为探测不成功

实验1：command探测方式

（1）拉取busybox镜像

nerdctl pull busybox

（2）创建一个pod，宽限期为0，镜像用busybox
自定义容器的主进程为创建/tmp/healthy文件，然后休眠30s，然后删除/tmp/healthy文件，然后休眠100000s
探针在spec.containers下定义，定义一个livenessProbe探针，使用command探测方式，这里为一个查看/tmp/healthy文件的命令
若/tmp/healthy这个文件存在，则探针的命令能执行成功，否则执行不成功
initialDelaySeconds——定义的是容器启动内多少秒内不检测
periodSeconds——定义的是每间隔多少秒检测一次
还有一些其他的参数如：
timeoutSeconds——定义探测超时时间，默认1秒，最小1秒，超过这个时间则认为探测失败（在这里的command探测方式用不到，在httpGet、tcpSocket网络探测方式中应用）
failureThreshold——当pod启动了，并且探测失败，kubernetes的重试次数，存活探测情况下的放弃就意味着重新启动容器，就绪探测情况下的放弃pod会被打上未就绪的标签，默认值是3，最小值是1
successThreshold——探测失败后，最少连续探测为正常多少次才被认定为成功了，默认值是1，对于liveness必须是1，最小值是1
yaml文件如下：

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: pod1
spec:
  terminationGracePeriodSeconds: 0
  containers:
  - name: liveness
    image: busybox
    imagePullPolicy: IfNotPresent
    args:
    - /bin/sh
    - -c
    - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 100000
    livenessProbe:
      exec:
        command:
        - cat
        - /tmp/healthy
      initialDelaySeconds: 5 #容器启动的5s内不检测
      periodSeconds: 5 #每5s检查一次

（3）创建这个pod

kubectl apply -f xxx.yaml

（4）进入pod容器内，执行列出/tmp目录下文件，可以看到存在healthy这个文件

kubectl exec -it pod1 -- ls /tmp

在这里插入图片描述
过了30s后，/tmp/healthy被删除，可以看到/tmp下healthy文件没有了（探测命令执行不成功）
每隔5s会探测一次，这时候会探测失败，失败后，默认的重试次数为3，因此会探测3次（15s），还是探测失败，大概在pod创建的47s时，加上这里的宽限期为0，所以大概47s的时候会立马重启pod
在这里插入图片描述

实验2：httpGet探测方式

（1）删除实验1中的pod1，重新创建一个pod，使用nginx镜像
livenessProbe中定义使用httpGet的探测方式，path定义httpGet的地址，端口为80，使用HTTP协议
探测失败后重试次数failureThreshold为3，initialDelaySeconds容器启动10s内不检测
每次探测间隔时间periodSeconds为10s，超时时间timeoutSeconds为10s，successThreshold连续探测1次成功即认为成功
yaml文件如下：

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: pod2
spec:
  containers:
  - name: liveness
    image: nginx
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 3
      httpGet:
        path: /index.html #这里的index.html不是指宿主机中的index.html，而是在容器内的/usr/share/nginx/html/index.html
        port: 80
        scheme: HTTP
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 10

（3）创建这个pod

kubectl apply -f xxx.yaml

（4）进入pod内容器，查看/usr/share/nginx/html目录下文件，可以看到index.html

kubectl exec -it pod2 -- ls /usr/share/nginx/html

在这里插入图片描述
然后我们大概在pod创建后的10s后删除index.html，这样探针就访问不到这个页面了
yaml中httpGet探测方式中定义的path: index.html不是指宿主机中的index.html，而是在容器内的/usr/share/nginx/html/index.html

kubectl exec -it pod2 -- rm -rf /usr/share/nginx/html/index.html

在这里插入图片描述
我们定义的探针在容器启动后10s内不检测，每隔10s检测一次，因此大概在20s时，会访问不到index.html（10s时删除了index.html）
定义的探针的httpGet探测超时时间为10s，因此大概在30s时会认为探测失败，并进行第一次重试，默认的探测失败后的重试次数为3，因此在50s后认为探测失败，这时候会重启pod
在这里插入图片描述

实验3：tcpSocket探测方式

（1）删除实验2中的pod2，重新创建一个pod，使用nginx镜像，宽限期为0，
lvenessProbe中定义使用tcpSocket探测方式
tcpSocket下定义端口port为80，意为与容器端口80能够建立三次握手，即认为探测成功
yaml文件如下：

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: pod3
spec:
  terminationGracePeriodSeconds: 0
  containers:
  - name: liveness
    image: nginx
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 3
      tcpSocket:
        port: 80
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 10

（2）创建这个pod
在这里插入图片描述
（3）假设在yaml文件中，将tcpSocket访问的端口改为其他，不为80，会怎么样呢？
因为容器使用nginx镜像，默认端口为80，因此能够成功建立三次握手，若设置访问端口不为80，则建立不起三次握手，认为探测失败，会不停地重启pod

...
livenessProbe:
      failureThreshold: 3
      tcpSocket:
        port: 808
      initialDelaySeconds: 10
...

重新创建这个pod，会发现它不停地重启
在这里插入图片描述

三、readiness probe 就绪探针

readiness probe探测方式，同liveness，也是这三种
1.command
2.httpGet
3.tcpSocket

就绪探针与存活探针的区别就是：探测失败后不会去重启pod，并且后续不会继续将用户请求转发到该pod（会掐断svc到该pod的线路），但是pod仍然是通的，若直接访问pod（不通过svc转发）还是能访问

实验：command探测方式

（1）我们创建一个deployment，创建3个副本
pod模板中使用nginx镜像，宽限期为0
容器中使用钩子进程或初始化进程来使容器创建/tmp/healthy文件（这里使用钩子进程），这样又可以不影响nginx主进程，又可以启动一个进程创建/tmp/healthy文件
使用readinessProbe定义就绪探针，使用command探测方式，命令为查看/tmp/healthy文件，命令执行成功则认为探测成功
yaml文件如下：

apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    app: web1
  name: web1
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web1
  strategy: {}
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: web1
    spec:
      terminationGracePeriodSeconds: 0
      containers:
      - image: nginx
        name: c1
        imagePullPolicy: IfNotPresent
        lifecycle:
          postStart:
            exec:
              command: ["/bin/sh","-c","touch /tmp/healthy"]
        readinessProbe:
          exec:
            command:
            - cat
            - /tmp/healthy
          initialDelaySeconds: 2 #容器启动5s内不检测
          periodSeconds: 5 #每间隔5s检测一次
        resources: {}
status: {}

（2）创建这个deployment，可以看到创建了3个pod副本

kubectl apply -f xxx.yaml

在这里插入图片描述
（3）为了方便演示，将内容“111”、“222”、“333”写入3个pod内的index.html

kubectl exec -it web1-656db66885-njhjq -- sh -c "echo 111 > /usr/share/nginx/html/index.html"
kubectl exec -it web1-656db66885-nzjq6 -- sh -c "echo 222 > /usr/share/nginx/html/index.html"
kubectl exec -it web1-656db66885-vg7r4 -- sh -c "echo 333 > /usr/share/nginx/html/index.html"

（4）创建一个负载均衡器svc

kubectl expose --name=svc1 deploy web1 --port=80

（5）查看svc访问地址

kubectl get svc
#输出：
NAME         TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)   AGE
svc1         ClusterIP   10.97.6.220   <none>        80/TCP    20s

（6）每隔1s循环访问svc，都能输出index.html内容，探针也能探测成功

while true ; do curl 10.97.6.220 ; sleep 1 ; done

在这里插入图片描述
安ctrl+c退出
（7）现在模拟将web1-656db66885-vg7r4（index.html内容为333）的这个pod内的/tmp/healthy文件删除

kubectl exec -it web1-656db66885-vg7r4 -- rm -rf /tmp/healthy

此时，可以看到该pod的Ready已为未就绪状态
在这里插入图片描述
可想而知，此时探针也会探测失败
此时访问svc，将不会继续把请求转发给该pod（掐断svc到该pod的线路）

while true ; do curl 10.97.6.220 ; sleep 1 ; done

在这里插入图片描述
但是readiness探针并不会将该pod重启，因此，若直接访问该pod，还是能访问通

curl 10.244.70.120

在这里插入图片描述
httpGet、与tcpSocket探测方式同理

戴陵FL

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
打赏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录