1.两种探针
readiness probe(就绪探针)
监测容器是否就绪?只有pod里的容器就绪,kubelet才会认为pod处于就绪状态.
就绪探针的作用是控制哪些pod可以作为svc的后端,如果pod不是就绪状态,就把它从svc load balancer中移除.
liveness probe(存活探针)
监测容器是否存活?如果容器中的应用出现问题,liveness将检测到容器不健康会通知kubelet,kubelet重启该pod容器.
2.使用探针的三种方式
官网介绍了三种,见下:
command命令执行
http request访问
tcp socket连接
个人比较喜欢用第三种方式,tcp socket.
3.tcp socket方式学习测试
tcp socket方式
这个方式比较好理解.
比如说,起一个nginx容器,nginx服务提供的端口是80端口.
配置tcp socket 探针,设定隔一个时间,用tcp方式连接80端口,如果连接成功,就返回容器健康或者就绪,如果连接失败,返回容器不健康或者不就绪,kubelet重启容器.
逆向思维示例:
简单思路:探针tcp socket连接不存在的8080端口,必然连接失败报错,从而实现pod不断重启.
[root@k8s-master1 k8s_tanzhen]# cat tcp_ness
apiVersion: v1
kind: Pod
metadata:
name: httpd
labels:
app: httpd
spec:
containers:
- name: httpd
image: nginx
ports:
- containerPort: 80
readinessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 45
periodSeconds: 20
livenessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 45
periodSeconds: 20
[root@k8s-master1 k8s_tanzhen]#
起一个nginx的pod容器,提供服务端口80.
配置探针连接端口8080,第一次监测时间为pod容器启动后的45s,第一次监测后每隔20s监测一次.
测试结果,pod容器一直在重启.
[root@k8s-master1 k8s_tanzhen]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
httpd 0/1 CrashLoopBackOff 7 18m 172.30.35.3 k8s-master3
describe报错
Warning Unhealthy 6m (x19 over 16m) kubelet, k8s-master3 Liveness probe failed: dial tcp 172.30.35.3:8080: connect: connection refused
Warning Unhealthy 2m (x15 over 16m) kubelet, k8s-master3 Readiness probe failed: dial tcp 172.30.35.3:8080: connect: connection refused
探针自动tcp连接容器ip:8080端口,失败.所以容器一直重启.
正常配置示例
正常配置是连接提供服务的80端口
简单思路:理论上来说,长时间运行的应用程序最终会过渡到中断状态,除非重新启动,否则无法恢复.Kubernetes提供了活性探针来检测和补救这种情况.这是配置探针的根本原因,以防万一.
[root@k8s-master1 k8s_tanzhen]# cat tcp_ness
apiVersion: v1
kind: Pod
metadata:
name: httpd
labels:
app: httpd
spec:
containers:
- name: httpd
image: nginx
ports:
- containerPort: 80
readinessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 45
periodSeconds: 20
livenessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 45
periodSeconds: 20
[root@k8s-master1 k8s_tanzhen]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
httpd 1/1 Running 0 2m 172.30.35.3 k8s-master3
正常配置模拟测试案例
简单思路:起nginx容器,然后执行命令杀死nginx进程,设定探针监测连接tcp socket 80端口,当nginx进程被杀死后,tcp socket连接失败,探针监测容器为不健康不就绪,kubelet重启容器.
[root@k8s-master1 k8s_tanzhen]# cat tcp_ness
apiVersion: v1
kind: Pod
metadata:
name: httpd
labels:
app: httpd
spec:
containers:
- name: httpd
image: nginx
args:
- /bin/sh
- -c
- sleep 60;nginx -s quit
ports:
- containerPort: 80
readinessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 20
periodSeconds: 10
livenessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 20
periodSeconds: 10
[root@k8s-master1 k8s_tanzhen]#
配置参数说明:
容器启动后,执行nginx -s quit杀死Nginx进程
容器启动20s后开始执行readiness和liveness检测
容器启动后35s左右
探针监测到nginx进程已经死掉,无法连接到80端口,报警见下:
Warning Unhealthy 8s (x3 over 28s) kubelet, k8s-master3 Liveness probe failed: dial tcp 172.30.35.3:80: connect: connection refused
Warning Unhealthy 7s (x3 over 27s) kubelet, k8s-master3 Readiness probe failed: dial tcp 172.30.35.3:80: connect: connection refused
整个重启事件记录
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m default-scheduler Successfully assigned default/httpd to k8s-master3
Normal Pulled 1m (x2 over 2m) kubelet, k8s-master3 Successfully pulled image "nginx"
Normal Created 1m (x2 over 2m) kubelet, k8s-master3 Created container
Normal Started 1m (x2 over 2m) kubelet, k8s-master3 Started container
Warning Unhealthy 16s (x6 over 1m) kubelet, k8s-master3 Liveness probe failed: dial tcp 172.30.35.3:80: connect: connection refed
Warning Unhealthy 15s (x8 over 1m) kubelet, k8s-master3 Readiness probe failed: dial tcp 172.30.35.3:80: connect: connection resed
Normal Pulling 5s (x3 over 2m) kubelet, k8s-master3 pulling image "nginx"
Normal Killing 5s (x2 over 1m) kubelet, k8s-master3 Killing container with id docker://httpd:Container failed liveness prob. Container will be killed and recreated.
可以看到,nginx进程杀死后,pod自动重启.
[root@k8s-master1 k8s_tanzhen]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
httpd 0/1 Running 4 5m 172.30.35.3 k8s-master3
实现测试目的