iosip探针_了解Kubernetes探针

最新推荐文章于 2022-10-20 20:15:01 发布

weixin_26752759

最新推荐文章于 2022-10-20 20:15:01 发布

阅读量1.3k

点赞数

文章标签： python

原文链接：https://medium.com/dev-genius/understanding-kubernetes-probes-5daaff67599a

版权

本文深入探讨了Kubernetes中的iosip探针，解释了如何使用这些探针来监控应用健康状态，并确保服务的高可用性。

摘要由CSDN通过智能技术生成

iosip探针

Configuring readiness, liveness, and startup probes to detect and deal with unhealthy pods.

配置准备状态，活动性和启动探针以检测和处理不健康的Pod。

One of the challenges with distributed systems and microservices architecture is automatically detecting unhealthy applications, rerouting requests to other available systems, and restoring the broken components. Health checks are one way to address this challenge and ensure reliability. With Kubernetes, health checks are configured via probes to determine the state of each pod.

分布式系统和微服务体系结构的挑战之一是自动检测不正常的应用程序，将请求重新路由到其他可用系统，以及恢复损坏的组件。健康检查是解决此挑战并确保可靠性的一种方法。使用Kubernetes，可以通过探针配置运行状况检查以确定每个Pod的状态。

By default, Kubernetes simply observes the pod’s lifecycle and starts to route traffic to the pod when the containers move from the Pending to Succeeded state. Kubelet also watches for application crashes and restarts the pod to recover. Many developers assume that this basic setup is adequate, especially when the application inside the pod is configured with daemon process managers (e.g. PM2 for Node.js). However, since Kubernetes deems a pod as healthy and ready for requests as soon as all the containers start, the application may receive traffic before it is actually ready. This may happen if the application needs to initialize some state, make database connections, or load data before handling application logic. This gap in time between when the application is actually ready versus when Kubernetes thinks is ready becomes an issue when the deployment begins to scale and unready applications receive traffic and send back 500 errors.

默认情况下，当容器从Pending状态变为Succeeded状态时，Kubernetes只会观察pod的生命周期并开始将流量路由到pod。 Kubelet还监视应用程序崩溃并重新启动Pod以进行恢复。许多开发人员认为此基本设置就足够了，特别是在pod内的应用程序配置有守护进程管理器(例如，Node.js的PM2)的情况下。但是，由于Kubernetes认为pod在所有容器启动后就很健康并且可以接受请求，因此应用程序可能在实际准备就绪之前就已经收到了流量。如果应用程序需要在处理应用程序逻辑之前初始化一些状态，建立数据库连接或加载数据，则可能会发生这种情况。当部署开始扩展且未就绪的应用程序接收流量并发回500个错误时，在应用程序实际准备就绪与Kubernetes认为准备就绪之间的时间间隔成为一个问题。

This is where Kubernetes probes come in to define when a container is ready to accept traffic and when a container should be restarted. As of Kubernetes 1.16, there are now three types of probes supported. In this post, we’ll review the different types of probes, best practices, and tools to detect deployments with potential configuration issues.

这是Kubernetes探针用来定义容器何时准备接受流量以及何时应重新启动容器的地方。从Kubernetes 1.16开始，现在支持三种类型的探针。在本文中，我们将回顾各种类型的探针，最佳实践和工具，以检测具有潜在配置问题的部署。

Kubernetes探针 (Kubernetes Probes)

Kubernetes supports readiness and liveness probes for versions ≤ 1.15. Startup probes were added in 1.16 as an alpha feature and graduated to beta in 1.18 (WARNING: 1.16 deprecated several Kubernetes APIs. Use this migration guide to check for compatibility).

Kubernetes支持≤1.15版本的就绪和活跃性探针。在1.16中添加了启动探针作为Alpha功能，并在1.18中逐渐将其升级为Beta( 警告：1.16不赞成使用多个Kubernetes API。请使用此 迁移指南 来检查兼容性 )。

All the probe have the following parameters:

所有探针均具有以下参数：

initialDelaySeconds : number of seconds to wait before initiating liveness or readiness probes
initialDelaySeconds ：启动活动性或就绪性探针之前要等待的秒数
periodSeconds: how often to check the probe
periodSeconds ：多久检查一次探针
timeoutSeconds: number of seconds before marking the probe as timing out (failing the health check)
timeoutSeconds ：将探针标记为超时(未通过运行状况检查)之前的秒数
successThreshold : minimum number of consecutive successful checks for the probe to pass
successThreshold ：探针通过的连续成功检查的最小数量
failureThreshold : number of retries before marking the probe as failed. For liveness probes, this will lead to the pod restarting. For readiness probes, this will mark the pod as unready.
failureThreshold ：将探针标记为失败之前的重试次数。对于活动探针，这将导致吊舱重新启动。对于就绪探测器，这将标记吊舱为未就绪。

准备探针 (Readiness Probes)

Readiness probes are used to let kubelet know when the application is ready to accept new traffic. If the application needs some time to initialize state after the process has started, configure the readiness probe to tell Kubernetes to wait before sending new traffic. A primary use case for readiness probes is directing traffic to deployments behind a service.

准备就绪探针用于让kubelet知道应用程序何时准备接受新流量。如果应用程序在进程启动后需要一些时间来初始化状态，请配置就绪探针以告知Kubernetes在发送新流量之前要等待。准备情况调查的主要用例是将流量定向到服务后的部署。

Kubernetes Readiness Probe — GCP Blog GCP博客

One important thing to note with readiness probes is that it runs during the pod’s entire lifecycle. This means that readiness probes will run not only at startup but repeatedly throughout as long as the pod is running. This is to deal with situations where the application is temporarily unavailable (i.e. loading large data, waiting on external connections). In this case, we don’t want to necessarily kill the application but wait for it to recover. Readiness probes are used to detect this scenario and not send traffic to these pods until it passes the readiness check again.

准备就绪探针要注意的一件事是，它在容器的整个生命周期中运行 。这意味着准备就绪探针不仅会在启动时运行，而且还会在Pod运行期间重复运行。这是为了处理应用程序暂时不可用的情况(即加载大数据，等待外部连接)。在这种情况下，我们不想杀死应用程序，而是等待它恢复。准备就绪探针用于检测这种情况，直到再次通过准备就绪检查之前，才将流量发送到这些Pod。

活力探针 (Liveness Probes)

On the other hand, liveness probes are used to restart unhealthy containers. The kubelet periodically pings the liveness probe, determines the health, and kills the pod if it fails the liveness check. Liveness checks can help the application recover from a deadlock situation. Without liveness checks, Kubernetes deems a deadlocked pod healthy since the underlying process continues to run from Kubernetes’s perspective. By configuring the liveness probe, the kubelet can detect that the application is in a bad state and restarts the pod to restore availability.

另一方面，活动探针用于重新启动不健康的容器。 Kubelet会定期对活动性探针执行ping操作，确定健康状况，并在未通过活动性检查的情况下杀死Pod。活动性检查可以帮助应用程序从僵局中恢复。如果不进行活动检查，Kubernetes会认为僵局的Pod处于健康状态，因为从Kubernetes的角度来看，其基础流程仍在继续。通过配置活动探针，kubelet可以检测到应用程序处于不良状态，然后重新启动Pod以恢复可用性。

Kubernetes Liveness Probes — GCP Blog GCP博客

启动探针 (Startup Probes)

Startup probes are similar to readiness probes but only executed at startup. They are optimized for slow starting containers or applications with unpredictable initialization processes. With readiness probes, we can configure the initialDelaySeconds to determine how long to wait before probing for readiness. Now consider an application where it occasionally needs to download large amounts of data or do an expensive operation at the start of the process. Since initialDelaySeconds is a static number, we are forced to always take the worst-case scenario (or extend the failureThresholdthat may affect long-running behavior) and wait for a long time even when that application does not need to carry out long-running initialization steps. With startup probes, we can instead configure failureThreshold and periodSeconds to model this uncertainty better. For example, setting failureThreshold to 15 and periodSeconds to 5 means the application will get 10 x 5 = 75s to startup before it fails.

启动探针与就绪探针相似，但仅在启动时执行。它们针对缓慢启动的容器或具有不可预测的初始化过程的应用程序进行了优化。使用就绪探针，我们可以配置initialDelaySeconds以确定在探测就绪之前要等待多长时间。现在考虑一个有时需要下载大量数据或在过程开始时执行昂贵操作的应用程序。由于initialDelaySeconds是一个静态数，因此即使在该应用程序不需要进行长时间运行的情况下，我们也必须始终采取最坏的情况(或扩展可能影响长时间运行行为的failureThreshold )并等待很长时间。初始化步骤。随着启动探头，我们可以自行配置failureThreshold和periodSeconds这种不确定性更好的模型。例如，设置failureThreshold至15和periodSeconds至5装置的应用将得到10×5 = 75秒到启动失败之前。

配置探测动作 (Configuring Probe Actions)

Now that we understand the different types of probes, we can examine the three different ways to configure each probe.

现在我们了解了不同类型的探针，我们可以研究配置每种探针的三种不同方式。

HTTP (HTTP)

The kubelet sends an HTTP GET request to an endpoint and checks for a 2xx or 3xx response. You can reuse an existing HTTP endpoint or set up a lightweight HTTP server for probing purposes (e.g. an Express server with /healthz endpoint).

kubelet将HTTP GET请求发送到端点，并检查2xx或3xx响应。您可以重用现有的HTTP终结点或设置轻量级HTTP服务器以进行探测(例如，具有/healthz终结点的Express服务器)。

HTTP probes take in additional parameters:

HTTP探针包含其他参数：

host : hostname to connect to (default: pod’s IP)
host ：要连接的host名(默认值：pod的IP)
scheme : HTTP (default) or HTTPS
scheme ：HTTP(默认)或HTTPS
path : path on the HTTP/S server
path ：HTTP / S服务器上的路径
httpHeaders : custom headers if you need header values for authentication, CORS settings, etc
httpHeaders ：如果您需要用于身份验证，CORS设置等的标头值， httpHeaders自定义标头
port : name or number of the port to access the server
port ：访问服务器的端口名称或port号

livenessProbe:
   httpGet:
     path: /healthz
     port: 8080

TCP协议 (TCP)

If you just need to check whether or not a TCP connection can be made, you can specify a TCP probe. The pod is marked healthy if can establish a TCP connection. Using a TCP probe may be useful for a gRPC or FTP server where HTTP calls may not be suitable.

如果仅需要检查是否可以建立TCP连接，则可以指定TCP探针。如果可以建立TCP连接，则将pod标记为运行状况良好。对于不适合使用HTTP调用的gRPC或FTP服务器，使用TCP探针可能有用。

readinessProbe:
   tcpSocket:
     port: 21

命令 (Command)

Finally, a probe can be configured to run a shell command. The check passes if the command returns with exit code 0; otherwise, the pod is marked as unhealthy. This type of probe may be useful if it is not desirable to expose an HTTP server/port or if it is easier to check initialization steps via command (e.g. check if a configuration file has been created, run a CLI command).

最后，可以将探针配置为运行shell命令。如果命令返回的退出代码为0，则检查通过。否则，豆荚被标记为不健康。如果不希望公开HTTP服务器/端口，或者更容易通过命令检查初始化步骤(例如，检查是否已创建配置文件，请运行CLI命令)，则这种类型的探针可能很有用。

readinessProbe:
   exec:
     command: ["/bin/sh", "-ec", "vault status -tls-skip-verify"]

最佳实践 (Best Practices)

The exact parameters for the probes depend on your application, but here are some general best practices to get started:

探针的确切参数取决于您的应用程序，但是以下是一些入门的一般最佳实践：

For older (≤ 1.15) Kubernetes clusters, use a readiness probe with an initial delay to deal with the container startup phase (use p99 times for this). But make this check lightweight, since the readiness probe will execute throughout the entire lifecycle of the pod. We don’t want the probe to timeout because the readiness check takes a long time to compute.
对于较旧的(≤1.15)Kubernetes集群，请使用具有初始延迟的就绪探针来处理容器启动阶段(为此使用p99倍)。但是，使此检查轻巧，因为就绪探针将在Pod的整个生命周期中执行。我们不希望探针超时，因为准备检查需要很长时间才能计算出来。
For newer (≥ 1.16) Kubernetes clusters, use a startup probe for applications with unpredictable or variable startup times. The startup probe may share the same endpoint (e.g. /healthz ) as the readiness and liveness probes, but set the failureThreshold higher than the other probes to account for longer start times, but more reasonable time to failure for liveness and readiness checks.
对于较新的(≥1.16)Kubernetes集群，请对具有不可预测或可变启动时间的应用程序使用启动探针。启动探针可以共享相同的端点(例如， /healthz )作为准备和存活性的探针，但设置failureThreshold高于其他探针以考虑较长的启动时间，而且更合理的失效时间为活跃度和准备检查。
Readiness and liveness probes may share the same endpoint if the readiness probes aren’t used for other signaling purposes. If there’s only one pod (i.e. using a Vertical Pod Autoscaler), set the readiness probe to address the startup behavior and use the liveness probe to determine health. In this case, marking the pod unhealthy means downtime.
如果就绪探针不用于其他信令目的，则就绪探针和活跃探针可能共享相同的端点。如果只有一个吊舱(即使用“垂直吊舱自动缩放器”)，则设置就绪探测器以解决启动行为，并使用活动探测器确定运行状况。在这种情况下，将豆荚标记为不健康意味着停机。
Readiness checks can be used in various ways to signal system degradation. For example, if the application loses connection to the database, readiness probes may be used to temporarily block new requests and allow the system to reconnect. It can also be used to load balance work to other pods by marking busy pods as not ready.
准备检查可以通过各种方式使用，以表示系统性能下降。例如，如果应用程序失去与数据库的连接，则就绪探针可能会用于临时阻止新请求并允许系统重新连接。通过将繁忙的Pod标记为未就绪，它还可用于将工作负载均衡到其他Pod。

In short, well-defined probes generally lead to better resilience and availability. Be sure to observe the startup times and system behavior to tweak the probe settings as the applications change.

简而言之，定义明确的探针通常会带来更好的弹性和可用性。确保观察启动时间和系统行为，以在应用程序更改时调整探针设置。

工具类 (Tools)

Finally, given the importance of Kubernetes probes, you can use a Kubernetes resource analysis tool to detect missing probes. These tools can be run against existing clusters or be baked into the CI/CD process to automatically reject workloads without properly configured resources.

最后，鉴于Kubernetes探针的重要性，您可以使用Kubernetes资源分析工具来检测丢失的探针。这些工具可以在现有群集上运行，也可以放入CI / CD流程中，以在没有正确配置资源的情况下自动拒绝工作负载。

polaris: a resource analysis tool with a nice dashboard that can also be used as a validating webhook or CLI tool.
polaris ：具有漂亮仪表板的资源分析工具，也可以用作验证webhook或CLI工具。
kube-score: a static code analysis tool that works with Helm, Kustomize, and standard YAML files.
kube-score ：一个静态代码分析工具，可用于Helm，Kustomize和标准YAML文件。
popeye: read-only utility tool that scans Kubernetes clusters and reports potential issues with configurations.
popeye ：只读的实用工具，用于扫描Kubernetes集群并报告配置中的潜在问题。

翻译自: https://medium.com/dev-genius/understanding-kubernetes-probes-5daaff67599a

iosip探针

weixin_26752759

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
iosip探针_了解Kubernetes探针

iosip探针Configuring readiness, liveness, and startup probes to detect and deal with unhealthy pods. 配置就绪，活动和启动探针以检测和处理不健康的Pod。 One of the challenges with distributed systems and microservices architec...
复制链接

扫一扫