Kubernetes 证书监控--x509-certificate-exporter

有部分读者可能听说过 ssl-exporter 这个项目,它能提供多种针对 SSL 的检测手段,包括:HTTPS 证书、文件证书、Kubernetes Secret、Kubeconfig 文件。从功能上来看,它基本可以满足上述需求,但它的指标还不够丰富,本文将介绍一个更为强大的 Prometheus Exporter:x509-certificate-exporter

与 ssl-exporter 不同,x509-certificate-exporter 只专注于监控 Kubernetes 集群相关的证书,包括各个组件的文件证书、Kubernetes TLS Secret、Kubeconfig 文件,而且指标更加丰富。我们来看看在 KubeSphere 中如何部署 x509-certificate-exporter 以监控集群的所有证书。它能够帮助用户及时发现证书过期问题,确保系统的安全性和稳定性。

在 Kubernetes 环境中,x509-certificate-exporter 可以无缝集成,监控集群内所有证书的状态,确保集群的安全运行。

监控手段

使用enix 的 x509-certificate-exporter监控集群所有 node 的/etc/kubernetes/pki/var/lib/kubelet下的证书以及 kubeconfig文件

  1. 优势: 可以监控所有 node, 所有 kubeconfig 文件, 以及 所有 tls 格式的 secret 证书, 如果要监控 Kubernetes 集群以外的证书,也可以如法炮制,范围广而全。
  2. 需要额外安装: x509-certificate-exporter, 对应有 1 个 Deployment 和 多个 DaemonSet, 对 Kubernetes 集群的资源消耗不少。

该 Exporter 是通过监控集群所有 node 的指定目录或 path 下的证书文件以及 kubeconfig 文件来获取证书信息。

如果是使用 kubeadm 搭建的 Kubernetes 集群,则可以监控如下包含证书的文件和 kubeconfig:

watchFiles:
- /var/lib/kubelet/pki/kubelet-client-current.pem
- /etc/kubernetes/pki/apiserver.crt
- /etc/kubernetes/pki/apiserver-etcd-client.crt
- /etc/kubernetes/pki/apiserver-kubelet-client.crt
- /etc/kubernetes/pki/ca.crt
- /etc/kubernetes/pki/front-proxy-ca.crt
- /etc/kubernetes/pki/front-proxy-client.crt
- /etc/kubernetes/pki/etcd/ca.crt
- /etc/kubernetes/pki/etcd/healthcheck-client.crt
- /etc/kubernetes/pki/etcd/peer.crt
- /etc/kubernetes/pki/etcd/server.crt
watchKubeconfFiles:
- /etc/kubernetes/admin.conf
- /etc/kubernetes/controller-manager.conf
- /etc/kubernetes/scheduler.conf
YAML

下载并解压Chart包

https://github.com/enix/x509-certificate-exporter/tree/main

根据实际情况修改values.yaml,Chart.yaml  其他配置可不做修改

修改 Chart.yaml 里面版本号信息

[root@k8s-uat-m01 x509-certificate-exporter]# cat Chart.yaml 
version: '3.19.1'
appVersion: '3.19.1'
........................................

拉取镜像的地址,国内镜像源:

https://docker.aityp.com/image/docker.io/enix/x509-certificate-exporter:3.19.1https://docker.aityp.com/image/docker.io/enix/x509-certificate-exporter:3.19.1

Docker拉取命令

docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/enix/x509-certificate-exporter:3.19.1
docker tag  swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/enix/x509-certificate-exporter:3.19.1  docker.io/enix/x509-certificate-exporter:3.19.1

Containerd拉取命令

ctr images pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/enix/x509-certificate-exporter:3.19.1
ctr images tag  swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/enix/x509-certificate-exporter:3.19.1  docker.io/enix/x509-certificate-exporter:3.19.1

修改values.yaml地址

# 修改为镜像名称,默认是Chart中的appVersion  
# 修改仓库名称


image:
  # -- x509-certificate-exporter image registry
  registry: swr.cn-north-4.myhuaweicloud.com
  # -- x509-certificate-exporter image repository
  repository: ddn-k8s/docker.io/enix/x509-certificate-exporter
--------------------------------------------------------------------------

# 节点标签,根据实际情况调整,基本也不用修改
# 容忍,根据实际情况调整,基本也不用修改
# 证书所在目录,根据实际情况调整


  # -- Additional environment variables for container
  env: []
  # - name: GOMAXPROCS
  #   value: "1"

  # -- [SEE README] Map to define one or many DaemonSets running hostPath exporters. Key is used as a name ; value is a map to override all default settings set by `hostPathsExporter.*`.
  daemonSets: 
    master:
      nodeSelector:
        node-role.kubernetes.io/control-plane: ""
      tolerations:
      - effect: NoSchedule
        key:  node-role.kubernetes.io/control-plane
        operator: Exists
      watchFiles:
      - /etc/kubernetes/pki/apiserver.crt
      - /etc/kubernetes/pki/apiserver-etcd-client.crt
      - /etc/kubernetes/pki/apiserver-kubelet-client.crt
      - /etc/kubernetes/pki/ca.crt
      - /etc/kubernetes/pki/front-proxy-ca.crt
      - /etc/kubernetes/pki/front-proxy-client.crt
 # 配置文件所在目录,根据实际情况调整,也可不做配置
      watchKubeconfFiles:
      - /etc/kubernetes/admin.conf
      - /etc/kubernetes/controller-manager.conf
      - /etc/kubernetes/kubelet.conf
      - /etc/kubernetes/scheduler.conf
    nodes:
      watchFiles:
      - /var/lib/kubelet/pki/kubelet-client-current.pem
      - /etc/kubernetes/pki/ca.crt

修改service类型为NodePort,这样部署在k8s外部的prometheus可以获取到端口暴露的数据:

[root@k8s-pre-m01 x509-certificate-exporter]# cat templates/service.yaml 
{{- if .Values.service.create }}
apiVersion: v1
kind: Service
metadata:
  name: {{ include "x509-certificate-exporter.fullname" . }}
  namespace: {{ include "x509-certificate-exporter.namespace" . }}
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "9793"
  labels:
    {{- include "x509-certificate-exporter.labels" . | nindent 4 }}
    {{- with .Values.service.extraLabels }}
    {{- . | toYaml | trim | nindent 4 }}
    {{- end }}
  {{- with .Values.service.annotations }}
  annotations:
    {{- . | toYaml | trim | nindent 4 }}
  {{- end }}
spec:
  type: NodePort
  {{- if .Values.service.headless }}
  clusterIP: None
  {{- end }}
  ports:
  - name: metrics
    port: {{ .Values.service.port }}
    targetPort: metrics
    nodePort: 30090
  selector:
    {{- include "x509-certificate-exporter.selectorLabels" . | nindent 4 }}
{{- end }}

[root@k8s-uat-m01 x509-certificate-exporter]# helm install x509-certificate-exporter   --values values.yaml  .

[root@k8s-pre-m01 x509-certificate-exporter]# helm list
NAME                            NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                                   APP VERSION
x509-certificate-exporter       default         1               2025-10-13 11:25:47.81887836 +0800 CST  deployed        x509-certificate-exporter-3.19.1        3.19.1 


[root@k8s-m01 x509-certificate-exporter]# kubectl get pod
NAME                                         READY   STATUS             RESTARTS           AGE
x509-certificate-exporter-56567b56b9-xdmrf   1/1     Running            0                  7h50m
x509-certificate-exporter-master-5tspp       1/1     Running            0                  7h50m
x509-certificate-exporter-master-6ffhr       1/1     Running            0                  7h50m
x509-certificate-exporter-master-6q94j       1/1     Running            0                  7h50m

[root@k8s-m01 x509-certificate-exporter]# kubectl get ds
NAME                               DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                            AGE
x509-certificate-exporter-master   3         3         3       3            3           node-role.kubernetes.io/control-plane=   7h52m
[root@k8s-uat-m01 x509-certificate-exporter]# kubectl get deploy
NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
x509-certificate-exporter   1/1     1            1           7h52m


[root@k8s-m01 x509-certificate-exporter]# kubectl get svc
NAME                        TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                                        AGE
x509-certificate-exporter   ClusterIP   10.202.156.255   <none>        9793/TCP                                       8h
[root@k8s-m01 x509-certificate-exporter]# curl 10.209.156.2xx:9793/metrics
...............................................................................

[root@k8s-pre-m01 templates]# kubectl get svc
NAME                        TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
kubernetes                  ClusterIP   10.196.xx.x     <none>        443/TCP          618d
x509-certificate-exporter   NodePort    10.196.xxx.173   <none>        9793:30090/TCP   16h

访问master节点所在的主机 + 端口30090 +/metrics 这样就可以拿到采集的指标,然后在外部的prometheus写死就行了,如果部署在k8s内部的prometheus使用服务发现就行了。

配置监控以及告警

通过这个 Helm Chart 也会自动安装:

  • ServiceMonitor
  • PrometheusRule

其监控指标为:

  • x509_cert_not_after

前往prometheus使用(x509_cert_not_after - time()) / 86400是否有结果,有则成功,无则失败

前往告警平台配置告警,有效期低于14天告警

promQL为:(x509_cert_not_after{filepath!=""} - time()) / 86400 < 14

若需要配置grafana面板,则复制helm包中的x509-certificate-exporter/grafana-dashboards/x509-certificate-exporter.json,通过导入json的形式创建面板即可。

一般我们只需要关注已经过期的证书和即将过期的证书即可。假设我想查看证书还有多久失效,可以使用表达式 (x509_cert_not_after{filepath!=""} - time()) / 3600 / 24

groups:
  - name: certificate_alerts
    rules:
      - alert: CertificateExpiration
        expr: ((x509_cert_not_after - time()) / 86400) < 28
        for: 15m
        labels:
          severity: warning
        annotations:
          summary: "Certificate is about to expire"
          description: "Certificate for '{{ $labels.subject_CN }}' is about to expire in Kubernetes secret '{{ $labels.secret_namespace }}/{{ $labels.secret_name }}'"

监控效果

该 Exporter 还提供了一个比较花哨的 Grafana Dashboard, 如下:

https://grafana.com/grafana/dashboards/13922-certificates-expiration-x509-certificate-exporter/https://grafana.com/grafana/dashboards/13922-certificates-expiration-x509-certificate-exporter/

异常处理

# 检查pod是否都运行kubectl get po -n 命名空间 -o wide

# 若运行异常,查看日志,一般都为某证书没读取权限或证书不存在kubectl logs -n 命名空间 【pod name】

# 若证书无读取权限,则前往该节点赋权  chmod +r 【证书路径】

评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值