黑盒监控Blackbox

黑盒监控Blackbox

Prometheus 监控分为两种:

  • 白盒监控
  • 黑盒监控

白盒监控:是指我们日常监控主机的资源用量、容器的运行状态的运行数据。

黑盒监控:常见的黑盒监控包括 HTTP探针TCP探针DnsIcmp等用于检测站点、服务的可访问性、服务的连通性,以及访问效率等。

两者比较

  • 黑盒监控是以故障为导向当故障发生时,黑盒监控能快速发现故障*。*
  • 白盒监控则侧重于主动发现或者预测潜在的问题。

一个完善的监控目标是要能够从白盒的角度发现潜在问题,能够在黑盒的角度快速发现已经发生的问题。

目前支持的应用场景:

  • ICMP 测试
    • 主机探活机制
  • TCP 测试
    • 业务组件端口状态监听
    • 应用层协议定义与监听
  • HTTP 测试
    • 定义 Request Header 信息
    • 判断 Http status / Http Respones Header / Http Body 内容
  • POST 测试
    • 接口联通性
  • SSL 证书过期时间

Blackbox Exporter 部署

Exporter Configmap 定义,可以参考下面两个链接

https://github.com/prometheus/blackbox_exporter/blob/master/CONFIGURATION.md

https://github.com/prometheus/blackbox_exporter/blob/master/example.yml

首先得声明一个 Blackbox 的 Deployment,并利用 Configmap 来为 Blackbox 提供配置文件。

Configmap:

参考 BlackBox Exporter 的 Github 提供的 示例配置文件

vim blackbox-configmap.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: blackbox-exporter
  namespace: monitor
  labels:
    app: blackbox-exporter
data:
  blackbox.yml: |-
    modules:
      ## ----------- DNS 检测配置 -----------
      dns_tcp:  
        prober: dns
        dns:
          transport_protocol: "tcp"
          preferred_ip_protocol: "ip4"
          query_name: "kubernetes.default.svc.cluster.local" # 用于检测域名可用的网址
          query_type: "A" 
      ## ----------- TCP 检测模块配置 -----------
      tcp_connect:
        prober: tcp
        timeout: 5s
      ## ----------- ICMP 检测配置 -----------
      icmp:
        prober: icmp
        timeout: 5s
        icmp:
          preferred_ip_protocol: "ip4"
      ## ----------- HTTP GET 2xx 检测模块配置 -----------
      http_get_2xx:  
        prober: http
        timeout: 10s
        http:
          method: GET
          preferred_ip_protocol: "ip4"
          valid_http_versions: ["HTTP/1.1","HTTP/2"]
          valid_status_codes: [200]           # 验证的HTTP状态码,默认为2xx
          no_follow_redirects: false          # 是否不跟随重定向
      ## ----------- HTTP GET 3xx 检测模块配置 -----------
      http_get_3xx:  
        prober: http
        timeout: 10s
        http:
          method: GET
          preferred_ip_protocol: "ip4"
          valid_http_versions: ["HTTP/1.1","HTTP/2"]
          valid_status_codes: [301,302,304,305,306,307]  # 验证的HTTP状态码,默认为2xx
          no_follow_redirects: false                     # 是否不跟随重定向
      ## ----------- HTTP POST 监测模块 -----------
      http_post_2xx: 
        prober: http
        timeout: 10s
        http:
          method: POST
          preferred_ip_protocol: "ip4"
          valid_http_versions: ["HTTP/1.1", "HTTP/2"]
          #headers:                             # HTTP头设置
          #  Content-Type: application/json
          #body: '{}'                           # 请求体设置

vim blackbox-exporter.yaml

apiVersion: v1
kind: Service
metadata:
  name: blackbox-exporter
  namespace: monitor
  labels:
    k8s-app: blackbox-exporter
spec:
  type: ClusterIP
  ports:
  - name: http
    port: 9115
    targetPort: 9115
  selector:
    k8s-app: blackbox-exporter
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: blackbox-exporter
  namespace: monitor
  labels:
    k8s-app: blackbox-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: blackbox-exporter
  template:
    metadata:
      labels:
        k8s-app: blackbox-exporter
    spec:
      containers:
      - name: blackbox-exporter
        image: prom/blackbox-exporter:v0.21.0
        imagePullPolicy: IfNotPresent
        args:
        - --config.file=/etc/blackbox_exporter/blackbox.yml
        - --web.listen-address=:9115
        - --log.level=info
        ports:
        - name: http
          containerPort: 9115
        resources:
          limits:
            cpu: 200m
            memory: 256Mi
          requests:
            cpu: 100m
            memory: 50Mi
        livenessProbe:
          tcpSocket:
            port: 9115
          initialDelaySeconds: 5
          timeoutSeconds: 5
          periodSeconds: 10
          successThreshold: 1
          failureThreshold: 3
        readinessProbe:
          tcpSocket:
            port: 9115
          initialDelaySeconds: 5
          timeoutSeconds: 5
          periodSeconds: 10
          successThreshold: 1
          failureThreshold: 3
        volumeMounts:
        - name: config
          mountPath: /etc/blackbox_exporter
      volumes:
      - name: config
        configMap:
          name: blackbox-exporter
          defaultMode: 420

验证

# 部署
$ kubectl apply -f blackbox-configmap.yaml
$ kubectl apply -f blackbox-exporter.yaml
# 查看部署后的资源
$ kg all -nmonitor |grep blackbox

定义 BlackBox 在 Prometheus 抓取设置

下面抓取设置,都存放在 prometheus-config.yaml 文件中,设置可参考

https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml

DNS 监控

 - job_name: "kubernetes-dns"
      metrics_path: /probe      # 不是 metrics,是 probe
      params:
        module: [dns_tcp]       # 使用 DNS TCP 模块
      static_configs:
        - targets:
          - kube-dns.kube-system:53     # 不要省略端口号
          - 8.8.4.4:53
          - 8.8.8.8:53
          - 223.5.5.5:53
      relabel_configs:
        - source_labels: [__address__]
          target_label: __param_target
        - source_labels: [__param_target]
          target_label: instance
        - target_label: __address__
          replacement: blackbox-exporter.monitor:9115       # 服务地址,和上面的 Service 定义保持一致

参数解释

################ DNS 服务器监控 ###################
- job_name: "kubernetes-dns"
  metrics_path: /probe
  params:
    ## 配置要使用的模块,要与blackbox exporter配置中的一致
    ## 这里使用DNS模块
    module: [dns_tcp]
  static_configs:
    ## 配置要检测的地址
    - targets:
      - kube-dns.kube-system:53
      - 8.8.4.4:53
      - 8.8.8.8:53
      - 223.5.5.5
  relabel_configs:
    ## 将上面配置的静态DNS服务器地址转换为临时变量 “__param_target”
    - source_labels: [__address__]
      target_label: __param_target
    ## 将 “__param_target” 内容设置为 instance 实例名称
    - source_labels: [__param_target]
      target_label: instance
    ## BlackBox Exporter 的 Service 地址
    - target_label: __address__
      replacement: blackbox-exporter.monitor:9115

热加载

$ curl -XPOST http://prometheus.kubernets.cn/-/reload

打开 Prometheus 的 Target 页面,就会看到 上面定义的 blackbox-k8s-service-dns 任务

graph 页面,可以使用 probe_successprobe_duration_seconds 等来检查历史结果

ICMP监控

 - job_name: icmp-status
      metrics_path: /probe
      params:
        module: [icmp]
      static_configs:
      - targets:
        - 11.0.1.91
        labels:
          group: icmp
      relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: blackbox-exporter.monitor:9115

热加载

curl -XPOST http://prometheus.kubernets.cn/-/reload

HTTP 监控(K8S 内部发现方法)

自定义发现Service监控端口和路径

 job_name: 'kubernetes-services'
      metrics_path: /probe
      params:
        module:     ## 使用HTTP_GET_2xx与HTTP_GET_3XX模块
        - "http_get_2xx"
        - "http_get_3xx"
      kubernetes_sd_configs:        ## 使用Kubernetes动态服务发现,且使用Service类型的发现
      - role: service
      relabel_configs:      ## 设置只监测Kubernetes Service中Annotation里配置了注解prometheus.io/http_probe: true的service
      - action: keep
        source_labels: [__meta_kubernetes_service_annotation_prometheus_io_http_probe]
        regex: "true"
      - action: replace
        source_labels: 
        - "__meta_kubernetes_service_name"
        - "__meta_kubernetes_namespace"
        - "__meta_kubernetes_service_annotation_prometheus_io_http_probe_port"
        - "__meta_kubernetes_service_annotation_prometheus_io_http_probe_path"
        target_label: __param_target
        regex: (.+);(.+);(.+);(.+)
        replacement: $1.$2:$3$4
      - target_label: __address__
        replacement: blackbox-exporter.monitor:9115     ## BlackBox Exporter 的 Service 地址
      - source_labels: [__param_target]
        target_label: instance
      - action: labelmap
        regex: __meta_kubernetes_service_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_service_name]
        target_label: kubernetes_name   

然后,需要在Service中配置这样的annotation

annotations:
  prometheus.io/http-probe: "true"      ## 开启 HTTP 探针
  prometheus.io/http-probe-port: "8080"     ## HTTP 探针会使用 8080 端口来进行探测
  prometheus.io/http-probe-path: "/healthCheck"     ## HTTP 探针会请求  /healthCheck  路径来进行探测,以检查应用程序是否正常运行

示例:Java应用的svc:

apiVersion: v1
kind: Service
metadata:
  name: springboot
  annotations:
    prometheus.io/http-probe: "true"
    prometheus.io/http-probe-port: "8080"
    prometheus.io/http-probe-path: "/apptwo"
spec:
  type: ClusterIP
  selector:
    app: springboot
  ports:
  - name: http
    port: 8080
    protocol: TCP
    targetPort: 8080

热加载

curl -XPOST http://prometheus.kubernets.cn/-/reload

TCP检测

- job_name: "service-tcp-probe"
      scrape_interval: 1m
      metrics_path: /probe
      # 使用blackbox exporter配置文件的tcp_connect的探针
      params:
        module: [tcp_connect]
      kubernetes_sd_configs:
      - role: service
      relabel_configs:
      # 保留prometheus.io/scrape: "true"和prometheus.io/tcp-probe: "true"的service
      - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape, __meta_kubernetes_service_annotation_prometheus_io_tcp_probe]
        action: keep
        regex: true;true
      # 将原标签名__meta_kubernetes_service_name改成service_name
      - source_labels: [__meta_kubernetes_service_name]
        action: replace
        regex: (.*)
        target_label: service_name
      # 将原标签名__meta_kubernetes_service_name改成service_name
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        regex: (.*)
        target_label: namespace
      # 将instance改成 `clusterIP:port` 地址
      - source_labels: [__meta_kubernetes_service_cluster_ip, __meta_kubernetes_service_annotation_prometheus_io_http_probe_port]
        action: replace
        regex: (.*);(.*)
        target_label: __param_target
        replacement: $1:$2
      - source_labels: [__param_target]
        target_label: instance
      # 将__address__的值改成 `blackbox-exporter.monitor:9115`
      - target_label: __address__
        replacement: blackbox-exporter.monitor:9115

热加载

curl -XPOST http://prometheus.kubernets.cn/-/reload

则需要在service上添加注释必须有以下三行

annotations:
    prometheus.io/scrape: "true"        ## 这个服务是可以被采集指标的,Prometheus 可以对这个服务进行数据采集
    prometheus.io/tcp-probe: "true"     ## 开启 TCP 探针
    prometheus.io/http-probe-port: "8080"       ## HTTP 探针会使用 8080 端口来进行探测,以检查应用程序是否正常运行

示例

apiVersion: v1
kind: Service
metadata:
  name: springboot
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/tcp-probe: "true"
    prometheus.io/http-probe-port: "8080"
spec:
  type: ClusterIP
  selector:
    app: springboot
  ports:
  - name: http
    port: 8080
    protocol: TCP
    targetPort: 8080

Ingress服务的探测

  - job_name: 'blackbox-k8s-ingresses'
      scrape_interval: 30s
      scrape_timeout: 10s
      metrics_path: /probe
      params:
        module: [http_get_2xx]  # 使用定义的http模块
      kubernetes_sd_configs:
      - role: ingress  # ingress 类型的服务发现
      relabel_configs:
      # 只有ingress的annotation中配置了 prometheus.io/http_probe=true 的才进行发现
      - source_labels: [__meta_kubernetes_ingress_annotation_prometheus_io_http_probe]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_ingress_scheme,__address__,__meta_kubernetes_ingress_path]
        regex: (.+);(.+);(.+)
        replacement: ${1}://${2}${3}
        target_label: __param_target
      - target_label: __address__
        replacement: blackbox-exporter.monitor:9115
      - source_labels: [__param_target]
        target_label: instance
      - action: labelmap
        regex: __meta_kubernetes_ingress_label_(.+)
      - source_labels: [__meta_kubernetes_namespace]
        target_label: kubernetes_namespace
      - source_labels: [__meta_kubernetes_ingress_name]
        target_label: kubernetes_name

则需要在ingress上添加注释必须有以下三行

  annotations:
    prometheus.io/http_probe: "true"
    prometheus.io/http-probe-port: '8080'
    prometheus.io/http-probe-path: '/healthz'

示例

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: java-ingress-nginx
  namespace: default
  annotations:
    prometheus.io/http_probe: "true"
    prometheus.io/http-probe-port: '8080'
    prometheus.io/http-probe-path: '/apptwo'
spec:
  ingressClassName: nginx
  rules:
    - host: java.kubernets.cn
      http:
        paths:
          - pathType: Prefix
            backend:
              service:
                name: springboot
                port:
                  number: 8080
            path: /

HTTP 监控(监控外部域名)

 - job_name: "blackbox-external-website"
      scrape_interval: 30s
      scrape_timeout: 15s
      metrics_path: /probe
      params:
        module: [http_get_2xx]
      static_configs:
      - targets:
        - https://www.baidu.com # 改为公司对外服务的域名
        - https://www.jd.com
      relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: blackbox-exporter.monitor:9115

热加载

$ curl -XPOST http://prometheus.kubernets.cn/-/reload

HTTP Post 监控(监控外部域名)

 - job_name: 'blackbox-http-post'
      metrics_path: /probe
      params:
        module: [http_post_2xx]
      static_configs:
        - targets:
          - https://www.example.com/api # 要检查的网址
      relabel_configs:
        - source_labels: [__address__]
          target_label: __param_target
        - source_labels: [__param_target]
          target_label: instance
        - target_label: __address__
          replacement: blackbox-exporter.monitor:9115

热加载

$ curl -XPOST http://prometheus.kubernets.cn/-/reload

小结

  • Blackbox Exporter是一个用于监控网络服务的开源工具。
  • 通过模拟HTTP、HTTPS、DNS、TCP等协议向服务端发送请求,并返回响应码、响应时间等信息。
  • Blackbox Exporter具有高度的灵活性,可以通过配置文件进行定制,包括对请求的频率、超时时限、请求头、请求参数等进行配置。
  • Blackbox Exporter还支持Prometheus监控系统,可以将采集到的数据自动发送给Prometheus,并进行聚合和展示
  • 16
    点赞
  • 17
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值