k8s环境中搭建prometheus监控平台及自动发现集群中的节点

立33

已于 2024-05-22 17:40:45 修改

阅读量385

点赞数 8

文章标签： kubernetes prometheus 容器经验分享云原生运维 k8s

于 2024-05-22 17:37:18 首次发布

本文链接：https://blog.csdn.net/baidu_39157826/article/details/139125611

版权

k8s环境中搭建prometheus监控平台及自动发现集群中的节点

本文在已安装完成k8s环境，且使用了docker作为k8s的运行时。在以上的基础上搭建了prometheus，若按照本文搭建prometheus需要完成以上前提。

node_exporter

node_exporter是prometheus指标收集器，用于监控主机系统。

配置

新建配置文件node_exporter_k8s.yml

apiVersion: apps/v1
# 配置DaemonSet方式运行,当有新的k8s节点加入集群时就会自动创建及启动node_exporter容器
kind: DaemonSet
metadata:
  name: node-exporter
  namespace: kube-system
  labels:
    app: node-exporter
spec:
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      #我这里使用的是测试环境,为了方便测试使用了主机地址,如若不需要可以修改走k8s网络
      hostNetwork: true
      tolerations:
      - key: node-role.kubernetes.io/control-plane
        operator: Exists
        effect: NoSchedule
      - key: node-role.kubernetes.io/master
        operator: Exists
        effect: NoSchedule
      containers:
      - name: node-exporter
        image: quay.io/prometheus/node-exporter:latest
        #因为是以启动容器的方式进行主机监控,所以需要指定path.rootfs参数,值必须要与主机根目录的路径进行绑定
        args:
        - --path.rootfs=/host
        volumeMounts:
        - name: rootfs
          mountPath: /host
      #映射主机根目录
      volumes:
      - name: rootfs
        hostPath:
          path: /

应用配置及启动

kubectl apply -f node_exporter_k8s.yml

prometheus

配置

新建配置文件prometheus.yml，这段配置用于配置prometheus能够自动发现node，并且抓取数据

# prometheus.yml
global:
  scrape_interval: 15s  # 默认抓取间隔

scrape_configs:
  - job_name: 'node-exporter'
    kubernetes_sd_configs:
      - role: pod
        namespaces:
          names:
            - kube-system  # 指定命名空间，例如 kube-system
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_label_app]
        action: keep
        regex: node-exporter
      - source_labels: [__meta_kubernetes_pod_ip]
        action: replace
        target_label: __address__
        regex: (.*)
        replacement: $1:9100  # 这里指定 node-exporter 的端口
      - source_labels: [__meta_kubernetes_pod_node_name]
        target_label: instance

创建k8s Deployment

配置

新建配置文件prometheus_k8s.yml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus
  namespace: kube-system
  labels:
    app: prometheus
spec:
  selector:
    matchLabels:
      name: prometheus
  template:
    metadata:
      labels:
        name: prometheus
    spec:
      hostNetwork: true
      tolerations:
      - key: node-role.kubernetes.io/control-plane
        operator: Exists
        effect: NoSchedule
      - key: node-role.kubernetes.io/master
        operator: Exists
        effect: NoSchedule
      containers:
      - name: prometheus
        image: prom/prometheus
        volumeMounts:
        - name: config
          mountPath: /etc/prometheus/prometheus.yml
      volumes:
      - name: config
        hostPath:
          #这个路径是上一步操作中创建的配置文件
          path: /k8s/prometheus/config/prometheus.yml

应用配置及启动

kubectl apply -f prometheus_k8s.yml

成功后访问主机9090端口即可访问prometheus界面

问题

如果启动prometheus出现以下错误

caller=klog.go:116 level=error component=k8s_client_runtime func=ErrorDepth msg="pkg/mod/k8s.io/client-go@v0.29.3/tools/cache/reflector.go:229: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:kube-system:default\" cannot list resource \"pods\" in API group \"\" at the cluster scope"

则是因为权限问题导致，具体解决方法如下

1. 创建 ClusterRole

创建一个新的 ClusterRole，赋予它列出、获取和监视 Pods 的权限。

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources:
    - pods
    - nodes
    - services
    - endpoints
  verbs:
    - get
    - list
    - watch
- apiGroups: [""]
  resources:
    - configmaps
  verbs:
    - get
- nonResourceURLs: ["/metrics"]
  verbs:
    - get

2. 创建 ClusterRoleBinding

创建一个 ClusterRoleBinding，将上述 ClusterRole 绑定到 Prometheus 使用的 ServiceAccount 上。

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: default
  namespace: kube-system

3. 应用 RBAC 配置

将上述两个 YAML 文件应用到 Kubernetes 集群中：

kubectl apply -f prometheus-clusterrole.yaml
kubectl apply -f prometheus-clusterrolebinding.yaml

4. 重新启动 Prometheus

kubectl rollout restart deployment prometheus -n <prometheus-namespace>

立33

关注

8
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
k8s环境中搭建prometheus监控平台及自动发现集群中的节点

在已搭建好的k8s环境中搭建prometheus，并且配置prometheus自动发现k8s中的集群节点
复制链接

扫一扫