k8s部署Prometheus

最新推荐文章于 2024-09-13 09:21:48 发布

云纷纷

最新推荐文章于 2024-09-13 09:21:48 发布

阅读量834

点赞数 10

分类专栏： Prometheus和Grafana监控平台文章标签： kubernetes prometheus 容器

本文链接：https://blog.csdn.net/s1440350254/article/details/141371065

版权

Prometheus和Grafana监控平台专栏收录该内容

15 篇文章 0 订阅

订阅专栏

文章目录

部署Prometheus
初识Prometheus监控平台

部署Prometheus

前提：需要k8s集群环境部署k8s集群博客

初识Prometheus监控平台

创建命名空间

$ kubectl create namespace monitor

创建RBAC规则

创建RBAC规则，包含ServiceAccount、ClusterRole、ClusterRoleBinding三类YAML文件

vim prometheus-rabc.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
  namespace: monitor
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources: ["nodes","nodes/proxy","services","endpoints","pods"]
  verbs: ["get", "list", "watch"] 
- apiGroups: ["extensions"]
  resources: ["ingress"]
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef: 
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: monitor

验证

$ kubectl apply -f prometheus-rabc.yaml
$ kubectl get sa prometheus -n monitor
NAME         SECRETS   AGE
prometheus   0         52s
$ kubectl get clusterrole prometheus
NAME         CREATED AT
prometheus   2024-08-19T20:06:06Z
$ kubectl get clusterrolebinding prometheus
NAME         ROLE                        AGE
prometheus   ClusterRole/cluster-admin   9m15s

创建ConfigMap类型的prometheus配置文件

vim prometheus-cm.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: monitor
data:
  prometheus.yml: |
    global:
      scrape_interval:     15s
      evaluation_interval: 15s
      external_labels:
        cluster: "kubernetes"
        
    ############ 数据采集job ###################
    scrape_configs:
    - job_name: prometheus
      static_configs:
      - targets: ['127.0.0.1:9090']
        labels:
          instance: prometheus
 
    ############ 指定告警规则文件路径位置 ###################
    rule_files:
    - /etc/prometheus/rules/*.rules

验证

$ kubectl apply -f  prometheus-cm.yaml
$ kubectl get cm prometheus-config -n monitor
NAME                DATA   AGE
prometheus-config   1      4s

创建ConfigMap类型的prometheus rules配置文件

使用ConfigMap方式创建prometheus rules配置文件

包含的内容是两块，分别是general.rules和node.rules

使用以下命令创建Prometheus的另外两个配置文件

vim prometheus-rules.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-rules
  namespace: monitor
data:
  general.rules: |
    groups:
    - name: general.rules
      rules:
      - alert: InstanceDown
        expr: |
          up{job=~"k8s-nodes|prometheus"} == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Instance {{ $labels.instance }} 停止工作"
          description: "{{ $labels.instance }} 主机名：{{ $labels.hostname }} 已经停止1分钟以上."

  node.rules: |
    groups:
    - name: node.rules
      rules:
      - alert: NodeFilesystemUsage
        expr: |
          100 - (node_filesystem_avail_bytes / node_filesystem_size_bytes) * 100 > 85
        for: 1m
        labels:
          severity: warning
        annotations:
          summary: "Instance {{ $labels.instance }} : {{ $labels.mountpoint }} 分区使用率过高"
          description: "{{ $labels.instance }} 主机名：{{ $labels.hostname }} : {{ $labels.mountpoint }} 分区使用大于85% (当前值: {{ $value }})"

验证

$ kubectl apply -f  prometheus-rules.yaml
$ kubectl get cm -n monitor prometheus-rules
NAME               DATA   AGE
prometheus-rules   2      11s

创建prometheus svc

vim prometheus-svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: prometheus
  namespace: monitor
  labels:
    k8s-app: prometheus
spec:
  type: ClusterIP
  ports:
  - name: http
    port: 9090
    targetPort: 9090
  selector:
    k8s-app: prometheus

验证

$ kubectl apply -f  prometheus-svc.yaml
$ kubectl get svc -n monitor prometheus
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
prometheus   ClusterIP   10.1.8.76    <none>        9090/TCP   9m29s

创建prometheus deploy

由于Prometheus需要对数据进行持久化，以便在重启后能够恢复历史数据。所以这边我们通过早先课程部署的NFS做存储来实现持久化。

当前我们使用NFS提供的StorageClass来做数据存储

创建sc可以看这个博客

vim prometheus-pvc.yaml

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: prometheus-data-pvc
  namespace: monitor
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: "nfs-storage"
  resources:
    requests:
      storage: 10Gi

vim prometheus-deploy.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus
  namespace: monitor
  labels:
    k8s-app: prometheus
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: prometheus
  template:
    metadata:
      labels:
        k8s-app: prometheus
    spec:
      serviceAccountName: prometheus
      containers:
      - name: prometheus
        image: docker.m.daocloud.io/prom/prometheus:v2.36.0
        imagePullPolicy: IfNotPresent
        ports:
        - name: http
          containerPort: 9090
        securityContext:
          runAsUser: 65534
          privileged: true
        command:
        - "/bin/prometheus"
        args:
        - "--config.file=/etc/prometheus/prometheus.yml"
        - "--web.enable-lifecycle"
        - "--storage.tsdb.path=/prometheus"
        - "--storage.tsdb.retention.time=10d"
        - "--web.console.libraries=/etc/prometheus/console_libraries"
        - "--web.console.templates=/etc/prometheus/consoles"
        resources:
          limits:
            cpu: 2000m
            memory: 2048Mi
          requests:
            cpu: 1000m
            memory: 512Mi
        readinessProbe:
          httpGet:
            path: /-/ready
            port: 9090
          initialDelaySeconds: 5
          timeoutSeconds: 10
        livenessProbe:
          httpGet:
            path: /-/healthy
            port: 9090
          initialDelaySeconds: 30
          timeoutSeconds: 30
        volumeMounts:
        - name: data
          mountPath: /prometheus
          subPath: prometheus
        - name: config
          mountPath: /etc/prometheus
        - name: prometheus-rules
          mountPath: /etc/prometheus/rules
      - name: configmap-reload
        image: jimmidyson/configmap-reload:v0.5.0
        imagePullPolicy: IfNotPresent
        args:
        - "--volume-dir=/etc/config"
        - "--webhook-url=http://localhost:9090/-/reload"
        resources:
          limits:
            cpu: 100m
            memory: 100Mi
          requests:
            cpu: 10m
            memory: 10Mi
        volumeMounts:
        - name: config
          mountPath: /etc/config
          readOnly: true
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: prometheus-data-pvc
      - name: prometheus-rules
        configMap:
          name: prometheus-rules
      - name: config
        configMap:
          name: prometheus-config

部署的 Deployment 资源文件中的 containers 部分配置了两个容器，分别是

prometheus: Prometheus 容器是主容器，用于运行 Prometheus 进程
configmap-reload: 用于监听指定的 ConfigMap 文件中的内容，如果内容发生更改，则执行 webhook url 请求，因为 Prometheus 支持通过接口重新加载配置文件，所以这里使用这个容器提供的机制来完成 Prometheus ConfigMap 配置文件内容一有更改，就执行 Prometheus 的 /-/reload 接口，进行更新配置操作

上面资源文件中 Prometheus 参数说明:

–web.enable-lifecycle: 启用 Prometheus 用于重新加载配置的 /-/reload 接口
–config.file: 指定 Prometheus 配置文件所在地址，这个地址是相对于容器内部而言的
–storage.tsdb.path: 指定 Prometheus 数据存储目录地址，这个地址是相对于容器而言的
–storage.tsdb.retention.time: 指定删除旧数据的时间，默认为 15d
–web.console.libraries: 指定控制台组件依赖的存储路径
–web.console.templates: 指定控制台模板的存储路径

验证

$ kubectl apply -f prometheus-pvc.yaml
$ kubectl apply -f prometheus-deploy.yaml
$ kubectl get pvc -n monitor
NAME                  STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
prometheus-data-pvc   Bound    pvc-95786ed1-2d43-46ca-b15c-b3dcf958a6b6   10Gi       RWX            nfs-storage    38s
$ kubectl get deploy  -n monitor
NAME         READY   UP-TO-DATE   AVAILABLE   AGE
prometheus   1/1     1            1           83s
$ kubectl get pods -n monitor
NAME                          READY   STATUS    RESTARTS   AGE
prometheus-58cf9d5989-sttk2   2/2     Running   0          100s

创建prometheus ingress实现外部域名访问

ingress 部署可以看Ingress部署

vim prometheus-ing.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  namespace: monitor
  name: prometheus-ingress
spec:
  ingressClassName: nginx
  rules:
  - host: prometheus.kubernets.cn
    http:
      paths:
        - pathType: Prefix
          backend:
            service:
              name: prometheus
              port:
                number: 9090
          path: /

验证

$ kubectl apply -f prometheus-ing.yaml
$ kubectl get ing -n  monitor prometheus-ingress
NAME                 CLASS   HOSTS                     ADDRESS   PORTS   AGE
prometheus-ingress   nginx   prometheus.kubernets.cn             80      28s
$ kubectl  get svc -n ingress-nginx 
NAME                                 TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                      AGE
ingress-nginx-controller             NodePort    10.1.231.117   <none>        80:30186/TCP,443:30153/TCP   32h
$ echo '11.0.1.92 prometheus.kubernets.cn' > /ect/hosts
$ curl prometheus.kubernets.cn:30186
<a href="/graph">Found</a>