手动安装 Prometheus到k8s

简介

官方链接 : https://prometheus.io/docs/introduction/overview/

Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud. Since its inception in 2012, many companies and organizations have adopted Prometheus, and the project has a very active developer and user community. It is now a standalone open source project and maintained independently of any company. To emphasize this, and to clarify the project’s governance structure, Prometheus joined the Cloud Native Computing Foundation in 2016 as the second hosted project, after Kubernetes.

Prometheus collects and stores its metrics as time series data, i.e. metrics information is stored with the timestamp at which it was recorded, alongside optional key-value pairs called labels.

For more elaborate overviews of Prometheus, see the resources linked from the media section.

特征

Prometheus’s main features are:

  • a multi-dimensional data model with time series data identified by metric name and key/value pairs
  • PromQL, a flexible query language to leverage this dimensionality
  • no reliance on distributed storage; single server nodes are autonomous
  • time series collection happens via a pull model over HTTP
  • pushing time series is supported via an intermediary gateway
  • targets are discovered via service discovery or static configuration
  • multiple modes of graphing and dashboarding support

组件

The Prometheus ecosystem consists of multiple components, many of which are optional:

  • the main Prometheus server which scrapes and stores time series data
  • client libraries for instrumenting application code a push gateway
  • for supporting short-lived jobs special-purpose exporters for
  • services like HAProxy, StatsD, Graphite, etc. an alertmanager to
  • handle alerts various support tools

架构

在这里插入图片描述

安装方式选择

二进制安装

Prometheus主要是由Go语言编写的,可以在官网下载 https://prometheus.io/download/ 二进制文件,直接进行启动安装

./prometheus --config.file=prometheus.yml

prometheus.yml文件的基本配置如下:

global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']

通过docker方式安装

登陆docker hub,查找Prometheus
查找一个docker镜像即可,比如
https://hub.docker.com/layers/bitnami/prometheus/2-debian-10/images/sha256-ad4ad5965bc993979299fa366b408bd07366404f3a1d3915dc6e3eab44c42a64?context=explore

这里关注的是Prometheus的镜像地址
以及镜像启动的命令
其目的是填写k8s里container配置文件参数

在这里插入图片描述

编写Prometheus k8s yaml文件

deployment

apiVersion: apps/v1
kind: Deployment # 这里决定使用deployment来部署,所以需要考虑到pod被delete后,后端储存还能用,因此使用了pv
metadata:
  name: prometheus-deploy
  namespace: prometheus-ns # 单独使用了命名空间,所以还需要有namespace的声明
spec:
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      securityContext:
        runAsUser: 0 # 由于prometheus的docker启动user id是1001,在访问nfs里会权限不够,这里使用root用户来运行
      serviceAccountName: prometheus-sa # prometheus需要访问k8s里的相关信息,因此需要账号控制策略
      containers:
      - name: prometheus-container
        image: docker.io/bitnami/prometheus:2-debian-10
        imagePullPolicy: IfNotPresent
        args:
        - "--config.file=/prometheus/conf/prometheus.yml"  # 通过configmap资源对象储存
        - "--web.console.libraries=/opt/bitnami/prometheus/conf/console_libraries"  # 暂不修改,使用docker镜像里的配置
        - "--web.console.templates=/opt/bitnami/prometheus/conf/consoles"  # 暂不修改,使用docker镜像里的配置
        - "--storage.tsdb.path=/prometheus/data/"  # 通过声明pvc来储存,来做持久化
        - "--storage.tsdb.retention=24h"  # 保留多长时间的时序日志
        - "--web.enable-admin-api"  # 可以开启对admin api来访问,直接操作
        - "--web.enable-lifecycle"  # 表示开启热更新
        resources:
          limits:
            memory: "128Mi"
            cpu: "500m"
        ports:
        - containerPort: 9090
          name: app-http-port
        volumeMounts:
        - mountPath: "/prometheus/data/"  # 目录持久化
          subPath: sub1
          name: data
        - mountPath: "/prometheus/conf/"
          name: config
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: prometheus-pvc
      - name: config
        configMap:
          name: prometheus-cm  # 通过configmap来储存prometheus.yml文件

Namespace

apiVersion: v1
kind: Namespace
metadata:
  name: prometheus-ns

PersistentVolume

apiVersion: v1
kind: PersistentVolume
metadata:
  name: prometheus-pv  # 来给pvc干活用
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Recycle
  nfs:
    path: /nfsData/prometheus
    server: 192.168.56.203  # 部署的nfs储存服务

PersistentVolumeClaim

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: prometheus-pvc  # 名字和deployment中使用的保持一致
  namespace: prometheus-ns
spec:  # 字段和pv中保持一致
  resources:
    requests:
      storage: 10Gi  # 不能大于pv中的值
  accessModes:
    - ReadWriteOnce

ConfigMap

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-cm  # 通过configmap来储存prometheus.yml文件
  namespace: prometheus-ns
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
      evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
      # scrape_timeout is set to the global default (10s).

    # Alertmanager configuration
    alerting:
      alertmanagers:
        - static_configs:
            - targets:
              # - alertmanager:9093

    # Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
    rule_files:
      # - "first_rules.yml"
      # - "second_rules.yml"

    # A scrape configuration containing exactly one endpoint to scrape:
    # Here it's Prometheus itself.
    scrape_configs:
      # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
      - job_name: "prometheus"

        # metrics_path defaults to '/metrics'
        # scheme defaults to 'http'.

        static_configs:
          - targets: ["localhost:9090"]

ServiceAccount

apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus-sa
  namespace: prometheus-ns

ClusterRole

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-cr  # 需要访问到其它namespace下的内容
rules: # 根据使用的情况,进行适度修改
- apiGroups: [""]
  resources:
    - nodes
    - services
    - endpoints
    - pods
    - nodes/proxy
    - configmaps
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]  # 对非资源型进行操作
  verbs: ["get"]

ClusterRoleBinding

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus-crb  # 将账号和集群绑定
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus-cr
subjects:
  - kind: ServiceAccount
    name: prometheus-sa
    namespace: prometheus-ns

Service

apiVersion: v1
kind: Service  # 也可以配置Ingress来进行访问
metadata:
  name: prometheus-svc
  namespace: prometheus-ns  # 保持在同一个名称空间内,不然会出现无法访问的现象
spec:
  selector:
    app: prometheus
  type: NodePort
  ports:
  - name: web
    port: 9090
    targetPort: app-http-port

部署到环境

准备声明文件

可以将准备好的yaml文件分别执行部署

也可以将它们放到同一个文件中执行,如prometheus-app.yaml文件

apiVersion: v1
kind: Namespace
metadata:
  name: prometheus-ns

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: prometheus-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Recycle
  nfs:
    path: /nfsData/prometheus
    server: 192.168.56.203

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: prometheus-pvc
  namespace: prometheus-ns
spec:
  resources:
    requests:
      storage: 10Gi
  accessModes:
    - ReadWriteOnce

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus-deploy
  namespace: prometheus-ns
spec:
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      securityContext:
        runAsUser: 0
      serviceAccountName: prometheus-sa
      containers:
      - name: prometheus-container
        image: docker.io/bitnami/prometheus:2-debian-10
        imagePullPolicy: IfNotPresent
        args:
        - "--config.file=/prometheus/conf/prometheus.yml"
        - "--web.console.libraries=/opt/bitnami/prometheus/conf/console_libraries"
        - "--web.console.templates=/opt/bitnami/prometheus/conf/consoles"
        - "--storage.tsdb.path=/prometheus/data/"
        - "--storage.tsdb.retention=24h"
        - "--web.enable-admin-api"
        - "--web.enable-lifecycle"
        resources:
          limits:
            memory: "128Mi"
            cpu: "500m"
        ports:
        - containerPort: 9090
          name: app-http-port
        volumeMounts:
        - mountPath: "/prometheus/data/"
          subPath: sub1
          name: data
        - mountPath: "/prometheus/conf/"
          name: config
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: prometheus-pvc
      - name: config
        configMap:
          name: prometheus-cm

---
apiVersion: v1
kind: Service
metadata:
  name: prometheus-svc
  namespace: prometheus-ns
spec:
  selector:
    app: prometheus
  type: NodePort
  ports:
  - name: web
    port: 9090
    targetPort: app-http-port

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-cm
  namespace: prometheus-ns
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
      evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
      # scrape_timeout is set to the global default (10s).

    # Alertmanager configuration
    alerting:
      alertmanagers:
        - static_configs:
            - targets:
              # - alertmanager:9093

    # Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
    rule_files:
      # - "first_rules.yml"
      # - "second_rules.yml"

    # A scrape configuration containing exactly one endpoint to scrape:
    # Here it's Prometheus itself.
    scrape_configs:
      # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
      - job_name: "prometheus"

        # metrics_path defaults to '/metrics'
        # scheme defaults to 'http'.

        static_configs:
          - targets: ["localhost:9090"]


---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus-sa
  namespace: prometheus-ns

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-cr
rules:
- apiGroups: [""]
  resources:
    - nodes
    - services
    - endpoints
    - pods
    - nodes/proxy
    - configmaps
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus-crb
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus-cr
subjects:
  - kind: ServiceAccount
    name: prometheus-sa
    namespace: prometheus-ns

执行部署命令

kubectl create -f prometheus-app.yaml

在这里插入图片描述
可以看到已成功部署
在这里插入图片描述

测试访问

通过nodeip+port的方式访问

可以看到能成功访问
在这里插入图片描述

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值