目录
一 Prometheus简介
Prometheus是一个开源的服务监控系统和时序数据库
其提供了通用的数据模型和快捷数据采集、存储和查询接口
它的核心组件Prometheus服务器定期从静态配置的监控目标或者基于服务发现自动配置的目标中进行拉取数据
新拉取到啊的 数据大于配置的内存缓存区时,数据就会持久化到存储设备当中
1.1 Prometheus架构
1.1.1 组件功能:
-
监控代理程序:如node_exporter:收集主机的指标数据,如平均负载、CPU、内存、磁盘、网络等等多个维度的指标数据。
-
kubelet(cAdvisor):收集容器指标数据,也是K8S的核心指标收集,每个容器的相关指标数据包括:CPU使用率、限额、文件系统读写限额、内存使用率和限额、网络报文发送、接收、丢弃速率等等。
-
API Server:收集API Server的性能指标数据,包括控制队列的性能、请求速率和延迟时长等等
-
etcd:收集etcd存储集群的相关指标数据
-
kube-state-metrics:该组件可以派生出k8s相关的多个指标数据,主要是资源类型相关的计数器和元数据信息,包括制定类型的对象总数、资源限额、容器状态以及Pod资源标签系列等。
-
每个被监控的主机都可以通过专用的exporter程序提供输出监控数据的接口,并等待Prometheus服务器周期性的进行数据抓取
-
如果存在告警规则,则抓取到数据之后会根据规则进行计算,满足告警条件则会生成告警,并发送到Alertmanager完成告警的汇总和分发
-
当被监控的目标有主动推送数据的需求时,可以以Pushgateway组件进行接收并临时存储数据,然后等待Prometheus服务器完成数据的采集
-
任何被监控的目标都需要事先纳入到监控系统中才能进行时序数据采集、存储、告警和展示
-
监控目标可以通过配置信息以静态形式指定,也可以让Prometheus通过服务发现的机制进行动态管理
二 在k8s中部署Prometheus
2.1 下载部署Prometheus所需资源
#在helm中添加Prometheus仓库(网络巨好才做)
[root@k8s-master helm]# helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
"prometheus-community" has been added to your repositories
#下载Prometheus项目
[root@k8s-master helm]# helm pull prometheus-community/kube-prometheus-stack
[root@k8s-master helm]# ls
kube-prometheus-stack-62.6.0.tgz
================================================================
2.2 部署步骤
根据所有项目中的values.yaml中指定的image路径下载容器镜像并上传至harbor仓库
[root@k8s-master ~]# mkdir prometheus
[root@k8s-master ~]# cd prometheus/
[root@k8s-master prometheus]# ls
grafana-11.2.0.tar kube-state-metrics-2.13.0.tar nginx-exporter-1.3.0-debian-12-r2.tar prometheus-62.6.0.tar
kube-prometheus-stack-62.6.0.tgz nginx-18.1.11.tgz node-exporter-1.8.2.tar
-----------------------------------------------------------------------------------
[root@k8s-master prometheus]# tar zxf kube-prometheus-stack-62.6.0.tgz
[root@k8s-master prometheus]# cd kube-prometheus-stack/
[root@k8s-master kube-prometheus-stack]# ls
Chart.lock charts Chart.yaml CONTRIBUTING.md README.md templates values.yaml
#修改到本地harbor仓库
[root@k8s-master kube-prometheus-stack]# vim values.yaml
227 imageRegistry: "reg.exam.com"
----------------------------------------------------------------------------------
#导入镜像
[root@k8s-master prometheus]# docker load -i prometheus-62.6.0.tar
Loaded image: quay.io/prometheus/prometheus:v2.54.1
Loaded image: quay.io/thanos/thanos:v0.36.1
Loaded image: quay.io/prometheus/alertmanager:v0.27.0
Loaded image: quay.io/prometheus-operator/admission-webhook:v0.76.1
Loaded image: registry.k8s.io/ingress-nginx/kube-webhook-certgen:v20221220-controller-v1.5.1-58-g787ea74b6
Loaded image: quay.io/prometheus-operator/prometheus-operator:v0.76.1
Loaded image: quay.io/prometheus-operator/prometheus-config-reloader:v0.76.1
#打包并上传镜像
[root@k8s-master prometheus]# docker tag quay.io/prometheus/prometheus:v2.54.1 reg.exam.com/prometheus/prometheus:v2.54.1
[root@k8s-master prometheus]# docker push reg.exam.com/prometheus/prometheus:v2.54.1
[root@k8s-master prometheus]# docker tag quay.io/thanos/thanos:v0.36.1 reg.exam.com/thanos/thanos:v0.36.1
[root@k8s-master prometheus]# docker push reg.exam.com/thanos/thanos:v0.36.1
[root@k8s-master prometheus]# docker tag quay.io/prometheus/alertmanager:v0.27.0 reg.exam.com/prometheus/alertmanager:v0.27.0
[root@k8s-master prometheus]# docker push reg.exam.com/prometheus/alertmanager:v0.27.0
[root@k8s-master prometheus]# docker tag quay.io/prometheus-operator/admission-webhook:v0.76.1 reg.exam.com/prometheus-operator/admission-webhook:v0.76.1
[root@k8s-master prometheus]# docker push reg.exam.com/prometheus-operator/admission-webhook:v0.76.1
[root@k8s-master prometheus]# docker tag registry.k8s.io/ingress-nginx/kube-webhook-certgen:v20221220-controller-v1.5.1-58-g787ea74b6 reg.exam.com/ingress-nginx/kube-webhook-certgen:v20221220-controller-v1.5.1-58-g787ea74b6
[root@k8s-master prometheus]# docker push reg.exam.com/ingress-nginx/kube-webhook-certgen:v20221220-controller-v1.5.1-58-g787ea74b6
[root@k8s-master prometheus]# docker tag quay.io/prometheus-operator/prometheus-operator:v0.76.1 reg.exam.com/prometheus-operator/prometheus-operator:v0.76.1
[root@k8s-master prometheus]# docker push reg.exam.com/prometheus-operator/prometheus-operator:v0.76.1
[root@k8s-master prometheus]# docker tag quay.io/prometheus-operator/prometheus-config-reloader:v0.76.1 reg.exam.com/prometheus-operator/prometheus-config-reloader:v0.76.1
[root@k8s-master prometheus]# docker push reg.exam.com/prometheus-operator/prometheus-config-reloader:v0.76.1
#更改仓库地址
[root@k8s-master prometheus]# cd kube-prometheus-stack/charts/grafana/
[root@k8s-master grafana]# pwd
/root/prometheus/kube-prometheus-stack/charts/grafana
[root@k8s-master grafana]# vim values.yaml
3 imageRegistry: "reg.exam.com"
418 tag: "latest"
-----------------------------------------------------------------------------
#导入grafana镜像包
[root@k8s-master prometheus]# docker load -i grafana-11.2.0.tar
Loaded image: grafana/grafana:11.2.0
Loaded image: quay.io/kiwigrid/k8s-sidecar:1.27.4
Loaded image: grafana/grafana-image-renderer:latest
Loaded image: bats/bats:v1.4.1
#打包上传到harbor仓库
[root@k8s-master prometheus]# docker tag grafana/grafana:11.2.0 reg.exam.com/grafana/grafana:11.2.0
[root@k8s-master prometheus]# docker push reg.exam.com/grafana/grafana:11.2.0
[root@k8s-master prometheus]# docker tag quay.io/kiwigrid/k8s-sidecar:1.27.4 reg.exam.com/kiwigrid/k8s-sidecar:1.27.4
[root@k8s-master prometheus]# docker push reg.exam.com/kiwigrid/k8s-sidecar:1.27.4
[root@k8s-master prometheus]# docker tag grafana/grafana-image-renderer:latest reg.exam.com/grafana/grafana-image-renderer:latest
[root@k8s-master prometheus]# docker push reg.exam.com/grafana/grafana-image-renderer:latest
[root@k8s-master prometheus]# docker tag bats/bats:v1.4.1 reg.exam.com/bats/bats:v1.4.1
[root@k8s-master prometheus]# docker push reg.exam.com/bats/bats:v1.4.1
#修改配置文件中的仓库地址
[root@k8s-master kube-state-metrics]# pwd
/root/prometheus/kube-prometheus-stack/charts/kube-state-metrics
[root@k8s-master kube-state-metrics]# ls
Chart.yaml README.md templates values.yaml
[root@k8s-master kube-state-metrics]# vim values.yaml
4 registry: reg.exam.com
29 imageRegistry: "reg.exam.com"
-------------------------------------------------------------------------------------
#导入镜像
[root@k8s-master prometheus]# docker load -i kube-state-metrics-2.13.0.tar
Loaded image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.13.0
Loaded image: quay.io/brancz/kube-rbac-proxy:v0.18.0
#打包上传
[root@k8s-master prometheus]# docker tag registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.13.0 reg.exam.com/kube-state-metrics/kube-state-metrics:v2.13.0
[root@k8s-master prometheus]# docker push reg.exam.com/kube-state-metrics/kube-state-metrics:v2.13.0
[root@k8s-master prometheus]# docker tag quay.io/brancz/kube-rbac-proxy:v0.18.0 reg.exam.com/brancz/kube-rbac-proxy:v0.18.0
[root@k8s-master prometheus]# docker push reg.exam.com/brancz/kube-rbac-proxy:v0.18.0
#修改node监控配置文件仓库地址
[root@k8s-master prometheus-node-exporter]# pwd
/root/prometheus/kube-prometheus-stack/charts/prometheus-node-exporter
[root@k8s-master prometheus-node-exporter]# ls
Chart.yaml ci README.md templates values.yaml
[root@k8s-master prometheus-node-exporter]# vim values.yaml
5 registry: reg.exam.com
36 imageRegistry: "reg.exam.com"
------------------------------------------------------------------------------
#导入node镜像
[root@k8s-master prometheus]# docker load -i node-exporter-1.8.2.tar
Loaded image: quay.io/prometheus/node-exporter:v1.8.2
Loaded image: quay.io/brancz/kube-rbac-proxy:v0.18.0 #已上传
#打包上传
[root@k8s-master prometheus]# docker tag quay.io/prometheus/node-exporter:v1.8.2 reg.exam.com/prometheus/node-exporter:v1.8.2
[root@k8s-master prometheus]# docker push reg.exam.com/prometheus/node-exporter:v1.8.2
[root@k8s-master prometheus]# docker tag quay.io/brancz/kube-rbac-proxy:v0.18.0 reg.exam.com/brancz/kube-rbac-proxy:v0.18.0
[root@k8s-master prometheus]# docker push reg.exam.com/brancz/kube-rbac-proxy:v0.18.0
================================================================================
创建命名空间
[root@k8s-master prometheus]# kubectl create namespace kube-prometheus-stack
namespace/kube-prometheus-stack created
[root@k8s-master prometheus]# kubectl get namespaces
NAME STATUS AGE
default Active 154m
kube-node-lease Active 154m
kube-prometheus-stack Active 8s
kube-public Active 154m
kube-system Active 154m
metallb-system Active 54m
利用helm安装Prometheus !注意,在安装过程中千万别ctrl+c!
[root@k8s-master prometheus]# cd kube-prometheus-stack/
[root@k8s-master kube-prometheus-stack]#
# . 代表当前位置/root/prometheus/kube-prometheus-stack
[root@k8s-master kube-prometheus-stack]# helm -n kube-prometheus-stack install kube-prometheus-stack .
NAME: kube-prometheus-stack
LAST DEPLOYED: Thu Sep 12 20:54:37 2024
NAMESPACE: kube-prometheus-stack
STATUS: deployed
REVISION: 1
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
kubectl --namespace kube-prometheus-stack get pods -l "release=kube-prometheus-stack"
Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.
查看所有pod是否运行
[root@k8s-master kube-prometheus-stack]# kubectl --namespace kube-prometheus-stack get pods
NAME READY STATUS RESTARTS AGE
alertmanager-kube-prometheus-stack-alertmanager-0 2/2 Running 0 23m
kube-prometheus-stack-grafana-548c8fb6c4-29qdc 3/3 Running 0 23m
kube-prometheus-stack-kube-state-metrics-6688476957-n26gn 1/1 Running 0 23m
kube-prometheus-stack-operator-587f4b669b-8ztmk 1/1 Running 0 23m
kube-prometheus-stack-prometheus-node-exporter-j6j4t 1/1 Running 0 23m
kube-prometheus-stack-prometheus-node-exporter-pccpc 1/1 Running 0 23m
kube-prometheus-stack-prometheus-node-exporter-t77b8 1/1 Running 0 23m
prometheus-kube-prometheus-stack-prometheus-0 2/2 Running 0 23m
查看svc
[root@k8s-master kube-prometheus-stack]# kubectl -n kube-prometheus-stack get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 23m
kube-prometheus-stack-alertmanager ClusterIP 10.104.96.90 <none> 9093/TCP,8080/TCP 23m
kube-prometheus-stack-grafana ClusterIP 10.103.122.224 <none> 80/TCP 23m
kube-prometheus-stack-kube-state-metrics ClusterIP 10.104.185.222 <none> 8080/TCP 23m
kube-prometheus-stack-operator ClusterIP 10.98.25.116 <none> 443/TCP 23m
kube-prometheus-stack-prometheus ClusterIP 10.102.144.68 <none> 9090/TCP,8080/TCP 23m
kube-prometheus-stack-prometheus-node-exporter ClusterIP 10.98.117.125 <none> 9100/TCP 23m
prometheus-operated ClusterIP None <none> 9090/TCP 23m
修改暴漏方式
[root@k8s-master kube-prometheus-stack]# kubectl -n kube-prometheus-stack edit svc kube-prometheus-stack-grafana
39 type: LoadBalancer
各个svc的作用
alertmanager-operated 告警管理
kube-prometheus-stack-grafana 展示prometheus采集到的指标
kube-prometheus-stack-prometheus-node-exporter 收集节点级别的指标的工具
kube-prometheus-stack-prometheus 主程序
各个svc的作用
alertmanager-operated 告警管理
kube-prometheus-stack-grafana 展示prometheus采集到的指标
kube-prometheus-stack-prometheus-node-exporter 收集节点级别的指标的工具
kube-prometheus-stack-prometheus 主程
2.3 登陆grafana
查看grafana密码
[root@k8s-master helm]# kubectl -n kube-prometheus-stack get secrets kube-prometheus-stack-grafana -o yaml
apiVersion: v1
data:
admin-password: cHJvbS1vcGVyYXRvcg==
admin-user: YWRtaW4=
ldap-toml: ""
kind: Secret
metadata:
annotations:
meta.helm.sh/release-name: kube-prometheus-stack
meta.helm.sh/release-namespace: kube-prometheus-stack
creationTimestamp: "2024-09-12T12:54:47Z"
labels:
app.kubernetes.io/instance: kube-prometheus-stack
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: grafana
app.kubernetes.io/version: 11.2.0
helm.sh/chart: grafana-8.5.1
name: kube-prometheus-stack-grafana
namespace: kube-prometheus-stack
resourceVersion: "16943"
uid: d19640ae-4b79-4013-ba03-e039fc98b493
type: Opaque
查看密码
[root@k8s-master helm]# echo -n "cHJvbS1vcGVyYXRvcg==" | base64 -d
prom-operator #密码
prom-operator[root@k8s-master helm]# echo "YWRtaW4=" | base64 -d
admin #用户
[root@k8s-master helm]# kubectl -n kube-prometheus-stack get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 29m
kube-prometheus-stack-alertmanager ClusterIP 10.104.96.90 <none> 9093/TCP,8080/TCP 29m
kube-prometheus-stack-grafana LoadBalancer 10.103.122.224 172.25.250.50 80:31471/TCP 29m
kube-prometheus-stack-kube-state-metrics ClusterIP 10.104.185.222 <none> 8080/TCP 29m
kube-prometheus-stack-operator ClusterIP 10.98.25.116 <none> 443/TCP 29m
kube-prometheus-stack-prometheus ClusterIP 10.102.144.68 <none> 9090/TCP,8080/TCP 29m
kube-prometheus-stack-prometheus-node-exporter ClusterIP 10.98.117.125 <none> 9100/TCP 29m
prometheus-operated ClusterIP None <none> 9090/TCP 29m
#用分配的IP在网页查看
2.4 导入面板
官方监控模板:Grafana dashboards | Grafana Labs
面板不行就换
2.5 访问Prometheus主程序
[root@k8s-master helm]# kubectl -n kube-prometheus-stack edit svc kube-prometheus-stack-prometheus
48 type: LoadBalancer
[root@k8s-master helm]# kubectl -n kube-prometheus-stack get svc kube-prometheus-stack-prometheus
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-prometheus-stack-prometheus LoadBalancer 10.102.144.68 172.25.250.51 9090:30607/TCP,8080:32132/TCP 43m
网页登录172.25.250.51:9090
三 监控使用示例
3.1 建立监控项目
[root@k8s-master ~]# mkdir test
[root@k8s-master ~]# cd test/
[root@k8s-master test]# ls
nginx-18.1.11.tgz nginx-exporter-1.3.0-debian-12-r2.tar
[root@k8s-master test]# tar zxf nginx-18.1.11.tgz
[root@k8s-master test]# cd nginx/
修改项目开启监控
[root@k8s-master nginx]# vim values.yaml
925 metrics:
926 ## @param metrics.enabled Start a Prometheus exporter sidecar container
927 ##
928 enabled: true #改为true
...
1015 serviceMonitor:
1016 ## @param metrics.serviceMonitor.enabled Creates a Prometheus Operator ServiceMonitor (also requires `metrics.enable d` to be `true`)
1017 ##
1018 enabled: true #改为true
1019 ## @param metrics.serviceMonitor.namespace Namespace in which Prometheus is running
1020 ##
1021 namespace: "kube-prometheus-stack" #更改命名空间
1022 ## @param metrics.serviceMonitor.jobLabel The name of the label on the target service to use as the job name in prom etheus.
1023 ##
...
1046 labels:
1047 release: kube-prometheus-stack #添加指定监控标签
#查看标签
[root@k8s-master nginx]# kubectl -n kube-prometheus-stack get servicemonitors.monitoring.coreos.com --show-labels
安装项目,在安装之前一定要上传镜像到仓库中
[root@k8s-master nginx]# ls
nginx-1.27.1-debian-12-r2.tar
#第一个nginx
[root@k8s-master nginx]# docker load -i nginx-1.27.1-debian-12-r2.tar
30f5b1069b7f: Loading layer [==================================================>] 190.1MB/190.1MB
Loaded image: bitnami/nginx:1.27.1-debian-12-r2
[root@k8s-master nginx]# docker tag bitnami/nginx:1.27.1-debian-12-r2 reg.exam.com/bitnami/nginx:1.27.1-debian-12-r2
[root@k8s-master nginx]# docker push reg.exam.com/bitnami/nginx:1.27.1-debian-12-r2
#第二个nginx
[root@k8s-master nginx]# docker load -i nginx-exporter-1.3.0-debian-12-r2.tar
016ff07f0ae3: Loading layer [==================================================>] 149.3MB/149.3MB
Loaded image: bitnami/nginx-exporter:1.3.0-debian-12-r2
[root@k8s-master nginx]# docker tag bitnami/nginx-exporter:1.3.0-debian-12-r2 reg.exam.com/bitnami/nginx-exporter:1.3.0-debian-12-r2
[root@k8s-master nginx]# docker push reg.exam.com/bitnami/nginx-exporter:1.3.0-debian-12-r2
#安装chart包
[root@k8s-master nginx]# helm install howe .
NAME: howe
LAST DEPLOYED: Thu Sep 12 21:52:15 2024
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
CHART NAME: nginx
CHART VERSION: 18.1.11
APP VERSION: 1.27.1
[root@k8s-master nginx]# kubectl get pods
NAME READY STATUS RESTARTS AGE
howe-nginx-54c97cb888-x5hhh 2/2 Running 0 21s
[root@k8s-master nginx]# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 4h56m
test-nginx LoadBalancer 10.102.161.61 172.25.250.52 80:30614/TCP,443:31390/TCP,9113:32254/TCP 30s
[root@k8s-master nginx]# curl 172.25.250.52
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
压力测试:
[root@k8s-master nginx]# ab -c 5 -n 100 http://172.25.250.52/index.html
This is ApacheBench, Version 2.3 <$Revision: 1903618 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking 172.25.250.52 (be patient).....done
Server Software: nginx
Server Hostname: 172.25.250.52
Server Port: 80
Document Path: /index.html
Document Length: 615 bytes
Concurrency Level: 5
Time taken for tests: 0.033 seconds
Complete requests: 100
Failed requests: 0
Total transferred: 87000 bytes
HTML transferred: 61500 bytes
Requests per second: 2991.15 [#/sec] (mean)
Time per request: 1.672 [ms] (mean)
Time per request: 0.334 [ms] (mean, across all concurrent requests)
Transfer rate: 2541.31 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 1 0.1 0 1
Processing: 0 1 0.9 1 6
Waiting: 0 1 0.8 1 6
Total: 1 1 0.9 1 6
ERROR: The median and mean for the initial connection time are more than twice the standard
deviation apart. These results are NOT reliable.
Percentage of the requests served within a certain time (ms)
50% 1
66% 1
75% 1
80% 2
90% 2
95% 3
98% 6
99% 6
100% 6 (longest request)