一、说明
在K8S集群中部署的kube-prometheus监控默认的监控数据存储方式为emptyDir。因为emptyDir和pod的生命周期相同。当pod重启时保存的监控数据也会随之消失,并不适用于保存数据,需要改为持久化的存储方式。由于kube-prometheus是通过Statefulset控制器进行部署的,所以需要通过StorageClass存储类来做数据持久化。
StorageClass的作用主要有以下几个方面:
动态存储卷分配:StorageClass可以根据定义的属性动态地创建存储卷,无需手动创建和管理存储卷。
存储卷的属性管理:StorageClass可以定义存储卷的属性,如存储类型、存储容量、访问模式等,从而更好地满足应用程序的存储需求。
存储资源的管理:StorageClass可以将存储资源进行分类管理,方便开发者根据应用程序的需求进行选择。
每个StorageClass都有一个供应商(Provisioner),用来决定使用哪个卷插件制备 PV,该字段必须指定。
以NFS为例,要想使用NFS,需要一个nfs-client的自动装载程序,称之为provisioner,这个程序会使用配置好的NFS服务器自动创建持久卷,也就是自动创建PV。
二、部署NFS
1.部署NFS
yum -y install nfs-utils rpcbind
2.设置NFS目录路径
echo "/k8s-data 192.168.1.*(rw,sync,no_root_squash)" > /etc/exports
3.创建目录
mkdir /k8s-data
4.启动NFS服务
systemctl start rpcbind
systemctl start nfs
5.设置NFS开机自启动
systemctl enable rpcbind
systemctl enable nfs
6.生效NFS配置
exportfs -arv
7.创建数据存储目录
mkdir -p /k8s-data/prometheus/prometheus-data
三、创建StorageClass供应商(Provisioner)
3.1 创建ServiceAccount账号
cat <<END > prometheus-nfs-serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus-nfs-provisioner
namespace: monitoring
END
3.2 ServiceAccount账号绑定角色进行授权
cat <<END > prometheus-nfs-clusterrolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus-nfs-provisioner-clusterrolebinding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: prometheus-nfs-provisioner
namespace: monitoring
END
3.3 部署NFS客户端自动装载程序
cat <<END > prometheus-nfs-deployment.yaml
kind: Deployment
apiVersion: apps/v1
metadata:
name: prometheus-nfs-provisioner
namespace: monitoring
spec:
selector:
matchLabels:
app: prometheus-nfs-provisioner
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
app: prometheus-nfs-provisioner
spec:
serviceAccount: prometheus-nfs-provisioner #指定ServiceAccount账号
containers:
- name: nfs-provisioner
image: registry.cn-beijing.aliyuncs.com/mydlq/nfs-subdir-external-provisioner:v4.0.0
imagePullPolicy: IfNotPresent
volumeMounts:
- name: prometheus-nfs-client
mountPath: /persistentvolumes
env:
- name: PROVISIONER_NAME
value: k8s.prometheus/nfs # 设置NFS供应商名称
- name: NFS_SERVER
value: 192.168.1.120 # NFS服务端地址
- name: NFS_PATH
value: /k8s-data/prometheus/prometheus-data # NFS共享目录
volumes:
- name: prometheus-nfs-client
nfs:
server: 192.168.1.120 # NFS服务端地址
path: /k8s-data/prometheus/prometheus-data # NFS共享目录
END
3.4 创建StorageClass存储类对象
cat <<END > prometheus-nfs-storageclass.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: prometheus-data-db
provisioner: k8s.prometheus/nfs #指定NFS供应商名称
reclaimPolicy: Retain #设置删除PVC时保留数据文件
END
#注:provisioner指定的NFS供应商名称,为即部署NFS客户端自动装载程序中env设置下的PROVISIONER_NAME参数值。
3.5 执行部署
1.创建ServiceAccount账号
kubectl apply -f prometheus-nfs-serviceaccount.yaml
2.ServiceAccount账号绑定角色进行授权
kubectl apply -f prometheus-nfs-clusterrolebinding.yaml
3.部署NFS客户端自动装载程序
kubectl apply -f prometheus-nfs-deployment.yaml
4.创建StorageClass存储类对象
kubectl apply -f prometheus-nfs-storageclass.yaml
3.6 检查服务是否正常
1.查看ServiceAccount账号信息
[root@master01 ~]# kubectl get sa -n monitoring
NAME SECRETS AGE
prometheus-nfs-provisioner 0 20s
2.查看角色绑定信息
[root@master01 ~]# kubectl get clusterrolebinding -n monitoring
NAME ROLE AGE
prometheus-nfs-provisioner-clusterrolebinding ClusterRole/cluster-admin 10s
3.查看pod信息
[root@master01 ~]# kubectl get pod -n monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2/2 Running 0 2d
blackbox-exporter-8564b84f6b-vrx4t 3/3 Running 0 2d
grafana-7c96b97d5b-zg59q 1/1 Running 0 2d
kube-state-metrics-f9f5584ff-wb9ql 3/3 Running 0 2d
prometheus-nfs-provisioner-6ffdcff49b-m4vdp 1/1 Running 0 12s
node-exporter-9pgqd 2/2 Running 0 2d
node-exporter-ccgfs 2/2 Running 0 2d
node-exporter-hkdqt 2/2 Running 0 2d
node-exporter-vvs6f 2/2 Running 0 2d
prometheus-adapter-78965cf996-tvx6b 1/1 Running 0 2d
prometheus-k8s-0 2/2 Running 0 2d
prometheus-operator-57bb88c6d-n59zj 2/2 Running 0 2d
4.查看StorageClass存储类对象信息
[root@master01 ~]# kubectl get sc -n monitoring
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
prometheus-data-db k8s.prometheus/nfs Retain Immediate false 15s
四、修改kube-prometheus部署配置
4.1 修改prometheus的部署文件中添加以下持久化存储配置
#在prometheus的部署文件中添加以下持久化存储配置
storage:
volumeClaimTemplate:
spec:
storageClassName: prometheus-data-db
resources:
requests:
storage: 50Gi
4.2 完整的prometheus部署文件信息如下
#完整的prometheus部署文件信息如下
vi prometheus-prometheus.yaml
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
labels:
app.kubernetes.io/component: prometheus
app.kubernetes.io/instance: k8s
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.41.0
name: k8s
namespace: monitoring
spec:
retention: 180d
alerting:
alertmanagers:
- apiVersion: v2
name: alertmanager-main
namespace: monitoring
port: web
enableFeatures: []
externalLabels: {}
image: quay.io/prometheus/prometheus:v2.41.0
storage: #新增持久化存储配置
volumeClaimTemplate:
spec:
storageClassName: prometheus-data-db
resources:
requests:
storage: 50Gi
nodeSelector:
monitoring: prometheus
podMetadata:
labels:
app.kubernetes.io/component: prometheus
app.kubernetes.io/instance: k8s
app.kubernetes.io/name: prometheus
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 2.41.0
podMonitorNamespaceSelector: {}
podMonitorSelector: {}
probeNamespaceSelector: {}
probeSelector: {}
replicas: 1
resources:
requests:
memory: 400Mi
ruleNamespaceSelector: {}
ruleSelector: {}
securityContext:
fsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: prometheus-k8s
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector: {}
version: 2.41.0
4.3 生效持久化存储配置
1.生效持久化存储配置
[root@master01 ~]# kubectl apply -f prometheus-prometheus.yaml
2.查看pod状态
[root@master01 ~]# kubectl get pod -n monitoring
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2/2 Running 0 2d
blackbox-exporter-8564b84f6b-vrx4t 3/3 Running 0 2d
grafana-7c96b97d5b-zg59q 1/1 Running 0 2d
kube-state-metrics-f9f5584ff-wb9ql 3/3 Running 0 2d
prometheus-nfs-provisioner-6ffdcff49b-m4vdp 1/1 Running 0 50s
node-exporter-9pgqd 2/2 Running 0 2d
node-exporter-ccgfs 2/2 Running 0 2d
node-exporter-hkdqt 2/2 Running 0 2d
node-exporter-vvs6f 2/2 Running 0 2d
prometheus-adapter-78965cf996-tvx6b 1/1 Running 0 2d
prometheus-k8s-0 2/2 Running 0 10s
prometheus-operator-57bb88c6d-n59zj 2/2 Running 0 2d
4.4 查看持久化存储信息
1.查看pv信息
[root@master01 ~]# kubectl get pv -n monitoring
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-d62fbce6-0124-4e76-8128-a71dc5ac450c 50Gi RWO Delete Bound monitoring/prometheus-k8s-db-prometheus-k8s-0 prometheus-data-db 10s
2.查看pvc信息
[root@master01 ~]# kubectl get pvc -n monitoring
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
prometheus-k8s-db-prometheus-k8s-0 Bound pvc-d62fbce6-0124-4e76-8128-a71dc5ac450c 50Gi RWO prometheus-data-db 12s
3.查看prometheus数据文件信息
[root@nfs ~]# ll /k8s-data/prometheus/prometheus-data/monitoring-prometheus-k8s-db-prometheus-k8s-0-pvc-d62fbce6-0124-4e76-8128-a71dc5ac450c/prometheus-db
总用量 20
drwxr-xr-x 3 root root 68 1月 4 19:14 01HKA2XX2RZHZQY6HJJSNF891R
drwxr-xr-x 3 root root 68 1月 4 21:00 01HKA8Y9KEAH9JS2GBG1VSHGRF
drwxr-xr-x 3 root root 68 1月 4 23:00 01HKAFT0VE6W2KBGTVR43G08DS
drwxr-xr-x 3 root root 68 1月 4 01:00 01HKAPNR3EMXM9Y64QJH40EJW7
drwxr-xr-x 3 root root 68 1月 4 03:00 01HKAXHFBFNGBAF3HVRTA10WDH
drwxr-xr-x 3 root root 68 1月 4 05:00 01HKB4D6KFJJNKSEMERMXFXSG2
drwxr-xr-x 3 root root 68 1月 4 07:00 01HKBB8XVJFABX988Y4PZM6EC0
drwxr-xr-x 3 root root 68 1月 4 09:00 01HKBJ4N3F1TEVTJ9KC1PFGAD3
drwxr-xr-x 3 root root 68 1月 4 11:00 01HKBS0CBE56BZCYT7XA0TGFQ9
drwxr-xr-x 2 root root 34 1月 4 11:00 chunks_head
-rw-r--r-- 1 root root 0 1月 4 16:14 lock
-rw-r--r-- 1 root root 20001 1月 4 11:51 queries.active
drwxr-xr-x 3 root root 81 1月 4 11:00 wal