一、安装prometheus operator
1.1、基础环境
操作系统 | k8s | docker | helm | operator | adapter |
---|---|---|---|---|---|
centos7.7 | 1.23.1 | 20.10.12 | 3.8.1 | 0.69.1 | 0.11.2 |
1.2、安装crd
1.2.1、到github下载bundle.yaml
https://github.com/prometheus-operator/prometheus-operator/blob/v0.69.1/
1.2.2、创建crd
将bundle.yaml拷贝到集群主节点执行,必须使用create,apply会报错(有bug)。
k create -f bundle.yaml
1.2.3、检查是否创建成功
[root@xuegod63 ]# kubectl wait --for=condition=Ready pods -l app.kubernetes.io/name=prometheus-operator -n default
pod/prometheus-operator-7bc6f6ddd4-9znz9 condition met
默认使用官网镜像,可能会拉不到,pod会起不来。
[root@xuegod63 prometheus-operator]# k get deploy
NAME READY UP-TO-DATE AVAILABLE AGE
my-release-prometheus-adapter 1/1 1 1 121m
nfs-provisioner 1/1 1 1 71d
nginx 4/4 4 4 159m
prometheus-operator 1/1 1 1 141m
可以修改prometheus-operator 这个deploy,把镜像改为
quay.io/prometheus-operator/prometheus-operator:v0.69.1
1.3、安装测试服务,提取指标(nginx)
1.3.1、创建nginx configmap
apiVersion: v1
kind: ConfigMap
metadata:
name: nginx
data:
nginx.conf: |-
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log notice;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
keepalive_timeout 65;
include /etc/nginx/conf.d/*.conf;
server {
listen 8080;
location /stub_status {
stub_status;
}
}
}
1.3.2、创建nginx deploy
一个pod里封装了nginx容器,暴露8080端口给nginx exporter采集信息。nginx exporter容器暴露9113,将采集到的数据暴露给prometheus。
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 1
selector:
matchLabels:
app: example-app
template:
metadata:
labels:
app: example-app
spec:
containers:
- name: example-app
image: nginx
ports:
- name: web
containerPort: 8080
volumeMounts:
- mountPath: /etc/nginx/nginx.conf
name: nginx
subPath: nginx.conf
- name: nginx-exporter
image: nginx/nginx-prometheus-exporter
ports:
- containerPort: 9113
args:
- /usr/bin/nginx-prometheus-exporter
- -nginx.scrape-uri=http://127.0.0.1:8080/stub_status
volumes:
- configMap:
name: nginx
name: nginx
可以使用如下命令检查是否成功获取到指标
curl <nginx-svc-ip>:8080/stub_status
[root@xuegod63 prometheus-operator]# curl 10.107.67.10:8080/stub_status
Active connections: 2
server accepts handled requests
2 2 53
Reading: 0 Writing: 1 Waiting: 1
1.3.3、创建nginx svc暴露服务
kind: Service
apiVersion: v1
metadata:
name: nginx
labels:
app: example-app
spec:
selector:
app: example-app
ports:
- name: web
port: 8080
- port: 9113
name: metrics
1.4、创建ServiceMonitor,监控nginx服务
就是prometheus的监控配置,想监控什么就需要添加ServiceMonitor。
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: example-app
labels:
team: frontend
spec:
selector:
matchLabels:
app: example-app
endpoints:
- port: metrics
path: /metrics
1.5、创建prometheus
1.5.1、创建rbac和promethues实例
创建了sa,clusterrole,clusterrolebinding和prometheus资源。
因为安装了prometheus operator,prometheus可以通过定义k8s资源的形式装出来。
apiVersion: v1
kind: ServiceAccount
metadata:
name: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/metrics
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources:
- configmaps
verbs: ["get"]
- apiGroups:
- networking.k8s.io
resources:
- ingresses
verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: default
---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
spec:
serviceAccountName: prometheus
serviceMonitorSelector:
matchLabels:
team: frontend
resources:
requests:
memory: 400Mi
enableAdminAPI: true
1.5.2、创建promethues svc
apiVersion: v1
kind: Service
metadata:
name: prometheus
spec:
type: NodePort
ports:
- name: web
nodePort: 30900
port: 9090
protocol: TCP
targetPort: web
selector:
prometheus: prometheus
可以通过如下命令,查看promethues的状态。
kubectl get -n default prometheus prometheus -w
NAME VERSION DESIRED READY RECONCILED AVAILABLE AGE
prometheus 1 True True 175m
1.6、创建pod监控
创建pod监控有两种方式,一种是直接通过标签选择pod,这种方式不需要创建svc也可以直接监控pod;另一种是在定义prometheus时添加 **serviceMonitorSelector **字段来通过标签选择上面创建的ServiceMonitor。
选择其中一种即可
1.6.1、通过标签直接选择pod
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: example-app
labels:
team: frontend
spec:
selector:
matchLabels:
app: example-app
podMetricsEndpoints:
- port: metrics
1.6.2、通过标签选择ServiceMonitor
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prometheus
spec:
serviceAccountName: prometheus
podMonitorSelector:
matchLabels:
team: frontend
resources:
requests:
memory: 400Mi
enableAdminAPI: false
prometheus已经通过nodeport暴露出来了,可以到web界面找target查看是否已经识别到pod。
二、安装prometheus adapter
github官网文档:https://github.com/kubernetes-sigs/prometheus-adapter
2.1、安装adapter
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install my-release prometheus-community/prometheus-adapter
安装成功会有如下pod
[root@xuegod63 ]# k get po
NAME READY STATUS RESTARTS AGE
my-release-prometheus-adapter-84549595d8-cmkfg 1/1 Running 0 163m
镜像如果拉不到,可以搜 k8s-prometheus-adapter ,找着试。
directxman12/k8s-prometheus-adapter-amd64 可用
adapter作用是将prometheus的指标转换为k8s可用的指标,可以通过调用k8s api访问到,如下:
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/services/*/nginx_connections_accepted" | jq
2.2、创建prometheusRule自定义指标
三、创建hpa,根据特定指标实现扩缩容
3.1、创建hpa
使用的是较旧的hpa版本,autoscaling/v2版定义方式见文末。
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
minReplicas: 1
maxReplicas: 10
metrics:
- type: Object
object:
target:
kind: Service
name: nginx
metricName: nginx_connections_accepted
targetValue: 4000
3.2、压测看是否扩容
ab -n 1000 -c 1000 10.107.67.10:9113/metrics
[root@xuegod63 ]# k get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
nginx-hpa Deployment/nginx 3194/4k 1 10 4 91m
四、修改adapter的configmap,添加自定义指标(计算后的指标)
4.1、修改adapter配置,添加自定义指标
k edit cm my-release-prometheus-adapter
apiVersion: v1
data:
config.yaml: |-
rules:
- seriesQuery: '{__name__=~"^nginx_http_requests.*_total$",container!="POD",namespace!="",pod!=""}'
resources:
overrides:
namespace:
resource: namespace
pod:
resource: pod
name:
matches: (.*)_total
as: "${1}_qps"
metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[30s])) by (<<.GroupBy>>)
prometheus-adapter不能动态加载配置,需要kubectl delete pod prometheus-adapter-xx,让Pod重启加载最新配置。
以上就是把 sum(rate(http_requests_total[30s])) by (pod) 保存为 nginx_http_requests_qps ,通过调用k8s api可以查到:
kubectl get --raw '/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/nginx_http_requests_qps' |jq
4.2、创建根据自定义指标的hpa
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
minReplicas: 1
maxReplicas: 3
metrics:
- type: Pods
pods:
metric:
name: nginx_http_requests_qps
target:
type: AverageValue
averageValue: 2000m # 2000m 即2个/秒