K8S集群部署kube-Prometheus监控mariadb

本文介绍了如何在K8S集群内部署mysql-exporter来监控Mysql5.7,包括创建监控用户与授权、使用NodePort方式暴露metrics、配置ServiceMonitor以及解决RBAC权限问题。同时给出了mysql告警指标的示例,如Mysql_Instance_Reboot、Mysql_High_QPS等。
摘要由CSDN通过智能技术生成

K8S集群部署kube-Prometheus监控Mysql 5.7

一、前言

测试环境使用的mysql单机版,部署在K8S集群内,现在需要部署mysql-exporter暴露metrics,有两种方式:
①、在K8S集群外部署mysql-exporter
②、在K8S集群内部署mysql-exporter。
本文采用第二种方式。
注:为管理方便单独为kube-prometheus 下集群外服务监控创建了一个namespace

[root@k8s01 prometheus-mysql]# kubectl create ns prometheus-exporter
[root@k8s01 prometheus-mysql]# kubectl get namespace
NAME                   STATUS   AGE
cephfs                 Active   475d
default                Active   486d
kube-node-lease        Active   486d
kube-public            Active   486d
kube-system            Active   486d
kubernetes-dashboard   Active   485d
monitoring             Active   485d
prometheus-exporter    Active   2d16h

二、Mysql 授权mysql-exporter 收集数据

①、mariadb版本
mariadb> select version();
+---------------------------------------+
| version()                             |
+---------------------------------------+
| 10.4.18-MariaDB-1:10.4.18+maria~focal |
+---------------------------------------+
1 row in set (0.04 sec)
②、创建监控用户并授权

mariadb> CREATE USER 'mysqlexporter'@"%"  IDENTIFIED BY 'mysqlexporter';

GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'mysqlexporter'@'%'  IDENTIFIED BY 'mysqlexporter' WITH MAX_USER_CONNECTIONS 30; 

GRANT select on performance_schema.* to "mysqlexporter"@"%" IDENTIFIED BY 'mysqlexporter';

flush privileges; 
Query OK, 0 rows affected (0.01 sec)

Query OK, 0 rows affected (0.01 sec)

Query OK, 0 rows affected (0.01 sec)

Query OK, 0 rows affected (0.01 sec)

三、k8s部署

①、因为这里做测试所以使用Nodeport方式暴露metrices!
[root@k8s01 prometheus-mysql]# cat demployment-mysql.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: prometheus-exporter
  name: mysqld-exporter
  labels:
    app: mysqld-exporter
spec:
  selector:
    matchLabels:
      app: mysqld-exporter
  template:
    metadata:
      labels:
        app: mysqld-exporter
    spec:
      containers:
      - name: mysqld-exporter
        image: prom/mysqld-exporter
        env:
        - name: DATA_SOURCE_NAME
          value: "mysqlexporter:mysqlexporter@(172.16.1.11:30006)/"  #数据库连接: 用户:密码@(主机:端口)
        ports:
        - containerPort: 9104
          name: http

---
apiVersion: v1
kind: Service
metadata:
  namespace: prometheus-exporter
  labels:
    app: mysqld-exporter
  name: mysqld-exporter
spec:
  type: NodePort
  ports:
  - name: http
    port: 9104
    nodePort: 30043
    targetPort: http
  selector:
    app: mysqld-exporter

②、查看mysql-exporter暴露出来的指标

查看pod 是否部署成功!

[root@k8s01 prometheus-mysql]# kubectl get pod -o wide -n prometheus-exporter
NAME                               READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
mysqld-exporter-58477844b4-rvj7w   1/1     Running   0          15s   172.30.77.2   k8s04   <none>           <none>
[root@k8s01 prometheus-mysql]# kubectl get svc -n prometheus-exporter
NAME              TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGE
mysqld-exporter   NodePort   10.254.228.5   <none>        9104:30043/TCP   34s

在这里插入图片描述
详细指标
在这里插入图片描述

③、部署prometheus-serviceMonitormysql.yaml 监控服务

匹配 namespace 名称: prometheus-exporter

[root@k8s01 prometheus-mysql]# cat prometheus-serviceMonitormysql.yaml 
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: mysqld-exporter
  namespace: monitoring
  labels:
    app: mysqld-exporter
spec:
  jobLabel: mysqld-exporter
  endpoints:
  - port: http
    interval: 15s
  selector:
    matchLabels:
      app: mysqld-exporter
  namespaceSelector:
    matchNames:
    - prometheus-exporter

[root@k8s01 prometheus-mysql]# kubectl apply -f prometheus-serviceMonitormysql.yaml 
servicemonitor.monitoring.coreos.com/mysqld-exporter created
④、K8S rabc 授权 prometheus-k8s 可以访问prometheus-exporter 名称空间下的pod

[root@k8s01 prometheus-mysql]# prometheus-roleBindNewNameSpace.yaml

--- # 在对应的ns中创建角色
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: prometheus-k8s
  namespace: my-namespace
rules:
- apiGroups:
  - ""
  resources:
  - services
  - endpoints
  - pods
  verbs:
  - get
  - list
  - watch
--- # 绑定角色 prometheus-k8s 角色到 Role
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: prometheus-k8s
  namespace: my-namespace
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: prometheus-k8s
subjects:
- kind: ServiceAccount
  name: prometheus-k8s # Prometheus 容器使用的 serviceAccount,kube-prometheus默认使用prometheus-k8s这个用户
  namespace: monitoring
[root@k8s01 prometheus-mysql]# kubectl apply -f prometheus-roleBindNewNameSpace.yaml 
role.rbac.authorization.k8s.io/prometheus-k8s created

注:my-namespace 替换为自己的namespace。
可以看到有很多错误日志出现,都是xxx is forbidden,这说明是 RBAC 权限的问题,通过 prometheus 资源对象的配置可以知道 Prometheus 绑定了一个名为 prometheus-k8s 的 ServiceAccount 对象,而这个对象绑定的是一个名为 prometheus-k8s 的 ClusterRole:(prometheus-clusterRole.yaml)

[root@k8s01 prometheus-mysql]# kubectl logs -f prometheus-k8s-0 prometheus -n monitoring
level=info ts=2021-11-15T02:58:32.254Z caller=kubernetes.go:253 component="discovery manager notify" discovery=k8s msg="Using pod service account via in-cluster config"
level=error ts=2021-11-15T02:58:32.256Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:361: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" in the namespace \"prometheus-exporter\""
level=error ts=2021-11-15T02:58:32.256Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:362: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" in the namespace \"prometheus-exporter\""
level=info ts=2021-11-15T02:58:32.258Z caller=main.go:827 msg="Completed loading of configuration file" filename=/etc/prometheus/config_out/prometheus.env.yaml
level=error ts=2021-11-15T02:58:32.262Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:363: Failed to list *v1.Pod: pods is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"pods\" in API group \"\" in the namespace \"prometheus-exporter\""

查看原有的cluster 文件

[root@k8s-master manifests]# cat prometheus-clusterRole.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-k8s
rules:
- apiGroups:
  - ""
  resources:
  - nodes/metrics
  verbs:
  - get
- nonResourceURLs:
  - /metrics
  verbs:
  - get

上面的权限规则中我们可以看到明显没有对 Service 或者 Pod 的 list 权限,所以报错了,要解决这个问题,我们只需要添加上需要的权限即可:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-k8s
rules:
- apiGroups:
  - ""
  resources:
  - nodes
  - services
  - endpoints
  - pods
  - nodes/proxy
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - configmaps
  - nodes/metrics
  verbs:
  - get
- nonResourceURLs:
  - /metrics
  verbs:
  - get

重新应用prometheus-clusterRole.yaml 文件

[root@k8s01 manifests]# kubectl apply -f prometheus-clusterRole.yaml

登录prometheus dashboard 查看监控target mysql已经读取到监控信息了。
在这里插入图片描述

四、granfa

导入 7362 模板
监控效果图如下:

在这里插入图片描述

五、mysql 告警指标

- alert: Mysql_Instance_Reboot
        expr: mysql_global_status_uptime < 180 
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql_Instance_Reboot detected"
          description: "{{$labels.instance}}: Mysql_Instance_Reboot in 3 minute (up to now is: {{ $value }} seconds"	
      - alert: Mysql_High_QPS
        expr: rate(mysql_global_status_questions[5m]) > 500 
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql_High_QPS detected"
          description: "{{$labels.instance}}: Mysql opreation is more than 500 per second ,(current value is: {{ $value }})"	
      - alert: Mysql_Too_Many_Connections
        expr: rate(mysql_global_status_connections[5m]) > 100
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql Too Many Connections detected"
          description: "{{$labels.instance}}: Mysql Connections is more than 100 per second ,(current value is: {{ $value }})"	
      - alert: Mysql_High_Recv_Rate
        expr: rate(mysql_global_status_bytes_received[3m]) * 1024 * 1024 * 8   > 100
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql_High_Recv_Rate detected"
          description: "{{$labels.instance}}: Mysql_Receive_Rate is more than 100Mbps ,(current value is: {{ $value }})"	
      - alert: Mysql_High_Send_Rate
        expr: rate(mysql_global_status_bytes_sent[3m]) * 1024 * 1024 * 8   > 100
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql_High_Send_Rate detected"
          description: "{{$labels.instance}}: Mysql data Send Rate is more than 100Mbps ,(current value is: {{ $value }})"
      - alert: Mysql_Too_Many_Slow_Query
        expr: rate(mysql_global_status_slow_queries[30m]) > 3
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql_Too_Many_Slow_Query detected"
          description: "{{$labels.instance}}: Mysql current Slow_Query Sql is more than 3 ,(current value is: {{ $value }})"
      - alert: Mysql_Deadlock
        expr: mysql_global_status_innodb_deadlocks > 0
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql_Deadlock detected"
          description: "{{$labels.instance}}: Mysql Deadlock was found ,(current value is: {{ $value }})"			
      - alert: Mysql_Too_Many_sleep_threads
        expr: mysql_global_status_threads_running / mysql_global_status_threads_connected * 100 < 30
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql_Too_Many_sleep_threads detected"
          description: "{{$labels.instance}}: Mysql_sleep_threads percent is more than {{ $value }}, please clean the sleeping threads"	
      - alert: Mysql_innodb_Cache_insufficient
        expr: (mysql_global_status_innodb_page_size * on (instance) mysql_global_status_buffer_pool_pages{state="data"} +  on (instance) mysql_global_variables_innodb_log_buffer_size +  on (instance) mysql_global_variables_innodb_additional_mem_pool_size + on (instance)  mysql_global_status_innodb_mem_dictionary + on (instance)  mysql_global_variables_key_buffer_size + on (instance)  mysql_global_variables_query_cache_size + on (instance) mysql_global_status_innodb_mem_adaptive_hash )  / on (instance) mysql_global_variables_innodb_buffer_pool_size * 100 > 80
        for: 2m
        labels:
          severity: warning
        annotations:
          summary: "{{$labels.instance}}: Mysql_innodb_Cache_insufficient detected"
          description: "{{$labels.instance}}: Mysql innodb_Cache was used more than 80% ,(current value is: {{ $value }})"

角色授权报错参考:https://www.cnblogs.com/wangxu01/articles/11655443.html
mysql 监控参考:https://blog.csdn.net/qq_32502263/article/details/118794813

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值