kubesphere集成自有的kube-prometheus-stack

1、卸载 KubeSphere 的自定义 Prometheus 堆栈
2、安装您自己的 Prometheus 堆栈
3、将 KubeSphere 自定义组件安装至您的 Prometheus 堆栈
4、更改 KubeSphere 的 monitoring endpoint

步骤1、卸载 KubeSphere 的自定义 Prometheus 堆栈

kubectl -n kubesphere-system exec $(kubectl get pod -n kubesphere-system -l app=ks-installer -o jsonpath='{.items[0].metadata.name}') -- kubectl delete -f /kubesphere/kubesphere/prometheus/alertmanager/ 2>/dev/null
kubectl -n kubesphere-system exec $(kubectl get pod -n kubesphere-system -l app=ks-installer -o jsonpath='{.items[0].metadata.name}') -- kubectl delete -f /kubesphere/kubesphere/prometheus/devops/ 2>/dev/null
kubectl -n kubesphere-system exec $(kubectl get pod -n kubesphere-system -l app=ks-installer -o jsonpath='{.items[0].metadata.name}') -- kubectl delete -f /kubesphere/kubesphere/prometheus/etcd/ 2>/dev/null
kubectl -n kubesphere-system exec $(kubectl get pod -n kubesphere-system -l app=ks-installer -o jsonpath='{.items[0].metadata.name}') -- kubectl delete -f /kubesphere/kubesphere/prometheus/grafana/ 2>/dev/null
kubectl -n kubesphere-system exec $(kubectl get pod -n kubesphere-system -l app=ks-installer -o jsonpath='{.items[0].metadata.name}') -- kubectl delete -f /kubesphere/kubesphere/prometheus/kube-state-metrics/ 2>/dev/null
kubectl -n kubesphere-system exec $(kubectl get pod -n kubesphere-system -l app=ks-installer -o jsonpath='{.items[0].metadata.name}') -- kubectl delete -f /kubesphere/kubesphere/prometheus/node-exporter/ 2>/dev/null
kubectl -n kubesphere-system exec $(kubectl get pod -n kubesphere-system -l app=ks-installer -o jsonpath='{.items[0].metadata.name}') -- kubectl delete -f /kubesphere/kubesphere/prometheus/upgrade/ 2>/dev/null
kubectl -n kubesphere-system exec $(kubectl get pod -n kubesphere-system -l app=ks-installer -o jsonpath='{.items[0].metadata.name}') -- kubectl delete -f /kubesphere/kubesphere/prometheus/prometheus-rules-v1.16\+.yaml 2>/dev/null
kubectl -n kubesphere-system exec $(kubectl get pod -n kubesphere-system -l app=ks-installer -o jsonpath='{.items[0].metadata.name}') -- kubectl delete -f /kubesphere/kubesphere/prometheus/prometheus-rules.yaml 2>/dev/null
kubectl -n kubesphere-system exec $(kubectl get pod -n kubesphere-system -l app=ks-installer -o jsonpath='{.items[0].metadata.name}') -- kubectl delete -f /kubesphere/kubesphere/prometheus/prometheus 2>/dev/null
kubectl delete deploy -n  kubesphere-monitoring-system prometheus-operator
kubectl delete svc -n kubesphere-monitoring-system prometheus-operator
kubectl delete prometheusrules.monitoring.coreos.com -n kubesphere-monitoring-system  prometheus-operator-rules  prometheus-k8s-rules
kubectl delete servicemonitor -n kubesphere-monitoring-system coredns kube-apiserver  kube-controller-manager  kube-scheduler kubelet prometheus-operator

2、安装您自己的 Prometheus 堆栈

#添加 kubernetes-dashboard helm chart
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
# 更新下仓库
helm repo update 
#查询repo
helm repo list  
#指定变量
pro=kube-prometheus-stack
chart_version=35.0.0

mkdir -p /data/$pro
cd /data/$pro

#下载charts
helm pull prometheus-community/$pro --version=$chart_version

#提取values.yaml文件
tar zxvf $pro-$chart_version.tgz --strip-components 1 $pro/values.yaml 

cat > /data/$pro/start.sh << EOF
helm upgrade --install --create-namespace $pro $pro-$chart_version.tgz \
-f values.yaml \
-n monitoring
EOF
bash /data/kube-prometheus-stack/start.sh

KubeSphere 3.3.0 已经过认证,可以与以下 Prometheus 堆栈组件搭配使用:

Prometheus Operator v0.38.3+
Prometheus v2.20.1+
Alertmanager v0.21.0+
kube-state-metrics v1.9.6
node-exporter v0.18.1
请确保您的 Prometheus 堆栈组件版本符合上述版本要求,尤其是 node-exporter 和 kube-state-metrics。

问题解决:

1、对kube-proxy的监控,修改kube-proxy的configmap中的metricsBindAddress

kubectl edit cm -n kube-system kube-proxy
 |
 V
metricsBindAddress: "0.0.0.0:10249"     #metrics的监控端口

2、对外部etcd的监控

  • 生成etcd-client-cert的secret
cd /etc/ssl/etcd/ssl
cp admin-master01-key.pem etcd-client-key.pem       #admin-lady-master01-key.pem根据实际情况进行改名
cp admin-master01.pem etcd-client.pem

kubectl create secret generic -n monitoring etcd-client-cert \
            --from-file=etcd-ca=ca.pem \
            --from-file=etcd-client=etcd-client.pem \
            --from-file=etcd-client-key=etcd-client-key.pem

对kube-prometheus-stack的values.yaml进行修改

kubeEtcd:
  enabled: true
  endpoints:
  - 192.168.11.100      #外部etcd的IP地址
  - 192.168.11.101      #外部etcd的IP地址
  - 192.168.11.102      #外部etcd的IP地址
  service:
    enabled: true
    port: 2379
    targetPort: 2379
  serviceMonitor:
    enabled: true
    interval: ""
    proxyUrl: ""
    scheme: https            #使用https协议
    insecureSkipVerify: true      #不对证书进行验证
    serverName: "localhost"
    caFile: /etc/prometheus/secrets/etcd-client-cert/etcd-ca            #证书路径(pod内)
    certFile: /etc/prometheus/secrets/etcd-client-cert/etcd-client
    keyFile: /etc/prometheus/secrets/etcd-client-cert/etcd-client-key

prometheus:
  prometheusSpec:
    secrets:
    - etcd-client-cert    #在prometheus内增加证书的secret的挂载

3、将 Prometheus 规则评估间隔设置为 1m,与 KubeSphere 3.3.0 的自定义 ServiceMonitor 保持一致。规则评估间隔应大于或等于抓取间隔。

kubectl -n monitoring patch prometheuses.monitoring.coreos.com kube-prometheus-stack-prometheus --patch '{
  "spec": {
    "evaluationInterval": "1m"
  }
}' --type=merge

4、将 monitoring endpoint 更改为您自己的 Prometheus:

kubectl edit cm -n kubesphere-system kubesphere-config      #集群重启后会失效
 
    monitoring:
      endpoint: http://prometheus-operated.monitoring.svc:9090

kubectl edit cc -n kubesphere-system ks-installer          #集群重启后不会失效
    monitoring:
      endpoint: http://prometheus-operated.monitoring.svc:9090

5、运行以下命令,重启 KubeSphere APIserver。

kubectl -n kubesphere-system rollout restart deployment/ks-apiserver
或者
kubectl rollout restart deploy -n kubesphere-system ks-installer

6、kubesphere的dashboard的图表出不来,记来要修改prometheusrules.monitoring.coreos.com,因为很多指标都是通过record来计算获取

#获取kube-promethues-stack与promethesrules和servicemonitor的关联label
kubectl get prometheus -n monitoring kube-prometheus-stack-prometheus  -o yaml

  ruleSelector:
    matchLabels:
      release: kube-prometheus-stack     #使用此label与promethesrules关联
  serviceMonitorSelector:
    matchLabels:
      release: kube-prometheus-stack     #使用此label与servicemonitor关联


#下载kubernetes-prometheusRule.yaml,此与apiserver指标有关
wget https://raw.githubusercontent.com/kubesphere/ks-installer/master/roles/ks-monitor/files/prometheus/kubernetes/kubernetes-prometheusRule.yaml

#修改kubernetes-prometheusRule.yaml
vi kubernetes-prometheusRule.yaml

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    release: kube-prometheus-stack     #修改labels部分,以让kube-promethues-stack读取到PrometheusRule中的规则
  name: prometheus-k8s-rules
  namespace: monitoring                #改为monitoring

#应用kubernetes-prometheusRule.yaml
kubectl apply -f kubernetes-prometheusRule.yaml

7、kube-prometheus-stack-node.rules会与kubernetes-prometheusRule.yaml中的’node_namespace_pod:kube_pod_info:'和node:node_num_cpu:sum冲突。

kubectl edit prometheusrules.monitoring.coreos.com -n monitoring kube-prometheus-stack-node.rules
#删除以下内容
    - expr: |-
        topk by(cluster, namespace, pod) (1,
          max by (cluster, node, namespace, pod) (
            label_replace(kube_pod_info{job="kube-state-metrics",node!=""}, "pod", "$1", "pod", "(.*)")
        ))
      record: 'node_namespace_pod:kube_pod_info:'
    - expr: |-
        count by (cluster, node) (sum by (node, cpu) (
          node_cpu_seconds_total{job="node-exporter"}
        * on (namespace, pod) group_left(node)
          topk by(namespace, pod) (1, node_namespace_pod:kube_pod_info:)
        ))
      record: node:node_num_cpu:sum

8、验证–数据和图形都能正常显示
在这里插入图片描述

  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值