注:必须要先搭建网页管理容器;
k8s部署dashboard_kali_yao的博客-CSDN博客
1.Prometheus的概述
Prometheus是一个最初在SoundCloud上构建的开源监控系统 。它现在是一个独立的开源项目,为了强调这一点,并说明项目的治理结构,Prometheus 于2016年加入CNCF,作为继Kubernetes之后的第二个托管项目
现在最常见的k8s容器管理系统中,通常会搭配Prometheus进行监控,可以把他看成google BorgMon监控的开源版本
prometheus的特点
-自定义多维度模型
-非常高效的存储,平均一个采样数据占~3.5bytes左右
-在多纬度上灵活且强大的查询语言(PromQ)
-不依赖分布式存储,支持单主节点工作通过基于HTTP的pull方式采集时序数据可以通过push gateway进行时序列数据推送(pushing)可以通过服务发现或静态配置去或取要采集的目标服务器多种可视化图表及仪表盘
镜像及资源文件
网址:https://github.com/coreos/kube-prometheus
下载镜像导入私有仓库
prom/node-exporter v1.0.0 quay.io/coreos/prometheus-config-reloader v0.35.1 quay.io/coreos/prometheus-operator v0.35.1 quay.io/coreos/kube-state-metrics v1.9.2 grafana/grafana 6.4.3 jimmidyson/configmap-reload v0.3.0 quay.io/prometheus/prometheus v2.11.0 quay.io/prometheus/alertmanager v0.18.0 quay.io/coreos/k8s-prometheus-adapter-amd64 v0.5.0 quay.io/coreos/kube-rbac-proxy v0.4.1
下载资源文件
# 从官方的地址获,这里是release-0.4分支 ~]# git clone https://github.com/prometheus-operator/kube-prometheus.git # 默认下载下来的文件较多,建议把文件进行归类处理,将相关yaml文件移动到对应目录下 ~]# cd kube-prometheus/manifests ~]# mkdir -p grafana grafana-json metrics-state node-exporter prom-adapter prom-server setup # 整理如下 ~]# tree ./ ./ |-- alertmanager | |-- alertmanager-alertmanager.yaml | |-- alertmanager-secret.yaml | |-- alertmanager-serviceAccount.yaml | |-- alertmanager-serviceMonitor.yaml | `-- alertmanager-service.yaml |-- grafana | |-- grafana-dashboardDatasources.yaml | |-- grafana-dashboardDefinitions.yaml | |-- grafana-dashboardSources.yaml | |-- grafana-deployment.yaml | |-- grafana-serviceAccount.yaml | |-- grafana-serviceMonitor.yaml | `-- grafana-service.yaml |-- grafana-json | |-- kubernetes-for-prometheus-dashboard-cn-v20201010_rev3.json | `-- node-exporter-dashboard_rev1.json |-- metrics-state | |-- kube-state-metrics-clusterRoleBinding.yaml | |-- kube-state-metrics-clusterRole.yaml | |-- kube-state-metrics-deployment.yaml | |-- kube-state-metrics-roleBinding.yaml | |-- kube-state-metrics-role.yaml | |-- kube-state-metrics-serviceAccount.yaml | |-- kube-state-metrics-serviceMonitor.yaml | `-- kube-state-metrics-service.yaml |-- node-exporter | |-- node-exporter-clusterRoleBinding.yaml | |-- node-exporter-clusterRole.yaml | |-- node-exporter-daemonset.yaml | |-- node-exporter-serviceAccount.yaml | |-- node-exporter-serviceMonitor.yaml | `-- node-exporter-service.yaml |-- prom-adapter | |-- prometheus-adapter-apiService.yaml | |-- prometheus-adapter-clusterRoleAggregatedMetricsReader.yaml | |-- prometheus-adapter-clusterRoleBindingDelegator.yaml | |-- prometheus-adapter-clusterRoleBinding.yaml | |-- prometheus-adapter-clusterRoleServerResources.yaml | |-- prometheus-adapter-clusterRole.yaml | |-- prometheus-adapter-configMap.yaml | |-- prometheus-adapter-deployment.yaml | |-- prometheus-adapter-roleBindingAuthReader.yaml | |-- prometheus-adapter-serviceAccount.yaml | `-- prometheus-adapter-service.yaml |-- prom-server | |-- prometheus-clusterRoleBinding.yaml | |-- prometheus-clusterRole.yaml | |-- prometheus-operator-serviceMonitor.yaml | |-- prometheus-prometheus.yaml | |-- prometheus-roleBindingConfig.yaml | |-- prometheus-roleBindingSpecificNamespaces.yaml | |-- prometheus-roleConfig.yaml | |-- prometheus-roleSpecificNamespaces.yaml | |-- prometheus-rules.yaml | |-- prometheus-serviceAccount.yaml | |-- prometheus-serviceMonitorApiserver.yaml | |-- prometheus-serviceMonitorCoreDNS.yaml | |-- prometheus-serviceMonitorKubeControllerManager.yaml | |-- prometheus-serviceMonitorKubelet.yaml | |-- prometheus-serviceMonitorKubeScheduler.yaml | |-- prometheus-serviceMonitor.yaml | `-- prometheus-service.yaml `-- setup |-- 0namespace-namespace.yaml |-- prometheus-operator-0alertmanagerCustomResourceDefinition.yaml |-- prometheus-operator-0podmonitorCustomResourceDefinition.yaml |-- prometheus-operator-0prometheusCustomResourceDefinition.yaml |-- prometheus-operator-0prometheusruleCustomResourceDefinition.yaml |-- prometheus-operator-0servicemonitorCustomResourceDefinition.yaml |-- prometheus-operator-clusterRoleBinding.yaml |-- prometheus-operator-clusterRole.yaml |-- prometheus-operator-deployment.yaml |-- prometheus-operator-serviceAccount.yaml `-- prometheus-operator-service.yaml
架构图
数据库:prometheus;其他的都是数据采集插件; grafana读取数据并制成图表展示;Alertmanager告警插件
所有监控流程架构图
2.安装Prometheus
1)下载并导入镜像至私有仓库
~]# for i in "prom/node-exporter v1.0.0 >quay.io/coreos/prometheus-config-reloader v0.35.1 >quay.io/coreos/prometheus-operator v0.35.1 >quay.io/coreos/kube-state-metrics v1.9.2 >grafana/grafana 6.4.3 >jimmidyson/configmap-reload v0.3.0 >quay.io/prometheus/prometheus v2.11.0 >quay.io/prometheus/alertmanager v0.18.0 >quay.io/coreos/k8s-prometheus-adapter-amd64 v0.5.0 >quay.io/coreos/kube-rbac-proxy v0.4.1" do docker pull $i done ]# img="prom/node-exporter v1.0.0 >quay.io/coreos/prometheus-config-reloader v0.35.1 >quay.io/coreos/prometheus-operator v0.35.1 >quay.io/coreos/kube-state-metrics v1.9.2 >grafana/grafana 6.4.3 >jimmidyson/configmap-reload v0.3.0 >quay.io/prometheus/prometheus v2.11.0 >quay.io/prometheus/alertmanager v0.18.0 >quay.io/coreos/k8s-prometheus-adapter-amd64 v0.5.0 >quay.io/coreos/kube-rbac-proxy v0.4.1" ~]# while read _f _v;do docker tag ${_f}:${_v} 172.17.0.98:5000/${_f##*/}:${_v} docker push 172.17.0.98:5000/${_f##*/}:${_v} docker rmi ${_f}:${_v} done <<<"${img}" # 测试查看 ~]# curl http://172.17.0.98:5000/v2/_catalog {"repositories":["alertmanager","configmap-reload","coredns","dashboard","etcd","flannel","grafana","k8s-prometheus-adapter-amd64","kube-apiserver","kube-controller-manager","kube-proxy","kube-rbac-proxy","kube-scheduler","kube-state-metrics","metrics-scraper","metrics-server","myos","nginx-ingress-controller","node-exporter","pause","prometheus","prometheus-config-reloader","prometheus-operator"]}
2)先安装核心数据库的名称空间
注:安装核心数据库之前要先安装基础环境(如名称空间monitoring)
什么是Prometheus Operator
Prometheus Operator的本职就是一组用户自定义的CRD资源以及Controller的实现,Prometheus Operator负责监听这些自定义资源的变化,并且根据这些资源的定义自动化的完成如Prometheus Server自身以及配置的自动化管理工作。以下是Prometheus Operator的架构图
为什么用Prometheus Operator
由于Prometheus本身没有提供管理配置的AP接口(尤其是管理监控目标和管理警报规则),也没有提供好用的多实例管理手段,因此这一块往往要自己写一些代码或脚本。为了简化这类应用程序的管理复杂度,CoreOS率先引入了Operator的概念,并且首先推出了针对在Kubernetes下运行和管理Etcd的Etcd Operator。并随后推出了Prometheus Operator
prometheus-operator官方地址:https://github.com/prometheus-operator/prometheus-operator kube-prometheus官方地址:https://github.com/prometheus-operator/kube-prometheus
两个项目的关系:前者只包含了Prometheus Operator,后者既包含了Operator,又包含了Prometheus相关组件的部署及常用的Prometheus自定义监控,具体包含下面的组件
## 先创建基础环境 # 基础环境运用的镜像(上面有导入) ~]# curl http://172.17.0.98:5000/v2/configmap-reload/tags/list {"name":"configmap-reload","tags":["v0.3.0"]} ~]# curl http://172.17.0.98:5000/v2/prometheus-config-reloader/tags/list {"name":"prometheus-config-reloader","tags":["v0.35.1"]} ~]# curl http://172.17.0.98:5000/v2/prometheus-operator/tags/list
# 书写资源文件 # 资源文件较多创建一个目录存放 ~]# cd setup # 需要的文件 ~]# ls setup 0namespace-namespace.yaml prometheus-operator-0alertmanagerCustomResourceDefinition.yaml prometheus-operator-0podmonitorCustomResourceDefinition.yaml prometheus-operator-0prometheusCustomResourceDefinition.yaml prometheus-operator-0prometheusruleCustomResourceDefinition.yaml prometheus-operator-0servicemonitorCustomResourceDefinition.yaml prometheus-operator-clusterRoleBinding.yaml prometheus-operator-clusterRole.yaml prometheus-operator-deployment.yaml prometheus-operator-serviceAccount.yaml prometheus-operator-service.yaml # 只需要修改指定镜像仓库就可以了(190,274行) ~]# vim prometheus-operator-deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/name: prometheus-operator app.kubernetes.io/version: v0.35.1 name: prometheus-operator namespace: monitoring spec: replicas: 1 selector: matchLabels: app.kubernetes.io/component: controller app.kubernetes.io/name: prometheus-operator template: metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/name: prometheus-operator app.kubernetes.io/version: v0.35.1 spec: containers: - args: - --kubelet-service=kube-system/kubelet - --logtostderr=true - --config-reloader-image=172.17.0.98:5000/configmap-reload:v0.3.0 #(指定到本地参库) - --prometheus-config-reloader=172.17.0.98:5000/prometheus-config-reloader:v0.35.1 #(指定到本地参库) image: 172.17.0.98:5000/prometheus-operator:v0.35.1 #(指定到本地参库) name: prometheus-operator ports: - containerPort: 8080 name: http resources: limits: cpu: 200m memory: 200Mi requests: cpu: 100m memory: 100Mi securityContext: allowPrivilegeEscalation: false nodeSelector: beta.kubernetes.io/os: linux securityContext: runAsNonRoot: true runAsUser: 65534 serviceAccountName: prometheus-operator
安装
# 安装并查看(k8s默认会自动顺序安装,只要指定目录) ~]# kubectl apply -f setup/ ~]# kubectl get namespaces NAME STATUS AGE default Active 5d14h ingress-nginx Active 3d17h kube-node-lease Active 5d14h kube-public Active 5d14h kube-system Active 5d14h kubernetes-dashboard Active 130m monitoring Active 2m49s ~]# kubectl -n monitoring get pod NAME READY STATUS RESTARTS AGE prometheus-operator-75b4b59b74-72qhg 1/1 Running 0 47s
3)安装数据库Prometheus server
#对数据的监控的获取,存储以及查询 #Prometheus server用到的镜像(上面有导入) ~]# curl http://172.17.0.98:5000/v2/prometheus/tags/list {"name":"prometheus","tags":["v2.11.0"]} ## 准备资源文件 ~]# ls prom-server prometheus-clusterRoleBinding.yaml prometheus-clusterRole.yaml prometheus-operator-serviceMonitor.yaml prometheus-prometheus.yaml prometheus-roleBindingConfig.yaml prometheus-roleBindingSpecificNamespaces.yaml prometheus-roleConfig.yaml prometheus-roleSpecificNamespaces.yaml prometheus-rules.yaml prometheus-serviceAccount.yaml prometheus-serviceMonitorApiserver.yaml prometheus-serviceMonitorCoreDNS.yaml prometheus-serviceMonitorKubeControllerManager.yaml prometheus-serviceMonitorKubelet.yaml prometheus-serviceMonitorKubeScheduler.yaml prometheus-serviceMonitor.yaml prometheus-service.yaml # 只需要注意该文件的版本于镜像即可(这里把镜像的版本分开写了) ~]# vim prom-server/prometheus-prometheus.yaml 14: baseImage: 172.17.0.98:5000/prometheus 34: version: v2.11.0 ~]# vim prometheus-prometheus.yaml apiVersion: monitoring.coreos.com/v1 kind: Prometheus metadata: labels: prometheus: k8s name: k8s namespace: monitoring spec: alerting: alertmanagers: - name: alertmanager-main namespace: monitoring port: web baseImage: 172.17.0.98:5000/prometheus # 指定镜像仓库 nodeSelector: kubernetes.io/os: linux podMonitorNamespaceSelector: {} podMonitorSelector: {} replicas: 2 resources: requests: memory: 400Mi ruleSelector: matchLabels: prometheus: k8s role: alert-rules securityContext: fsGroup: 2000 runAsNonRoot: true runAsUser: 1000 serviceAccountName: prometheus-k8s serviceMonitorNamespaceSelector: {} serviceMonitorSelector: {} version: v2.11.0 # 指定镜像版本 # 安装(以目录开头会自动安装) ~]# kubectl apply -f prom-server/ ~]# kubectl -n monitoring get pod NAME READY STATUS RESTARTS AGE prometheus-k8s-0 3/3 Running 1 45s prometheus-k8s-1 3/3 Running 1 45s
3.安装数据收集插件
注:以下3个插件主要收集(cpu,磁盘使用率,容器的状态信息)
1)prom-adapter安装
adapter
-获取APIServer的资源指标提供给Prom Server
adapter用到的镜像(上面有传)
~]# curl http://172.17.0.98:5000/v2/k8s-prometheus-adapter-amd64/tags/list {"name":"k8s-prometheus-adapter-amd64","tags":["v0.5.0"]}
# 准备镜像(上面已经上传到私有仓库了) ~]# curl http://172.17.0.98:5000/v2/k8s-prometheus-adapter-amd64/tags/list {"name":"k8s-prometheus-adapter-amd64","tags":["v0.5.0"]} # 准备文件 ~]# ls prom-adapter/ prometheus-adapter-apiService.yaml prometheus-adapter-clusterRoleAggregatedMetricsReader.yaml prometheus-adapter-clusterRoleBindingDelegator.yaml prometheus-adapter-clusterRoleBinding.yaml prometheus-adapter-clusterRoleServerResources.yaml prometheus-adapter-clusterRole.yaml prometheus-adapter-configMap.yaml prometheus-adapter-deployment.yaml prometheus-adapter-roleBindingAuthReader.yaml prometheus-adapter-serviceAccount.yaml prometheus-adapter-service.yaml # 只需要主要28行指定仓库 ~]# vim prom-adapter/prometheus-adapter-deployment.yaml 28: image: 172.17.0.98:5000/k8s-prometheus-adapter-amd64:v0.5.0 ~]# prometheus-adapter-deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: name: prometheus-adapter namespace: monitoring spec: replicas: 1 selector: matchLabels: name: prometheus-adapter strategy: rollingUpdate: maxSurge: 1 maxUnavailable: 0 template: metadata: labels: name: prometheus-adapter spec: containers: - args: - --cert-dir=/var/run/serving-cert - --config=/etc/adapter/config.yaml - --logtostderr=true - --metrics-relist-interval=1m - --prometheus-url=http://prometheus-k8s.monitoring.svc:9090/ - --secure-port=6443 image: quay.io/coreos/k8s-prometheus-adapter-amd64:v0.5.0 # 指定镜像 name: prometheus-adapter ports: - containerPort: 6443 volumeMounts: - mountPath: /tmp name: tmpfs readOnly: false - mountPath: /var/run/serving-cert name: volume-serving-cert readOnly: false - mountPath: /etc/adapter name: config readOnly: false nodeSelector: kubernetes.io/os: linux serviceAccountName: prometheus-adapter volumes: - emptyDir: {} name: tmpfs - emptyDir: {} name: volume-serving-cert - configMap: name: adapter-config name: config # 部署于查看容器 ~]# kubectl apply -f prom-adapter ~]# kubectl -n monitoring get pod NAME READY STATUS RESTARTS AGE prometheus-adapter-856854f9f6-knqtq 1/1 Running 0 6s
2)metrics-state安装
-获取各种资源的最新状态(pod,deploy)
-metrics-state用到的镜像
~]# curl http://172.17.0.98:5000/v2/kube-state-metrics/tags/list {"name":"kube-state-metrics","tags":["v1.9.2"]} ~]# curl http://172.17.0.98:5000/v2/kube-rbac-proxy/tags/list {"name":"kube-rbac-proxy","tags":["v0.4.1"]} 文件 ]# ls metrics-state/ kube-state-metrics-clusterRoleBinding.yaml kube-state-metrics-clusterRole.yaml kube-state-metrics-deployment.yaml kube-state-metrics-roleBinding.yaml kube-state-metrics-role.yaml kube-state-metrics-serviceAccount.yaml kube-state-metrics-serviceMonitor.yaml kube-state-metrics-service.yaml # 先查看镜像是否上传成功 ~]# curl http://172.17.0.98:5000/v2/kube-state-metrics/tags/list {"name":"kube-state-metrics","tags":["v1.9.2"]} ~]# curl http://172.17.0.98:5000/v2/kube-rbac-proxy/tags/list {"name":"kube-rbac-proxy","tags":["v0.4.1"]} # 只需要注意镜像就可以了 ~]# vim metrics-state/kube-state-metrics-deployment.yaml 24: image: 172.17.0.98:5000/kube-rbac-proxy:v0.4.1 # 指定私有仓库 41: image: 172.17.0.98:5000/kube-rbac-proxy:v0.4.1 # 指定私有仓库 58: image: 172.17.0.98:5000/kube-state-metrics:v1.9.2 # 指定私有仓库 ~]# cat metrics-state/kube-state-metrics-apiVersion: apps/v1 kind: Deployment metadata: labels: app: kube-state-metrics name: kube-state-metrics namespace: monitoring spec: replicas: 1 selector: matchLabels: app: kube-state-metrics template: metadata: labels: app: kube-state-metrics spec: containers: - args: - --logtostderr - --secure-listen-address=:8443 - --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 - --upstream=http://127.0.0.1:8081/ image: quay.io/coreos/kube-rbac-proxy:v0.4.1 # 修该镜像 name: kube-rbac-proxy-main ports: - containerPort: 8443 name: https-main resources: limits: cpu: 20m memory: 40Mi requests: cpu: 10m memory: 20Mi - args: - --logtostderr - --secure-listen-address=:9443 - --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 - --upstream=http://127.0.0.1:8082/ image: quay.io/coreos/kube-rbac-proxy:v0.4.1 # 修改镜像仓库 name: kube-rbac-proxy-self ports: - containerPort: 9443 name: https-self resources: limits: cpu: 20m memory: 40Mi requests: cpu: 10m memory: 20Mi - args: - --host=127.0.0.1 - --port=8081 - --telemetry-host=127.0.0.1 - --telemetry-port=8082 image: quay.io/coreos/kube-state-metrics:v1.9.2 # 指定镜像 name: kube-state-metrics resources: limits: cpu: 100m memory: 150Mi requests: cpu: 100m memory: 150Mi nodeSelector: kubernetes.io/os: linux securityContext: runAsNonRoot: true runAsUser: 65534 serviceAccountName: kube-state-metrics # 创建资源与查看 ~]# kubectl apply -f metrics-state/ ~]# kubectl -n monitoring get pod NAME READY STATUS RESTARTS AGE kube-state-metrics-5894f64799-krvn6 3/3 Running 0 4s
3)node-exporter安装
-采集node节点的数据提供给Prom Server
-node-exporter用到的镜像
~]# curl http://172.17.0.98:5000/v2/node-exporter/tags/list {"name":"node-exporter","tags":["v1.0.0"]} ~]# curl http://172.17.0.98:5000/v2/kube-rbac-proxy/tags/list {"name":"kube-rbac-proxy","tags":["v0.4.1"]} 文件 ]# ls node-exporter/ node-exporter-clusterRoleBinding.yaml node-exporter-serviceAccount.yaml node-exporter-clusterRole.yaml node-exporter-serviceMonitor.yaml node-exporter-daemonset.yaml node-exporter-service.yaml # 镜像准备 ~]# curl http://172.17.0.98:5000/v2/node-exporter/tags/list {"name":"node-exporter","tags":["v1.0.0"]} ~]# curl http://172.17.0.98:5000/v2/kube-rbac-proxy/tags/list {"name":"kube-rbac-proxy","tags":["v0.4.1"]} # 注意修改镜像 ~]# vim node-exporter/node-exporter-daemonset.yaml 27: image: 172.17.0.98:5000/node-exporter:v1.0.0 57: image: 172.17.0.98:5000/kube-rbac-proxy:v0.4.1 ~]# vim node-exporter/node-exporter- apiVersion: apps/v1 kind: DaemonSet metadata: labels: app: node-exporter name: node-exporter namespace: monitoring spec: selector: matchLabels: app: node-exporter template: metadata: labels: app: node-exporter spec: containers: - args: - --web.listen-address=127.0.0.1:9100 - --path.procfs=/host/proc - --path.sysfs=/host/sys - --path.rootfs=/host/root - --no-collector.wifi - --no-collector.hwmon - --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/) - --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$ image: quay.io/prometheus/node-exporter:v1.0.0 # 指定镜像 name: node-exporter resources: limits: cpu: 250m memory: 180Mi requests: cpu: 102m memory: 180Mi volumeMounts: - mountPath: /host/proc name: proc readOnly: false - mountPath: /host/sys name: sys readOnly: false - mountPath: /host/root mountPropagation: HostToContainer name: root readOnly: true - args: - --logtostderr - --secure-listen-address=[$(IP)]:9100 - --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 - --upstream=http://127.0.0.1:9100/ env: - name: IP valueFrom: fieldRef: fieldPath: status.podIP image: quay.io/coreos/kube-rbac-proxy:v0.4.1 # 指定镜像 name: kube-rbac-proxy ports: - containerPort: 9100 hostPort: 9100 name: https resources: limits: cpu: 20m memory: 40Mi requests: cpu: 10m memory: 20Mi hostNetwork: true hostPID: true nodeSelector: kubernetes.io/os: linux securityContext: runAsNonRoot: true runAsUser: 65534 serviceAccountName: node-exporter tolerations: - operator: Exists volumes: - hostPath: path: /proc name: proc - hostPath: path: /sys name: sys - hostPath: path: / name: root # 添加与查看 ~]# kubectl apply -f node-exporter/ ~]# kubectl -n monitoring get pod NAME READY STATUS RESTARTS AGE node-exporter-7h4l9 2/2 Running 0 7s node-exporter-7vxmx 2/2 Running 0 7s node-exporter-mr6lw 2/2 Running 0 7s node-exporter-zg2j8 2/2 Running 0 7s
4.安装告警插件
1)alertmanager安装
-Prometheus体系中的告警处理中心
-alertmanager用到的镜像
~]# curl http://172.17.0.98:5000/v2/alertmanager/tags/list {"name":"alertmanager","tags":["v0.18.0"]} 文件 ~]# ls alertmanager/ alertmanager-alertmanager.yaml alertmanager-serviceMonitor.yaml alertmanager-secret.yaml alertmanager-service.yaml alertmanager-serviceAccount.yaml # 查看镜像 ~]# curl http://172.17.0.98:5000/v2/alertmanager/tags/list {"name":"alertmanager","tags":["v0.18.0"]} # 只需要修改指定镜像 ~]# vim alertmanager/alertmanager-alertmanager.yaml 09: baseImage: 172.17.0.98:5000/alertmanager 18: version: v0.18.0 ~]# vim alertmanager-alertmanager.yaml apiVersion: monitoring.coreos.com/v1 kind: Alertmanager metadata: labels: alertmanager: main name: main namespace: monitoring spec: baseImage: quay.io/prometheus/alertmanager nodeSelector: kubernetes.io/os: linux replicas: 3 securityContext: fsGroup: 2000 runAsNonRoot: true runAsUser: 1000 serviceAccountName: alertmanager-main version: v0.18.0 # 创建资源与查看 ~]# kubectl apply -f alertmanager/ ~]# kubectl -n monitoring get pod NAME READY STATUS RESTARTS AGE alertmanager-main-0 2/2 Running 0 16s alertmanager-main-1 2/2 Running 0 16s alertmanager-main-2 2/2 Running 0 16s
5.按装展示插件
grafana安装
-支持多种图形和Dashboard的展示
-grafana用到的镜像
~]# curl http://172.17.0.98:5000/v2/grafana/tags/list {"name":"grafana","tags":["6.4.3"]} 文件 ~]# ls grafana grafana-dashboardDatasources.yaml grafana-serviceAccount.yaml grafana-dashboardDefinitions.yaml grafana-serviceMonitor.yaml grafana-dashboardSources.yaml grafana-service.yaml grafana-deployment.yaml # 查看镜像 ~]# curl http://172.17.0.98:5000/v2/grafana/tags/list {"name":"grafana","tags":["6.4.3"]} # 只需要修改镜像 ~]# vim grafana/grafana-deployment.yaml 19: - image: 172.17.0.98:5000/grafana:6.4.3 ~]# vim grafana/grafana-deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: labels: app: grafana name: grafana namespace: monitoring spec: replicas: 1 selector: matchLabels: app: grafana template: metadata: labels: app: grafana spec: containers: - image: grafana/grafana:6.4.3 # 指定镜像 name: grafana ports: - containerPort: 3000 name: http readinessProbe: httpGet: path: /api/health port: http resources: limits: cpu: 200m memory: 200Mi requests: cpu: 100m memory: 100Mi volumeMounts: - mountPath: /var/lib/grafana name: grafana-storage readOnly: false - mountPath: /etc/grafana/provisioning/datasources name: grafana-datasources readOnly: false - mountPath: /etc/grafana/provisioning/dashboards name: grafana-dashboards readOnly: false - mountPath: /grafana-dashboard-definitions/0/apiserver name: grafana-dashboard-apiserver readOnly: false - mountPath: /grafana-dashboard-definitions/0/cluster-total name: grafana-dashboard-cluster-total readOnly: false - mountPath: /grafana-dashboard-definitions/0/controller-manager name: grafana-dashboard-controller-manager readOnly: false - mountPath: /grafana-dashboard-definitions/0/k8s-resources-cluster name: grafana-dashboard-k8s-resources-cluster readOnly: false - mountPath: /grafana-dashboard-definitions/0/k8s-resources-namespace name: grafana-dashboard-k8s-resources-namespace readOnly: false - mountPath: /grafana-dashboard-definitions/0/k8s-resources-node name: grafana-dashboard-k8s-resources-node readOnly: false - mountPath: /grafana-dashboard-definitions/0/k8s-resources-pod name: grafana-dashboard-k8s-resources-pod readOnly: false - mountPath: /grafana-dashboard-definitions/0/k8s-resources-workload name: grafana-dashboard-k8s-resources-workload readOnly: false - mountPath: /grafana-dashboard-definitions/0/k8s-resources-workloads-namespace name: grafana-dashboard-k8s-resources-workloads-namespace readOnly: false - mountPath: /grafana-dashboard-definitions/0/kubelet name: grafana-dashboard-kubelet readOnly: false - mountPath: /grafana-dashboard-definitions/0/namespace-by-pod name: grafana-dashboard-namespace-by-pod readOnly: false - mountPath: /grafana-dashboard-definitions/0/namespace-by-workload name: grafana-dashboard-namespace-by-workload readOnly: false - mountPath: /grafana-dashboard-definitions/0/node-cluster-rsrc-use name: grafana-dashboard-node-cluster-rsrc-use readOnly: false - mountPath: /grafana-dashboard-definitions/0/node-rsrc-use name: grafana-dashboard-node-rsrc-use readOnly: false - mountPath: /grafana-dashboard-definitions/0/nodes name: grafana-dashboard-nodes readOnly: false - mountPath: /grafana-dashboard-definitions/0/persistentvolumesusage name: grafana-dashboard-persistentvolumesusage readOnly: false - mountPath: /grafana-dashboard-definitions/0/pod-total name: grafana-dashboard-pod-total readOnly: false - mountPath: /grafana-dashboard-definitions/0/pods name: grafana-dashboard-pods readOnly: false - mountPath: /grafana-dashboard-definitions/0/prometheus-remote-write name: grafana-dashboard-prometheus-remote-write readOnly: false - mountPath: /grafana-dashboard-definitions/0/prometheus name: grafana-dashboard-prometheus readOnly: false - mountPath: /grafana-dashboard-definitions/0/proxy name: grafana-dashboard-proxy readOnly: false - mountPath: /grafana-dashboard-definitions/0/scheduler name: grafana-dashboard-scheduler readOnly: false - mountPath: /grafana-dashboard-definitions/0/statefulset name: grafana-dashboard-statefulset readOnly: false - mountPath: /grafana-dashboard-definitions/0/workload-total name: grafana-dashboard-workload-total readOnly: false nodeSelector: beta.kubernetes.io/os: linux securityContext: runAsNonRoot: true runAsUser: 65534 serviceAccountName: grafana volumes: - emptyDir: {} name: grafana-storage - name: grafana-datasources secret: secretName: grafana-datasources - configMap: name: grafana-dashboards name: grafana-dashboards - configMap: name: grafana-dashboard-apiserver name: grafana-dashboard-apiserver - configMap: name: grafana-dashboard-cluster-total name: grafana-dashboard-cluster-total - configMap: name: grafana-dashboard-controller-manager name: grafana-dashboard-controller-manager - configMap: name: grafana-dashboard-k8s-resources-cluster name: grafana-dashboard-k8s-resources-cluster - configMap: name: grafana-dashboard-k8s-resources-namespace name: grafana-dashboard-k8s-resources-namespace - configMap: name: grafana-dashboard-k8s-resources-node name: grafana-dashboard-k8s-resources-node - configMap: name: grafana-dashboard-k8s-resources-pod name: grafana-dashboard-k8s-resources-pod - configMap: name: grafana-dashboard-k8s-resources-workload name: grafana-dashboard-k8s-resources-workload - configMap: name: grafana-dashboard-k8s-resources-workloads-namespace name: grafana-dashboard-k8s-resources-workloads-namespace - configMap: name: grafana-dashboard-kubelet name: grafana-dashboard-kubelet - configMap: name: grafana-dashboard-namespace-by-pod name: grafana-dashboard-namespace-by-pod - configMap: name: grafana-dashboard-namespace-by-workload name: grafana-dashboard-namespace-by-workload - configMap: name: grafana-dashboard-node-cluster-rsrc-use name: grafana-dashboard-node-cluster-rsrc-use - configMap: name: grafana-dashboard-node-rsrc-use name: grafana-dashboard-node-rsrc-use - configMap: name: grafana-dashboard-nodes name: grafana-dashboard-nodes - configMap: name: grafana-dashboard-persistentvolumesusage name: grafana-dashboard-persistentvolumesusage - configMap: name: grafana-dashboard-pod-total name: grafana-dashboard-pod-total - configMap: name: grafana-dashboard-pods name: grafana-dashboard-pods - configMap: name: grafana-dashboard-prometheus-remote-write name: grafana-dashboard-prometheus-remote-write - configMap: name: grafana-dashboard-prometheus name: grafana-dashboard-prometheus - configMap: name: grafana-dashboard-proxy name: grafana-dashboard-proxy - configMap: name: grafana-dashboard-scheduler name: grafana-dashboard-scheduler - configMap: name: grafana-dashboard-statefulset name: grafana-dashboard-statefulset - configMap: name: grafana-dashboard-workload-total name: grafana-dashboard-workload-total # 安装与查看 ~]# kubectl apply -f grafana/ ~]# kubectl -n monitoring get pod NAME READY STATUS RESTARTS AGE grafana-647d948b69-d2hv9 1/1 Running 0 19s
6.发布服务
grafana默认的服务使用Cluster IP
-使用nodePort发布服务
# 使用etid在线直接改 ~]# kubctl etid svc grafana -o yaml # 直接修改资源文件然后更新 ~]# cp grafana/grafana-service.yaml ./ ~]# vim grafana-service.yaml apiVersion: v1 kind: Service metadata: labels: app: grafana name: grafana namespace: monitoring spec: type: NodePort # 新添加NodePort ports: - name: http port: 3000 nodePort: 30000 # 新添加映射端口 targetPort: http selector: app: grafana ~]# kubectl apply -f grafana-service.yaml ~]# kubectl -n monitoring get service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) grafana NodePort 10.254.79.49 <none> 3000:30000/TCP # 服务发布以后可以通过云弹性公网IP直接访问即可
grafana 第一次默认登录的用户名/密码(admin/admin),登录之后会强制修改密码
7.图形操作
1)绑定数据库
# 先查看数据库服务,因为如果要用pod的话只要重启就会找不到 ~]# kubectl get service -n monitoring ....... prometheus-k8s ClusterIP 10.254.192.100 <none> 9090/TCP 3h4m ~]# curl http://prometheus-k8s:9090
2) 收集传输数据地址
名字随意(这里是prometheus),但是后续的图表导入要填写这个
URL填写内部DNS名称如上(kubectl get service -n monitoring)
端口默认:9090
注:这里必须填写该服务域名
prometheus的内部访问是顺序
用户访问开放端口30000,然后映射到内部grafana的3000;grafana容器再问prometheus(在容器内部就可以用域名访问了)
默认有展示页面
3)导入下载好的页面展示图
官网下载地址:仪表板|格拉法纳实验室 (grafana.com)
注:导入的文件就是如下任意一个
~]# ls grafana-json kubernetes-for-prometheus-dashboard-cn-v20201010_rev3.json node-exporter-dashboard_rev1.json
点击这个位置,输入模板ID
导入文件,数据源就是刚刚定义的prometheus1
4)调整展示时间范围
默认十二小时(调成1小时)