1.kube-state-metrics简介
Kube-state-metrics:通过监听API Server生成有关资源对象的状态指标,比如Deployment、Node、Pod,需要注意的是kube-state-metrics只是简单的提供一个metrics数据,并不会存储这些指标数据,所以我们可以使用Prometheus来抓取这些数据然后存储,主要关注的是业务相关的一些元数据,比如Deployment、Pod、副本状态等;调度了多少个replicas?现在可用的有几个?;多少个Pod是running/stopped/terminated状态?;Pod重启了多少次?;我有多少job在运行中。
官网地址:
https://github.com/kubernetes/kube-state-metrics
2.部署kube-state-metrics
上传镜像kube-state-metrics_1_9_0.tar.gz到k8s-master和k8s-node节点,解压镜像
docker load -i kube-state-metrics_1_9_0.tar.gz
在k8s-master节点生成如下yaml文件
cat > kube-state-metrics-deploy.yaml <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: kube-state-metrics
namespace: kube-system
spec:
replicas: 1
selector:
matchLabels:
app: kube-state-metrics
template:
metadata:
labels:
app: kube-state-metrics
spec:
serviceAccountName: kube-state-metrics
containers:
- name: kube-state-metrics
# image: gcr.io/google_containers/kube-state-metrics-amd64:v1.3.1
image: quay.io/coreos/kube-state-metrics:v1.9.0
ports:
- containerPort: 8080
EOF
cat > kube-state-metrics-rbac.yaml <<EOF
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: kube-state-metrics
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: kube-state-metrics
rules:
- apiGroups: [""]
resources: ["nodes", "pods", "services", "resourcequotas", "replicationcontrollers", "limitranges", "persistentvolumeclaims", "persistentvolumes", "namespaces", "endpoints"]
verbs: ["list", "watch"]
- apiGroups: ["extensions"]
resources: ["daemonsets", "deployments", "replicasets"]
verbs: ["list", "watch"]
- apiGroups: ["apps"]
resources: ["statefulsets"]
verbs: ["list", "watch"]
- apiGroups: ["batch"]
resources: ["cronjobs", "jobs"]
verbs: ["list", "watch"]
- apiGroups: ["autoscaling"]
resources: ["horizontalpodautoscalers"]
verbs: ["list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kube-state-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kube-state-metrics
subjects:
- kind: ServiceAccount
name: kube-state-metrics
namespace: kube-system
EOF
cat >kube-state-metrics-svc.yaml <<EOF
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/scrape: 'true'
name: kube-state-metrics
namespace: kube-system
labels:
app: kube-state-metrics
spec:
ports:
- name: kube-state-metrics
port: 8080
protocol: TCP
selector:
app: kube-state-metrics
EOF
kubectl apply -f kube-state-metrics-rbac.yaml
kubectl apply -f kube-state-metrics-deploy.yaml
kubectl apply -f kube-state-metrics-svc.yaml
kubectl get pods -n kube-system 显示如下说明部署成功
kube-state-metrics-588f4fdfdb-b78fs 1/1 Running 0 60s
可在grafana导入Kubernetes Cluster (Prometheus)-1577674936972.json和
可以通过Graph进行测试一下
按照Pod内存使用情况进行绘图
sum(kube_pod_container_resource_requests_memory_bytes) by (namespace, pod, node)
* on (pod) group_left() (sum(kube_pod_status_phase{phase="Running"}) by (pod, namespace) == 1
3.kube-state-metrics可以监控的指标介绍
1)kube-state-metrics的监控指标类别包括:
Job Metrics
Node Metrics
Pod Metrics
Service Metrics
Namespace Metrics
Endpoint Metrics
(2)以pod为例:
kube_pod_container_info
kube_pod_container_resource_limits
kube_pod_container_resource_limits_cpu_cores
kube_pod_container_resource_limits_memory_bytes
kube_pod_container_resource_requests
kube_pod_container_resource_requests_cpu_cores
kube_pod_container_resource_requests_memory_bytes
kube_pod_container_status_ready
kube_pod_container_status_restarts_total
kube_pod_container_status_running
kube_pod_container_status_terminated
kube_pod_container_status_terminated_reason
kube_pod_container_status_waiting
kube_pod_container_status_waiting_reason
kube_pod_created
kube_pod_info
kube_pod_init_container_info
kube_pod_init_container_status_last_terminated_reason
kube_pod_init_container_status_ready
kube_pod_init_container_status_restarts_total
kube_pod_init_container_status_running
kube_pod_init_container_status_terminated
kube_pod_init_container_status_terminated_reason
kube_pod_init_container_status_waiting
kube_pod_init_container_status_waiting_reason
kube_pod_labels
kube_pod_owner
kube_pod_restart_policy
kube_pod_start_time
kube_pod_status_phase
kube_pod_status_ready
kube_pod_status_scheduled
kube_pod_status_scheduled_time
kube-state-metrics-service.yaml中有prometheus.io/scrape: 'true'标识,因此会将metric暴露给prometheus,而Prometheus会在kubernetes-service-endpoints这个job下自动发现kube-state-metrics,并开始拉取metrics,无需其他配置。