kubernetes资源监控
1. metrics-server
1.1 简介
Metrics-Server是集群核心监控数据的聚合器,用来替换之前的heapster。
容器相关的 Metrics 主要来自于 kubelet 内置的 cAdvisor 服务,有了Metrics-
Server之后,用户就可以通过标准的 Kubernetes API 来访问到这些监控数据。
Metrics API 只可以查询当前的度量数据,并不保存历史数据。
Metrics API URI 为 /apis/metrics.k8s.io/,在 k8s.io/metrics 维护。
必须部署 metrics-server 才能使用该 API,metrics-server 通过调用 Kubelet Summary API 获取数据
Metrics-server属于Core metrics(核心指标),提供API metrics.k8s.io,仅提供Node和Pod的CPU和内存使用情况。而其他Custom Metrics(自定义指标)由Prometheus等组件来完成。
1.2 部署
下载部署文件:
wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
拉取镜像到Harbor仓库并上传
安装脚本,将指定内容部分端口改为4443
[root@server1 ~]# cat components.yaml
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
image: metrics-server:v0.5.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /livez
port: https
scheme: HTTPS
periodSeconds: 10
name: metrics-server
ports:
- containerPort: 4443
执行部署文件,发现 metrics-server 已经 Running,但是一直不 Ready
[root@server1 ~]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
metrics-server-5567648887-gs6qw 0/1 Running 0 5m26s
解决X509报错,证书问题
[root@server1 ~]# vim /var/lib/kubelet/config.yaml
[root@server1 ~]# systemctl restart kubelet
[root@server1 ~]# kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
csr-kwhxf 61s kubernetes.io/kubelet-serving system:node:server2 Pending
csr-m9bkn 3s kubernetes.io/kubelet-serving system:node:server4 Pending
[root@server1 ~]# kubectl certificate approve csr-kwhxf
certificatesigningrequest.certificates.k8s.io/csr-kwhxf approved
[root@server1 ~]# kubectl certificate approve csr-m9bkn
certificatesigningrequest.certificates.k8s.io/csr-m9bkn approved
[root@server1 ~]# kubectl get csr
NAME AGE SIGNERNAME REQUESTOR CONDITION
csr-kwhxf 85s kubernetes.io/kubelet-serving system:node:server2 Approved,Issued
csr-m9bkn 27s kubernetes.io/kubelet-serving system:node:server4 Approved,Issued
解决metrics-server无法解析节点名字
kubectl edit configmap coredns -n kube-system
hosts{
172.25.16.1 server1
172.25.16.2 server2
172.25.16.4 server4
fallthrough
}
获取node信息
[root@server1 ~]# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
server1 161m 8% 1284Mi 66%
server2 79m 3% 482Mi 25%
server4 101m 5% 647Mi 34%
查看metrics-server服务部署
[root@server1 ~]# kubectl -n kube-system describe svc metrics-server
Name: metrics-server
Namespace: kube-system
Labels: k8s-app=metrics-server
Annotations: <none>
Selector: k8s-app=metrics-server
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.102.154.210
IPs: 10.102.154.210
Port: https 443/TCP
TargetPort: https/TCP
Endpoints: 10.244.179.78:4443
2. Dashboard 可视化监控
2.1 简介
Dashboard可以给用户提供一个可视化的 Web 界面来查看当前集群的各种信息。用户可以用 Kubernetes Dashboard 部署容器化的应用、监控应用的状态、执行故障排查任务以及管理 Kubernetes 各种资源。
2.2 部署
拉取镜像,上传至私有仓库
[root@server1 dash]# docker pull kubernetesui/dashboard:v2.3.1
[root@server1 dash]# docker pull kubernetesui/metrics-scraper:v1.0.6
[root@server1 dash]# docker tag kubernetesui/metrics-scraper:v1.0.6 reg.westos.org/library/metrics-scraper:v1.0.6
[root@server1 dash]# docker tag kubernetesui/dashboard:v2.3.1 reg.westos.org/library/dashboard:v2.3.1
[root@server1 dash]# docker push reg.westos.org/library/dashboard:v2.3.1
[root@server1 dash]# docker push reg.westos.org/library/metrics-scraper:v1.0.6
kubectl apply -f recommended.yaml
查看部署
[root@server1 dash]# kubectl -n kubernetes-dashboard get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dashboard-metrics-scraper ClusterIP 10.101.84.145 <none> 8000/TCP 34s
kubernetes-dashboard ClusterIP 10.110.109.97 <none> 443/TCP 34s
[root@server1 dash]# kubectl -n kubernetes-dashboard get all
NAME READY STATUS RESTARTS AGE
pod/dashboard-metrics-scraper-6875fdf695-lmshg 1/1 Running 0 41s
pod/kubernetes-dashboard-55c66865b7-j66xk 1/1 Running 0 41s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/dashboard-metrics-scraper ClusterIP 10.101.84.121 <none> 8000/TCP 41s
service/kubernetes-dashboard ClusterIP 10.110.109.66 <none> 443/TCP 41s
修改为LoadBalancer方式
[root@server1 dash]# kubectl -n kubernetes-dashboard edit svc kubernetes-dashboard
[root@server1 dash]# kubectl -n kubernetes-dashboard get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dashboard-metrics-scraper ClusterIP 10.101.84.145 <none> 8000/TCP 8m2s
kubernetes-dashboard LoadBalancer 10.110.109.97 172.25.16.10 443:31222/TCP 8m2s
获取dashboard pod的token
kubectl -n kubernetes-dashboard describe secrets kubernetes-dashboard-token-srbdc
授权: 默认dashboard对集群没有操作权限,需要授权
RBAC授权
[root@server1 dash]# cat rbac.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kubernetes-dashboard-admin
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kubernetes-dashboard
访问