kubernetes v1.23.3 安装 kube-prometheus-release-0.11
节点信息
[ root@cloud-1 kube-prometheus-release-0.11]
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
cloud-1 Ready control-plane,master 39h v1.23.3 172.17 .4.40 < none> Rocky Linux 8.7 ( Green Obsidian) 4.18 .0-425.10.1.el8_7.x86_64 containerd://1.6.19
kube-prometheus-release-0.11 下载
https://github.com/prometheus-operator/kube-prometheus
相关版本依赖 可参考如下文档
https://github.com/prometheus-operator/kube-prometheus/blob/main/README.md
部署过程
kubectl apply --server-side -f manifests/setup
kubectl apply -f manifests/
查看相关pod状态
kubectl get pod -n monitoring
[ root@cloud-1 kube-prometheus-release-0.11]
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2 /2 Running 0 15h
alertmanager-main-1 2 /2 Running 0 15h
alertmanager-main-2 2 /2 Running 0 15h
blackbox-exporter-746c64fd88-hb99r 3 /3 Running 0 16h
grafana-5fc7f9f55d-977cj 1 /1 Running 0 16h
kube-state-metrics-6c8846558c-hwvd2 2 /3 ImagePullBackOff 0 16h
node-exporter-bqb7b 2 /2 Running 0 16h
prometheus-adapter-6455646bdc-9zncj 0 /1 Running 0 16h
prometheus-adapter-b4f7dc9b6-p5m9x 0 /1 ErrImageNeverPull 0 2m52s
prometheus-adapter-b4f7dc9b6-zmw6t 0 /1 ErrImageNeverPull 0 2m51s
prometheus-k8s-0 2 /2 Running 0 15h
prometheus-k8s-1 2 /2 Running 0 15h
prometheus-operator-f59c8b954-lg9sm 2 /2 Running 0 16h
kubectl -n monitoring describe pod prometheus-adapter-b4f7dc9b6-p5m9x
kubectl -n monitoring describe pod kube-state-metrics-6c8846558c-hwvd2
prometheus-adapter kube-state-metrics 获取镜像失败
kubectl -n monitoring describe pod kube-state-metrics-6c8846558c-hwvd2
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Failed 54m ( x38 over 15h) kubelet Failed to pull image "k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.5.0" : rpc error: code = Unknown desc = failed to pull and unpack image "k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.5.0" : failed to resolve reference "k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.5.0" : failed to do request: Head "https://k8s.gcr.io/v2/kube-state-metrics/kube-state-metrics/manifests/v2.5.0" : dial tcp 64.233 .188.82:443: i/o timeout
Normal Pulling 9m34s ( x175 over 16h) kubelet Pulling image "k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.5.0"
Normal BackOff 4m35s ( x3858 over 16h) kubelet Back-off pulling image "k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.5.0"
上传无法直接下载的镜像
链接:https://pan.baidu.com/s/1cKeDTc7hc0ozIKkhEmYSig
提取码:opxf
导入镜像
ctr -n = k8s.io image import /root/kube-state-metricsv2.5.0
ctr -n = k8s.io image import /root/prometheus-adapterv0.9.1
[ root@cloud-1 kube-prometheus-release-0.11]
NAME READY STATUS RESTARTS AGE
alertmanager-main-0 2 /2 Running 0 16h
alertmanager-main-1 2 /2 Running 0 16h
alertmanager-main-2 2 /2 Running 0 16h
blackbox-exporter-746c64fd88-hb99r 3 /3 Running 0 16h
grafana-5fc7f9f55d-977cj 1 /1 Running 0 16h
kube-state-metrics-59c8b66df6-dg76g 3 /3 Running 0 21m
node-exporter-bqb7b 2 /2 Running 0 16h
prometheus-adapter-b4f7dc9b6-p5m9x 1 /1 Running 0 35m
prometheus-adapter-b4f7dc9b6-zmw6t 1 /1 Running 0 35m
prometheus-k8s-0 2 /2 Running 0 16h
prometheus-k8s-1 2 /2 Running 0 16h
prometheus-operator-f59c8b954-lg9sm 2 /2 Running 0 16h
修改svc type
kubectl -n monitoring edit svc grafana
kubectl -n monitoring edit svc prometheus-k8s
将 type修改为NodePort
type: NodePort
放通网络策略
cat network-policy-allow-all-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
namespace: monitoring
name: allow-all-ingress
spec:
podSelector: {}
ingress:
- {}
policyTypes:
- Ingress
kubectl apply -f network-policy-allow-all-ingress.yaml