kubelet上报的container相关指标总是莫名缺少相应的label值,如image, pod, name等

17 篇文章 1 订阅
6 篇文章 0 订阅

k8s集群中container监控指标有label但是没有value

现象

无法获取到以下label的值:container, image, pod, name
集群刚搭建完的时候一切都是正常的,后来运行一段时间后就缺这缺那的。
kubelet中的container指标不齐全

原因排查

由于这些指标是kubelet上报的,因此看了对应节点上kubelet的日志,发现kubelet日志一直在报错。
centos系统

cat /var/log/messages  |grep "Failed"

Jul 19 17:37:51 m1 kubelet: E0719 17:37:51.724736     869 manager.go:1084] Failed to create existing container: /kubepods/besteffort/podcbdefb29-9b15-4be8-9b13-164b57549227/a1de8a3dc42f0ca1bbd965c9c7a3aaadc4a5accafa9959527e21acd12313db6c: failed to identify the read-write layer ID for container "a1de8a3dc42f0ca1bbd965c9c7a3aaadc4a5accafa9959527e21acd12313db6c". - open /var/lib/docker/image/overlay2/layerdb/mounts/a1de8a3dc42f0ca1bbd965c9c7a3aaadc4a5accafa9959527e21acd12313db6c/mount-id: no such file or directory
Jul 19 17:37:51 m1 kubelet: E0719 17:37:51.727490     869 manager.go:1084] Failed to create existing container: /kubepods/besteffort/podd48732a1-52f4-4a39-be7a-eaa9f174fbb9/582d3f2cfdcfa16797515991302c1509648d1e95946ea9a12fbf5f75cc2488ca: failed to identify the read-write layer ID for container "582d3f2cfdcfa16797515991302c1509648d1e95946ea9a12fbf5f75cc2488ca". - open /var/lib/docker/image/overlay2/layerdb/mounts/582d3f2cfdcfa16797515991302c1509648d1e95946ea9a12fbf5f75cc2488ca/mount-id: no such file or directory
Jul 19 17:37:51 m1 kubelet: E0719 17:37:51.729932     869 manager.go:1084] Failed to create existing container: /kubepods/besteffort/pode39a19ee6224e5d144913aec7b4d2615/619e8652e8228663b821e7a2babf22811931227868f3e0b9fe88cde146e07117: failed to identify the read-write layer ID for container "619e8652e8228663b821e7a2babf22811931227868f3e0b9fe88cde146e07117". - open /var/lib/docker/image/overlay2/layerdb/mounts/619e8652e8228663b821e7a2babf22811931227868f3e0b9fe88cde146e07117/mount-id: no such file or directory
Jul 19 17:37:51 m1 kubelet: E0719 17:37:51.732715     869 manager.go:1084] Failed to create existing container: /kubepods/burstable/pod869271af-b3b8-4188-9839-dd3eb41e892f/30b6f17363a39c0e2518b23ad2ad6eed673cc86dd444fcd51dcbe4c8e1d38b56: failed to identify the read-write layer ID for container "30b6f17363a39c0e2518b23ad2ad6eed673cc86dd444fcd51dcbe4c8e1d38b56". - open /var/lib/docker/image/overlay2/layerdb/mounts/30b6f17363a39c0e2518b23ad2ad6eed673cc86dd444fcd51dcbe4c8e1d38b56/mount-id: no such file or directory

由于kubelet的监控指标是从Cadvisor查询的,因此这个错跟Cadvisor有关。果然cadvisor项目下也存在类似issue
cAdvisor occasionally gets into a state where it has no container metadata

解决

按照cadvisor issue中的办法重启docker并不能彻底解决,问题还是会出现。
直接给kubelet增加了参数:–docker-root 问题解决。
由于docker-root默认值为/var/lib/docker,之所以在当前环境不行是因为公司的docker目录不在/var/lib/docker下,而是别的目录,修改为正确的目录后,重启kubelet问题解决。

--docker-root string                                                                                       DEPRECATED: docker root is read from docker info (this is a fallback, default: /var/lib/docker) (default "/var/lib/docker")

1.修改kubelet启动参数
k8s组件通过systemctl管理的,可以在/etc/systemd/system 下找到kubelet的配置。
在kubelet.service中的ExecStart后面新增–docker-root=/var/lib/it/docker
2. 修改完后reload kubelet即可

systemctl restart kubelet

修改后的kubelet.service

[root@n1 system]# cat kubelet.service 
[Unit]
Description=kubelet: The Kubernetes Node Agent
Documentation=http://kubernetes.io/docs/

[Service]
Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"
Environment="KUBELET_DNS_ARGS=--cluster-dns=10.51.0.10 --cluster-domain=cluster.local"
Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests"
Environment="KUBELET_PRIVILEGED_ARGS="
Environment="KUBELET_RUNTIME_ARGS=--pod-infra-container-image=docker.io/kubernetes/pause:go"
Environment="KUBELET_CUSTOM_ARGS="

ExecStart=/usr/bin/kubelet --kubeconfig=/etc/kubernetes/kubelet.conf --register-node $KUBELET_PRIVILEGED_ARGS $KUBELET_RUNTIME_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_DNS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_CUSTOM_ARGS --docker-root=/var/lib/it/docker
Restart=always
StartLimitInterval=0
RestartSec=10

[Install]
WantedBy=multi-user.target
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值