简介:
使用prometheus监控kubelet的时候,报如下403的错误:
或者报401的错误
该问题的原因是webhook的授权地址使用127.0.0.1,所以其它IP发起的请求都会被拒绝。将该地址改为0.0.0.0,然后在controller和scheduler上允许数据请求。
变更步骤:
- 在master节点上备份如下文件
/etc/kubernetes/manifests/kube-controller-manager.yaml
/etc/kubernetes/manifests/kube-scheduler.yaml
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
-
修改授权地址,yaml文件修改以后,响应的pod会自动重启
sed -e "s/- --address=127.0.0.1/- --address=0.0.0.0/" -i /etc/kubernetes/manifests/kube-controller-manager.yaml
sed -e "s/- --address=127.0.0.1/- --address=0.0.0.0/" -i /etc/kubernetes/manifests/kube-scheduler.yaml
- 修改master节点的kubeadm.conf文件
KUBEADM_SYSTEMD_CONF=/etc/systemd/system/kubelet.service.d/10-kubeadm.conf sed -e "/cadvisor-port=0/d" -i "$KUBEADM_SYSTEMD_CONF" if ! grep -q "authentication-token-webhook=true" "$KUBEADM_SYSTEMD_CONF"; then sed -e "s/--authorization-mode=Webhook/--authentication-token-webhook=true --authorization-mode=Webhook/" -i "$KUBEADM_SYSTEMD_CONF" fi systemctl daemon-reload systemctl restart kubelet
-
修改所以worker节点的kubeadm.conf文件
a. root登陆worker节点
b. cd /root/ansible/update_kubeadm_conf
c. ansible-playbook -i ../nodes.inventory update_kubeadm_conf_for_nodes_without_master.yaml #节点太多了,直接用ansible来批量替换。手动执行就将下面的bash文件在每个节点执行
#!/bin/bash
KUBEADM_SYSTEMD_CONF=/etc/systemd/system/kubelet.service.d/10-kubeadm.conf
cp -p /etc/systemd/system/kubelet.service.d/10-kubeadm.conf /etc/systemd/system/kubelet.service.d/10-kubeadm.conf.`date +%Y%m%d`
sed -e "/cadvisor-port=0/d" -i "$KUBEADM_SYSTEMD_CONF"
if ! grep -q "authentication-token-webhook=true" "$KUBEADM_SYSTEMD_CONF"; then
sed -e "s/--authorization-mode=Webhook/--authentication-token-webhook=true --authorization-mode=Webhook/" -i "$KUBEADM_SYSTEMD_CONF"
fi
systemctl daemon-reload
systemctl restart kubelet
修复成功
参考文章:
https://github.com/coreos/prometheus-operator/blob/master/Documentation/troubleshooting.md
https://github.com/coreos/prometheus-operator/issues/976