kubelet 不断重启导致节点 notready

问题:kubectl get node 显示一个节点 notready 状态

ssh 登陆到该节点:

1、查看 kubelet 状态,发现 kubelet 在不断重启。

2、查看 kubelet 日志,发现每次出现 Failed to start cAdvisor inotify_add_watch /sys/fs/cgroup/cpuset/kubepods/podxxx: no space left on device 就会重启 kubelet

解决方法:

增加 inotify watcher 最大数量到 40000(必需大于 32678,)

sysctl fs.inotify.max_user_watches=40000

查看 inotify wathcer 使用数量:

脚本 1

#!/bin/sh

# Get the procs sorted by the number of inotify watchers
# @author Carl-Erik Kopseng
# @latest 
https://github.com/fatso83/dotfiles/blob/master/utils/scripts/inotify-consumers

# Discussion leading up to answer: 
https://unix.stackexchange.com/questions/15509/whos-consuming-my-inotify-resources


usage(){
    cat << EOF
Usage: $0 [--help|--limits]
    -l, --limits    Will print the current related limits and how to change them
    -h, --help      Show this help
EOF
}

limits(){
    echo "\nCurrent limits\n-------------"
    sysctl fs.inotify.max_user_instances  fs.inotify.max_user_watches

    cat <<- EOF 
Changing settings permanently
-----------------------------
echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf
sudo sysctl -p # re-read config
EOF
}

if [ "$1" = "--limits" -o "$1" = "-l" ]; then
    limits
    exit 0
fi

if [ "$1" = "--help" -o "$1" = "-h" ]; then
    usage 
    exit 0
fi

if [ -n "$1" ]; then
    echo "\nUnknown parameter '$1'\n" > /dev/stderr
    usage
    exit 1
fi


generateRawData(){
    # From `man find`: 
    #    %h     Leading directories of file's name (all but the last element).  If the file name contains no slashes  (since  it
    #           is in the current directory) the %h specifier expands to `.'.
    #    %f     File's name with any leading directories removed (only the last element).
    #
    find /proc/*/fd \
    -lname anon_inode:inotify \
    -printf '%hinfo/%f\n' 2>/dev/null \
    \
    | xargs grep -c '^inotify'  \
    | sort -n -t: -k2 -r 
}

printf "\n%10s\n" "INOTIFY"
printf "%10s\n" "WATCHER"
printf "%10s  %5s     %s\n" " COUNT " "PID" "CMD"
printf -- "----------------------------------------\n"

IFS=''; # to avoid `read` from interpreting whitespace and keep whole lines
generateRawData | while read line; do
    watcher_count=$(echo $line | sed -e 's/.*://')
    pid=$(echo $line | sed -e 's/\/proc\/\([0-9]*\)\/.*/\1/')
    cmdline=$(ps --columns 120 -o command -h -p $pid) 
    printf "%8d  %7d  %s\n" "$watcher_count" "$pid" "$cmdline"
done

脚本 2

set -o errexit
set -o pipefail
lsof +c 0 -n -P -u root \
| awk '/inotify$/ { gsub(/[urw]$/,"",$4); print $1" "$2" "$4 }' \
| while read name pid fd; do \
exe="$(readlink -f /proc/$pid/exe || echo n/a)"; \
fdinfo="/proc/$pid/fdinfo/$fd" ; \
count="$(grep -c inotify "$fdinfo" || true)"; \
echo "$name $exe $pid $fdinfo $count"; \
done

参考文章:

https://github.com/kubernetes/kubernetes/issues/7815

https://github.com/kubernetes/kubernetes/issues/10421#issuecomment-115866727

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值