环境
软件 | 版本 |
---|---|
k8s | v1.19.10 |
docker | 20.10.7 |
os | Ubuntu 20.04.2 |
部署
使用了大佬的工程 lxcfs-admission-webhook,但对 lxcfs 的镜像做了部分修改
FROM ubuntu:20.04 as build
ENV LXCFS_VERSION 4.0.12
RUN apt update \
&& DEBIAN_FRONTEND=noninteractive apt install -y build-essential wget meson python3-pip cmake fuse libfuse-dev pkg-config
RUN pip3 install jinja2
RUN wget https://linuxcontainers.org/downloads/lxcfs/lxcfs-$LXCFS_VERSION.tar.gz \
&& mkdir /lxcfs \
&& tar xzvf lxcfs-$LXCFS_VERSION.tar.gz -C /lxcfs --strip-components=1 \
&& cd /lxcfs \
&& ./configure \
&& make
FROM ubuntu:20.04
STOPSIGNAL SIGINT
COPY --from=build /lxcfs/src/lxcfs /usr/local/bin/lxcfs
COPY --from=build /lxcfs/src/.libs/liblxcfs.so /usr/local/lib/lxcfs/liblxcfs.so
COPY --from=build /lxcfs/src/lxcfs /lxcfs/lxcfs
COPY --from=build /lxcfs/src/.libs/liblxcfs.so /lxcfs/liblxcfs.so
COPY --from=build /usr/lib/x86_64-linux-gnu/libfuse.so.2.9.9 /lxcfs/libfuse.so.2.9.9
COPY --from=build /usr/lib/x86_64-linux-gnu/libulockmgr.so.1.0.1 /lxcfs/libulockmgr.so.1.0.1
COPY start.sh /
CMD ["/start.sh"]
#!/bin/bash
# Cleanup
nsenter -m/proc/1/ns/mnt fusermount -u /var/lib/lxcfs 2> /dev/null || true
nsenter -m/proc/1/ns/mnt [ -L /etc/mtab ] || \
sed -i "/^lxcfs \/var\/lib\/lxcfs fuse.lxcfs/d" /etc/mtab
# remove /var/lib/lxcfs
rm -rf /var/lib/lxcfs/*
# Prepare
mkdir -p /usr/local/lib/lxcfs /var/lib/lxcfs
# Update lxcfs
cp -f /lxcfs/lxcfs /usr/local/bin/lxcfs
cp -f /lxcfs/liblxcfs.so /usr/local/lib/lxcfs/liblxcfs.so
cp -f /lxcfs/libfuse.so.2.9.9 /usr/lib64/libfuse.so.2.9.9
cp -f /lxcfs/libulockmgr.so.1.0.1 /usr/lib64/libulockmgr.so.1.0.1
ln -s /usr/lib64/libfuse.so.2.9.9 /usr/lib64/libfuse.so.2
ln -s /usr/lib64/libulockmgr.so.1.0.1 /usr/lib64/libulockmgr.so.1
# Mount
exec nsenter -m/proc/1/ns/mnt /usr/local/bin/lxcfs /var/lib/lxcfs/ --enable-cfs -l
测试遇到的一些问题
角色
角色 | 用途 | 其他 |
---|---|---|
lxcfs | lxcfs 是一个开源的 FUSE 的用户态文件系统,用来实现来支持LXC容器 | 使容器内部获得正确的限制的 cpu、内存等信息(daemonset 方式部署) |
lxcfs-admission-webhook | 按需拦截 pod 的创建, patch lxcfs 相关的 volume | 一个 deployment |
PODS | 需要获取正确数据的程序 | nginx、java 等 |
测试的 pod
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-lxcfs
spec:
replicas: 1
selector:
matchLabels:
app: testlxcfs
template:
metadata:
labels:
app: testlxcfs
spec:
containers:
- name: alpine
#image: ubuntu:20.04
image: alpine:3.16
command: ["/bin/sh"]
args: ["-c", "sleep 1d"]
imagePullPolicy: Always
resources:
requests:
memory: "256Mi"
cpu: "0.2"
limits:
memory: "1024Mi"
cpu: "0.5"
正常情况
### 1 cpu
# kubectl exec -it test-lxcfs-5b449777dd-fl7m9 -- cat /proc/cpuinfo | grep processor | wc -l
1
### 1G memory
# kubectl exec -it test-lxcfs-5b449777dd-fl7m9 -- cat /proc/meminfo | grep MemTotal
MemTotal: 1048576 kB
问题
小概率发生
lxcfs 异常
- 已经正常运行的 pod 不能获取 cpu、内存,即使 lxcfs 恢复也是一样的情况
处理:Lxcfs调研测试 和 container_remount_lxcfs.,简单来说就是利用 linux mnt 的 namespace 加 nsenter 重新 mount
### lxcfs 异常
# kubectl exec -it test-lxcfs-5b449777dd-fl7m9 -- cat /proc/cpuinfo
cat: can't open '/proc/cpuinfo': Socket not connected
command terminated with exit code
# kubectl exec -it test-lxcfs-5b449777dd-fl7m9 -- cat /proc/meminfo
cat: can't open '/proc/meminfo': Socket not connected
command terminated with exit code 1
### lxcfs 恢复正常
# kubectl exec -it test-lxcfs-5b449777dd-fl7m9 -- cat /proc/cpuinfo
cat: can't open '/proc/cpuinfo': Socket not connected
command terminated with exit code 1
# kubectl exec -it test-lxcfs-5b449777dd-fl7m9 -- cat /proc/meminfo
cat: can't open '/proc/meminfo': Socket not connected
command terminated with exit code 1
### 重建 pod
# kubectl exec -it test-lxcfs-5b449777dd-bzgqm -- cat /proc/cpuinfo | grep processor | wc -l
1
# kubectl exec -it test-lxcfs-5b449777dd-bzgqm -- cat /proc/meminfo | grep MemTotal
MemTotal: 1048576 kB
- 新建 pod 不能正常启动,直到 lxcfs 恢复
# kubectl get pods -w
NAME READY STATUS RESTARTS AGE
test-lxcfs-5b449777dd-r5g68 0/1 Pending 0 0s
test-lxcfs-5b449777dd-r5g68 0/1 Pending 0 0s
test-lxcfs-5b449777dd-r5g68 0/1 Init:0/1 0 0s
test-lxcfs-5b449777dd-r5g68 0/1 Init:0/1 0 1s
test-lxcfs-5b449777dd-r5g68 0/1 PodInitializing 0 2s
test-lxcfs-5b449777dd-r5g68 0/1 RunContainerError 0 4s
test-lxcfs-5b449777dd-r5g68 0/1 RunContainerError 1 5s
test-lxcfs-5b449777dd-r5g68 0/1 CrashLoopBackOff 1 17s
test-lxcfs-5b449777dd-r5g68 0/1 RunContainerError 2 38s
test-lxcfs-5b449777dd-r5g68 0/1 RunContainerError 3 51s
test-lxcfs-5b449777dd-r5g68 0/1 CrashLoopBackOff 3 52s
test-lxcfs-5b449777dd-r5g68 1/1 Running 4 90s
### describe 信息
Error: failed to start container "alpine": Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:545: container init caused: rootfs_linux.go:76: mounting "/var/lib/lxcfs/proc/loadavg" to rootfs at "/proc/loadavg" caused: mount through procfd: not a directory: unknown: Are you trying to mount a directory onto a file (or vice-versa)? Check if the specified host path exists and is the expected type
### Running 之后能够获取正确的信息
# kubectl exec -it test-lxcfs-5b449777dd-r5g68 -- cat /proc/cpuinfo | grep processor | wc -l
1
# kubectl exec -it test-lxcfs-5b449777dd-r5g68 -- cat /proc/meminfo | grep MemTotal
MemTotal: 1048576 kB
lxcfs-admission-webhook 异常
pod 将不会被 path lxcfs 相关的 volume,pod 获取的信息和宿主机一致;需要 lxcfs-admission-webhook 恢复正常只有重建 pod
lxcfs、lxcfs-admission-webhook 启动优先级
需要确保启动优先级高于其他 pod(服务器断电,重启等情况)
Thanks
official
在 Kubernetes 中使用最新的 LXCFS
lxcfs 原理
lxcfs-admission-webhook