前言
国内网络环境特殊,在线安装比较麻烦,K3S采用离线安装方式进行部署。
安装整体思路是:
- 安装GPU驱动
- 安装CUDA工具
- 安装nvidia容器运行时
- 安装K3S
- 设置K3S使用GPU
基础环境
采用All In One方式(其实只有一张GPU卡)部署。
| 参数 | 内容 |
|---|---|
| 系统 | openEuler 22.03 (LTS-SP3) |
| CPU | 8 |
| 内存 | 64G |
| 系统盘 | 500G |
| GPU | V100-32G |
GPU采用直通的方式,vGPU应该也差不多。
安装GPU驱动
下载驱动
先去官网(https://www.nvidia.cn/geforce/drivers/)下载驱动
选择对应型号操作系统为Linux 64-bit搜索后下载即可

像我V100的下载链接就是https://cn.download.nvidia.com/tesla/560.35.03/NVIDIA-Linux-x86_64-560.35.03.run
wget https://cn.download.nvidia.com/tesla/560.35.03/NVIDIA-Linux-x86_64-560.35.03.run -O NVIDIA-Linux.run
安装构建工具和依赖
yum install gcc make kernel-devel-$(uname -r) vulkan-loader -y
运行安装
根据指示安装完即可。
bash NVIDIA-Linux.run --kernel-source-path=/usr/src/kernels/$(uname -r)
测试是否安装成功
nvidia-smi
运行后应该会输出显卡和驱动信息,输出则表示安装成功
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03 Driver Version: 560.35.03 CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Tesla V100-PCIE-32GB Off | 00000000:00:06.0 Off | 0 |
| N/A 41C P0 39W / 250W | 310MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
安装CUDA工具
下载CUDA工具
前往官网(https://developer.nvidia.com/cuda-toolkit-archive),选择要下载的版本进行下载。

这里我选择的是12.4.0版本(PyTorch支持的版本),访问后操作系统选择Linux、架构选择x86_64、发行选择RHEL、版本选择9、安装方式选择runfile(local)。

根据安装指引运行
wget https://developer.download.nvidia.com/compute/cuda/12.4.0/local_installers/cuda_12.4.0_550.54.14_linux.run
sudo sh cuda_12.4.0_550.54.14_linux.run
设置系统环境变量
sudo echo "export PATH=\$PATH:/usr/local/cuda-12.4/bin" >> /etc/profile
sudo echo /usr/local/cuda-12.4/lib64 >> /etc/ld.so.conf
sudo ldconfig
source /etc/profile
校验CUDA
nvcc -V
运行后会返回CUDA版本信息
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Tue_Feb_27_16:19:38_PST_2024
Cuda compilation tools, release 12.4, V12.4.99
Build cuda_12.4.r12.4/compiler.33961263_0
安装cuDNN
wget https://developer.download.nvidia.com/compute/cudnn/9.7.1/local_installers/cudnn-local-repo-rhel9-9.7.1-1.0-1.x86_64.rpm
sudo rpm -i cudnn-local-repo-rhel9-9.7.1-1.0-1.x86_64.rpm
sudo dnf clean all
sudo dnf -y install cudnn
sudo dnf -y install cudnn-cuda-12
安装nvidia容器运行时
为了使容器内可以使用显卡,需要安装nvidia容器运行时,这一步网上经常是搜索的是ubuntu的教材,这里按官网步骤安装。
安装YUM源
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
启动YUM源
sudo dnf-config-manager --enable nvidia-container-toolkit-experimental
安装容器工具包
sudo dnf install -y nvidia-container-toolkit
安装完即可,在后续步骤会使用。
安装K3S
因为网络原因在线安装K3S不太稳定,这里使用离线安装。
下载镜像
mkdir -p /var/lib/rancher/k3s/agent/images/ /usr/local/bin/
wget https://github.com/k3s-io/k3s/releases/download/v1.31.5%2Bk3s1/k3s -O /usr/local/bin/k3s
chmod a+x /usr/local/bin/k3s
wget https://github.com/k3s-io/k3s/releases/download/v1.31.5%2Bk3s1/k3s-airgap-images-amd64.tar -O /var/lib/rancher/k3s/agent/images/k3s-airgap-images-amd64.tar
这里下载很慢可以使用类似镜像加速站点(https://gh-proxy.com)进行加速下载。
下载安装脚本
wget https://get.k3s.io -O install.sh
如果下载不了可以使用下面命令直接保存安装脚本
cat > install.sh <<EOF
#!/bin/sh
set -e
set -o noglob
# Usage:
# curl ... | ENV_VAR=... sh -
# or
# ENV_VAR=... ./install.sh
#
# Example:
# Installing a server without traefik:
# curl ... | INSTALL_K3S_EXEC="--disable=traefik" sh -
# Installing an agent to point at a server:
# curl ... | K3S_TOKEN=xxx K3S_URL=https://server-url:6443 sh -
#
# Environment variables:
# - K3S_*
# Environment variables which begin with K3S_ will be preserved for the
# systemd service to use. Setting K3S_URL without explicitly setting
# a systemd exec command will default the command to "agent", and we
# enforce that K3S_TOKEN is also set.
#
# - INSTALL_K3S_SKIP_DOWNLOAD
# If set to true will not download k3s hash or binary.
#
# - INSTALL_K3S_FORCE_RESTART
# If set to true will always restart the K3s service
#
# - INSTALL_K3S_SYMLINK
# If set to 'skip' will not create symlinks, 'force' will overwrite,
# default will symlink if command does not exist in path.
#
# - INSTALL_K3S_SKIP_ENABLE
# If set to true will not enable or start k3s service.
#
# - INSTALL_K3S_SKIP_START
# If set to true will not start k3s service.
#
# - INSTALL_K3S_VERSION
# Version of k3s to download from github. Will attempt to download from the
# stable channel if not specified.
#
# - INSTALL_K3S_COMMIT
# Commit of k3s to download from temporary cloud storage.
# * (for developer & QA use)
#
# - INSTALL_K3S_PR
# PR build of k3s to download from Github Artifacts.
# * (for developer & QA use)
#
# - INSTALL_K3S_BIN_DIR
# Directory to install k3s binary, links, and uninstall script to, or use
# /usr/local/bin as the default
#
# - INSTALL_K3S_BIN_DIR_READ_ONLY
# If set to true will not write files to INSTALL_K3S_BIN_DIR, forces
# setting INSTALL_K3S_SKIP_DOWNLOAD=true
#
# - INSTALL_K3S_SYSTEMD_DIR
# Directory to install systemd service and environment files to, or use
# /etc/systemd/system as the default
#
# - INSTALL_K3S_EXEC or script arguments
# Command with flags to use for launching k3s in the systemd service, if
# the command is not specified will default to "agent" if K3S_URL is set
# or "server" if not. The final systemd command resolves to a combination
# of EXEC and script args ($@).
#
# The following commands result in the same behavior:
# curl ... | INSTALL_K3S_EXEC="--disable=traefik" sh -s -
# curl ... | INSTALL_K3S_EXEC="server --disable=traefik" sh -s -
# curl ... | INSTALL_K3S_EXEC="server" sh -s - --disable=traefik
# curl ... | sh -s - server --disable=traefik
# curl ... | sh -s - --disable=traefik
#
# - INSTALL_K3S_NAME
# Name of systemd service to create, will default from the k3s exec command
# if not specified. If specified the name will be prefixed with 'k3s-'.
#
# - INSTALL_K3S_TYPE
# Type of systemd service to create, will default from the k3s exec command
# if not specified.
#
# - INSTALL_K3S_SELINUX_WARN
# If set to true will continue if k3s-selinux policy is not found.
#
# - INSTALL_K3S_SKIP_SELINUX_RPM
# If set to true will skip automatic installation of the k3s RPM.
#
# - INSTALL_K3S_CHANNEL_URL
# Channel URL for fetching k3s download URL.
# Defaults to 'https://update.k3s.io/v1-release/channels'.
#
# - INSTALL_K3S_CHANNEL
# Channel to use for fetching k3s download URL.
# Defaults to 'stable'.
GITHUB_URL=${GITHUB_URL:-https://github.com/k3s-io/k3s/releases}
GITHUB_PR_URL=""
STORAGE_URL=https://k3s-ci-builds.s3.amazonaws.com
DOWNLOADER=
# --- helper functions for logs ---
info()
{
echo '[INFO] ' "$@"
}
warn()
{
echo '[WARN] ' "$@" >&2
}
fatal()
{
echo '[ERROR] ' "$@" >&2
exit 1
}
# --- fatal if no systemd or openrc ---
verify_system() {
if [ -x /sbin/openrc-run ]; then
HAS_OPENRC=true
return
fi
if [ -x /bin/systemctl ] || type systemctl > /dev/null 2>&1; then
HAS_SYSTEMD=true
return
fi
fatal 'Can not find systemd or openrc to use as a process supervisor for k3s'
}
# --- add quotes to command arguments ---
quote() {
for arg in "$@"; do
printf '%s\n' "$arg" | sed "s/'/'\\\\''/g;1s/^/'/;\$s/\$/'/"
done
}
# --- add indentation and trailing slash to quoted args ---
quote_indent() {
printf ' \\\n'
for arg in "$@"; do
printf '\t%s \\\n' "$(quote "$arg")"
done
}
# --- escape most punctuation characters, except quotes, forward slash, and space ---
escape() {
printf '%s' "$@" | sed -e 's/\([][!#$%&()*;<=>?\_`{
|}]\)/\\\1/g;'
}
# --- escape double quotes ---
escape_dq() {
printf '%s' "$@" | sed -e 's/"/\\"/g'
}
# --- ensures $K3S_URL is empty or begins with https://, exiting fatally otherwise ---
verify_k3s_url() {
case "${K3S_URL}" in
"")
;;
https://*)
;;
*)
fatal "Only https:// URLs are supported for K3S_URL (have ${K3S_URL})"
;;
esac
}
# --- define needed environment variables ---
setup_env() {
# --- use command args if passed or create default ---
case "$1" in
# --- if we only have flags discover if command should be server or agent ---
(-*|"")
if [ -z "${K3S_URL}" ]; then
CMD_K3S=server
else
if [ -z "${K3S_TOKEN}" ] && [ -z "${K3S_TOKEN_FILE}" ]; then
fatal "Defaulted k3s exec command to 'agent' because K3S_URL is defined, but K3S_TOKEN or K3S_TOKEN_FILE is not defined."
fi
CMD_K3S=agent
fi
;;
# --- command is provided ---
(*)
CMD_K3S=$1
shift
;;
esac
verify_k3s_url
CMD_K3S_EXEC="${CMD_K3S}$(quote_indent "$@")"
# --- use systemd name if defined or create default ---
if [ -n "${INSTALL_K3S_NAME}" ]; then
SYSTEM_NAME=k3s-${INSTALL_K3S_NAME}
else
if [ "${CMD_K3S}" = server ]; then
SYSTEM_NAME=k3s
else
SYSTEM_NAME=k3s-${CMD_K3S}
fi
fi
# --- check for invalid characters in system name ---
valid_chars=$(printf '%s' "${SYSTEM_NAME}" | sed -e 's/[][!#$%&()*;<=>?\_`{|}/[:space:]]/^/g;' )
if [ "${SYSTEM_NAME}" != "${valid_chars}" ]; then
invalid_chars=$(printf '%s' "${valid_chars}" | sed -e 's/[^^]/ /g')
fatal "Invalid characters for system name:
${SYSTEM_NAME}
${invalid_chars}"
fi
# --- use sudo if we are not already root ---
SUDO=sudo
if [ $(id -u) -eq 0 ]; then
SUDO=
fi
# --- use systemd type if defined or create default ---
if [ -n "${INSTALL_K3S_TYPE}" ]; then
SYSTEMD_TYPE=${INSTALL_K3S_TYPE}
else
SYSTEMD_TYPE=notify
fi
# --- use binary install directory if defined or create default ---
if [ -n "${INSTALL_K3S_BIN_DIR}" ]; then
BIN_DIR=${INSTALL_K3S_BIN_DIR}
else
# --- use /usr/local/bin if root can write to it, otherwise use /opt/bin if it exists
BIN_DIR=/usr/local/bin
if ! $SUDO sh -c "touch ${BIN_DIR}/k3s-ro-test && rm -rf ${BIN_DIR}/k3s-ro-test"; then
if [ -d /opt/bin ]; then
BIN_DIR=/opt/bin
fi
fi
fi
# --- use systemd directory if defined or create default ---
if [ -n "${INSTALL_K3S_SYSTEMD_DIR}" ]; then
SYSTEMD_DIR="${INSTALL_K3S_SYSTEMD_DIR}"
else
SYSTEMD_DIR=/etc/systemd/system
fi
# --- set related files from system name ---
SERVICE_K3S=${SYSTEM_NAME}.service
UNINSTALL_K3S_SH=${UNINSTALL_K3S_SH:-${BIN_DIR}/${SYSTEM_NAME}-uninstall.sh}
KILLALL_K3S_SH=${KILLALL_K3S_SH:-${BIN_DIR}/k3s-killall.sh}
# --- use service or environment location depending on systemd/openrc ---
if [ "${HAS_SYSTEMD}" = true ]; then
FILE_K3S_SERVICE=${SYSTEMD_DIR}/${SERVICE_K3S}
FILE_K3S_ENV=${SYSTEMD_DIR}/${SERVICE_K3S}.env
elif [ "${HAS_OPENRC}" = true ]; then
$SUDO mkdir -p /etc/rancher/k3s
FILE_K3S_SERVICE=/etc/init.d/${SYSTEM_NAME}
FILE_K3S_ENV=/etc/rancher/k3s/${SYSTEM_NAME}.env
fi
# --- get hash of config & exec for currently installed k3s ---
PRE_INSTALL_HASHES=$(get_installed_hashes)
# --- if bin directory is read only skip download ---
if [ "${INSTALL_K3S_BIN_DIR_READ_ONLY}" = true ]; then
INSTALL_K3S_SKIP_DOWNLOAD=true
fi
# --- setup channel values
INSTALL_K3S_CHANNEL_URL=${INSTALL_K3S_CHANNEL_URL:-'https


最低0.47元/天 解锁文章
762

被折叠的 条评论
为什么被折叠?



