重装系统以后,下载kubeadm,enable kubelet.service,并启动,显示如下错误
[user@localhost kubelet]$ sudo kubeadm init
[sudo] user 的密码:
I0611 12:44:00.666368 3995 version.go:256] remote version is much newer: v1.30.1; falling back to: stable-1.28
[init] Using Kubernetes version: v1.28.10
[preflight] Running pre-flight checks
[WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR CRI]: container runtime is not running: output: E0611 12:44:01.448684 4008 remote_runtime.go:616] "Status from runtime service failed" err="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /var/run/containerd/containerd.sock: connect: no such file or directory\""
time="2024-06-11T12:44:01+08:00" level=fatal msg="getting status of runtime: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /var/run/containerd/containerd.sock: connect: no such file or directory\""
, error: exit status 1
[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
设置kbeadm.yaml
为了后续拉取镜像方便,修改kubeadm.yaml文件。此时,并没有这样一个yaml文件,但是kubeadm init是有一个默认配置的,我们将它输出,并在此基础上修改源
kubeadm config print init-defaults > kubeadm-config.yaml
vim kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 1.2.3.4
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
name: node
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.k8s.io
kind: ClusterConfiguration
kubernetesVersion: 1.28.0
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
scheduler: {}
将这里的源改为阿里云的源
imageRepository: registry.aliyuncs.com/google_containers
随后下载以containerd为容器运行时的docker
下载docker和containerd
重新安装合适版本的docker,首先配置yum依赖和yum源
sudo yum install yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
随后安装1.20版本的docker,以及containerd.io
yum install docker-ce-20.10.8-3.el7 docker-ce-cli-20.10.8-3.el7 containerd.io docker-compose-plugin
查看docker版本,启动containerd并查看日志,发现一个配置错误,后续再解决。
docker --version
>Docker version 20.10.8, build 3967b7d
sudo systemctl start containerd
sudo systemctl status containerd
● containerd.service - containerd container runtime
Loaded: loaded (/usr/lib/systemd/system/containerd.service; enabled; vendor preset: disabled)
Active: active (running) since 二 2024-06-11 13:06:32 CST; 3s ago
Docs: https://containerd.io
Process: 6302 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
Main PID: 6309 (containerd)
Tasks: 13
Memory: 16.2M
CGroup: /system.slice/containerd.service
└─6309 /usr/bin/containerd
sudo journalctl --no-pager -u containerd
-- Logs begin at 二 2024-06-11 10:48:32 CST, end at 二 2024-06-11 13:13:23 CST. --
6月 11 13:06:32 localhost.localdomain systemd[1]: Starting containerd container runtime...
6月 11 13:06:32 localhost.localdomain containerd[6309]: time="2024-06-11T13:06:32+08:00" level=warning msg="containerd config version `1` has been deprecated and will be converted on each startup in containerd v2.0, use `containerd config migrate` after upgrading to containerd 2.0 to avoid conversion on startup"
设置新的主机名
重新初始化kubeadm,发现如下错误
user@localhost kubernetes_cc]$ sudo kubeadm init --config=kubeadm-config.yaml
[init] Using Kubernetes version: v1.28.0
[preflight] Running pre-flight checks
[WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
[WARNING Hostname]: hostname "node" could not be reached
[WARNING Hostname]: hostname "node": lookup node on [2001:da8:b8::208]:53: no such host
本机使用 localhost.localdomain
作为主机名在某些情况下可能会导致问题,特别是在运行需要正确主机名解析的网络服务(如 Kubernetes)时。因为 localhost
通常用于指本地回环地址,不适合用作网络中可识别的设备标识。
设置新的主机名
sudo vim /etc/hostname
输入
cdh03
sudo vim /etc/hosts
将localhost.localdomian改为cdh03
重置系统使得生效
sudo reboot
后来发现只需要在/etc/hosts中把本地回环地址设为node
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.60.102.55 cdh01
10.60.102.56 cdh02
100.64.163.45 cdh03
127.0.0.1 node
Linux 系统的网络桥接配置
针对网络桥接的错误,发现缺少br_netfilter模块
[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
通过以下命令加载该模块
sudo modprobe br_netfilter
设置自动启动
echo 'br_netfilter' | sudo tee -a /etc/modules-load.d/br_netfilter.conf
启用 net.bridge.bridge-nf-call-iptables
配置,确保通过所有网桥的网络流量都会被 iptables
处理。这是 Kubernetes 使用 iptables
管理 POD 网络流量的要求。
echo '1' | sudo tee /proc/sys/net/bridge/bridge-nf-call-iptables
echo 'net.bridge.bridge-nf-call-iptables=1' | sudo tee -a /etc/sysctl.conf
立即应用 /etc/sysctl.conf
文件中的所有设置,无需重启系统。
sudo sysctl -p