软件环境
- 服务器:2核(vCPU) ,2 GiB, CentOS 7.9 64位(1台master,2台node)
- Docker版本:24.0.7
- Kubernetes:1.28.2
安装Kubernetes
1、初始化操作
# 关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
# 关闭selinux
sed -i 's/enforcing/disabled/' /etc/selinux/config # 永久
setenforce 0 # 临时
# 关闭swap
swapoff -a # 临时
sed -ri 's/.*swap.*/#&/' /etc/fstab # 永久
# 关闭完swap后,一定要重启一下虚拟机!!!
# 根据规划设置主机名
hostnamectl set-hostname <hostname>
# 在master添加hosts【根据实际情况填写】
# 如果是在不同平台的云服务器,如一台阿里云服务器、一台腾讯云服务器、一台百度云服务器、一台华为云服务器等多种不同平台的云服务器使用k8s,需要在云服务器中创建虚拟网卡 ,绑定公网IP
cat >> /etc/hosts << EOF
192.168.113.120 k8s-master
192.168.113.121 k8s-node1
192.168.113.122 k8s-node2
EOF
# 将桥接的IPv4流量传递到iptables的链
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system # 生效
# 开启内核支持
# 在/etc/sysctl.conf文件中,修改 net.ipv4.ip_forward = 1
# 执行命令:sysctl -p
# 时间同步
yum install ntpdate -y
ntpdate time.windows.com
补充:在云服务器中创建虚拟网卡 ,绑定公网IP【忽略这步】
# 创建虚拟网卡
cat > /etc/sysconfig/network-scripts/ifcfg-eth0:1 <<EOF
BOOTPROTO=static
DEVICE=eth0:1
IPADDR=xxx.xxx.xxx.xxx # 公网ip
PREFIX=32
TYPE=Ethernet
USERCTL=no
ONBOOT=yes
EOF
# 重启网络
systemctl restart network
2、安装基础软件(所有节点)
2.1安装docker
# 详见:https://help.aliyun.com/zh/ecs/use-cases/deploy-and-use-docker-on-alibaba-cloud-linux-2-instances#aa11e8210adyt
# 下载docker-ce的yum源
wget -O /etc/yum.repos.d/docker-ce.repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# 安装Docker
yum -y install docker-ce
# 检查Docker是否安装成功
docker -v
# 修改/etc/docker/daemon.json文件
{
"registry-mirrors": [
"https://registry.cn-hangzhou.aliyuncs.com"
]
}
# 启动Docker服务,并设置开机自启动
systemctl start docker
systemctl enable docker
# 查看Docker是否启动
systemctl status docker
# 使用“docker info”命令,查看镜像仓库是否配置成功,
# 显示内容的最后是否存在“Registry Mirrors”信息,
# 若存在,则配置成功,否则,没有
docker info
如果docker的镜像源地址失效,请更换。
可能会出现类似拉取镜像报错或者需要登录等问题:
Docker pull拉取镜像报错“Error response from daemon: Get “https://registry-1.docker.io/v2”解决办法-CSDN博客
2.2添加阿里云yum源
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
2.3 安装 kubeadm、kubelet、kubectl
# 1、配置关闭 Docker 的 cgroups,修改 /etc/docker/daemon.json,加入以下内容
"exec-opts": ["native.cgroupdriver=systemd"]
# 2、重启 docker
systemctl daemon-reload
systemctl restart docker
# 3、安装kubelet、kubeadm、kubectl
# yum install -y kubelet-1.23.6 kubeadm-1.23.6 kubectl-1.23.6
yum install -y kubelet kubeadm kubectl
# 4、配置kubelet的cgroup
# 编辑/etc/sysconfig/kubelet, 添加下面的配置
KUBELET_CGROUP_ARGS="--cgroup-driver=systemd"
# 5、设置kubelet开机自启
systemctl enable kubelet
3. 部署 Kubernetes Master
# 在 Master 节点下执行【根据实际情况修改kubeadm init参数】
# 如果是在不同平台的云服务器,如一台阿里云服务器、一台腾讯云服务器、一台百度云服务器、一台华为云服务器等多种不同平台的云服务器使用k8s,需要在云服务器中创建虚拟网卡 ,绑定公网IP,此时的--apiserver-advertise-address应为公网IP的值
kubeadm init \
--apiserver-advertise-address=192.168.113.120 \
--image-repository=registry.aliyuncs.com/google_containers \
--kubernetes-version=v1.28.2 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16
# 安装成功后,复制如下配置并执行
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl get nodes
kubeadm init命令部分参数说明:
--apiserver-advertise-address:API 服务器所公布的其正在监听的 IP 地址。如果未设置,则使用默认网络接口。
--apiserver-bind-port:API 服务器绑定的端口,默认值6443
--image-repository:选择用于拉取控制平面镜像的容器仓库,默认值registry.k8s.io--pod-network-cidr:指明 Pod 网络可以使用的 IP 地址段。如果设置了这个参数,控制平面将会为每一个节点自动分配 CIDR
--service-cidr:为服务的虚拟 IP 地址另外指定 IP 地址段,默认值10.96.0.0/12
执行kubeadm init命令后,会出现“[kubelet-check] Initial timeout of 40s passed”信息,若无,跳过。
[init] Using Kubernetes version: v1.28.2
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W0110 15:25:16.480341 11124 checks.go:835] detected that the sandbox image "registry.k8s.io/pause:3.6" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.aliyuncs.com/google_containers/pause:3.9" as the CRI sandbox image.
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local snail] and IPs [10.96.0.1 172.18.237.56]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost snail] and IPs [172.18.237.56 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost snail] and IPs [172.18.237.56 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.Unfortunately, an error has occurred:
timed out waiting for the conditionThis error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
执行命令
journalctl -xeu kubelet
截取部分日志信息,如下
"Error syncing pod, skipping" err="failed to \"CreatePodSandbox\" for \"kube-apiserver-snail_kube-system(9ef03a9048282986fab7cb22daeb04c1)\" with CreatePodSandboxError: \"Failed to create sandbox for pod \\\"kube-apiserver-snail_kube-system(9ef03a9048282986fab7cb22daeb04c1)\\\": rpc error: code = DeadlineExceeded desc = failed to get sandbox image \\\"registry.k8s.io/pause:3.6\\\": failed to pull image \\\"registry.k8s.io/pause:3.6\\\": failed to pull and unpack image \\\"registry.k8s.io/pause:3.6\\\": failed to resolve reference \\\"registry.k8s.io/pause:3.6\\\": failed to do request: Head \\\"https://us-west2-docker.pkg.dev/v2/k8s-artifacts-prod/images/pause/manifests/3.6\\\": dial tcp 173.194.174.82:443: i/o timeout\"" pod="kube-system/kube-apiserver-snail" podUID="9ef03a9048282986fab7cb22daeb04c1"
报错信息显示无法拉取镜像registry.k8s.io/pause:3.6
解决办法1:
详见:
kubeadm init:failed to pull image registry.k8s.io/pause:3.6-CSDN博客
# 执行命令
containerd config default > /etc/containerd/config.toml
# 编辑文件
vim /etc/containerd/config.toml
# 发现sandbox_image的配置为"registry.k8s.io/pause:3.6"
# 需要修改sandbox_image配置
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.6"
# 重启服务
systemctl daemon-reload && systemctl restart containerd
解决办法2:提前下载所需镜像(使用docker pull下载镜像)
# 查看镜像列表
kubeadm config images list
# 显示如下:
registry.k8s.io/kube-apiserver:v1.28.5
registry.k8s.io/kube-controller-manager:v1.28.5
registry.k8s.io/kube-scheduler:v1.28.5
registry.k8s.io/kube-proxy:v1.28.5
registry.k8s.io/pause:3.9
registry.k8s.io/etcd:3.5.9-0
registry.k8s.io/coredns/coredns:v1.10.1
# 使用 docker pull 下载镜像
kubeadm config images pull
接下来继续操作
# 在 Master 节点下执行【根据实际情况修改kubeadm init参数】
kubeadm reset
kubeadm init \
--apiserver-advertise-address=192.168.113.120 \
--image-repository=registry.aliyuncs.com/google_containers \
--kubernetes-version=v1.28.2 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16
安装成功后会出现:“Your Kubernetes control-plane has initialized successfully!”
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.113.120:6443 --token {...} \
--discovery-token-ca-cert-hash sha256:{...}
# 安装成功后,复制如下配置并执行
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
kubectl get nodes
4. 加入 Kubernetes Node
【根据实际情况操作】
分别在 k8s-node1 和 k8s-node2 执行
# 下方命令可以在 k8s master 控制台初始化成功后复制 join 命令
kubeadm join 192.168.113.120:6443 --token w34ha2.66if2c8nwmeat9o7 --discovery-token-ca-cert-hash sha256:20e2227554f8883811c01edd850f0cf2f396589d32b57b9984de3353a7389477
# 如果初始化的 token 不小心清空了,可以通过如下命令获取或者重新申请
# 如果 token 已经过期,就重新申请
kubeadm token create
# token 没有过期可以通过如下命令获取
kubeadm token list
# 获取 --discovery-token-ca-cert-hash 值,得到值后需要在前面拼接上 sha256:
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | \
openssl dgst -sha256 -hex | sed 's/^.* //'
5. 部署 CNI 网络插件
# 在 master 节点上执行
# 下载 calico 配置文件,可能会网络超时
curl https://docs.projectcalico.org/manifests/calico.yaml -O
# 修改 calico.yaml 文件中的 CALICO_IPV4POOL_CIDR 配置,修改为与初始化的 cidr 相同
# 修改 IP_AUTODETECTION_METHOD 下的网卡名称
# 修改calico.yaml文件内容
# ....
- name: CALICO_IPV4POOL_CIDR
value: "10.244.0.0/16"
# ....
# /k8s,bgp 查找,同级新增如下
- name: CLUSTER_TYPE
value: "k8s,bgp"
- name: IP_AUTODETECTION_METHOD
value: "interface=eth0" # eth0为你网卡的名字
# 删除镜像 docker.io/ 前缀,避免下载过慢导致失败
sed -i 's#docker.io/##g' calico.yaml
# 启动calico
kubectl apply -f calico.yaml
查看节点
# 查看节点
kubectl get node
# 显示结果
NAME STATUS ROLES AGE VERSION
master Ready control-plane 95m v1.28.2
node NotReady <none> 13m v1.28.2
# 其中节点node一直为NotReady状态
# 查看node节点详细情况
kubectl describe no node
# 显示结果
KubeletNotReady
container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
# 解决办法详见:https://www.jianshu.com/p/3e6231eb7c8a
# 需要将master节点的/etc/cni/net.d数据拷贝到node节点/etc/cni/net.d
在master: scp -r /etc/cni/net.d/* root@nodeip:/etc/cni/net.d/
6. 测试 kubernetes 集群
# 创建部署
kubectl create deployment nginx --image=nginx
# 暴露端口
kubectl expose deployment nginx --port=80 --type=NodePort
# 查看 pod 以及服务信息
kubectl get pod,svc
详见:
3.1.1_搭建k8s集群-kubeadm搭建:初始化master节点_bilibili_哔哩哔哩_bilibili
不同厂商云服务器公网IP部署k8s集群(腾讯云+阿里云)_阿里云 不同vpc搭建k8s集群-CSDN博客
一台是阿里云,一台是腾讯云,一台是华为云,一台是百度云等多种公有云混合安装K8S集群-CSDN博客
containerd配置镜像加速
如果需要配置containerd的镜像加速,请参考下面配置;否则,跳过
如果镜像源地址无效,请替换镜像源地址
# 1、配置docker.io、registry.k8s.io、k8s.gcr.io
# 1.1、配置docker.io
mkdir -p /etc/containerd/certs.d/docker.io
touch /etc/containerd/certs.d/docker.io/hosts.toml
# hosts.toml文件内容
server = "https://docker.io"
[host."https://registry.cn-hangzhou.aliyuncs.com"]
capabilities = ["pull", "resolve"]
# 1.2、配置registry.k8s.io
mkdir -p /etc/containerd/certs.d/registry.k8s.io
touch /etc/containerd/certs.d/registry.k8s.io/hosts.toml
# hosts.toml文件内容
server = "https://registry.k8s.io"
[host."https://k8s.m.daocloud.io"]
capabilities = ["pull", "resolve"]
# 1.3配置k8s.gcr.io
mkdir -p /etc/containerd/certs.d/k8s.gcr.io
touch /etc/containerd/certs.d/k8s.gcr.io/hosts.toml
# hosts.toml文件内容
server = "https://k8s.gcr.io"
[host."k8s-gcr.m.daocloud.io"]
capabilities = ["pull", "resolve"]
# 2、配置/etc/containerd/config.toml,修改如下“config_path”的内容
......
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = "/etc/containerd/certs.d"
......
# 3、重新加载配置
systemctl restart containerd
# 4、查看是否加载成功
containerd config dump | grep config_path
# 显示内容
config_path = "/etc/containerd/certs.d"
containerd 介绍_containerd镜像加速-CSDN博客