目录
第一步:搭建k8s master
禁用系统中所有的交换空间:
swapoff -a
安装docker:
sudo apt install docker.io
确保系统的包管理器是最新的,并安装所需的软件包以支持使用 HTTPS 协议获取 Kubernetes 相关的软件包
apt-get update
apt-get install -y apt-transport-https
将官方的 Kubernetes APT 存储库添加到系统的包管理器中:
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
安装 Kubernetes 相关组件:
apt-get update
apt-get install -y kubelet kubeadm kubectl
拉取所需的 Kubernetes 容器镜像,并准备好在 Kubernetes 集群中使用这些镜像(国内需要通过阿里云):
kubeadm config images pull --image-repository=registry.aliyuncs.com/google_containers
初始化 Kubernetes 主节点(Master Node):
sudo kubeadm init --apiserver-advertise-address <your-master-ip> --pod-network-cidr 10.244.0.0/16 --image-repository registry.aliyuncs.com/google_containers
报错:[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
.... [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Starting the kubelet [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [kubelet-check] Initial timeout of 40s passed. [kubelet-check] It seems like the kubelet isn't running or healthy. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused. ....
root@user:path# systemctl status kubelet ● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: activating (auto-restart) (Result: exit-code) since Fri 2024-02-23 15:37:06 CST; 6s ago Docs: https://kubernetes.io/docs/home/ Process: 637003 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=1/FAILURE) Main PID: 637003 (code=exited, status=1/FAILURE)
kebelet启动失败。
解决办法:
是版本不兼容问题导致了kubelet 没有启动成功,
remove了containerd并重装了一下。
systemctl stop containerd apt reinstall containerd systemctl start containerd
报错:[ERROR Port-10250]: Port 10250 is in use
root@ubuntu:~# sudo kubeadm init --apiserver-advertise-address .... --pod-network-cidr 10.244.0.0/16 --image-repository registry.aliyuncs.com/google_containers I0223 11:26:31.707217 1887668 version.go:256] remote version is much newer: v1.29.2; falling back to: stable-1.28 [init] Using Kubernetes version: v1.28.7 [preflight] Running pre-flight checks [WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service' error execution phase preflight: [preflight] Some fatal errors occurred: [ERROR Port-10250]: Port 10250 is in use [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...` To see the stack trace of this error execute with --v=5 or higher
解决方式:
kubeadm reset
报错: timed out waiting for the condition
..... [kubelet-start] Starting the kubelet [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [kubelet-check] Initial timeout of 40s passed. Unfortunately, an error has occurred: timed out waiting for the condition This error is likely caused by: - The kubelet is not running - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled) .....
观察keblet是正常的(绿色的active):
systemctl status kubelet
解决办法:
版本换成了1.27.6就成功了,我也不知道为啥,可能是网络没加速有什么包下载不下来。
#下载Kubernetes 软件包仓库的公共签名密钥 rm -rf /etc/apt/keyrings/kubernetes-apt-keyring.gpg curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.27/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg #添加 Kubernetes apt 仓库 cat <<EOF >/etc/apt/sources.list.d/kubernetes.list deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.27/deb/ / EOF #更新 apt 包索引,安装 kubelet、kubeadm 和 kubectl,并锁定其版本 apt-get update apt-get install -y kubelet=1.27.6-1.1 kubectl=1.27.6-1.1 kubeadm=1.27.6-1.1
报错:....already exists
error execution phase preflight: [preflight] Some fatal errors occurred: [ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists [ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists [ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists [ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists [ERROR Port-10250]: Port 10250 is in use [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
解决办法:这是因为上次启动失败了配置文件还在:
kubeadm reset
kubeadm init成功后可以看看状态:
kubectl get nodes
第二步:搭建k8s node
设置主机名:
cat << EOF >> /etc/hosts
$IP_NODE_01 $HOSTNAME_NODE01
$IP_MASTER $HOSTNAME_MASTER
EOF
其他的和master一样。
第三步:k8s集群搭建
打印加入集群token(master节点执行):
kubeadm token create --print-join-command
加入集群命令(node节点执行):
kubeadm join k8s-master-01:6443 --token ????????? \
--discovery-token-ca-cert-hash sha256:?????????? \