1. kubernetes集群规划
主机IP | 主机名 | 主机配置 | 角色 |
---|---|---|---|
192.168.100.3 | master1 | 2C/4G | 管理节点 |
192.168.100.4 | node1 | 2C/4G | 工作节点 |
192.168.100.5 | node2 | 2C/4G | 工作节点 |
2. 集群前期环境准备
(1)初始化脚本
#!/bin/bash
echo "——>>> 关闭防火墙与SELinux <<<——"
sleep 3
systemctl disable firewalld --now &> /dev/null
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
echo "——>>> 创建阿里仓库 <<<——"
sleep 3
mv /etc/yum.repos.d/* /tmp
curl -o /etc/yum.repos.d/centos.repo https://mirrors.aliyun.com/repo/Centos-7.repo &> /dev/null
curl -o /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo &> /dev/null
echo "——>>> 设置时区并同步时间 <<<——"
sleep 3
timedatectl set-timezone Asia/Shanghai
yum install -y chrony &> /dev/null
systemctl enable chronyd --now &> /dev/null
sed -i '/^server/s/^/# /' /etc/chrony.conf
sed -i '/^# server 3.centos.pool.ntp.org iburst/a\server ntp1.aliyun.com iburst\nserver ntp2.aliyun.com iburst\nserver ntp3.aliyun.com iburst' /etc/chrony.conf
systemctl restart chronyd &> /dev/null
chronyc sources &> /dev/null
echo "——>>> 设置系统最大打开文件数 <<<——"
sleep 3
if ! grep "* soft nofile 65535" /etc/security/limits.conf &>/dev/null; then
cat >> /etc/security/limits.conf << EOF
* soft nofile 65535 # 软限制
* hard nofile 65535 # 硬限制
EOF
fi
echo "——>>> 系统内核优化 <<<——"
sleep 3
cat >> /etc/sysctl.conf << EOF
net.ipv4.tcp_syncookies = 1 # 防范SYN洪水攻击,0为关闭
net.ipv4.tcp_max_tw_buckets = 20480 # 此项参数可以控制TIME_WAIT套接字的最大数量,避免Squid服务器被大量的TIME_WAIT套接字拖死
net.ipv4.tcp_max_syn_backlog = 20480 # 表示SYN队列的长度,默认为1024,加大队列长度为8192,可以容纳更多等待连接的网络连接数
net.core.netdev_max_backlog = 262144 # 每个网络接口 接受数据包的速率比内核处理这些包的速率快时,允许发送到队列的数据包的最大数目
net.ipv4.tcp_fin_timeout = 20 # FIN-WAIT-2状态的超时时间,避免内核崩溃
EOF
echo "——>>> 减少SWAP使用 <<<——"
sleep 3
echo "0" > /proc/sys/vm/swappiness
echo "——>>> 安装系统性能分析工具及其他 <<<——"
sleep 3
yum install -y vim net-tools lsof wget lrzsz &> /dev/null
(2)配置主机映射
cat >> /etc/hosts << EOF
192.168.100.3 k8s-master
192.168.100.4 k8s-node1
192.168.100.5 k8s-node2
EOF
3. Docker环境安装
(1)安装Docker
[root@master ~]# curl -o /etc/yum.repos.d/docker-ce.repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
[root@master ~]# yum list docker-ce --showduplicates | sort -r
* updates: mirrors.aliyun.com
Loading mirror speeds from cached hostfile
Loaded plugins: fastestmirror
Installed Packages
* extras: mirrors.aliyun.com
docker-ce.x86_64 3:26.1.4-1.el7 docker-ce-stable
docker-ce.x86_64 3:26.1.4-1.el7 @docker-ce-stable
docker-ce.x86_64 3:26.1.3-1.el7 docker-ce-stable
docker-ce.x86_64 3:26.1.2-1.el7 docker-ce-stable
docker-ce.x86_64 3:26.1.1-1.el7 docker-ce-stable
docker-ce.x86_64 3:26.1.0-1.el7 docker-ce-stable
...
[root@master ~]# yum -y install docker-ce
[root@master ~]# systemctl enable docker --now
[root@master ~]# docker version
Client: Docker Engine - Community
Version: 26.1.4
API version: 1.45
Go version: go1.21.11
Git commit: 5650f9b
Built: Wed Jun 5 11:32:04 2024
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 26.1.4
API version: 1.45 (minimum version 1.24)
Go version: go1.21.11
Git commit: de5c9cf
Built: Wed Jun 5 11:31:02 2024
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.33
GitCommit: d2d58213f83a351ca8f528a95fbd145f5654e957
runc:
Version: 1.1.12
GitCommit: v1.1.12-0-g51d5e94
docker-init:
Version: 0.19.0
GitCommit: de40ad0
(2)配置镜像加速器和Cgroup驱动
sudo mkdir -p /etc/docker
sudo tee /etc/docker/daemon.json << 'EOF'
{
"registry-mirrors": [
"https://docker.aityp.com",
"https://docker.m.daocloud.io",
"https://reg-mirror.qiniu.com",
"https://k8s.m.daocloud.io",
"https://elastic.m.daocloud.io",
"https://gcr.m.daocloud.io",
"https://ghcr.m.daocloud.io",
"https://k8s-gcr.m.daocloud.io",
"https://mcr.m.daocloud.io",
"https://nvcr.m.daocloud.io",
"https://quay.m.daocloud.io",
"https://jujucharms.m.daocloud.io",
"https://rocks-canonical.m.daocloud.io",
"https://d3p1s1ji.mirror.aliyuncs.com"
],
"exec-opts": ["native.cgroupdriver=systemd"]
}
EOF
sudo systemctl daemon-reload
sudo systemctl restart docker
(3)安装cri-dockerd
Docker与Kubernetes通信的中间程序
K8s的1.24版本以后移除了docker-shim,而Docker Engine默认又不支持CRI规范,因而二者将无法直接完成整合,为此,Mirantis和Docker联合创建了cri-dockerd项目,用于为Docker Engine提供一个能够支持到CRI规范的垫片,从而能够让Kubernetes基于CRI控制Docker ,所以想在K8s的1.24版本及以后的版本中使用docker,需要安装cri-dockerd,然后K8s集群通过cri-dockerd联系到docker(注意每个节点都要安装)
项目地址:https://github.com/Mirantis/cri-dockerd
[root@master ~]# wget https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.2/cri-dockerd-0.3.2-3.el7.x86_64.rpm
[root@master ~]# rpm -ivh cri-dockerd-0.3.2-3.el7.x86_64.rpm
(4)编辑cri-docker.service文件
[root@master ~]# vim /usr/lib/systemd/system/cri-docker.service
...
[Service]
Type=notify
ExecStart=/usr/bin/cri-dockerd --container-runtime-endpoint fd:// --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.9 --network-plugin=cni --cni-bin-dir=/opt/cni/bin --cni-cache-dir=/var/lib/cni/cache --cni-conf-dir=/etc/cni/net.d
...
重启服务
systemctl daemon-reload
systemctl restart cri-docker.service
4. 配置阿里云YUM源
(1)添加k8s源
[root@master ~]# cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
(2)安装k8s工具
[root@master ~]# yum install -y kubelet-1.28.0 kubeadm-1.28.0 kubectl-1.28.0
- kubeadm:用于初始化集群,并配置集群所需的组件并生成对应的安全证书和令牌;
- kubelet:负责与 Master 节点通信,并根据 Master 节点的调度决策来创建、更新和删除 Pod,同时维护 Node 节点上的容器状态;
- kubectl:用于管理k8集群的一个命令行工具;
设置kubelet开启自启,不需要直接开启初始化过程会启动
[root@master ~]# systemctl enable kubelet
(3)初始化集群
命令行方式
kubeadm init \
--apiserver-advertise-address=192.168.100.3 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.28.0 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16 \
--cri-socket=unix:///var/run/cri-dockerd.sock
yaml文件方式
[root@master ~]# kubeadm config print init-defaults > kubeadm-config.yaml
[root@master ~]# cat kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.100.3 ### 本地IP地址
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/cri-dockerd.sock
imagePullPolicy: IfNotPresent
name: master ### 修改主机名
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers ### 修改仓库地址
kind: ClusterConfiguration
kubernetesVersion: 1.28.0
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
scheduler: {}
初始化集群
[root@master1 ~]# kubeadm init --config kubeadm-config.yaml --upload-certs
#选项说明:
--upload-certs //初始化过程将生成证书,并将其上传到etcd存储中,否则节点无法加入集群
初始化失败使用以下命令重置
[root@master1 ~]# kubeadm reset --cri-socket /var/run/cri-dockerd.sock
(4)配置认证文件
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
export KUBECONFIG=/etc/kubernetes/admin.conf
使用kubectl工具查看节点状态
[root@master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master NotReady control-plane 20s v1.28.0
注:由于网络插件还没有部署,节点会处于"NotReady"状态
(5)将node节点加入集群
kubeadm init \
--apiserver-advertise-address=192.168.100.3 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.28.0 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16 \
--cri-socket=unix:///var/run/cri-dockerd.sock
5. 配置Calico网络组件
(1)下载配置文件
wget https://docs.projectcalico.org/manifests/tigera-operator.yaml
wget https://docs.projectcalico.org/manifests/custom-resources.yaml
(2)编辑配置文件
[root@master ~]# vim custom-resources.yaml
# This section includes base Calico installation configuration.
# For more information, see: https://projectcalico.docs.tigera.io/master/reference/installation/api#operator.tigera.io/v1.Installation
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
name: default
spec:
# Configures Calico networking.
calicoNetwork:
# Note: The ipPools section cannot be modified post-install.
ipPools:
- blockSize: 26
cidr: 10.244.0.0/16 ### 修改为--pod-network-cidr地址
encapsulation: VXLANCrossSubnet
natOutgoing: Enabled
nodeSelector: all()
---
# This section configures the Calico API server.
# For more information, see: https://projectcalico.docs.tigera.io/master/reference/installation/api#operator.tigera.io/v1.APIServer
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata:
name: default
spec: {}
(3)部署Calico网络
注意使用apply会有如以下报错
[root@master ~]# kubectl apply -f tigera-operator.yaml
...
The CustomResourceDefinition "installations.operator.tigera.io" is invalid: metadata.annotations:
Too long: must have at most 262144 bytes
使用以下命令运行部署
[root@master ~]# kubectl create -f tigera-operator.yaml
[root@master ~]# kubectl create -f custom-resources.yaml
查看集群Pod运行状态
[root@master ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready control-plane 18m v1.28.0
node Ready <none> 17m v1.28.0
[root@master ~]# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
calico-apiserver calico-apiserver-7b9c9fb95d-7w666 1/1 Running 0 12s
calico-apiserver calico-apiserver-7b9c9fb95d-hfjjg 1/1 Running 0 12s
calico-system calico-kube-controllers-685f7c9b88-rpmvl 1/1 Running 0 29s
calico-system calico-node-d52d7 1/1 Running 0 30s
calico-system calico-node-t6qpr 1/1 Running 0 30s
calico-system calico-typha-589b7cd4b4-hvq7q 1/1 Running 0 30s
calico-system csi-node-driver-crmm9 2/2 Running 0 29s
calico-system csi-node-driver-kjnlc 2/2 Running 0 30s
kube-system coredns-66f779496c-6vpq8 1/1 Running 0 17m
kube-system coredns-66f779496c-khqb4 1/1 Running 0 17m
kube-system etcd-master 1/1 Running 0 18m
kube-system kube-apiserver-master 1/1 Running 0 18m
kube-system kube-controller-manager-master 1/1 Running 0 18m
kube-system kube-proxy-9ll4p 1/1 Running 0 16m
kube-system kube-proxy-wpgnh 1/1 Running 0 18m
kube-system kube-scheduler-master 1/1 Running 0 18m
tigera-operator tigera-operator-8547bd6cc6-vmjvq 1/1 Running 0 39s
(4)测试部署
[root@master ~]# kubectl create deployment --image nginx:1.20.2 nginx
[root@master ~]# kubectl get pod
NAME READY STATUS RESTARTS AGE
nginx-6f974c44c8-xzvwg 1/1 Running 0 65s
[root@master ~]# kubectl describe pod nginx-6f974c44c8-xzvwg
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 64s default-scheduler Successfully assigned default/nginx-6f974c44c8-xzvwg to node
Normal Pulling 63s kubelet Pulling image "nginx:1.20.2"
Normal Pulled 3s kubelet Successfully pulled image "nginx:1.20.2" in 1m0.406s (1m0.406s including waiting)
Normal Created 2s kubelet Created container nginx
Normal Started 2s kubelet Started container nginx