第一章 环境部署
Windows系统从这里开始
VMware Workstation
百度云盘下载实验环境Courseware_1C06_v1.22,解压到D:\,确保路径如下
使用默认配置安装VMware Workstation(不要修改安装路径),安装文件在D:\Courseware\CKA_Wayne\WinApps目录中。安装完成后,确保VMware workstation的网络配置中已经启用NAT模式网络,并且启用了DHCP(默认已经启用)
导入虚拟机
进入D:\Courseware\CKA_Wayne\VMBat-Container目录,双击执行clone-all-vms脚本自动创建VM。如果有权限错误,就右键–以管理员身份运行运行。
脚本运行完成后,会自动创建4个VM,并且打开一个txt文本,显示当前VM的IP信息
连接VM
使用D:\Courseware\CKA_Wayne\WinApps目录中的SSH工具MobaXterm,访问你获取到的第一个IP
VM初始化
生成ansible清单
[root@localhost ~]# cd /resources/playbooks/initialize
[root@localhost initialize]# ls
assign-ip.sh hosts hosts_template ifcfg-eth0_template ifcfg-eth1_template initialize.yml inventory
[root@localhost initialize]# ./assign-ip.sh
[clientvm]
clientvm.example.com ansible_connection=local ansible_ssh_host=192.168.241.132
[masters]
master.example.com ansible_user=root ansible_ssh_common_args="-o StrictHostKeyChecking=no" ansible_ssh_host=192.168.241.129
[workers]
worker1.example.com ansible_user=root ansible_ssh_common_args="-o StrictHostKeyChecking=no" ansible_ssh_host=192.168.241.130
worker2.example.com ansible_user=root ansible_ssh_common_args="-o StrictHostKeyChecking=no" ansible_ssh_host=192.168.241.131
[lab:children]
masters
workers
安装ansible
[root@localhost initialize]# yum install -y ansible
执行Playbook
[root@localhost initialize]# ansible-playbook -i inventory initialize.yml
MAC系统从这里开始
百度云盘下载实验环境Courseware_1C06_v1.22_VMReady
配置VMware NAT网络为192.168.126.0/24
手动导入4台VM
为4台VM配置虚拟机共享目录,共享名为resource,共享目录路径为环境中的resource目录所在的路径。启动虚拟机,在每一台虚拟机上验证可以访问/resource目录
ls /resource
Docker安装
[root@clientvm k8s]# cd /resources/playbooks/k8s/
[root@clientvm k8s]# ansible-playbook -i hosts docker.yml
验证
ansible -i hosts lab -m command -a 'docker images'
官方安装步骤
# (Install Docker CE)
## Set up the repository
### Install required packages
sudo yum install -y yum-utils device-mapper-persistent-data lvm2
## Add the Docker repository
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
# Install Docker CE
yum install -y containerd.io-1.2.13 docker-ce-19.03.11 docker-ce-cli-19.03.11
如果需要安装指定版本,请先显示所有版本号:
yum list docker-ce --showduplicates | sort -r
docker-ce.x86_64 3:18.09.1-3.el7 docker-ce-stable
docker-ce.x86_64 3:18.09.0-3.el7 docker-ce-stable
docker-ce.x86_64 18.06.1.ce-3.el7 docker-ce-stable
docker-ce.x86_64 18.06.0.ce-3.el7 docker-ce-stable
然后指定版本进行安装:
yum install docker-ce-<VERSION_STRING> docker-ce-cli-<VERSION_STRING> containerd.io
继续…
## Create /etc/docker
sudo mkdir /etc/docker
# Set up the Docker daemon
cat <<EOF | sudo tee /etc/docker/daemon.json
{
"registry-mirrors": ["https://pee6w651.mirror.aliyuncs.com", "https://ustc-edu-cn.mirror.aliyuncs.com"],
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
EOF
# Create /etc/systemd/system/docker.service.d
sudo mkdir -p /etc/systemd/system/docker.service.d
# Restart Docker
sudo systemctl daemon-reload
sudo systemctl restart docker
sudo systemctl enable docker
kubelet kubeadm kubectl安装
安装步骤
完成系统配置
- Turn off swapping
- Turn off SELinux
- Manage Kernel parameters
[root@clientvm k8s]# ansible-playbook -i hosts tune-os.yml
安装kubeadm, kubelet, kubectl
[root@clientvm k8s]# ansible-playbook -i hosts kubeadm-kubelet.yml
命令补全
[root@clientvm k8s]# echo "source <(kubectl completion bash)" >>~/.bashrc
[root@clientvm k8s]# . ~/.bashrc
官方安装步骤
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
显示所有版本
yum list kubeadm --showduplicates | sort -r
安装指定版本
setenforce 0
yum install -y kubelet-<VERSION_STRING> kubeadm-<VERSION_STRING> kubectl-<VERSION_STRING>
systemctl enable kubelet && systemctl start kubelet
安装K8S集群
安装master
预先导入Image以节约时间,为避免多虚拟机同时读写磁盘数据带来磁盘压力导致镜像导入出错,增加 --forks 1 的参数,配置并行数量为1:
[root@clientvm k8s]# ansible-playbook --forks 1 -i hosts preload-images.yml
如果镜像导入出错,需要在每个节点上执行如下命令删除镜像,然后重新导入:
for i in $(docker images | awk '{print $3}' |grep -v IMAGE); do docker rmi $i ; done
导入时,可以分别使用以下几个yml文件替换之前的preload-images.yml,分批次导入:
[root@clientvm ~]# ll /resources/playbooks/k8s/
total 25
-rwxrwxrwx. 1 root root 2061 Nov 20 17:40 01-preload-install-Image.yml
-rwxrwxrwx. 1 root root 1424 Nov 20 17:45 02-preload-other.yml
-rwxrwxrwx. 1 root root 1261 Nov 20 17:43 03-preload-ingress-storage-metallb.yml
-rwxrwxrwx. 1 root root 1951 Nov 20 17:46 04-preload-harbor-Exam.yml
-rwxrwxrwx. 1 root root 531 Nov 24 10:21 05-preload-dashboard.yaml
-rwxrwxrwx. 1 root root 1400 Nov 24 15:13 06-preload-Prometheus.yaml
[root@clientvm k8s]# ssh master
Last login: Thu Nov 26 11:53:41 2020 from 192.168.241.132
[root@master ~]#
[root@master ~]# source <(kubeadm completion bash)
创建生成配置文件
以下IP替换为你自己master节点的IP:
[root@master ~]# kubeadm config print init-defaults >init.yaml
[root@master ~]# vim init.yaml
##修改如下几行
advertiseAddress: 192.168.133.129
imageRepository: registry.aliyuncs.com/google_containers
......
networking:
dnsDomain: example.com
serviceSubnet: 10.96.0.0/12
podSubnet: 10.244.0.0/16
初始化
可以修改此文件中的IP地址后直接使用:/resources/yaml/cluster-init.yaml
[root@master ~]# kubeadm init --config /resources/yaml/cluster-init.yaml
......
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.133.129:6443 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:3c2a964155d000ac6950f7bc33f765e937fe2f58fdf4c2fe99792f886a4a84a4
手动拉取镜像命令
kubeadm config images pull --config cluster-init.yaml
配置kubectl
配置master的kubectl
[root@master ~]# mkdir -p ~/.kube
[root@master ~]# cp -i /etc/kubernetes/admin.conf ~/.kube/config
配置客户端VM的kubectl
[root@clientvm k8s]# mkdir -p ~/.kube
[root@clientvm k8s]# scp master:/root/.kube/config ~/.kube/
[root@clientvm k8s]# kubectl get node
NAME STATUS ROLES AGE VERSION
master.example.com Ready master 46m v1.20.0
(二选一)部署网络组件Flannel
官方文档: https://github.com/coreos/flannel
https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
[root@clientvm ~]# kubectl apply -f /resources/yaml/kube-flannel.yml
[root@clientvm ~]# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6d56c8448f-c52pv 1/1 Running 0 9m52s
kube-system coredns-6d56c8448f-kx9l7 1/1 Running 0 9m52s
kube-system etcd-master.example.com 1/1 Running 0 10m
kube-system kube-apiserver-master.example.com 1/1 Running 0 10m
kube-system kube-controller-manager-master.example.com 1/1 Running 0 10m
kube-system kube-flannel-ds-z2f78 1/1 Running 0 6m36s
kube-system kube-proxy-9dlxj 1/1 Running 0 9m52s
kube-system kube-scheduler-master.example.com 1/1 Running 0 10m
(二选一)部署网络组件calico
可参考官方文档:
https://docs.projectcalico.org/getting-started/kubernetes/quickstart
或直接使用以下yaml
https://docs.projectcalico.org/v3.14/manifests/calico.yaml
https://docs.projectcalico.org/v3.17/manifests/calico.yaml
K8S1.20版本请使用:calico-v3.14.yaml
K8S1.22版本请使用:calico-v3.21.yaml,镜像已经预先导入
[root@master ~]# cd /resources/yaml/
[root@master yaml]# ls
calico.yaml cluster-init.yaml
[root@master yaml]# kubectl get node
NAME STATUS ROLES AGE VERSION
master.example.com NotReady master 4m57s v1.20.0
[root@master yaml]#
[root@master yaml]# kubectl apply -f calico-v3.21.yaml
如果因为master节点默认设置为被污染,禁止POD调度,那么POD就无法创建,需要先移除污染
(如果预先添加了节点,就不用这一步)
[root@master yaml]# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-6dfcd885bf-hzg95 0/1 Pending 0 119s
kube-system calico-node-kdrvs 0/1 Init:0/3 0 119s
kube-system coredns-6d56c8448f-hfxnk 0/1 Pending 0 7m5s
kube-system coredns-6d56c8448f-mtxb7 0/1 Pending 0 7m5s
[root@master yaml]# kubectl taint nodes --all node-role.kubernetes.io/master-
node/master.example.com untainted
[root@master yaml]# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-6dfcd885bf-hzg95 0/1 ContainerCreating 0 10m
kube-system calico-node-kdrvs 0/1 PodInitializing 0 10m
kube-system coredns-6d56c8448f-hfxnk 0/1 ContainerCreating 0 15m
kube-system coredns-6d56c8448f-mtxb7 0/1 ContainerCreating 0 15m
[root@master yaml]# kubectl get node
NAME STATUS ROLES AGE VERSION
master.example.com Ready master 29m v1.20.0
添加节点
[root@clientvm k8s]# ssh worker1
Last login: Thu Nov 26 16:27:47 2020 from 192.168.241.132
[root@worker1 ~]#
[root@worker1 ~]#
[root@worker1 ~]# kubeadm join 192.168.133.129:6443 --token abcdef.0123456789abcdef \
> --discovery-token-ca-cert-hash sha256:00a111079e7d2e367e2b21500c64202a981898cf7e058957cfa5d06e933c2362
[root@clientvm k8s]# ssh worker2
Last login: Thu Nov 26 16:27:44 2020 from 192.168.241.132
[root@worker2 ~]# kubeadm join 192.168.133.129:6443 --token abcdef.0123456789abcdef \
> --discovery-token-ca-cert-hash sha256:00a111079e7d2e367e2b21500c64202a981898cf7e058957cfa5d06e933c2362
[root@clientvm yaml]# kubectl get node
NAME STATUS ROLES AGE VERSION
master.example.com Ready control-plane,master 4m3s v1.20.0
worker1.example.com Ready <none> 2m3s v1.20.0
worker2.example.com Ready <none> 118s v1.20.0
[root@clientvm ~]# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-6dfcd885bf-rf9c5 1/1 Running 0 16m
kube-system calico-node-7f6vl 1/1 Running 0 12m
kube-system calico-node-ncbcw 1/1 Running 0 16m
kube-system calico-node-rvddq 1/1 Running 0 12m
kube-system coredns-6d56c8448f-2hqt4 1/1 Running 0 23m
kube-system coredns-6d56c8448f-6rwwd 1/1 Running 0 23m
kube-system etcd-master.example.com 1/1 Running 0 23m
kube-system kube-apiserver-master.example.com 1/1 Running 0 23m
kube-system kube-controller-manager-master.example.com 1/1 Running 0 23m
kube-system kube-proxy-7wsbm 1/1 Running 0 23m
kube-system kube-proxy-vmbgn 1/1 Running 0 12m
kube-system kube-proxy-xhs29 1/1 Running 0 12m
kube-system kube-scheduler-master.example.com 1/1 Running 0 23m
ComponentStatus资源报错
故障:
[root@master yaml]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
scheduler Unhealthy Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused
controller-manager Unhealthy Get "http://127.0.0.1:10252/healthz": dial tcp 127.0.0.1:10252: connect: connection refused
etcd-0 Healthy {"health":"true"}
解决:
#编辑如下两个配置文件,注释掉- port=0 的行
[root@master yaml]# vim /etc/kubernetes/manifests/kube-controller-manager.yaml
[root@master yaml]# vim /etc/kubernetes/manifests/kube-scheduler.yaml
[root@master yaml]# grep 'port=0' /etc/kubernetes/manifests/kube-controller-manager.yaml /etc/kubernetes/manifests/kube-scheduler.yaml
/etc/kubernetes/manifests/kube-controller-manager.yaml:# - --port=0
/etc/kubernetes/manifests/kube-scheduler.yaml:# - --port=0
## 重启 kubelet.service 服务
[root@master yaml]# systemctl restart kubelet.service
[root@master yaml]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health":"true"}
删除节点
在需要删除的节点上运行
[root@worker2 ~]# kubeadm reset -f
[root@worker2 ~]# iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
[root@worker2 ~]# ipvsadm -C
在master上运行
[root@master yaml]# kubectl delete node worker2.example.com
node "worker2.example.com" deleted
[root@master yaml]# kubectl delete node worker2.example.com
node "worker2.example.com" deleted
[root@master yaml]# kubectl delete node worker1.example.com
node "worker1.example.com" deleted
[root@master yaml]#
[root@master yaml]# kubectl get node
NAME STATUS ROLES AGE VERSION
master.example.com Ready master 32m v1.20.0
Token过期后加入节点
在master上列出Token
[root@master yaml]# kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
abcdef.0123456789abcdef 23h 2020-11-27T16:29:16+08:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
生成永久Token
[root@master yaml]# kubeadm token create --ttl 0
[root@master yaml]# kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
2kpxk0.3861kgminh7jafrp <forever> <never> authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
abcdef.0123456789abcdef 23h 2020-11-27T16:29:16+08:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
获取discovery-token-ca-cert-hash
[root@master yaml]# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
00a111079e7d2e367e2b21500c64202a981898cf7e058957cfa5d06e933c2362
在节点上执行命令加入集群
[root@worker1 ~]# kubeadm join 192.168.133.129:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:00a111079e7d2e367e2b21500c64202a981898cf7e058957cfa5d06e933c2362
Containerd参考
使用Containerd作为RUNC
Containerd 安装配置参考:
https://kubernetes.io/docs/setup/production-environment/container-runtimes/#containerd
注意: 还需要按照kubeadm的版本相应修改image registry为:registry.aliyuncs.com/google_containers
pause容器版本为K8S兼容的版本。
......
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.6"
......
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
......
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
参考配置文件:/resources/playbooks/k8s/config.toml
部署步骤:
/resources/playbooks/k8s
ansible-playbook -i hosts containerd.yaml
ansible-playbook -i hosts tune-os.yml
ansible-playbook -i hosts kubeadm-kubelet.yml
kubeadm init --config /resources/yaml/cluster-init-containerd.yaml
在cluster-init-containerd.yaml 中需要修改criSocket指定RUNC,位置与containerd配置相同:
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.126.128
bindPort: 6443
nodeRegistration:
criSocket: unix:///run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
name: master.example.com
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.23.0
networking:
dnsDomain: example.com
serviceSubnet: 10.96.0.0/12
scheduler: {}
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd