kubernetes_27_基于containerd部署kubernetes v1.23.5

最新推荐文章于 2024-08-13 11:52:38 发布

hellowordx007

最新推荐文章于 2024-08-13 11:52:38 发布

阅读量719

点赞数 2

分类专栏： kubernetes-1.23.5 文章标签： kubernetes 云原生容器

本文链接：https://blog.csdn.net/u013984806/article/details/127194221

版权

kubernetes-1.23.5 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

介绍

多年间，Docker、Kubernetes 被视为云计算时代下开发者的左膀右臂

Docker 作为一种开源的应用容器引擎，开发者可以打包他们的应用及依赖到一个可移植的容器中，发布到流行的 Linux 机器上，也可实现虚拟化。
Kubernetes，被称之为为 Docker 而生。同样作为开源容器集群管理系统，被用于管理云平台中多个主机上的容器化的应用。

不过，在k8s 1.20的版本中不在使用docker shim,并会在未来的版本中删除，转而使用是容器运行时的东西，负责提取和运行容器映像。Docker 是该运行时的热门选择（其他常见选项包括contained / CRI-O）,因此，作为用户，接下来，只需要将容器运行时从 Docker 更改为另一个受支持的容器运行时即可。

以下基于 containerd 部署 kubernetes集群，不再基于docker

配置环境

操作系统: CentOS Stream release 8
VIP： 192.168.0.179

ip	hostname	部署应用
192.168.0.100	www.kevin.com	运维节点，包括dns、CA证书生成
192.168.0.170	k8s-170.kevin.com	keepalived
192.168.0.171	k8s-171.kevin.com	keepalived
192.168.0.172	k8s-172.kevin.com
192.168.0.173	k8s-173.kevin.com

安装前配置

每台节点关闭防火墙

[root@k8s-170 ~]# systemctl stop firewalld
[root@k8s-170 ~]# systemctl disable firewalld

每台节点安装工具

[root@k8s-170 ~]# dnf -y install epel-release vim wget net-tools telnet tree nmap sysstat lrzsz dos2unix bind-utils

每台节点关闭SELinux

## 临时关闭
[root@k8s-170 ~]# setenforce 0
## 永久关闭
[root@k8s-170 ~]# sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config

配置hostname

## 每个节点设置要不一样
[root@k8s-170 ~]# hostnamectl set-hostname k8s-170.kevin.com

关闭交换分区

[root@k8s-170 ~]# swapoff -a
[root@k8s-170 ~]# sed -ri 's/.*swap.*/#&/' /etc/fstab

每台节点配置时间同步

[root@k8s-170 ~]# systemctl restart chronyd.service 
[root@k8s-170 ~]# systemctl enable chronyd.service

每台节点配置加载所需内核模块

[root@k8s-170 ~]# cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF

[root@k8s-170 ~]# sudo modprobe overlay
[root@k8s-170 ~]# sudo modprobe br_netfilter

每台节点配置加载ipvs模块（可选项，默认为iptables模式）

kuber-proxy代理支持iptables和ipvs两种模式，如果使用ipvs模式需要在初始化集群前所有节点加载ipvs模块并安装ipset工具，Linux kernel 4.19以上的内核版本使用nf_conntrack代替nf_conntrack_ipv4

[root@k8s-170 ~]# cat > /etc/modules-load.d/ipvs.conf <<EOF
# Load IPVS at boot
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack_ipv4
EOF
[root@k8s-170 ~]# systemctl enable --now systemd-modules-load.service

## 确定内核模块加载成功
[root@k8s-170 ~]# lsmod | grep -e ip_vs -e nf_conntrack_ipv4
ip_vs_sh               16384  0
ip_vs_wrr              16384  0
ip_vs_rr               16384  0
ip_vs                 172032  6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_defrag_ipv6         20480  2 nf_conntrack_ipv6,ip_vs
nf_conntrack_ipv4      16384  1
nf_defrag_ipv4         16384  1 nf_conntrack_ipv4
nf_conntrack          155648  6 nf_conntrack_ipv6,nf_conntrack_ipv4,nf_nat,nf_nat_ipv6,nf_nat_ipv4,ip_vs
libcrc32c              16384  4 nf_conntrack,nf_nat,xfs,ip_vs

#安装ipset、ipvsadm
[root@k8s-170 ~]# dnf install -y ipset ipvsadm

每台节点设置sysctl 参数，允许iptables检查桥接流量，这些参数在重新启动后仍然存在

[root@k8s-170 ~]# cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables  = 1
net.ipv4.ip_forward                 = 1
vm.swappiness                       = 0
net.bridge.bridge-nf-call-ip6tables = 1
EOF

## 应用sysctl参数而无需重新启动
[root@k8s-170 ~]# sudo sysctl --system

部署 Keepalived

在 192.168.1.170 和 192.168.1.171都执行

[root@k8s-170 ~]# dnf -y install keepalived

两个节点都配置检查 api-server监听的 6443端口

[root@k8s-170 ~]# vim /etc/keepalived/check_port.sh

#####################内容如下  #####################
#!/bin/bash

# keepalived 监控端口脚本
# 使用方法:
# 在 keepalived配置文件中

# vrrp_script check_port {
#    script "/etc/keepalived/check_port.sh 6379" # 配置监听的端口
#    interval 2 # 检查脚本的时间间隔，单位(秒)
#
#}

CHK_PORT=$1
if [ -n "$CHK_PORT" ];then
    PORT_PROCESS=`ss -lnt |grep $CHK_PORT|wc -l`
    if [ $PORT_PROCESS -eq 0 ];then
        echo "Port $CHK_PORT is not used,End."
        exit 1
    fi
else
    echo "Check port can not be empty!"
fi
#####################内容结束  #####################
## 配置可执行权限
[root@k8s-170 ~]# chmod +x /etc/keepalived/check_port.sh

在 192.168.0.170 配置 keepalived.conf 如下

[root@k8s-170 ~]# vim /etc/keepalived/keepalived.conf
global_defs {
    router_id 192.168.0.170
}

vrrp-script chk_nginx {
    script "/etc/keepalived/check_port.sh 6443"
        interval 2
        weight -20
}

vrrp_instance VI_1  {
    state MASTER
    interface ens33
    virtual_router_id 170
    priority 100
    advert_int 1
    mcast_src_ip 192.168.0.170
# 工作模式，nopreempt表示工作在非抢占模式，默认是抢占模式 preempt，在生产环境一定要配置为nopreempt 
#不要让VIP老是变动，VIP经常变动已触发了高可用机制，属于重大生产事故
# 当vip漂移后，如果再切回之前某台服务器作为VIP服务器，可在访问量低的时候重启服务。
    nopreempt

    authentication {
        auth_type PASS
        auth_pass 1111
    }

    track_script {
        chk_nginx
    }

    virtual_ipaddress {
        192.168.0.179
    }
}

在 192.168.0.171 配置 keepalived.conf 如下

[root@k8s-171 ~]# cat /etc/keepalived/keepalived.conf
global_defs {
    router_id 192.168.1.171
}

vrrp-script chk_nginx {
    script "/etc/keepalived/check_port.sh 6443"
        interval 2
        weight -20
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens33
    virtual_router_id 170
    priority 90
    advert_int 1
    mcast_src_ip 192.168.0.171
# 工作模式，nopreempt表示工作在非抢占模式，默认是抢占模式 preempt，在生产环境一定要配置为nopreempt 
#不要让VIP老是变动，VIP经常变动已触发了高可用机制，属于重大生产事故
# 当vip漂移后，如果再切回之前某台服务器作为VIP服务器，可在访问量低的时候重启服务。
    nopreempt

    authentication {
        auth_type PASS
        auth_pass 1111
    }

    track_script {
        chk_nginx
    }

    virtual_ipaddress {
        192.168.0.179
    }
}

在两台服务器都启动 keepalived

[root@k8s-170 ~]# systemctl start keepalived.service
[root@k8s-170 ~]# systemctl enable keepalived.service
## 在 170服务器查看，179的ip也在这台服务器上
[root@k8s-170 ~]# ip addr show ens160
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 00:0c:29:94:2d:be brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.170/24 brd 192.168.1.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever
    inet 192.168.0.179/32 scope global ens33
       valid_lft forever preferred_lft forever
    inet6 2409:8a55:611:dc40:5942:a3d0:7aef:99a4/64 scope global dynamic noprefixroute 
       valid_lft 86250sec preferred_lft 86250sec
    inet6 fe80::92c3:94e5:46fc:64f2/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

## 在 171服务器查看，没有 179的VIP
[root@k8s-171 ~]# ip addr show ens160
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 00:0c:29:9f:63:79 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.171/24 brd 192.168.1.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever
    inet6 2409:8a55:611:dc40:20c:29ff:fe9f:6379/64 scope global dynamic mngtmpaddr 
       valid_lft 86250sec preferred_lft 86250sec
    inet6 fe80::20c:29ff:fe9f:6379/64 scope link 
       valid_lft forever preferred_lft forever
[root@k8s-171 ~]#

安装 kubelet组件

所有节点配置kubernetes软件源（使用阿里云镜像源）

[root@k8s-170 ~]# vim /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-$basearch
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg

安装kubelet kubeadm kubectl组件，所有节点都需要执行（默认安装最新版本）

## 可以先查询有哪些版本
[root@k8s-170 ~]# dnf list kubeadm --showduplicates
## 默认安装最新版本
[root@k8s-170 ~]# dnf install -y kubelet kubeadm kubectl

## 如果要安装指定版本，这里使用的这个命令安装 1.23.5版本
[root@k8s-170 ~]# dnf install -y kubelet-1.23.5 kubeadm-1.23.5 kubectl-1.23.5

生成根证书

默认情况下kubeadm安装集群时，会自动生成相关证书，默认的根证书有效期为10年，客户端证书的有效时间为1年，这里使用自定义根证书来颁发

在192.168.0.102节点执行

[root@k8s-102  ~]# wget https://github.com/cloudflare/cfssl/releases/download/v1.6.1/cfssl_1.6.1_linux_amd64 -O /usr/bin/cfssl
[root@k8s-102 ~]# wget https://github.com/cloudflare/cfssl/releases/download/v1.6.1/cfssljson_1.6.1_linux_amd64 -O /usr/bin/cfssl-json
[root@k8s-102  ~]# wget https://github.com/cloudflare/cfssl/releases/download/v1.6.1/cfssl-certinfo_1.6.1_linux_amd64 -O /usr/bin/cfssl-certinfo
[root@k8s-102 ~]# chmod +x /usr/bin/cfssl*
[root@k8s-102  ~]# mkdir /opt/certs
[root@k8s-102 ~]# cd /opt/certs
[root@k8s-102 certs]# vim ca-csr.json
##################内容如下########################
{
    "CN": "Kevin",
    "hosts": [],
    "key":{
        "algo": "rsa",
        "size": 2048
    },
    "names":[
        {
            "C": "CN",
            "ST": "beijing",
            "L": "beijing",
            "O": "Kevin",
            "OU": "www"
        }
    ],
    "ca":{
        "expiry": "175200h"
    }
}
##########################################
#CN: Common Name: 浏览器使用该字段验证合法性，一般写域名，非常重要，浏览器会使用该字段验证网站是否合法。
#C： Country 国家
#ST: State ,州、省
#L: Locality，地区、城市
#O: Organization Name: 组织机构名，公司名称
#OU: Organization Unit Name: 组织机构单位、公司部门
#expiry: 证书过期时间，这个时间非常重要，k8s用 kube-admin生成的证书过期时间为 1年。过期后需要更新证书。
[root@k8s-102 certs]# cfssl genkey -initca ca-csr.json |cfssl-json -bare ca
[root@www certs]# ll
total 16
-rw-r--r-- 1 root root 1037 Apr 16 21:47 ca.csr
-rw-r--r-- 1 root root  319 Apr 16 21:47 ca-csr.json
-rw------- 1 root root 1679 Apr 16 21:47 ca-key.pem
-rw-r--r-- 1 root root 1294 Apr 16 21:47 ca.pem
[root@www certs]#

复制生成的证书到 192.168.0.170节点上

## 在 170执行
[root@k8s-170 ~]# mkdir /etc/kubernetes/pki -p
[root@k8s-170 ~]# cd /etc/kubernetes/pki
[root@k8s-170 ~]# rsync -av 192.168.0.102:/opt/certs/{ca,ca-key}.pem .
##将证书重命名，以下两步一定要做，因为k8s会使用以下两个文件作为CA的根证书
[root@k8s-170 pki]# mv ca.pem ca.crt
[root@k8s-170 pki]# mv ca-key.pem ca.key

配置kubelet使用containerd作为容器运行时，指定cgroupDriver为systemd模式（两种方法实现）

方法一

配置kubelet使用containerd（所有节点都要配置cgroup-driver=systemd参数，否则node节点无法自动下载和创建pod）

[root@k8s-170 ~]# cat > /etc/sysconfig/kubelet <<EOF
KUBELET_EXTRA_ARGS=--cgroup-driver=systemd
EOF
[root@k8s-170 ~]#

方法二:

如果不想修改/etc/sysconfig/kubelet配置，kubeadm init必须使用yaml文件来初始化传递cgroupDriver参数，可以通过如下命令导出默认的初始化配置

[root@k8s-170 ~]# kubeadm config print init-defaults > kubeadm-config.yaml

然后根据自己的需求修改配置，比如修改imageRepository的值，kube-proxy模式为ipvs，需要注意的是由于使用containerd作为运行时，所以在初始化节点的时候需要指定cgroupDriver为systemd模式，修改后的 kubeadm-config.yaml 如下:

apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.0.170 ## 每个节点的id
  bindPort: 6443    ## apiserver的端口
nodeRegistration:
  criSocket: /run/containerd/containerd.sock ## 使用 containerd
  name: k8s-170.kevin.com     ## 每个节点的hostname
  taints:                    ## taints，默认master节点为NoSchedule，也可以设置为null
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki   ## 证书生成的目录
clusterName: kubernetes
controllerManager: {}
dns:
  type: CoreDNS
etcd:
  local:
    dataDir: /var/lib/etcd  ## etcd数据目录，当然也可以配置为自己的etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers ## 镜像创建，这里配置为alibaba的
kind: ClusterConfiguration
kubernetesVersion: v1.20.5
controlPlaneEndpoint: 192.168.0.179:6443  ##配置vip的地址
networking:
  dnsDomain: cluster.local
  podSubnet: 10.244.0.0/16  ## 这里要和使用的网络插件一致,calio默认为192.168.0.0/16 ，flannel默认为10.244.0.0/16
  serviceSubnet: 10.96.0.0/12
scheduler: {}
--- 
apiVersion: kubeproxy.config.k8s.io/v1alpha1 
kind: KubeProxyConfiguration 
mode: ipvs ## 使用ipvs调动流量
--- 
apiVersion: kubelet.config.k8s.io/v1beta1 
kind: KubeletConfiguration 
cgroupDriver: systemd ## 使用systemd

设置kubelet开机自启

每台节点执行

[root@k8s-170 ~]# systemctl enable --now kubelet

安装 containerd 组件

所有节点配置containerd软件源

containerd组件默认在docker-ce源中

[root@k8s-170 ~]# dnf install -y yum-utils device-mapper-persistent-data lvm2
[root@k8s-170 ~]# yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

查看containerd.io可用版本

[root@k8s-170 ~]# dnf search containerd.io --showduplicates

所有节点安装最新版containerd.io

[root@k8s-170 ~]# dnf install -y containerd.io

所有节点创建containerd配置文件config.toml

[root@k8s-170 ~]# mkdir -p /etc/containerd
[root@k8s-170 ~]# containerd config default | sudo tee /etc/containerd/config.toml

修改config.toml参数

## plugins."io.containerd.grpc.v1.cri下的 sandbox_image 修改为 registry.aliyuncs.com/google_containers/pause:3.2
## containerd.runtimes.runc.options 添加 SystemdCgroup = true 
## registry.mirrors."docker.io下的 endpoint 修改为 https://registry.cn-hangzhou.aliyuncs.com
[root@k8s-170 ~]# vim /etc/containerd/config.toml
[plugins]
  [plugins."io.containerd.grpc.v1.cri"]
    sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.5" ## 修改项
    [plugins."io.containerd.grpc.v1.cri".containerd]
      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
        [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
          [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
            SystemdCgroup = true   ## 添加这一行
    [plugins."io.containerd.grpc.v1.cri".registry]
      [plugins."io.containerd.grpc.v1.cri".registry.mirrors]
        [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"] ##添加此行
          endpoint = ["https://registry.cn-hangzhou.aliyuncs.com"]   ## 修改项

重启containerd.io

[root@k8s-170 ~]# systemctl daemon-reload
[root@k8s-170 ~]# systemctl enable containerd
[root@k8s-170 ~]# systemctl restart containerd

安装CRI客户端工具crictl

工具下载地址：https://github.com/kubernetes-sigs/cri-tools/releases/

[root@k8s-170 ~]# wget https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.23.0/crictl-v1.23.0-linux-amd64.tar.gz
[root@k8s-170 ~]# tar zxvf crictl-v1.23.0-linux-amd64.tar.gz -C /usr/local/bin

[root@k8s-170 ~]# cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false
EOF

#或者执行以下语句添加参数
crictl config runtime-endpoint unix:/run/containerd/containerd.sock

验证是否可用

任意节点验证即可

[root@k8s-170 ~]# crictl pull nginx
[root@k8s-170 ~]# crictl images
[root@k8s-170 ~]# crictl rmi nginx

所有节点重启containerd和kubelet

[root@k8s-170 ~]# systemctl daemon-reload
[root@k8s-170 ~]# systemctl restart containerd && systemctl restart kubelet

最后确认kubelet和containerd版本

任意节点执行即可

[root@k8s-170 ~]# containerd --version
containerd containerd.io 1.5.11 3df54a852345ae127d1fa3092b95168e4a88e2f8
[root@k8s-170 ~]# kubelet --version
Kubernetes v1.23.5

在master节点启动集群

仅在 192.168.1.170 节点执行

[root@k8s-170 ~]# kubeadm init --config=kubeadm-config.yaml --upload-certs
####### 一系列输出，当看下如下输出时，master节点配置成功 ######################
Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:
#### 其它节点以master身份加入集群命令
  kubeadm join 192.168.1.179:6443 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:dc93ce29cf6b9112aebed16a050abdf030dabbbff92c42700a66d544083ac346 \
    --control-plane --certificate-key aea69a228acb46bffd9faf54a1d29621cb21d8fb1b6dddb8b34417d94dba9132

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:
#### 其它节点以worker身份加入集群命令
kubeadm join 192.168.1.179:6443 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:dc93ce29cf6b9112aebed16a050abdf030dabbbff92c42700a66d544083ac346

### 根据上面的输出依次执行以下命令
[root@k8s-170 ~]# mkdir -p $HOME/.kube
[root@k8s-170 ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@k8s-170 ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config

## 此时，执行以下命令查看node，当前节点已加入集群，但还是 NotReady 状态，需要安装网络插件
[root@k8s-170 ~]# kubectl get nodes
NAME                STATUS     ROLES                  AGE     VERSION
k8s-170.kevin.com   NotReady   control-plane,master   2m28s   v1.23.5
[root@k8s-170 ~]#

安装网络插件

## kube-flannel 文件地址： https://github.com/flannel-io/flannel/blob/master/Documentation/kube-flannel.yml
[root@k8s-170 ~]# vim kube-flannel.yml
###########文件开始#################
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: psp.flannel.unprivileged
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
    seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
    apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
    apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
  privileged: false
  volumes:
  - configMap
  - secret
  - emptyDir
  - hostPath
  allowedHostPaths:
  - pathPrefix: "/etc/cni/net.d"
  - pathPrefix: "/etc/kube-flannel"
  - pathPrefix: "/run/flannel"
  readOnlyRootFilesystem: false
  # Users and groups
  runAsUser:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
  # Privilege Escalation
  allowPrivilegeEscalation: false
  defaultAllowPrivilegeEscalation: false
  # Capabilities
  allowedCapabilities: ['NET_ADMIN', 'NET_RAW']
  defaultAddCapabilities: []
  requiredDropCapabilities: []
  # Host namespaces
  hostPID: false
  hostIPC: false
  hostNetwork: true
  hostPorts:
  - min: 0
    max: 65535
  # SELinux
  seLinux:
    # SELinux is unused in CaaSP
    rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: flannel
rules:
- apiGroups: ['extensions']
  resources: ['podsecuritypolicies']
  verbs: ['use']
  resourceNames: ['psp.flannel.unprivileged']
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - nodes/status
  verbs:
  - patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: flannel
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: flannel
subjects:
- kind: ServiceAccount
  name: flannel
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: flannel
  namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: kube-flannel-cfg
  namespace: kube-system
  labels:
    tier: node
    app: flannel
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/os
                operator: In
                values:
                - linux
      hostNetwork: true
      priorityClassName: system-node-critical
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni-plugin
       #image: flannelcni/flannel-cni-plugin:v1.0.1 for ppc64le and mips64le (dockerhub limitations may apply)
        image: rancher/mirrored-flannelcni-flannel-cni-plugin:v1.0.1
        command:
        - cp
        args:
        - -f
        - /flannel
        - /opt/cni/bin/flannel
        volumeMounts:
        - name: cni-plugin
          mountPath: /opt/cni/bin
      - name: install-cni
       #image: flannelcni/flannel:v0.17.0 for ppc64le and mips64le (dockerhub limitations may apply)
        image: rancher/mirrored-flannelcni-flannel:v0.17.0
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
       #image: flannelcni/flannel:v0.17.0 for ppc64le and mips64le (dockerhub limitations may apply)
        image: rancher/mirrored-flannelcni-flannel:v0.17.0
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
            add: ["NET_ADMIN", "NET_RAW"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
        - name: xtables-lock
          mountPath: /run/xtables.lock
      volumes:
      - name: run
        hostPath:
          path: /run/flannel
      - name: cni-plugin
        hostPath:
          path: /opt/cni/bin
      - name: cni
        hostPath:
          path: /etc/cni/net.d
      - name: flannel-cfg
        configMap:
          name: kube-flannel-cfg
      - name: xtables-lock
        hostPath:
          path: /run/xtables.lock
          type: FileOrCreate
###########文件结束#################
[root@k8s-170 ~]# kubectl apply -f kube-flannel.yml

## 耐心等待上面的网络插件启动完成后，查看pod如下
[root@k8s-170 ~]# kubectl get pods -A
NAMESPACE     NAME                                        READY   STATUS    RESTARTS   AGE
kube-system   coredns-54d67798b7-968tp                    1/1     Running   0          8m7s
kube-system   coredns-54d67798b7-s5sh6                    1/1     Running   0          8m7s
kube-system   etcd-k8s-170.kevin.com                      1/1     Running   0          8m22s
kube-system   kube-apiserver-k8s-170.kevin.com            1/1     Running   0          8m22s
kube-system   kube-controller-manager-k8s-170.kevin.com   1/1     Running   0          8m22s
kube-system   kube-flannel-ds-amd64-kmcpz                 1/1     Running   0          2m25s
kube-system   kube-proxy-r6g59                            1/1     Running   0          8m7s
kube-system   kube-scheduler-k8s-170.kevin.com            1/1     Running   0          8m22s
[root@k8s-170 ~]# 

### 此时，再次查看节点，当前节点为Ready状态
[root@k8s-170 ~]# kubectl get nodes
NAME                STATUS   ROLES                  AGE     VERSION
k8s-170.kevin.com   Ready    control-plane,master   7m22s   v1.20.5
[root@k8s-170 ~]#

其它主节点加入集群

192.168.1.171 节点以master身份加入集群
以下有两种方式将其它节点以master身份加入集群中，仅使用一种方法即可，这里我使用的是第一种方法

方法一:

直接在 171节点执行 join命令加入

## 耐心等待，因为要拉取镜像
[root@k8s-171 ~]# kubeadm join 192.168.1.179:6443 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:dc93ce29cf6b9112aebed16a050abdf030dabbbff92c42700a66d544083ac346 \
    --control-plane --certificate-key aea69a228acb46bffd9faf54a1d29621cb21d8fb1b6dddb8b34417d94dba9132
This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

	mkdir -p $HOME/.kube
	sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
	sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

## 当出现上面的成功提示后，根据提示执行
[root@k8s-171 ~]# mkdir -p $HOME/.kube
[root@k8s-171 ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@k8s-171 ~]#sudo chown $(id -u):$(id -g) $HOME/.kube/config

## 再在任意master节点(170或171节点)查看集群节点状态
[root@k8s-170 ~]# kubectl get nodes
NAME                STATUS   ROLES                  AGE     VERSION
k8s-170.kevin.com   Ready    control-plane,master   9m56s   v1.20.6
k8s-171.kevin.com   Ready    control-plane,master   106s    v1.20.6
[root@k8s-170 ~]#

方法二：

复制证书和配置

## 在170 节点上将证书复制到 171节点上
[root@k8s-170 ~]# rsync -av /etc/kubernetes/pki/* 192.168.0.171:/etc/kubernetes/pki/

## 复制kubeadm-config.yaml 到171节点
[root@k8s-170 ~]# rsync -av kubeadm-config.yaml 192.168.0.171:/root

修改配置

在171节点修改 kubeadm-config.yaml，修改 advertiseAddress 和nodeRegistration.name ，最后修改后听内容如下

## [root@k8s-171 ~]# cat kubeadm-config.yaml 
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.0.171
  bindPort: 6443
nodeRegistration:
  criSocket: /run/containerd/containerd.sock
  imagePullPolicy: IfNotPresent
  name: k8s-171.kevin.com
  taints: 
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: 
  type: CoreDNS
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.23.0
controlPlaneEndpoint: 192.168.0.179:7443
networking:
  dnsDomain: cluster.local
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.96.0.0/12
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1 
kind: KubeProxyConfiguration 
mode: ipvs ## 使用ipvs调动流量
--- 
apiVersion: kubelet.config.k8s.io/v1beta1 
kind: KubeletConfiguration 
cgroupDriver: systemd ## 使用systemd

应用配置，加入集群

### 此过程需要拉取镜像，需要时间，请耐心等待
[root@k8s-171 ~]# kubeadm init --config=kubeadm-config.yaml --upload-certs


### 如果出现错误，可以先reset，再重新从179节点复制 /etc/kubernetes/pki/ 目录到此节点，没出现错误此步不用
[root@k8s-171 ~]# kubeadm reset

### 根据上面的输出依次执行以下命令
[root@k8s-171 ~]# mkdir -p $HOME/.kube
[root@k8s-171 ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@k8s-171 ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config

## 查看集群节点信息如下
[root@k8s-171 ~]# kubectl get nodes
NAME                STATUS   ROLES                  AGE     VERSION
k8s-170.kevin.com   Ready    control-plane,master   21m     v1.20.5
k8s-171.kevin.com   Ready    control-plane,master   6m32s   v1.20.5
[root@k8s-171 ~]#

工作节点加入集群

在 172和 173 节点都执行

## 此过程需要拉取镜像，请耐心待
[root@k8s-172 ~]# kubeadm join 192.168.0.179:6443 --token abcdef.0123456789abcdef \
    --discovery-token-ca-cert-hash sha256:dc93ce29cf6b9112aebed16a050abdf030dabbbff92c42700a66d544083ac346
[preflight] Running pre-flight checks
	[WARNING FileExisting-tc]: tc not found in system path
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.


## 当出现以上信息输出时，worker 节点加入集群成功，等待worker节点镜像服务启动完成后，再次在任意master节点查看集群所有节点如下
[root@k8s-170 ~]# kubectl get nodes
NAME                STATUS   ROLES                  AGE    VERSION
k8s-170.kevin.com   Ready    control-plane,master   31m    v1.20.5
k8s-171.kevin.com   Ready    control-plane,master   17m    v1.20.5
k8s-172.kevin.com   Ready    <none>                 119s   v1.20.5
k8s-173.kevin.com   Ready    <none>                 115s   v1.20.5
[root@k8s-170 ~]# kubectl get pods -A -o wide
NAMESPACE     NAME                                        READY   STATUS    RESTARTS   AGE     IP              NODE                NOMINATED NODE   READINESS GATES
kube-system   calico-kube-controllers-69496d8b75-ksbsv    1/1     Running   0          29m     192.168.72.66   k8s-170.kevin.com   <none>           <none>
kube-system   calico-node-ksc98                           1/1     Running   0          3m12s   192.168.0.173   k8s-173.kevin.com   <none>           <none>
kube-system   calico-node-qzxwv                           1/1     Running   0          29m     192.168.0.170   k8s-170.kevin.com   <none>           <none>
kube-system   calico-node-sb27w                           1/1     Running   0          3m16s   192.168.0.172   k8s-172.kevin.com   <none>           <none>
kube-system   calico-node-zr7x2                           1/1     Running   0          18m     192.168.0.171   k8s-171.kevin.com   <none>           <none>
kube-system   coredns-54d67798b7-5v9z4                    1/1     Running   0          33m     192.168.72.67   k8s-170.kevin.com   <none>           <none>
kube-system   coredns-54d67798b7-r8dg9                    1/1     Running   0          33m     192.168.72.65   k8s-170.kevin.com   <none>           <none>
kube-system   etcd-k8s-170.kevin.com                      1/1     Running   0          32m     192.168.0.170   k8s-170.kevin.com   <none>           <none>
kube-system   etcd-k8s-171.kevin.com                      1/1     Running   0          18m     192.168.0.171   k8s-171.kevin.com   <none>           <none>
kube-system   kube-apiserver-k8s-170.kevin.com            1/1     Running   0          32m     192.168.0.170   k8s-170.kevin.com   <none>           <none>
kube-system   kube-apiserver-k8s-171.kevin.com            1/1     Running   0          18m     192.168.0.171   k8s-171.kevin.com   <none>           <none>
kube-system   kube-controller-manager-k8s-170.kevin.com   1/1     Running   0          32m     192.168.0.170   k8s-170.kevin.com   <none>           <none>
kube-system   kube-controller-manager-k8s-171.kevin.com   1/1     Running   0          18m     192.168.0.171   k8s-171.kevin.com   <none>           <none>
kube-system   kube-proxy-k84js                            1/1     Running   0          3m16s   192.168.0.172   k8s-172.kevin.com   <none>           <none>
kube-system   kube-proxy-m4qm5                            1/1     Running   0          18m     192.168.0.171   k8s-171.kevin.com   <none>           <none>
kube-system   kube-proxy-p2xbd                            1/1     Running   0          33m     192.168.0.170   k8s-170.kevin.com   <none>           <none>
kube-system   kube-proxy-rgrzj                            1/1     Running   0          3m12s   192.168.0.173   k8s-173.kevin.com   <none>           <none>
kube-system   kube-scheduler-k8s-170.kevin.com            1/1     Running   0          32m     192.168.0.170   k8s-170.kevin.com   <none>           <none>
kube-system   kube-scheduler-k8s-171.kevin.com            1/1     Running   0          18m     192.168.0.171   k8s-171.kevin.com   <none>           <none>
[root@k8s-170 ~]#

如果忘记了node 节点加入集群的命令，可以在任意master执行:

[root@k8s-170 ~]# kubeadm token create --print-join-command
kubeadm join 192.168.0.179:6443 --token 4s4dig.go8mpg3oso530u1z     --discovery-token-ca-cert-hash sha256:dc93ce29cf6b9112aebed16a050abdf030dabbbff92c42700a66d544083ac346 
[root@k8s-170 ~]#

kube-proxy组件开启ipvs

默认情况下，kube-proxy组件开启的就是ipvs调动，使用log查看如下

[root@k8s-170 ~]# kubectl -n kube-system logs kube-proxy-p2xbd | grep ipvs
I0415 12:52:02.610427       1 server_others.go:258] Using ipvs Proxier.

## 查看ipvs转发如下
[root@k8s-170 ~]# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  10.96.0.1:443 rr
  -> 192.168.1.170:6443           Masq    1      6          0         
TCP  10.96.0.10:53 rr
  -> 192.168.72.65:53             Masq    1      0          0         
  -> 192.168.72.67:53             Masq    1      0          0         
TCP  10.96.0.10:9153 rr
  -> 192.168.72.65:9153           Masq    1      0          0         
  -> 192.168.72.67:9153           Masq    1      0          0         
UDP  10.96.0.10:53 rr
  -> 192.168.72.65:53             Masq    1      0          0         
  -> 192.168.72.67:53             Masq    1      0          0         
[root@k8s-170 ~]#

问题解决

coredns 报错 `read udp 10.244.0.3:58854->192.168.1.100:53: i/o timeout`

此时，部署到这里，K8s集群应该是部署完成了，但是，在查看 coredns服务日志时，一直报错 read udp 10.244.0.3:58854->192.168.1.100:53: i/o timeout，使用 dig也不能正常解析出来，是iptables 没有生效，重启所有节点可以解决问题。

[root@k8s-170 traefik]# kubectl logs -f coredns-54d67798b7-bxs95 -n kube-system
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.7.0
linux/amd64, go1.14.4, f59c03d
[ERROR] plugin/errors: 2 2855583235070228061.2607985568958209405. HINFO: read udp 10.244.0.3:58854->192.168.1.100:53: i/o timeout
[ERROR] plugin/errors: 2 2855583235070228061.2607985568958209405. HINFO: dial udp [fe80::1%ens33]:53: connect: invalid argument
[ERROR] plugin/errors: 2 2855583235070228061.2607985568958209405. HINFO: read udp 10.244.0.3:50381->192.168.1.100:53: i/o timeout
[ERROR] plugin/errors: 2 2855583235070228061.2607985568958209405. HINFO: dial udp [fe80::1%ens33]:53: connect: invalid argument

[root@k8s-170 ~]# kubectl get svc -A
NAMESPACE     NAME                        TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                   AGE
default       kubernetes                  ClusterIP   10.96.0.1       <none>        443/TCP                   3h6m
kube-system   kube-dns                    ClusterIP   10.96.0.10      <none>        53/UDP,53/TCP,9153/TCP    3h6m
## 使用 dig解析域名也不能正常解析
[root@k8s-170 traefik]# dig -t A www.baidu.com @10.96.0.10 +short