k8s高可用集群(二进制, v1.18版本)

系统环境

软件版本
操作系统CentOS7.8_x64 (mini)
Docker19-ce
Kubernetes1.18
ETCD3.4.13
cni-plugins0.8.6

节点组件

角色IP组件
k8s-master-01192.168.2.101kube-apiserver, kube-controller-manager, kube-scheduler, kubelet, kube-proxy, docker, etcd
k8s-master-02192.168.2.102kube-apiserver, kube-controller-manager, kube-scheduler, kubelet,kube-proxy, docker, etcd
k8s-node-01192.168.2.201kubelet, kube-proxy, docker, etcd
k8s-node-02192.168.2.202kubelet, kube-proxy, docker, etcd
LB-M (Master)192.168.2.80,192.168.2.100(VIP)Nginx L4, keeplived
LB-S (Backup)192.168.2.81,192.168.2.100(VIP)Nginx L4, keeplived

基础环境配置–所有节点

hosts配置

cat >> /etc/hosts << EOF
192.168.2.101 k8s-master-01
192.168.2.102 k8s-master-02
192.168.2.201 k8s-node-01
192.168.2.202 k8s-node-02
EOF

## 主机名修改

例如master

cat > /etc/hostname << EOF
k8s-master-01
EOF

系统配置文件sysctl设置

## 将桥接的IPv4流量传递到iptables的链
cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
## 生效
sysctl --system
## 关闭缓存,配置/etc/fstab,永久关闭
swapoff -a

## 时间同步
cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
ntpdate time.windows.com

## 创建目录结构
mkdir -pv /opt/etcd/{bin,cfg,ssl,logs}
mkdir -pv /opt/k8s/{bin,cfg,ssl,logs,yaml}
mkdir -pv /opt/cni/{bin,cfg,yaml}

##配置环境变量(根据节点情况,一般配置master节点即可)
echo 'export PATH=$PATH:/opt/k8s/bin/' >> /etc/profile
echo 'export PATH=$PATH:/opt/etcd/bin/' >> /etc/profile
source /etc/profile

## 为了便捷操作,在master-01上创建免密登录其他节点
ssh-keygen -t rsa
ssh-copy-id -i /root/.ssh/id_rsa.pub root@k8s-master-02
ssh-copy-id -i /root/.ssh/id_rsa.pub root@k8s-node-01
ssh-copy-id -i /root/.ssh/id_rsa.pub root@k8s-node-02

安装证书工具

在Master-01节点操作,然后同步到其他节点

## 下载工具
wget  https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
## 添加可执行权限
chmod +x cfssl*
## 移动到bin下方便使用
mv cfssl-certinfo_linux-amd64 /usr/bin/cfssl-certinfo
mv cfssljson_linux-amd64 /usr/bin/cfssljson
mv cfssl_linux-amd64 /usr/bin/cfssl
## 生成配置模版
cfssl print-defaults config > config.json
cfssl print-defaults csr > csr.json

创建自签证书专用目录(为了方便操作)

mkdir -pv /data/TLS/{etcd,k8s}

部署ETCD集群

节点名称IP
etcd-1192.168.2.101
etcd-2192.168.2.102
etcd-3192.168.2.201
etcd-4192.168.2.202

自签TLS证书

在master-01上操作,然后传到其余节点

自签证书颁发机构(CA)

cd /data/TLS/etcd/

自签CA

cd /data/TLS/etcd/
cat > ca-config.json << EOF
{
  "signing": {
    "default": {
      "expiry": "87600h"
    },
    "profiles": {
      "www": {
         "expiry": "87600h",
         "usages": [
            "signing",
            "key encipherment",
            "server auth",
            "client auth"
        ]
      }
    }
  }
}
EOF
cat > ca-csr.json << EOF
{
    "CN": "etcd CA",
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "L": "Beijing",
            "ST": "Beijing"
        }
    ]
}
EOF

生成证书

cfssl gencert -initca ca-csr.json | cfssljson -bare ca
ls
ca-config.json  ca.csr  ca-csr.json  ca-key.pem  ca.pem

使用自签CA签发Etcd HTTPS证书

创建证书申请文件(hosts中要包含所有etcd节点ip,也可以多写几个预留)

cat > server-csr.json << EOF
{
    "CN": "etcd",
    "hosts": [
    "192.168.2.101",
    "192.168.2.102",
    "192.168.2.201",
    "192.168.2.202"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "L": "BeiJing",
            "ST": "BeiJing"
        }
    ]
}
EOF

生成证书

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=www server-csr.json | cfssljson -bare server

ls
ca-config.json  ca-key.pem  server-csr.json
ca.csr          ca.pem      server-key.pem
ca-csr.json     server.csr  server.pem

将证书拷贝到对应目录

cp /data/TLS/etcd/*.pem /opt/etcd/ssl/
ls /opt/etcd/ssl/
ca-key.pem  ca.pem  server-key.pem  server.pem

ETCD安装

master,node节点操作相同,以master-01为例

wget https://mirrors.huaweicloud.com/etcd/v3.3.25/etcd-v3.4.13-linux-amd64.tar.gz
tar -zxf etcd-v3.4.13-linux-amd64.tar.gz
mv etcd-v3.4.13-linux-amd64/etcd* /opt/etcd/bin/

创建ETCD配置文件

master,node配置同理

cat > /opt/etcd/cfg/etcd.conf << EOF
#[Member]
ETCD_NAME="etcd-1"
ETCD_DATA_DIR="/var/lib/etcd/default.etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.2.101:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.2.101:2379"

#[Clustering]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.2.101:2380"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.2.101:2379"
ETCD_INITIAL_CLUSTER="etcd-1=https://192.168.2.101:2380,etcd-2=https://192.168.2.102:2380,etcd-3=https://192.168.2.201:2380,etcd-4=https://192.168.2.202:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_INITIAL_CLUSTER_STATE="new"
EOF
  • ETCD_NAME:节点名称,集群中唯一
  • ETCD_DATA_DIR:数据目录
  • ETCD_LISTEN_PEER_URLS:集群通信监听地址
  • ETCD_LISTEN_CLIENT_URLS:客户端访问监听地址
  • ETCD_INITIAL_ADVERTISE_PEER_URLS:集群通告地址
  • ETCD_ADVERTISE_CLIENT_URLS:客户端通告地址
  • ETCD_INITIAL_CLUSTER:集群节点地址
  • ETCD_INITIAL_CLUSTER_TOKEN:集群Token
  • ETCD_INITIAL_CLUSTER_STATE:加入集群的当前状态,new是新集群,existing表示加入已有集群

创建ETCD启动文件

cat > /usr/lib/systemd/system/etcd.service << EOF
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
EnvironmentFile=/opt/etcd/cfg/etcd.conf
ExecStart=/opt/etcd/bin/etcd \
--cert-file=/opt/etcd/ssl/server.pem \
--key-file=/opt/etcd/ssl/server-key.pem \
--peer-cert-file=/opt/etcd/ssl/server.pem \
--peer-key-file=/opt/etcd/ssl/server-key.pem \
--trusted-ca-file=/opt/etcd/ssl/ca.pem \
--peer-trusted-ca-file=/opt/etcd/ssl/ca.pem \
--logger=zap
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

启动ETCD

## 重载启动配置文件
systemctl daemon-reload
## 启动etcd
systemctl restart etcd
## 加入开机自启动
systemctl enable etcd

## 配置其余ETCD节点

将/opt/etcd目录拷贝到所有节点,修改对应配置文件(主要修改NAME和URL,修改成对应节点的),启动所有节点的etcd,并检查状态

/opt/etcd/bin/etcdctl --cacert=/opt/etcd/ssl/ca.pem --cert=/opt/etcd/ssl/server.pem --key=/opt/etcd/ssl/server-key.pem --endpoints="https://192.168.2.101:2379,https://192.168.2.102:2379,https://192.168.2.201:2379,https://192.168.2.102:2379" endpoint health
https://192.168.2.201:2379 is healthy: successfully committed proposal: took = 8.805127ms
https://192.168.2.102:2379 is healthy: successfully committed proposal: took = 9.372327ms
https://192.168.2.102:2379 is healthy: successfully committed proposal: took = 9.53701ms
https://192.168.2.101:2379 is healthy: successfully committed proposal: took = 10.441432ms


/opt/etcd/bin/etcdctl --cacert=/opt/etcd/ssl/ca.pem --cert=/opt/etcd/ssl/server.pem --key=/opt/etcd/ssl/server-key.pem --endpoints="https://192.168.2.101:2379,https://192.168.2.102:2379,https://192.168.2.201:2379,https://192.168.2.102:2379" member list
e076e03709a081d, started, etcd-3, https://192.168.2.201:2380, https://192.168.2.201:2379, false
49c96169b3daf66f, started, etcd-4, https://192.168.2.202:2380, https://192.168.2.202:2379, false
ad9328796634e0d0, started, etcd-1, https://192.168.2.101:2380, https://192.168.2.101:2379, false
fddf9c47e41c5ec2, started, etcd-2, https://192.168.2.102:2380, https://192.168.2.102:2379, false

节点安装Docker

卸载本机docker

yum remove docker

安装依赖软件

yum install yum-utils device-mapper-persistent-data lvm2 -y

添加docker yum源

## 阿里镜像源
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
## 也可以选Docker官方镜像源,二选一,但阿里云的更快
# yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

安装docker

yum list docker-ce --showduplicates | sort -r
yum install docker-ce docker-ce-cli containerd.io -y

配置阿里云镜像加速

cat > /etc/docker/daemon.json << EOF
{
"registry-mirrors": ["https://gsm39obv.mirror.aliyuncs.com"]
}
EOF
systemctl restart docker
## 使用docker info查看生效情况

这个镜像加速地址,自己可以去阿里云申请,每个阿里云帐号申请的地址不同

启动docker

## 查看docker版本
docker -v
## 启动docker
systemctl start docker
#  配置docker开机启动
systemctl enable docker

Master节点部署组件

为了避免出错,先部署其中一台master,运行正常之后,再copy到其余master节点,待单节点master的集群全部运行正常之后,在进行master节点高可用配置

组件下载安装

https://github.com/kubernetes/kubernetes

下载kubernetes-server-linux-amd64.tar.gz包即可,并解压

tar -zxf kubernetes-server-linux-amd64.tar.gz
cd kubernetes/server/bin/
mv kubectl kube-proxy kubeadm kubelet kube-controller-manager kube-scheduler kube-apiserver mounter /opt/k8s/bin/

部署apierver

自签TLS证书

自签证书颁发机构(CA
cd /data/TLS/k8s/
cat > ca-config.json << EOF
{
  "signing": {
    "default": {
      "expiry": "87600h"
    },
    "profiles": {
      "kubernetes": {
         "expiry": "87600h",
         "usages": [
            "signing",
            "key encipherment",
            "server auth",
            "client auth"
        ]
      }
    }
  }
}
EOF
cat > ca-csr.json << EOF
{
    "CN": "kubernetes",
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "L": "Beijing",
            "ST": "Beijing",
            "O": "k8s",
            "OU": "System"
        }
    ]
}
EOF

生成证书

cfssl gencert -initca ca-csr.json | cfssljson -bare ca
ls
ca-config.json  ca.csr  ca-csr.json  ca-key.pem  ca.pem
使用自签CA签发kube-apiserver HTTPS证书
cat > server-csr.json << EOF
{
    "CN": "kubernetes",
    "hosts": [
      "10.0.0.1",
      "127.0.0.1",
      "192.168.2.101",
      "192.168.2.102",
      "192.168.2.201",
      "192.168.2.202",
      "192.168.2.80",
      "192.168.2.81",
      "192.168.2.100",
      "kubernetes",
      "kubernetes.default",
      "kubernetes.default.svc",
      "kubernetes.default.svc.cluster",
      "kubernetes.default.svc.cluster.local"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "L": "BeiJing",
            "ST": "BeiJing",
            "O": "k8s",
            "OU": "System"
        }
    ]
}
EOF

注:上述文件hosts字段中IP为所有Master/LB/VIP IP,一个都不能少!为了方便后期扩容可以多写几个预留的IP。

生成证书

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes server-csr.json | cfssljson -bare server
ls
ca-config.json  ca.csr  ca-csr.json  ca-key.pem  ca.pem  server.csr  server-csr.json  server-key.pem  server.pem

拷贝证书到对应目录

cp *.pem /opt/k8s/ssl/

启用 TLS Bootstrapping 机制

生成token文件
cat > /data/TLS/k8s/token.csv << EOF
$(head -c 16 /dev/urandom | od -An -t x | tr -d ' '),kubelet-bootstrap,10001,"system:node-bootstrapper"
EOF
生成bootstrap.kubeconfig文件
vim /data/TLS/k8s/kuberconfig.sh
KUBE_APISERVER="https://192.168.2.101:6443"
TOKEN="6536f63728fd225f2df9120355685de7"

# 生成 kubelet bootstrap kubeconfig 配置文件

# 设置集群参数-将证书信息写到bootstrap.kubeconfig里
kubectl config set-cluster kubernetes \
  --certificate-authority=/opt/k8s/ssl/ca.pem \
  --embed-certs=true \
  --server=${KUBE_APISERVER} \
  --kubeconfig=bootstrap.kubeconfig
  
# 设置客户端认证参数-设置证书信息
kubectl config set-credentials "kubelet-bootstrap" \
  --token=${TOKEN} \
  --kubeconfig=bootstrap.kubeconfig
  
# 设置上下文参数
kubectl config set-context default \
  --cluster=kubernetes \
  --user="kubelet-bootstrap" \
  --kubeconfig=bootstrap.kubeconfig
  
# 设置默认上下文
kubectl config use-context default --kubeconfig=bootstrap.kubeconfig
  • KUBE_APISERVER: apiserver IP:PORT
  • TOKEN: 与token.csv里保持一致

注意: 如果是先部署Master高可用,这里的KUBE_APISERVER就写keeplived的VIP,如果后部署master的高可用,可以先写一个master的IP,后面去修改每个节点的bootstrap.kubeconfig配置文件内容即可

执行脚本

sh kuberconfig.sh
ls bootstrap.kubeconfig

将bootstrap.kubeconfig文件传到node节点对应目录

cp token.csv bootstrap.kubeconfig /opt/k8s/cfg/
生成kube-proxy.kubeconfig
# 切换到TLS/k8s目录
cd /data/TLS/k8s

# 创建证书请求文件
cat > kube-proxy-csr.json << EOF
{
  "CN": "system:kube-proxy",
  "hosts": [],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "L": "BeiJing",
      "ST": "BeiJing",
      "O": "k8s",
      "OU": "System"
    }
  ]
}
EOF

# 生成证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy

ls kube-proxy*pem
kube-proxy-key.pem  kube-proxy.pem

生成kube-proxy.kubeconfig文件

vim kuberconfig-proxy.sh
KUBE_APISERVER="https://192.168.2.101:6443"

kubectl config set-cluster kubernetes \
  --certificate-authority=/opt/k8s/ssl/ca.pem \
  --embed-certs=true \
  --server=${KUBE_APISERVER} \
  --kubeconfig=kube-proxy.kubeconfig
kubectl config set-credentials kube-proxy \
  --client-certificate=./kube-proxy.pem \
  --client-key=./kube-proxy-key.pem \
  --embed-certs=true \
  --kubeconfig=kube-proxy.kubeconfig
kubectl config set-context default \
  --cluster=kubernetes \
  --user=kube-proxy \
  --kubeconfig=kube-proxy.kubeconfig
kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig

注意: 如果是先部署Master高可用,这里的KUBE_APISERVER就写keeplived的VIP,如果后部署master的高可用,可以先写一个master的IP,后面去修改每个节点的kube-proxy.kubeconfig配置文件内容即可

执行脚本

sh kuberconfig-proxy.sh
cp kube-proxy.kubeconfig /opt/k8s/cfg/

创建配置文件

cat > /opt/k8s/cfg/kube-apiserver.conf << EOF
KUBE_APISERVER_OPTS="--logtostderr=false \\
--v=2 \\
--log-dir=/opt/k8s/logs \\
--etcd-servers=https://192.168.2.101:2379,https://192.168.2.102:2379,https://192.168.2.201:2379,https://192.168.2.202:2379 \\
--bind-address=192.168.2.101 \\
--secure-port=6443 \\
--advertise-address=192.168.2.101 \\
--allow-privileged=true \\
--service-cluster-ip-range=10.0.0.0/16 \\
--enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota,NodeRestriction \\
--authorization-mode=RBAC,Node \\
--enable-bootstrap-token-auth=true \\
--token-auth-file=/opt/k8s/cfg/token.csv \\
--service-node-port-range=30000-32767 \\
--kubelet-client-certificate=/opt/k8s/ssl/server.pem \\
--kubelet-client-key=/opt/k8s/ssl/server-key.pem \\
--tls-cert-file=/opt/k8s/ssl/server.pem  \\
--tls-private-key-file=/opt/k8s/ssl/server-key.pem \\
--client-ca-file=/opt/k8s/ssl/ca.pem \\
--service-account-key-file=/opt/k8s/ssl/ca-key.pem \\
--etcd-cafile=/opt/etcd/ssl/ca.pem \\
--etcd-certfile=/opt/etcd/ssl/server.pem \\
--etcd-keyfile=/opt/etcd/ssl/server-key.pem \\
--audit-log-maxage=30 \\
--audit-log-maxbackup=3 \\
--audit-log-maxsize=100 \\
--audit-log-path=/opt/k8s/logs/k8s-audit.log"
EOF
  • –logtostderr:启用日志
  • —v:日志等级
  • –log-dir:日志目录
  • –etcd-servers:etcd集群地址
  • –bind-address:监听地址
  • –secure-port:https安全端口
  • –advertise-address:集群通告地址
  • –allow-privileged:启用授权
  • –service-cluster-ip-range:Service虚拟IP地址段
  • –enable-admission-plugins:准入控制模块
  • –authorization-mode:认证授权,启用RBAC授权和节点自管理
  • –enable-bootstrap-token-auth:启用TLS bootstrap机制
  • –token-auth-file:bootstrap token文件
  • –service-node-port-range:Service nodeport类型默认分配端口范围
  • –kubelet-client-xxx:apiserver访问kubelet客户端证书
  • –tls-xxx-file:apiserver https证书
  • –etcd-xxxfile:连接Etcd集群证书
  • –audit-log-xxx:审计日志

创建启动文件

cat > /usr/lib/systemd/system/kube-apiserver.service << EOF
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes

[Service]
EnvironmentFile=/opt/k8s/cfg/kube-apiserver.conf
ExecStart=/opt/k8s/bin/kube-apiserver \$KUBE_APISERVER_OPTS
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

启动apiserver

systemctl daemon-reload
systemctl restart kube-apiserver
systemctl enable kube-apiserver

授权kubelet-bootstrap用户允许请求证书

kubectl create clusterrolebinding kubelet-bootstrap \
--clusterrole=system:node-bootstrapper \
--user=kubelet-bootstrap

部署kube-controller-manager

创建配置文件

cat > /opt/k8s/cfg/kube-controller-manager.conf << EOF
KUBE_CONTROLLER_MANAGER_OPTS="--logtostderr=false \\
--v=2 \\
--log-dir=/opt/k8s/logs \\
--leader-elect=true \\
--master=127.0.0.1:8080 \\
--bind-address=127.0.0.1 \\
--allocate-node-cidrs=true \\
--cluster-cidr=10.244.0.0/16 \\
--service-cluster-ip-range=10.0.0.0/16 \\
--cluster-signing-cert-file=/opt/k8s/ssl/ca.pem \\
--cluster-signing-key-file=/opt/k8s/ssl/ca-key.pem  \\
--root-ca-file=/opt/k8s/ssl/ca.pem \\
--service-account-private-key-file=/opt/k8s/ssl/ca-key.pem \\
--experimental-cluster-signing-duration=87600h0m0s"
EOF
  • –master:通过本地非安全本地端口8080连接apiserver。
  • –leader-elect:当该组件启动多个时,自动选举(HA)
  • –cluster-signing-cert-file/–cluster-signing-key-file:自动为kubelet颁发证书的CA,与apiserver保持一致

创建启动文件

cat > /usr/lib/systemd/system/kube-controller-manager.service << EOF
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes

[Service]
EnvironmentFile=/opt/k8s/cfg/kube-controller-manager.conf
ExecStart=/opt/k8s/bin/kube-controller-manager \$KUBE_CONTROLLER_MANAGER_OPTS
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

启动kube-controller-manager

systemctl daemon-reload
systemctl restart kube-controller-manager
systemctl enable kube-controller-manager

部署kube-scheduler

创建配置文件

cat > /opt/k8s/cfg/kube-scheduler.conf << EOF
KUBE_SCHEDULER_OPTS="--logtostderr=false \\
--v=2 \\
--log-dir=/opt/k8s/logs \\
--leader-elect \\
--master=127.0.0.1:8080 \\
--bind-address=127.0.0.1"
EOF
  • –master:通过本地非安全本地端口8080连接apiserver。
  • –leader-elect:当该组件启动多个时,自动选举(HA)

创建启动文件

cat > /usr/lib/systemd/system/kube-scheduler.service << EOF
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes

[Service]
EnvironmentFile=/opt/k8s/cfg/kube-scheduler.conf
ExecStart=/opt/k8s/bin/kube-scheduler \$KUBE_SCHEDULER_OPTS
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

启动kube-scheduler

systemctl daemon-reload
systemctl restart kube-scheduler
systemctl enable kube-scheduler

查看集群状态-Master

kubectl get cs
NAME                 STATUS    MESSAGE             ERROR
scheduler            Healthy   ok                  
controller-manager   Healthy   ok                  
etcd-3               Healthy   {"health":"true"}   
etcd-1               Healthy   {"health":"true"}   
etcd-0               Healthy   {"health":"true"}   
etcd-2               Healthy   {"health":"true"}

如上输出说明Master节点组件运行正常

配置其余master节点

将master-01上的kube-api,kube-controller,kube-scheduler相关文件拷贝到其余master节点,并启动; 注意,这里拷贝到其余节点仅仅是先运行起来,先不做高可用,避免出问题不好排查

cd /opt/k8s/ssl/
scp * root@k8s-master-02:/opt/k8s/ssl/
cd /opt/k8s/bin/
scp kubectl kube-apiserver kube-controller-manager kube-scheduler root@k8s-master-02:/opt/k8s/bin/
cd /opt/k8s/cfg/
scp bootstrap.kubeconfig kube-apiserver.conf kube-controller-manager.conf kube-scheduler.conf token.csv root@k8s-master-02:/opt/k8s/cfg/
cd /usr/lib/systemd/system
scp kube-apiserver.service kube-controller-manager.service kube-scheduler.service root@k8s-master-02:/usr/lib/systemd/system/

修改对应配置

修改bootstrap.kubeconfig, kube-apiserver.conf中的对应IP,启动k8s-master-02的kube-apiserver kube-controller-manager kube-scheduler

systemctl daemon-reload
systemctl restart kube-apiserver
systemctl restart kube-controller-manager
systemctl restart kube-scheduler
systemctl enable kube-apiserver
systemctl enable kube-controller-manager
systemctl enable kube-scheduler

## master-02查看集群状态
kubectl get cs
NAME                 STATUS    MESSAGE             ERROR
scheduler            Healthy   ok                  
controller-manager   Healthy   ok                  
etcd-1               Healthy   {"health":"true"}   
etcd-0               Healthy   {"health":"true"}   
etcd-3               Healthy   {"health":"true"}   
etcd-2               Healthy   {"health":"true"}

// 如上输出说明master-02运行正常

NODE组件部署

同步文件

同步bin文件

拷贝master-01的kubelet kube-proxy组件到其余节点

cd /opt/k8s/bin/
scp kubelet kube-proxy root@k8s-master-02:/opt/k8s/bin/
scp kubelet kube-proxy root@k8s-node-01:/opt/k8s/bin/
scp kubelet kube-proxy root@k8s-node-02:/opt/k8s/bin/

同步配置文件

拷贝master-01的bootstrap.kubeconfig和kube-proxy.kubeconfig配置文件到各节点对应目录

cd /opt/k8s/cfg/
scp bootstrap.kubeconfig kube-proxy.kubeconfig root@k8s-master-02:/opt/k8s/cfg/
scp bootstrap.kubeconfig kube-proxy.kubeconfig root@k8s-node-01:/opt/k8s/cfg/
scp bootstrap.kubeconfig kube-proxy.kubeconfig root@k8s-node-02:/opt/k8s/cfg/

同步证书文件

拷贝master的ca.pem到各node节点对应位置(master已经拷贝过了,只需要拷贝到node)

cd /opt/k8s/ssl/
scp ca.pem root@k8s-node-01:/opt/k8s/ssl/
scp ca.pem root@k8s-node-02:/opt/k8s/ssl/

部署kubelet

每个节点都部署(master和node)

创建配置文件

注意hostname-override修改为对应的节点名称

cat > /opt/k8s/cfg/kubelet.conf << EOF
KUBELET_OPTS="--logtostderr=false \\
--v=2 \\
--log-dir=/opt/k8s/logs \\
--hostname-override=k8s-node-01 \\
--network-plugin=cni \\
--kubeconfig=/opt/k8s/cfg/kubelet.kubeconfig \\
--bootstrap-kubeconfig=/opt/k8s/cfg/bootstrap.kubeconfig \\
--config=/opt/k8s/cfg/kubelet-config.yml \\
--cert-dir=/opt/k8s/ssl \\
--pod-infra-container-image=mirrorgooglecontainers/pause-amd64:3.1"
EOF
  • –hostname-override:显示名称,集群中唯一
  • –network-plugin:启用CNI
  • –kubeconfig:空路径,会自动生成,后面用于连接apiserver
  • –bootstrap-kubeconfig:首次启动向apiserver申请证书
  • –config:配置参数文件
  • –cert-dir:kubelet证书生成目录
  • –pod-infra-container-image:管理Pod网络容器的镜像

配置参数文件

cat > /opt/k8s/cfg/kubelet-config.yml << EOF
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: 0.0.0.0
port: 10250
readOnlyPort: 10255
cgroupDriver: cgroupfs
clusterDNS:
- 10.0.0.2
clusterDomain: cluster.local 
failSwapOn: false
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 2m0s
    enabled: true
  x509:
    clientCAFile: /opt/k8s/ssl/ca.pem 
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 5m0s
    cacheUnauthorizedTTL: 30s
evictionHard:
  imagefs.available: 15%
  memory.available: 100Mi
  nodefs.available: 10%
  nodefs.inodesFree: 5%
maxOpenFiles: 1000000
maxPods: 110
EOF

创建启动文件

cat > /usr/lib/systemd/system/kubelet.service << EOF
[Unit]
Description=Kubernetes Kubelet
After=docker.service

[Service]
EnvironmentFile=/opt/k8s/cfg/kubelet.conf
ExecStart=/opt/k8s/bin/kubelet \$KUBELET_OPTS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

启动kubelet

systemctl daemon-reload
systemctl restart kubelet
systemctl enable kubelet

排错&&授权

这里启动时候会报错

kubelet: F1021 21:53:15.722893   12015 server.go:274] failed to run Kubelet: cannot create certificate signing request: certificatesigningrequests.certificates.k8s.io is forbidden: User "kubelet-bootstrap" cannot create resource "certificatesigningrequests" in API group "certificates.k8s.io" at the cluster scope

这是因为kubelet-bootstrap没有权限申请证书,在master上查看证书申请列表也是空的

kubectl get csr
No resources found in default namespace.

这时候需要在master上操作,授权kubelet-bootstrap用户允许请求证书

kubectl create clusterrolebinding kubelet-bootstrap \
--clusterrole=system:node-bootstrapper \
--user=kubelet-bootstrap

重新启动kubelet,然后在master上查看证书申请

kubectl get csr
NAME                                                   AGE   SIGNERNAME                                    REQUESTOR           CONDITION
node-csr-GOt-4QjBYgU9iN0V05ZCOxmK8wfwve50u_n0erxEeCc   20s   kubernetes.io/kube-apiserver-client-kubelet   kubelet-bootstrap   Pending

批准kubelet证书申请并加入集群

kubectl certificate approve node-csr-GOt-4QjBYgU9iN0V05ZCOxmK8wfwve50u_n0erxEeCc
kubectl get csr
NAME                                                   AGE     SIGNERNAME                                    REQUESTOR           CONDITION
node-csr-GOt-4QjBYgU9iN0V05ZCOxmK8wfwve50u_n0erxEeCc   2m28s   kubernetes.io/kube-apiserver-client-kubelet   kubelet-bootstrap   Approved,Issued

注:由于CNI网络插件还没有部署,节点会没有准备就绪 NotReady

部署kube-proxy

创建配置文件

cat > /opt/k8s/cfg/kube-proxy.conf << EOF
KUBE_PROXY_OPTS="--logtostderr=false \\
--v=2 \\
--log-dir=/opt/k8s/logs \\
--config=/opt/k8s/cfg/kube-proxy-config.yml"
EOF

配置参数文件

cat > /opt/k8s/cfg/kube-proxy-config.yml << EOF
kind: KubeProxyConfiguration
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
metricsBindAddress: 0.0.0.0:10249
clientConnection:
  kubeconfig: /opt/k8s/cfg/kube-proxy.kubeconfig
hostnameOverride: k8s-node1
clusterCIDR: 10.0.0.0/16
EOF
  1. 注意修改hostnameOverride为节点名称
  2. clusterCIDR: kube-proxy 根据 --cluster-cidr 判断集群内部和外部流量,指定 --cluster-cidr 或 --masquerade-all 选项后 kube-proxy 才会对访问 Service IP 的请求做 SNAT

创建启动文件

cat > /usr/lib/systemd/system/kube-proxy.service << EOF
[Unit]
Description=Kubernetes Proxy
After=network.target

[Service]
EnvironmentFile=/opt/k8s/cfg/kube-proxy.conf
ExecStart=/opt/k8s/bin/kube-proxy \$KUBE_PROXY_OPTS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
EOF

启动kube-proxy

systemctl daemon-reload
systemctl restart kube-proxy
systemctl enable kube-proxy

PS: 部署新节点时候,只需要保证节点有如下配置文件,并配置好即可

tree /opt/k8s/
/opt/k8s/
├── bin
│   ├── kubelet
│   └── kube-proxy
├── cfg
│   ├── bootstrap.kubeconfig
│   ├── kubelet.conf
│   ├── kubelet-config.yml
│   ├── kube-proxy.conf
│   ├── kube-proxy-config.yml
│   └── kube-proxy.kubeconfig
├── logs
└── ssl
    └── ca.pem

部署CNI网络

下载安装

下载地址

https://github.com/containernetworking/plugins

下载二进制包

wget https://github.com/containernetworking/plugins/releases/download/v0.8.6/cni-plugins-linux-amd64-v0.8.6.tgz

mkdir -pv /opt/cni/bin
mkdir -pv /etc/cni/net.d
tar -zxf cni-plugins-linux-amd64-v0.8.6.tgz -C /opt/cni/bin/

## 把文件传输到各节点
cd /opt/cni/bin/
scp * root@k8s-master-02:/opt/cni/bin/
scp * root@k8s-node-01:/opt/cni/bin/
scp * root@k8s-node-02:/opt/cni/bin/

flannel的yaml文件不太好找,我直接粘到这里

vim kube-flannel.yml
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: psp.flannel.unprivileged
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
    seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
    apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
    apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
  privileged: false
  volumes:
    - configMap
    - secret
    - emptyDir
    - hostPath
  allowedHostPaths:
    - pathPrefix: "/etc/cni/net.d"
    - pathPrefix: "/etc/kube-flannel"
    - pathPrefix: "/run/flannel"
  readOnlyRootFilesystem: false
  # Users and groups
  runAsUser:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
  # Privilege Escalation
  allowPrivilegeEscalation: false
  defaultAllowPrivilegeEscalation: false
  # Capabilities
  allowedCapabilities: ['NET_ADMIN']
  defaultAddCapabilities: []
  requiredDropCapabilities: []
  # Host namespaces
  hostPID: false
  hostIPC: false
  hostNetwork: true
  hostPorts:
  - min: 0
    max: 65535
  # SELinux
  seLinux:
    # SELinux is unsed in CaaSP
    rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: flannel
rules:
  - apiGroups: ['extensions']
    resources: ['podsecuritypolicies']
    verbs: ['use']
    resourceNames: ['psp.flannel.unprivileged']
  - apiGroups:
      - ""
    resources:
      - pods
    verbs:
      - get
  - apiGroups:
      - ""
    resources:
      - nodes
    verbs:
      - list
      - watch
  - apiGroups:
      - ""
    resources:
      - nodes/status
    verbs:
      - patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: flannel
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: flannel
subjects:
- kind: ServiceAccount
  name: flannel
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: flannel
  namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: kube-flannel-cfg
  namespace: kube-system
  labels:
    tier: node
    app: flannel
data:
  cni-conf.json: |
    {
      "cniVersion": "0.2.0",
      "name": "cbr0",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds-amd64
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: beta.kubernetes.io/os
                    operator: In
                    values:
                      - linux
                  - key: beta.kubernetes.io/arch
                    operator: In
                    values:
                      - amd64
      hostNetwork: true
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni
        image: quay.io/coreos/flannel:v0.11.0-amd64
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.11.0-amd64
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
             add: ["NET_ADMIN"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      volumes:
        - name: run
          hostPath:
            path: /run/flannel
        - name: cni
          hostPath:
            path: /etc/cni/net.d
        - name: flannel-cfg
          configMap:
            name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds-arm64
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: beta.kubernetes.io/os
                    operator: In
                    values:
                      - linux
                  - key: beta.kubernetes.io/arch
                    operator: In
                    values:
                      - arm64
      hostNetwork: true
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni
        image: quay.io/coreos/flannel:v0.11.0-arm64
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.11.0-arm64
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
             add: ["NET_ADMIN"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      volumes:
        - name: run
          hostPath:
            path: /run/flannel
        - name: cni
          hostPath:
            path: /etc/cni/net.d
        - name: flannel-cfg
          configMap:
            name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds-arm
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: beta.kubernetes.io/os
                    operator: In
                    values:
                      - linux
                  - key: beta.kubernetes.io/arch
                    operator: In
                    values:
                      - arm
      hostNetwork: true
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni
        image: quay.io/coreos/flannel:v0.11.0-arm
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.11.0-arm
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
             add: ["NET_ADMIN"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      volumes:
        - name: run
          hostPath:
            path: /run/flannel
        - name: cni
          hostPath:
            path: /etc/cni/net.d
        - name: flannel-cfg
          configMap:
            name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds-ppc64le
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: beta.kubernetes.io/os
                    operator: In
                    values:
                      - linux
                  - key: beta.kubernetes.io/arch
                    operator: In
                    values:
                      - ppc64le
      hostNetwork: true
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni
        image: quay.io/coreos/flannel:v0.11.0-ppc64le
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.11.0-ppc64le
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
             add: ["NET_ADMIN"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      volumes:
        - name: run
          hostPath:
            path: /run/flannel
        - name: cni
          hostPath:
            path: /etc/cni/net.d
        - name: flannel-cfg
          configMap:
            name: kube-flannel-cfg
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds-s390x
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: beta.kubernetes.io/os
                    operator: In
                    values:
                      - linux
                  - key: beta.kubernetes.io/arch
                    operator: In
                    values:
                      - s390x
      hostNetwork: true
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni
        image: quay.io/coreos/flannel:v0.11.0-s390x
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.11.0-s390x
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
             add: ["NET_ADMIN"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      volumes:
        - name: run
          hostPath:
            path: /run/flannel
        - name: cni
          hostPath:
            path: /etc/cni/net.d
        - name: flannel-cfg
          configMap:
            name: kube-flannel-cfg

修改镜像下载地址

sed -i -r "s#quay.io/coreos/flannel:.*-amd64#lizhenliang/flannel:v0.12.0-amd64#g" kube-flannel.yml
## 创建flannel网络
kubectl apply -f kube-flannel.yml
## 查看相关运行信息
kubectl get all -n kube-system
kubectl get all -n kube-system -o wide

部署好网络插件,各节点重启kubelet,Node准备就绪

kubectl get node
NAME            STATUS   ROLES    AGE   VERSION
k8s-master-01   Ready    <none>   34m   v1.18.10
k8s-master-02   Ready    <none>   33m   v1.18.10
k8s-node-01     Ready    <none>   33m   v1.18.10
k8s-node-02     Ready    <none>   33m   v1.18.10

授权apiserver访问kubelet

如果不进行授权, 将无法管理容器

cat > apiserver-to-kubelet-rbac.yaml << EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:kube-apiserver-to-kubelet
rules:
  - apiGroups:
      - ""
    resources:
      - nodes/proxy
      - nodes/stats
      - nodes/log
      - nodes/spec
      - nodes/metrics
      - pods/log
    verbs:
      - "*"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:kube-apiserver
  namespace: ""
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:kube-apiserver-to-kubelet
subjects:
  - apiGroup: rbac.authorization.k8s.io
    kind: User
    name: kubernetes
EOF

kubectl apply -f apiserver-to-kubelet-rbac.yaml

部署Dashboard

下载地址

wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta8/aio/deploy/recommended.yaml

文件内容

# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: v1
kind: Namespace
metadata:
  name: kubernetes-dashboard

---

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard

---

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  ports:
    - port: 443
      targetPort: 8443
  selector:
    k8s-app: kubernetes-dashboard

---

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-certs
  namespace: kubernetes-dashboard
type: Opaque

---

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-csrf
  namespace: kubernetes-dashboard
type: Opaque
data:
  csrf: ""

---

apiVersion: v1
kind: Secret
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-key-holder
  namespace: kubernetes-dashboard
type: Opaque

---

kind: ConfigMap
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard-settings
  namespace: kubernetes-dashboard

---

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
rules:
  # Allow Dashboard to get, update and delete Dashboard exclusive secrets.
  - apiGroups: [""]
    resources: ["secrets"]
    resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs", "kubernetes-dashboard-csrf"]
    verbs: ["get", "update", "delete"]
    # Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map.
  - apiGroups: [""]
    resources: ["configmaps"]
    resourceNames: ["kubernetes-dashboard-settings"]
    verbs: ["get", "update"]
    # Allow Dashboard to get metrics.
  - apiGroups: [""]
    resources: ["services"]
    resourceNames: ["heapster", "dashboard-metrics-scraper"]
    verbs: ["proxy"]
  - apiGroups: [""]
    resources: ["services/proxy"]
    resourceNames: ["heapster", "http:heapster:", "https:heapster:", "dashboard-metrics-scraper", "http:dashboard-metrics-scraper"]
    verbs: ["get"]

---

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
rules:
  # Allow Metrics Scraper to get metrics from the Metrics server
  - apiGroups: ["metrics.k8s.io"]
    resources: ["pods", "nodes"]
    verbs: ["get", "list", "watch"]

---

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: kubernetes-dashboard
subjects:
  - kind: ServiceAccount
    name: kubernetes-dashboard
    namespace: kubernetes-dashboard

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kubernetes-dashboard
subjects:
  - kind: ServiceAccount
    name: kubernetes-dashboard
    namespace: kubernetes-dashboard

---

kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: kubernetes-dashboard
  template:
    metadata:
      labels:
        k8s-app: kubernetes-dashboard
    spec:
      containers:
        - name: kubernetes-dashboard
          image: kubernetesui/dashboard:v2.0.0-beta8
          imagePullPolicy: Always
          ports:
            - containerPort: 8443
              protocol: TCP
          args:
            - --auto-generate-certificates
            - --namespace=kubernetes-dashboard
            # Uncomment the following line to manually specify Kubernetes API server Host
            # If not specified, Dashboard will attempt to auto discover the API server and connect
            # to it. Uncomment only if the default does not work.
            # - --apiserver-host=http://my-address:port
          volumeMounts:
            - name: kubernetes-dashboard-certs
              mountPath: /certs
              # Create on-disk volume to store exec logs
            - mountPath: /tmp
              name: tmp-volume
          livenessProbe:
            httpGet:
              scheme: HTTPS
              path: /
              port: 8443
            initialDelaySeconds: 30
            timeoutSeconds: 30
      volumes:
        - name: kubernetes-dashboard-certs
          secret:
            secretName: kubernetes-dashboard-certs
        - name: tmp-volume
          emptyDir: {}
      serviceAccountName: kubernetes-dashboard
      # Comment the following tolerations if Dashboard must not be deployed on master
      tolerations:
        - key: node-role.kubernetes.io/master
          effect: NoSchedule

---

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-metrics-scraper
  name: dashboard-metrics-scraper
  namespace: kubernetes-dashboard
spec:
  ports:
    - port: 8000
      targetPort: 8000
  selector:
    k8s-app: kubernetes-metrics-scraper

---

kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    k8s-app: kubernetes-metrics-scraper
  name: kubernetes-metrics-scraper
  namespace: kubernetes-dashboard
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: kubernetes-metrics-scraper
  template:
    metadata:
      labels:
        k8s-app: kubernetes-metrics-scraper
    spec:
      containers:
        - name: kubernetes-metrics-scraper
          image: kubernetesui/metrics-scraper:v1.0.0
          ports:
            - containerPort: 8000
              protocol: TCP
          livenessProbe:
            httpGet:
              scheme: HTTP
              path: /
              port: 8000
            initialDelaySeconds: 30
            timeoutSeconds: 30
      serviceAccountName: kubernetes-dashboard
      # Comment the following tolerations if Dashboard must not be deployed on master
      tolerations:
        - key: node-role.kubernetes.io/master
          effect: NoSchedule

替换镜像地址

sed -i 's#kubernetesui#registry.cn-hangzhou.aliyuncs.com\/google_containers#g' recommended.yaml

默认Dashboard只能集群内部访问,修改Service为NodePort类型,暴露到外部(kubernetes-dashboard部分), 如下:

vi recommended.yaml
kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  ports:
    - port: 443
      targetPort: 8443
      nodePort: 30001
  type: NodePort
  selector:
    k8s-app: kubernetes-dashboard

创建dashboard-admin帐号

cat >> recommended.yaml << EOF
---
# ------------------- dashboard-admin ------------------- #
apiVersion: v1
kind: ServiceAccount
metadata:
  name: dashboard-admin
  namespace: kubernetes-dashboard

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: dashboard-admin
subjects:
- kind: ServiceAccount
  name: dashboard-admin
  namespace: kubernetes-dashboard
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
EOF

部署kubernetes-dashboard

## 部署
kubectl apply -f recommended.yaml

## 查看pod分配节点信息
kubectl get all -n kubernetes-dashboard -o wide

## 查看svc信息
kubectl get -n kubernetes-dashboard svc
NAME                        TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
dashboard-metrics-scraper   ClusterIP   10.0.0.96    <none>        8000/TCP        2m16s
kubernetes-dashboard        NodePort    10.0.0.142   <none>        443:30001/TCP   2m16s

获取令牌

kubectl describe secrets -n kubernetes-dashboard dashboard-admin

浏览器访问节点https://IP:30001,使用上面生成的token,即可登录kubernetes-dashboard

部署CoreDNS

下载yaml配置文件

https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/dns/coredns/coredns.yaml.base

下载coredns.yaml.base,修改后保存为coredns.yaml

修改yaml配置文件

主要修改4个地方:

70行左右   kubernetes cluster.local {  	-->大写部分修改成自己的域  一般为 cluster.local.
135行左右	image: coredns/coredns:1.7.0 	-->image部分墙外的需要修改,coredns/coredns:1.3.1
140行左右	memory: 170Mi 		 			-->修改成自己适合的值,我这里修改为 170Mi
200行左右	clusterIP: 10.0.0.2				--> clusterIP 修改成kubelet.config中设置的clusterDNS IP

PS: 结合官方模版修改,比如内存,image镜像地址,版本号
https://github.com/coredns/deployment/blob/master/kubernetes/coredns.yaml.sed

比如我kubectl create -f coredns.yaml时候报错
error: error validating "coredns.yaml": error validating data: ValidationError(Deployment.spec.template.spec.securityContext): unknown field "seccompProfile" in io.k8s.api.core.v1.PodSecurityContext; if you choose to ignore these errors, turn validation off with --validate=false
查看报错信息,问题出在seccompProfile字段,结合官方模版对比,官方模版里已经没有这个字段了,所以删除即可
#      securityContext:
#        seccompProfile:
#          type: RuntimeDefault

创建coredns

kubectl create -f coredns.yaml

kubectl get all -n kube-system coredns
NAME                              READY   STATUS    RESTARTS   AGE
pod/coredns-85b4878f78-vwpm9      1/1     Running   0          28m

NAME               TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
service/kube-dns   ClusterIP   10.0.0.2     <none>        53/UDP,53/TCP,9153/TCP   28m

NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/coredns   1/1     1            1           28m

NAME                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/coredns-85b4878f78   1         1         1       28m

验证coredns是否生效

创建busybox并进入容器中

kubectl run busybox --image=busybox --command -- ping www.baidu.com
kubectl exec pod/busybox -c busybox -it -- sh -il
nslookup kubernetes
Server:         10.0.0.2
Address:        10.0.0.2:53

ping www.baidu.com
PING www.baidu.com (220.181.38.149): 56 data bytes
64 bytes from 220.181.38.149: seq=0 ttl=51 time=20.448 ms
64 bytes from 220.181.38.149: seq=1 ttl=51 time=22.957 ms

至此, 单节点k8s集群部署完毕,运行正常

扩容多Master高可用

前面的master-01和master-02已经运行正常,集群默认连接的是master-01的地址,master-02其实并没有真正启用,后面将使用nginx+keeplived对两个master进行LB+HA部署, 集群连接master的IP也将切换为keeplived的VIP

配置NGINX负载均衡

在LB-M和LB-S上进行相同操作

安装NGINX

生产环境最好编译安装,这里直接装个epel源yum安装了

yum install epel-release -y
yum install nginx -y

配置NGINX

cat >> /etc/nginx/nginx.conf << "EOF"

# 四层负载均衡,为两台Master apiserver组件提供负载均衡

stream {

    log_format  main  '$remote_addr $upstream_addr - [$time_local] $status $upstream_bytes_sent';
    access_log  /var/log/nginx/k8s-access.log  main;
    
    upstream k8s-apiserver {
       ip_hash;
       server 192.168.2.101:6443;   # Master1 APISERVER IP:PORT
       server 192.168.2.102:6443;   # Master2 APISERVER IP:PORT
    }
    
    server {
       listen 6443;
       proxy_pass k8s-apiserver;
    }
}
EOF

启动NGINX

systemctl restart nginx
systemctl enable nginx

验证NGINX负载均衡

分别关闭mater-01和master-02的网络,然后通过LB-M,LB-S访问集群版本,如果都有如下输出,说明nginx负载正常

curl -k https://192.168.2.80:6443/version
curl -k https://192.168.2.81:6443/version
{
  "major": "1",
  "minor": "18",
  "gitVersion": "v1.18.10",
  "gitCommit": "62876fc6d93e891aa7fbe19771e6a6c03773b0f7",
  "gitTreeState": "clean",
  "buildDate": "2020-10-15T01:43:56Z",
  "goVersion": "go1.13.15",
  "compiler": "gc",
  "platform": "linux/amd64"

配置Keeplived高可用

安装Keeplived

keeplived--yum安装
yum -y install keepalived
 
keeplived--编译安装
tar -zxvf keepalived-1.2.19.tar.gz 
./configure --prefix=/usr/local/keeplived
make && make install
做软连接
mkdir /etc/keepalived/
ln -s /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
ln -s /usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/init.d/
ln -s /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/

配置Keeplived

LB-M

cat > /etc/keepalived/keepalived.conf << "EOF"
global_defs { 
   notification_email { 
     k8s.localhost.com
   } 
   notification_email_from k8s.localhost.com
   router_id LB-M
} 

vrrp_script check_http {
    script "</dev/tcp/127.0.0.1/6443"      #修改为自己需要监听的端口,理论上可以监听远程端口
    interval 2                             #检查脚本的频率,单位(秒)
    weight -30                             #端口检查失败,优先级减少30,weight的绝对值要大于两台priority的差值
}

vrrp_instance VI_1 { 
    state MASTER 
    interface ens33  						# 修改为实际网卡名
    virtual_router_id 51 					# VRRP 路由 ID实例,每个实例是唯一的 
    priority 100    						# 优先级,备服务器设置 90 
    advert_int 1    						# 指定VRRP 心跳包通告间隔时间,默认1秒 
    authentication { 
        auth_type PASS      
        auth_pass 123456 
    }  
    # VIP
    virtual_ipaddress { 
        192.168.2.100/24
    } 
    track_script {
        check_http
    } 
}
EOF

LB-S

cat > /etc/keepalived/keepalived.conf << EOF
global_defs { 
   notification_email { 
     k8s.localhost.com 
   } 
   notification_email_from k8s.localhost.com  
   router_id LB-S
}

vrrp_script check_http {
    script "</dev/tcp/127.0.0.1/6443"      #修改为自己需要监听的端口,理论上可以监听远程端口
    interval 2                             #检查脚本的频率,单位(秒)
    weight -30                             #端口检查失败,优先级减少30,weight的绝对值要大于两台priority的差值
}

vrrp_instance VI_1 { 
    state BACKUP 
    interface ens33  						# 修改为实际网卡名
    virtual_router_id 51 					# VRRP 路由 ID实例,每个实例是唯一的 
    priority 90	    						# 优先级,备服务器设置 90 
    advert_int 1    						# 指定VRRP 心跳包通告间隔时间,默认1秒 
    authentication { 
        auth_type PASS      
        auth_pass 123456 
    }  
    # VIP
    virtual_ipaddress { 
        192.168.2.100/24
    } 
    track_script {
        check_http
    } 
}
EOF

启动keepalived

systemctl restart keepalived
systemctl enable keepalived

验证keeplived

LB-M上执行

ip a
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:d2:68:ff brd ff:ff:ff:ff:ff:ff
    inet 192.168.2.80/24 brd 192.168.2.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever
    inet 192.168.2.100/24 scope global secondary ens33
       valid_lft forever preferred_lft forever

有如上输出,表示keeplived配置已生效,VIP落在了LB-M节点

访问VIP地址查看集群版本,有如下输出说明LB工作正常

curl -k https://192.168.2.80:6443/version
{
  "major": "1",
  "minor": "18",
  "gitVersion": "v1.18.10",
  "gitCommit": "62876fc6d93e891aa7fbe19771e6a6c03773b0f7",
  "gitTreeState": "clean",
  "buildDate": "2020-10-15T01:43:56Z",
  "goVersion": "go1.13.15",
  "compiler": "gc",
  "platform": "linux/amd64"

关闭LB-M的nginx,查看LB-M和LB-S的IP,如果VIP落在LB-S上,并且可以通过VIP访问集群版本,证明HA工作正常

如果开启了防火墙,则需要做如下配置:
开启防火墙端口
iptables -A INPUT -p 112 -j ACCEPT
iptables -A INPUT -p vrrp -jACCEPT //这里不开启vrrp协议的端口,会造成backup服务器检测不到master服务器的是否正常而自动启动backup的应用服务

修改kube-apiserver地址

由于之前创建bootstrap.kubeconfig和kube-proxy.kubeconfig配置文件时,创建脚本中KUBE_APISERVER变量都是写的master-01,所以对应生成的配置文件以及节点kubelet自动生成的kubelet.kubeconfig文件,这些文件中apiserver地址都是master-01,所以需要替换为上面做了负载均衡和高可用的VIP,这样才能实现其中一台Master挂了,集群可以正常运行

## 在所有节点操作
sed -i 's#192.168.2.101:6443#192.168.2.100:6443#' /opt/k8s/cfg/bootstrap.kubeconfig
sed -i 's#192.168.2.101:6443#192.168.2.100:6443#' /opt/k8s/cfg/kube-proxy.kubeconfig
sed -i 's#192.168.2.101:6443#192.168.2.100:6443#' /opt/k8s/cfg/kubelet.kubeconfig

## 重启kubelet和kube-proxy
systemctl restart kubelet
systemctl restart kube-proxy

验证

kubectl get node
NAME            STATUS   ROLES    AGE   VERSION
k8s-master-01   Ready    <none>   8h    v1.18.10
k8s-master-02   Ready    <none>   8h    v1.18.10
k8s-node-01     Ready    <none>   8h    v1.18.10
k8s-node-02     Ready    <none>   8h    v1.18.10

说明各节点的kubelet和kube-proxy连接VIP正常,k8s高可用集群部署完成

  • 1
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值