![07c87c00fe2f37d3d4a908605a9bb1ee.png](https://i-blog.csdnimg.cn/blog_migrate/127e8e3095a4e1393440c98bf36028d2.jpeg)
接Kubernetes 容器集群管理环境 - 完整部署(上篇)
五部署 etcd 集群etcd 是基于 Raft 的分布式 key-value 存储系统,由 CoreOS 开发,常用于服务发现、共享配置以及并发控制(如leader 选举、分布式锁等)。kubernetes 使用 etcd 存储所有运行数据。需要注意的是:由于 etcd 是负责存储,所以不建议搭建单点集群, 如 zookeeper 一样,由于存在选举策略,所以一般推荐奇数个集群,如3,5,7。只要集群半数以上的结点存活,那么集群就可以正常运行,否则集群可能无法正常使用。下面部署命令均在 k8s-master01 节点上执行,然后远程分发文件和执行命令。
1
下载和分发 etcd 二进制文件
[root@k8s-master01 ~]# cd /opt/k8s/work[root@k8s-master01 work]# wget https://github.com/coreos/etcd/releases/download/v3.3.13/etcd-v3.3.13-linux-amd64.tar.gz[root@k8s-master01 work]# tar -xvf etcd-v3.3.13-linux-amd64.tar.gz
分发二进制文件到 etcd 集群所有节点:
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_etcd_ip in ${NODE_ETCD_IPS[@]} do echo ">>> ${node_etcd_ip}" scp etcd-v3.3.13-linux-amd64/etcd* root@${node_etcd_ip}:/opt/k8s/bin ssh root@${node_etcd_ip} "chmod +x /opt/k8s/bin/*" done
2
创建 etcd 证书和私钥
创建证书签名请求:
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# cat > etcd-csr.json <{ "CN": "etcd", "hosts": [ "127.0.0.1", "172.16.60.241", "172.16.60.242", "172.16.60.243" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "4Paradigm" } ]}EOF
配置说明:
hosts 字段指定授权使用该证书的 etcd 节点 IP 或域名列表,需要将 etcd 集群的三个节点 IP 都列在其中;
生成证书和私钥
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# cfssl gencert -ca=/opt/k8s/work/ca.pem \ -ca-key=/opt/k8s/work/ca-key.pem \ -config=/opt/k8s/work/ca-config.json \ -profile=kubernetes etcd-csr.json | cfssljson -bare etcd[root@k8s-master01 work]# ls etcd*pemetcd-key.pem etcd.pem
分发生成的证书和私钥到各 etcd 节点
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_etcd_ip in ${NODE_ETCD_IPS[@]} do echo ">>> ${node_etcd_ip}" ssh root@${node_etcd_ip} "mkdir -p /etc/etcd/cert" scp etcd*.pem root@${node_etcd_ip}:/etc/etcd/cert/ done
3
创建 etcd 的 systemd unit 模板文件
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# cat > etcd.service.template <[Unit]Description=Etcd ServerAfter=network.targetAfter=network-online.targetWants=network-online.targetDocumentation=https://github.com/coreos[Service]Type=notifyWorkingDirectory=${ETCD_DATA_DIR}ExecStart=/opt/k8s/bin/etcd \\ --data-dir=${ETCD_DATA_DIR} \\ --wal-dir=${ETCD_WAL_DIR} \\ --name=##NODE_ETCD_NAME## \\ --cert-file=/etc/etcd/cert/etcd.pem \\ --key-file=/etc/etcd/cert/etcd-key.pem \\ --trusted-ca-file=/etc/kubernetes/cert/ca.pem \\ --peer-cert-file=/etc/etcd/cert/etcd.pem \\ --peer-key-file=/etc/etcd/cert/etcd-key.pem \\ --peer-trusted-ca-file=/etc/kubernetes/cert/ca.pem \\ --peer-client-cert-auth \\ --client-cert-auth \\ --listen-peer-urls=https://##NODE_ETCD_IP##:2380 \\ --initial-advertise-peer-urls=https://##NODE_ETCD_IP##:2380 \\ --listen-client-urls=https://##NODE_ETCD_IP##:2379,http://127.0.0.1:2379 \\ --advertise-client-urls=https://##NODE_ETCD_IP##:2379 \\ --initial-cluster-token=etcd-cluster-0 \\ --initial-cluster=${ETCD_NODES} \\ --initial-cluster-state=new \\ --auto-compaction-mode=periodic \\ --auto-compaction-retention=1 \\ --max-request-bytes=33554432 \\ --quota-backend-bytes=6442450944 \\ --heartbeat-interval=250 \\ --election-timeout=2000Restart=on-failureRestartSec=5LimitNOFILE=65536[Install]WantedBy=multi-user.targetEOF
配置说明:
WorkingDirectory、data-dir:指定工作目录和数据目录为 ${ETCD_DATA_DIR},需在启动服务前创建这个目录;
wal-dir:指定 wal 目录,为了提高性能,一般使用 SSD 或者和 data-dir 不同的磁盘;
name:指定节点名称,当 initial-cluster-state 值为 new 时,name 的参数值必须位于 initial-cluster 列表中;
cert-file、key-file:etcd server 与 client 通信时使用的证书和私钥;
trusted-ca-file:签名 client 证书的 CA 证书,用于验证 client 证书;
peer-cert-file、peer-key-file:etcd 与 peer 通信使用的证书和私钥;
peer-trusted-ca-file:签名 peer 证书的 CA 证书,用于验证 peer 证书;
4
为各 etcd 节点创建和分发 etcd systemd unit 文件
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for (( i=0; i < 3; i++ )) do sed -e "s/##NODE_ETCD_NAME##/${NODE_ETCD_NAMES[i]}/" -e "s/##NODE_ETCD_IP##/${NODE_ETCD_IPS[i]}/" etcd.service.template > etcd-${NODE_ETCD_IPS[i]}.service done[root@k8s-master01 work]# ls *.service etcd-172.16.60.241.service etcd-172.16.60.242.service etcd-172.16.60.243.service
最好手动查看其中一个 etcd 节点的启动文件里的 name 名称和 ip 是否都已修改过来了
[root@k8s-master01 work]# cat etcd-172.16.60.241.service.......--name=k8s-etcd01 \....... --listen-peer-urls=https://172.16.60.241:2380 \ --initial-advertise-peer-urls=https://172.16.60.241:2380 \ --listen-client-urls=https://172.16.60.241:2379,http://127.0.0.1:2379 \ --advertise-client-urls=https://172.16.60.241:2379 \ --initial-cluster-token=etcd-cluster-0 \ --initial-cluster=k8s-etcd01=https://172.16.60.241:2380,k8s-etcd02=https://172.16.60.242:2380,k8s-etcd03=https://172.16.60.243:2380 \.......
配置说明:
NODE_ETCD_NAMES 和 NODE_ETCD_IPS 为相同长度的 bash 数组,分别为etcd 集群节点名称和对应的IP;
分发生成的 systemd unit 文件:
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_etcd_ip in ${NODE_ETCD_IPS[@]} do echo ">>> ${node_etcd_ip}" scp etcd-${node_etcd_ip}.service root@${node_etcd_ip}:/etc/systemd/system/etcd.service done
配置说明:文件重命名为 etcd.service;
5
启动 etcd 服务
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_etcd_ip in ${NODE_ETCD_IPS[@]} do echo ">>> ${node_etcd_ip}" ssh root@${node_etcd_ip} "mkdir -p ${ETCD_DATA_DIR} ${ETCD_WAL_DIR}" ssh root@${node_etcd_ip} "systemctl daemon-reload && systemctl enable etcd && systemctl restart etcd " & done
配置说明:
必须先创建 etcd 数据目录和工作目录;
etcd 进程首次启动时会等待其它节点的 etcd 加入集群,命令 systemctl start etcd 会卡住一段时间,为正常现象;
6
检查 etcd 服务启动结果
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_etcd_ip in ${NODE_ETCD_IPS[@]} do echo ">>> ${node_etcd_ip}" ssh root@${node_etcd_ip} "systemctl status etcd|grep Active" done
预期输出结果为:
>>> 172.16.60.241 Active: active (running) since Tue 2019-06-04 19:55:32 CST; 7min ago>>> 172.16.60.242 Active: active (running) since Tue 2019-06-04 19:55:32 CST; 7min ago>>> 172.16.60.243 Active: active (running) since Tue 2019-06-04 19:55:32 CST; 7min ago
确保状态均为为active (running),否则查看日志,确认原因 (可以执行"journalctl -u etcd"命令查看启动失败原因)
7
验证服务状态
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_etcd_ip in ${NODE_ETCD_IPS[@]} do echo ">>> ${node_etcd_ip}" ssh root@${node_etcd_ip} " ETCDCTL_API=3 /opt/k8s/bin/etcdctl \ --endpoints=https://${node_etcd_ip}:2379 \ --cacert=/etc/kubernetes/cert/ca.pem \ --cert=/etc/etcd/cert/etcd.pem \ --key=/etc/etcd/cert/etcd-key.pem endpoint health " done
预期输出结果为:
https://172.16.60.241:2379 is healthy: successfully committed proposal: took = 2.44394ms>>> 172.16.60.242https://172.16.60.242:2379 is healthy: successfully committed proposal: took = 7.044349ms>>> 172.16.60.243https://172.16.60.243:2379 is healthy: successfully committed proposal: took = 1.865713ms
输出均为 healthy 时表示集群服务正常。
8
查看当前 etcd 集群中的 leader
在三台 etcd 节点中的任意一个节点机器上执行下面命令:
[root@k8s-etcd03 ~]# source /opt/k8s/bin/environment.sh[root@k8s-etcd03 ~]# ETCDCTL_API=3 /opt/k8s/bin/etcdctl \ -w table --cacert=/etc/kubernetes/cert/ca.pem \ --cert=/etc/etcd/cert/etcd.pem \ --key=/etc/etcd/cert/etcd-key.pem \ --endpoints=${ETCD_ENDPOINTS} endpoint status
预期输出结果为:
由上面结果可见,当前的leader节点为172.16.60.243
六Flannel 容器网络方案部署kubernetes 要求集群内各节点(这里指 master 和 node 节点)能通过 Pod 网段互联互通。flannel 使用 vxlan 技术为各节点创建一个可以互通的 Pod 网络,使用的端口为 UDP 8472(需要开放该端口,如公有云 AWS等)。flanneld 第一次启动时,从 etcd 获取配置的 Pod 网段信息,为本节点分配一个未使用的地址段,然后创建 flannedl.1 网络接口(也可能是其它名称,如 flannel1 等)。flannel 将分配给自己的 Pod 网段信息写入/run/flannel/docker 文件,docker 后续使用这个文件中的环境变量设置docker0 网桥,从而从这个地址段为本节点的所有 Pod 容器分配 IP。下面部署命令均在 k8s-master01 节点上执行,然后远程分发文件和执行命令。
1
下载和分发 flanneld 二进制文件
从flannel 的 release 页面( https://github.com/coreos/flannel/releases)下载最新版本的安装包:
[root@k8s-master01 ~]# cd /opt/k8s/work[root@k8s-master01 work]# mkdir flannel[root@k8s-master01 work]# wget https://github.com/coreos/flannel/releases/download/v0.11.0/flannel-v0.11.0-linux-amd64.tar.gz[root@k8s-master01 work]# tar -zvxf flannel-v0.11.0-linux-amd64.tar.gz -C flannel
分发二进制文件到集群所有节点:
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_all_ip in ${NODE_ALL_IPS[@]} do echo ">>> ${node_all_ip}" scp flannel/{flanneld,mk-docker-opts.sh} root@${node_all_ip}:/opt/k8s/bin/ ssh root@${node_all_ip} "chmod +x /opt/k8s/bin/*" done
2
创建 flannel 证书和私钥
flanneld 从 etcd 集群存取网段分配信息,而 etcd 集群启用了双向 x509 证书认证,所以需要为 flanneld 生成证书和私钥。
创建证书签名请求:
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# cat > flanneld-csr.json <{ "CN": "flanneld", "hosts": [], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "4Paradigm" } ]}EOF
该证书只会被 kubectl 当做 client 证书使用,所以 hosts 字段为空;
生成证书和私钥:
[root@k8s-master01 work]# cfssl gencert -ca=/opt/k8s/work/ca.pem \ -ca-key=/opt/k8s/work/ca-key.pem \ -config=/opt/k8s/work/ca-config.json \ -profile=kubernetes flanneld-csr.json | cfssljson -bare flanneld
将生成的证书和私钥分发到所有节点(master 和 node):
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_all_ip in ${NODE_ALL_IPS[@]} do echo ">>> ${node_all_ip}" ssh root@${node_all_ip} "mkdir -p /etc/flanneld/cert" scp flanneld*.pem root@${node_all_ip}:/etc/flanneld/cert done
3
向 etcd 写入集群 Pod 网段信息 (注意:本步骤只需执行一次)
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# etcdctl \ --endpoints=${ETCD_ENDPOINTS} \ --ca-file=/opt/k8s/work/ca.pem \ --cert-file=/opt/k8s/work/flanneld.pem \ --key-file=/opt/k8s/work/flanneld-key.pem \ mk ${FLANNEL_ETCD_PREFIX}/config '{"Network":"'${CLUSTER_CIDR}'", "SubnetLen": 21, "Backend": {"Type": "vxlan"}}'
解决说明:
flanneld 当前版本 (v0.11.0) 不支持 etcd v3,故使用 etcd v2 API 写入配置 key 和网段数据;
写入的 Pod 网段 ${CLUSTER_CIDR} 地址段(如 /16)必须小于 SubnetLen,必须与 kube-controller-manager 的 --cluster-cidr 参数值一致;
4
创建 flanneld 的 systemd unit 文件
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# cat > flanneld.service << EOF[Unit]Description=Flanneld overlay address etcd agentAfter=network.targetAfter=network-online.targetWants=network-online.targetAfter=etcd.serviceBefore=docker.service[Service]Type=notifyExecStart=/opt/k8s/bin/flanneld \\ -etcd-cafile=/etc/kubernetes/cert/ca.pem \\ -etcd-certfile=/etc/flanneld/cert/flanneld.pem \\ -etcd-keyfile=/etc/flanneld/cert/flanneld-key.pem \\ -etcd-endpoints=${ETCD_ENDPOINTS} \\ -etcd-prefix=${FLANNEL_ETCD_PREFIX} \\ -iface=${IFACE} \\ -ip-masqExecStartPost=/opt/k8s/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/dockerRestart=alwaysRestartSec=5StartLimitInterval=0[Install]WantedBy=multi-user.targetRequiredBy=docker.serviceEOF
解决说明:
mk-docker-opts.sh 脚本将分配给 flanneld 的 Pod 子网段信息写入 /run/flannel/docker 文件,后续 docker 启动时使用这个文件中的环境变量配置 docker0 网桥;
flanneld 使用系统缺省路由所在的接口与其它节点通信,对于有多个网络接口(如内网和公网)的节点,可以用 -iface 参数指定通信接口;
flanneld 运行时需要 root 权限;
-ip-masq: flanneld 为访问 Pod 网络外的流量设置 SNAT 规则,同时将传递给 Docker 的变量 --ip-masq(/run/flannel/docker 文件中)设置为 false,这样 Docker 将不再创建 SNAT 规则;Docker 的 --ip-masq 为 true 时,创建的 SNAT 规则比较“暴力”
5
分发 flanneld systemd unit 文件到所有节点
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_all_ip in ${NODE_ALL_IPS[@]} do echo ">>> ${node_all_ip}" scp flanneld.service root@${node_all_ip}:/etc/systemd/system/ done
6
启动 flanneld 服务
[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_all_ip in ${NODE_ALL_IPS[@]} do echo ">>> ${node_all_ip}" ssh root@${node_all_ip} "systemctl daemon-reload && systemctl enable flanneld && systemctl restart flanneld" done
7
检查启动结果
确保状态为 active (running),否则查看日志,确认原因"journalctl -u flanneld"
[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_all_ip in ${NODE_ALL_IPS[@]} do echo ">>> ${node_all_ip}" ssh root@${node_all_ip} "systemctl status flanneld|grep Active" done
8
检查分配给各 flanneld 的 Pod 网段信息
查看集群 Pod 网段(/16):
[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# etcdctl \ --endpoints=${ETCD_ENDPOINTS} \ --ca-file=/etc/kubernetes/cert/ca.pem \ --cert-file=/etc/flanneld/cert/flanneld.pem \ --key-file=/etc/flanneld/cert/flanneld-key.pem \ get ${FLANNEL_ETCD_PREFIX}/config
预期输出:
/kubernetes/network/subnets/172.30.40.0-21/kubernetes/network/subnets/172.30.88.0-21/kubernetes/network/subnets/172.30.56.0-21/kubernetes/network/subnets/172.30.72.0-21/kubernetes/network/subnets/172.30.232.0-21/kubernetes/network/subnets/172.30.152.0-21
查看某一 Pod 网段对应的节点 IP 和 flannel 接口地址:
[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# etcdctl \ --endpoints=${ETCD_ENDPOINTS} \ --ca-file=/etc/kubernetes/cert/ca.pem \ --cert-file=/etc/flanneld/cert/flanneld.pem \ --key-file=/etc/flanneld/cert/flanneld-key.pem \ get ${FLANNEL_ETCD_PREFIX}/subnets/172.30.40.0-21
预期输出:{"Network":"172.30.0.0/16", "SubnetLen": 21, "Backend": {"Type": "vxlan"}}
查看已分配的 Pod 子网段列表(/24):
[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# etcdctl \ --endpoints=${ETCD_ENDPOINTS} \ --ca-file=/etc/kubernetes/cert/ca.pem \ --cert-file=/etc/flanneld/cert/flanneld.pem \ --key-file=/etc/flanneld/cert/flanneld-key.pem \ ls ${FLANNEL_ETCD_PREFIX}/subnets
解决说明:
172.30.40.0/21 被分配给节点 k8s-master03(172.16.60.243);
VtepMAC 为 k8s-master03 节点的 flannel.1 网卡 MAC 地址;
9
检查节点 flannel 网络信息 (比如k8s-master01节点)
[root@k8s-master01 work]# ip addr show1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever2: ens192: mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:50:56:ac:7c:81 brd ff:ff:ff:ff:ff:ff inet 172.16.60.241/24 brd 172.16.60.255 scope global ens192 valid_lft forever preferred_lft forever3: flannel.1: mtu 1450 qdisc noqueue state UNKNOWN group default link/ether 7a:2a:36:99:75:5f brd ff:ff:ff:ff:ff:ff inet 172.30.232.0/32 scope global flannel.1 valid_lft forever preferred_lft forever
注意:flannel.1 网卡的地址为分配的 Pod 子网段的第一个 IP(.0),且是 /32 的地址;
[root@k8s-master01 work]# ip route show |grep flannel.1172.30.40.0/21 via 172.30.40.0 dev flannel.1 onlink172.30.56.0/21 via 172.30.56.0 dev flannel.1 onlink172.30.72.0/21 via 172.30.72.0 dev flannel.1 onlink172.30.88.0/21 via 172.30.88.0 dev flannel.1 onlink172.30.152.0/21 via 172.30.152.0 dev flannel.1 onlink
到其它节点 Pod 网段请求都被转发到 flannel.1 网卡;
flanneld 根据 etcd 中子网段的信息,如 ${FLANNEL_ETCD_PREFIX}/subnets/172.30.232.0-21 ,来决定进请求发送给哪个节点的互联 IP;
10
验证各节点能通过 Pod 网段互通
在各节点上部署 flannel 后,检查是否创建了 flannel 接口(名称可能为 flannel0、flannel.0、flannel.1 等):
[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_all_ip in ${NODE_ALL_IPS[@]} do echo ">>> ${node_all_ip}" ssh ${node_all_ip} "/usr/sbin/ip addr show flannel.1|grep -w inet" done
预期输出:
>>> 172.16.60.241
inet 172.30.232.0/32 scope global flannel.1
>>> 172.16.60.242
inet 172.30.152.0/32 scope global flannel.1
>>> 172.16.60.243
inet 172.30.40.0/32 scope global flannel.1
>>> 172.16.60.244
inet 172.30.88.0/32 scope global flannel.1
>>> 172.16.60.245
inet 172.30.56.0/32 scope global flannel.1
>>> 172.16.60.246
inet 172.30.72.0/32 scope global flannel.1
在各节点上 ping 所有 flannel 接口 IP,确保能通:
[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_all_ip in ${NODE_ALL_IPS[@]} do echo ">>> ${node_all_ip}" ssh ${node_all_ip} "ping -c 1 172.30.232.0" ssh ${node_all_ip} "ping -c 1 172.30.152.0" ssh ${node_all_ip} "ping -c 1 172.30.40.0" ssh ${node_all_ip} "ping -c 1 172.30.88.0" ssh ${node_all_ip} "ping -c 1 172.30.56.0" ssh ${node_all_ip} "ping -c 1 172.30.72.0" done
七基于 nginx 四层代理环境
这里采用 nginx 4 层透明代理功能实现 K8S 节点( master 节点和 worker 节点)高可用访问 kube-apiserver。控制节点的 kube-controller-manager、kube-scheduler 是多实例(3个)部署,所以只要有一个实例正常,就可以保证高可用;搭建 nginx+keepalived 环境,对外提供一个统一的 vip 地址,后端对接多个 apiserver 实例,nginx 对它们做健康检查和负载均衡;kubelet、kube-proxy、controller-manager、scheduler 通过 vip 地址访问 kube-apiserver,从而实现 kube-apiserver 的高可用;
安装和配置 nginx,下面操作在172.16.60.247、172.16.60.247两个节点机器上操作1
下载和编译 nginx
[root@k8s-ha01 ~]# yum -y install gcc pcre-devel zlib-devel openssl-devel wget lsof[root@k8s-ha01 ~]# cd /opt/k8s/work[root@k8s-ha01 work]# wget http://nginx.org/download/nginx-1.15.3.tar.gz[root@k8s-ha01 work]# tar -xzvf nginx-1.15.3.tar.gz[root@k8s-ha01 work]# cd nginx-1.15.3[root@k8s-ha01 nginx-1.15.3]# mkdir nginx-prefix[root@k8s-ha01 nginx-1.15.3]# ./configure --with-stream --without-http --prefix=$(pwd)/nginx-prefix --without-http_uwsgi_module --without-http_scgi_module --without-http_fastcgi_module
解决说明:
with-stream:开启 4 层透明转发(TCP Proxy)功能;
without-xxx:关闭所有其他功能,这样生成的动态链接二进制程序依赖最小;
预期输出:
Configuration summary + PCRE library is not used + OpenSSL library is not used + zlib library is not used nginx path prefix: "/root/tmp/nginx-1.15.3/nginx-prefix" nginx binary file: "/root/tmp/nginx-1.15.3/nginx-prefix/sbin/nginx" nginx modules path: "/root/tmp/nginx-1.15.3/nginx-prefix/modules" nginx configuration prefix: "/root/tmp/nginx-1.15.3/nginx-prefix/conf" nginx configuration file: "/root/tmp/nginx-1.15.3/nginx-prefix/conf/nginx.conf" nginx pid file: "/root/tmp/nginx-1.15.3/nginx-prefix/logs/nginx.pid" nginx error log file: "/root/tmp/nginx-1.15.3/nginx-prefix/logs/error.log" nginx http access log file: "/root/tmp/nginx-1.15.3/nginx-prefix/logs/access.log" nginx http client request body temporary files: "client_body_temp" nginx http proxy temporary files: "proxy_temp"
继续编译和安装:
[root@k8s-ha01 nginx-1.15.3]# make && make install
2
验证编译的 nginx
[root@k8s-ha01 nginx-1.15.3]# ./nginx-prefix/sbin/nginx -vnginx version: nginx/1.15.3
查看 nginx 动态链接的库:
[root@k8s-ha01 nginx-1.15.3]# ldd ./nginx-prefix/sbin/nginx linux-vdso.so.1 => (0x00007ffc7e0ef000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f00b5c2d000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f00b5a11000) libc.so.6 => /lib64/libc.so.6 (0x00007f00b5644000) /lib64/ld-linux-x86-64.so.2 (0x00007f00b5e31000)
由于只开启了 4 层透明转发功能,所以除了依赖 libc 等操作系统核心 lib 库外,没有对其它 lib 的依赖(如 libz、libssl 等),这样可以方便部署到各版本操作系统中;
3
安装和部署 nginx
[root@k8s-ha01 ~]# cp /opt/k8s/work/nginx-1.15.3/nginx-prefix/sbin/nginx /opt/k8s/kube-nginx/sbin/kube-nginx[root@k8s-ha01 ~]# chmod a+x /opt/k8s/kube-nginx/sbin/*[root@k8s-ha01 ~]# mkdir -p /opt/k8s/kube-nginx/{conf,logs,sbin}
配置 nginx,开启 4 层透明转发功能:
[root@k8s-ha01 ~]# vim /opt/k8s/kube-nginx/conf/kube-nginx.confworker_processes 2;events { worker_connections 65525;}stream { upstream backend { hash $remote_addr consistent; server 172.16.60.241:6443 max_fails=3 fail_timeout=30s; server 172.16.60.242:6443 max_fails=3 fail_timeout=30s; server 172.16.60.243:6443 max_fails=3 fail_timeout=30s; } server { listen 8443; proxy_connect_timeout 1s; proxy_pass backend; }}[root@k8s-ha01 ~]# ulimit -n 65525[root@k8s-ha01 ~]# vim /etc/security/limits.conf # 文件底部添加下面四行内容* soft nofile 65525* hard nofile 65525* soft nproc 65525* hard nproc 65525
4
配置 systemd unit 文件,启动服务
[root@k8s-ha01 ~]# vim /etc/systemd/system/kube-nginx.service[Unit]Description=kube-apiserver nginx proxyAfter=network.targetAfter=network-online.targetWants=network-online.target[Service]Type=forkingExecStartPre=/opt/k8s/kube-nginx/sbin/kube-nginx -c /opt/k8s/kube-nginx/conf/kube-nginx.conf -p /opt/k8s/kube-nginx -tExecStart=/opt/k8s/kube-nginx/sbin/kube-nginx -c /opt/k8s/kube-nginx/conf/kube-nginx.conf -p /opt/k8s/kube-nginxExecReload=/opt/k8s/kube-nginx/sbin/kube-nginx -c /opt/k8s/kube-nginx/conf/kube-nginx.conf -p /opt/k8s/kube-nginx -s reloadPrivateTmp=trueRestart=alwaysRestartSec=5StartLimitInterval=0LimitNOFILE=65536[Install]WantedBy=multi-user.target[root@k8s-ha01 ~]# systemctl daemon-reload && systemctl enable kube-nginx && systemctl restart kube-nginx[root@k8s-ha01 ~]# lsof -i:8443COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAMEkube-ngin 31980 root 5u IPv4 145789 0t0 TCP localhost:pcsync-https (LISTEN)kube-ngin 31981 nobody 5u IPv4 145789 0t0 TCP localhost:pcsync-https (LISTEN)kube-ngin 31982 nobody 5u IPv4 145789 0t0 TCP localhost:pcsync-https (LISTEN)
测试下 8443 代理端口连通性
[root@k8s-ha01 ~]# telnet 172.16.60.250 8443Trying 172.16.60.250...Connected to 172.16.60.250.Escape character is '^]'.Connection closed by foreign host.
这是因为三个 kube-apiserver 服务还没有部署,即后端三个 apiserver 实例的 6443 端口还没有起来。
安装和配置 keepalived1
编译安装 keepalived (两个节点上同样操作)
[root@k8s-ha01 ~]# cd /opt/k8s/work/[root@k8s-ha01 work]# wget https://www.keepalived.org/software/keepalived-2.0.16.tar.gz[root@k8s-ha01 work]# tar -zvxf keepalived-2.0.16.tar.gz[root@k8s-ha01 work]# cd keepalived-2.0.16[root@k8s-ha01 keepalived-2.0.16]# ./configure[root@k8s-ha01 keepalived-2.0.16]# make && make install[root@k8s-ha01 keepalived-2.0.16]# cp keepalived/etc/init.d/keepalived /etc/rc.d/init.d/[root@k8s-ha01 keepalived-2.0.16]# cp /usr/local/etc/sysconfig/keepalived /etc/sysconfig/[root@k8s-ha01 keepalived-2.0.16]# mkdir /etc/keepalived[root@k8s-ha01 keepalived-2.0.16]# cp /usr/local/etc/keepalived/keepalived.conf /etc/keepalived/[root@k8s-ha01 keepalived-2.0.16]# cp /usr/local/sbin/keepalived /usr/sbin/[root@k8s-ha01 keepalived-2.0.16]# echo "/etc/init.d/keepalived start" >> /etc/rc.local
2
配置 keepalived
172.16.60.207 节点上的 keepalived 配置内容
[root@k8s-ha01 ~]# cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak[root@k8s-ha01 ~]# >/etc/keepalived/keepalived.conf[root@k8s-ha01 ~]# vim /etc/keepalived/keepalived.conf! Configuration File for keepalived global_defs {notification_email { ops@wangshibo.cn tech@wangshibo.cn}notification_email_from ops@wangshibo.cn smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id master-node }vrrp_script chk_http_port { script "/opt/chk_nginx.sh" interval 2 weight -5 fall 2 rise 1 }vrrp_instance VI_1 { state MASTER interface ens192 mcast_src_ip 172.16.60.247 virtual_router_id 51 priority 101 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 172.16.60.250 }track_script { chk_http_port }}
另一个节点 172.16.60.248 上的 keepalived 配置内容为:
[root@k8s-ha02 ~]# cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak[root@k8s-ha02 ~]# >/etc/keepalived/keepalived.conf[root@k8s-ha02 ~]# vim /etc/keepalived/keepalived.conf! Configuration File for keepalived global_defs {notification_email { ops@wangshibo.cn tech@wangshibo.cn}notification_email_from ops@wangshibo.cn smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id slave-node }vrrp_script chk_http_port { script "/opt/chk_nginx.sh" interval 2 weight -5 fall 2 rise 1 }vrrp_instance VI_1 { state MASTER interface ens192 mcast_src_ip 172.16.60.248 virtual_router_id 51 priority 99 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 172.16.60.250 }track_script { chk_http_port }}
3
配置两个节点的 nginx 监控脚本(该脚本会在 keepalived.conf 配置中被引用)
[root@k8s-ha01 ~]# vim /opt/chk_nginx.sh#!/bin/bashcounter=$(ps -ef|grep -w kube-nginx|grep -v grep|wc -l)if [ "${counter}" = "0" ]; then systemctl start kube-nginx sleep 2 counter=$(ps -ef|grep kube-nginx|grep -v grep|wc -l) if [ "${counter}" = "0" ]; then /etc/init.d/keepalived stop fifi[root@k8s-ha01 ~]# chmod 755 /opt/chk_nginx.sh
4
启动两个节点的 keepalived 服务
[root@k8s-ha01 ~]# /etc/init.d/keepalived startStarting keepalived (via systemctl): [ OK ][root@k8s-ha01 ~]# ps -ef|grep keepalivedroot 5358 1 0 00:32 ? 00:00:00 /usr/local/sbin/keepalived -Droot 5359 5358 0 00:32 ? 00:00:00 /usr/local/sbin/keepalived -Droot 5391 29606 0 00:32 pts/0 00:00:00 grep --color=auto keepalived
查看 vip 情况. 发现 vip 默认起初会在 master 节点上
[root@k8s-ha01 ~]# ip addr1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever2: ens192: mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 00:50:56:ac:3a:a6 brd ff:ff:ff:ff:ff:ff inet 172.16.60.247/24 brd 172.16.60.255 scope global ens192 valid_lft forever preferred_lft forever inet 172.16.60.250/32 scope global ens192 valid_lft forever preferred_lft forever inet6 fe80::250:56ff:feac:3aa6/64 scope link valid_lft forever preferred_lft forever
5
测试 vip 故障转移
参考:https://www.cnblogs.com/kevingrace/p/6138185.html
当 master 节点的 keepalived 服务挂掉,vip 会自动漂移到 slave 节点上
当 master 节点的 keepliaved 服务恢复后,从将 vip 资源从 slave 节点重新抢占回来(keepalived 配置文件中的 priority 优先级决定的)
当两个节点的 nginx 挂掉后,keepaived 会引用 nginx 监控脚本自启动 nginx 服务,如启动失败,则强杀 keepalived 服务,从而实现 vip 转移。
八部署 master 节点master 节点的 kube-apiserver、kube-scheduler 和 kube-controller-manager 均以多实例模式运行:kube-scheduler 和 kube-controller-manager 会自动选举产生一个 leader 实例,其它实例处于阻塞模式,当 leader 挂了后,重新选举产生新的 leader,从而保证服务可用性;kube-apiserver 是无状态的,需要通过 kube-nginx 进行代理访问,从而保证服务可用性;下面部署命令均在 k8s-master01 节点上执行,然后远程分发文件和执行命令。
下载最新版本二进制文件
[root@k8s-master01 ~]# cd /opt/k8s/work[root@k8s-master01 work]# wget https://dl.k8s.io/v1.14.2/kubernetes-server-linux-amd64.tar.gz[root@k8s-master01 work]# tar -xzvf kubernetes-server-linux-amd64.tar.gz[root@k8s-master01 work]# cd kubernetes[root@k8s-master01 work]# tar -xzvf kubernetes-src.tar.gz
将二进制文件拷贝到所有 master 节点:
[root@k8s-master01 ~]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_master_ip in ${NODE_MASTER_IPS[@]} do echo ">>> ${node_master_ip}" scp kubernetes/server/bin/{apiextensions-apiserver,cloud-controller-manager,kube-apiserver,kube-controller-manager,kube-proxy,kube-scheduler,kubeadm,kubectl,kubelet,mounter} root@${node_master_ip}:/opt/k8s/bin/ ssh root@${node_master_ip} "chmod +x /opt/k8s/bin/*" done
8.1 部署高可用 kube-apiserver 集群
这里部署一个三实例 kube-apiserver 集群环境,它们通过 nginx 四层代理进行访问,对外提供一个统一的 vip 地址,从而保证服务可用性。下面部署命令均在 k8s-master01 节点上执行,然后远程分发文件和执行命令。
1
创建 kubernetes 证书和私钥
创建证书签名请求:
[root@k8s-master01 ~]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# cat > kubernetes-csr.json <{ "CN": "kubernetes", "hosts": [ "127.0.0.1", "172.16.60.250", "172.16.60.241", "172.16.60.242", "172.16.60.243", "${CLUSTER_KUBERNETES_SVC_IP}", "kubernetes", "kubernetes.default", "kubernetes.default.svc", "kubernetes.default.svc.cluster", "kubernetes.default.svc.cluster.local" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "4Paradigm" } ]}EOF
解释说明:
hosts 字段指定授权使用该证书的 IP 或域名列表,这里列出了 VIP 、apiserver 节点 IP、kubernetes 服务 IP 和域名;
域名最后字符不能是 .(如不能为kubernetes.default.svc.cluster.local.),否则解析时失败,提示:x509: cannot parse dnsName "kubernetes.default.svc.cluster.local.";
如果使用非 cluster.local 域名,如 opsnull.com,则需要修改域名列表中的最后两个域名为:kubernetes.default.svc.opsnull、kubernetes.default.svc.opsnull.com
kubernetes 服务 IP 是 apiserver 自动创建的,一般是 --service-cluster-ip-range 参数指定的网段的第一个IP,后续可以通过如下命令获取:
[root@k8s-master01 work]# kubectl get svc kubernetesThe connection to the server 172.16.60.250:8443 was refused - did you specify the right host or port?
上面报错是因为 kube-apiserver 服务此时没有启动,后续待 apiserver 服务启动后,以上命令就可以获得了。
生成证书和私钥:
[root@k8s-master01 work]# cfssl gencert -ca=/opt/k8s/work/ca.pem \ -ca-key=/opt/k8s/work/ca-key.pem \ -config=/opt/k8s/work/ca-config.json \ -profile=kubernetes kubernetes-csr.json | cfssljson -bare kubernetes [root@k8s-master01 work]# ls kubernetes*pemkubernetes-key.pem kubernetes.pem
将生成的证书和私钥文件拷贝到所有 master 节点:
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_master_ip in ${NODE_MASTER_IPS[@]} do echo ">>> ${node_master_ip}" ssh root@${node_master_ip} "mkdir -p /etc/kubernetes/cert" scp kubernetes*.pem root@${node_master_ip}:/etc/kubernetes/cert/ done
2
创建加密配置文件
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# cat > encryption-config.yaml <kind: EncryptionConfigapiVersion: v1resources: - resources: - secrets providers: - aescbc: keys: - name: key1 secret: ${ENCRYPTION_KEY} - identity: {}EOF
将加密配置文件拷贝到 master 节点的 /etc/kubernetes 目录下:
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_master_ip in ${NODE_MASTER_IPS[@]} do echo ">>> ${node_master_ip}" scp encryption-config.yaml root@${node_master_ip}:/etc/kubernetes/ done
3
创建审计策略文件
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# cat > audit-policy.yaml <apiVersion: audit.k8s.io/v1beta1kind: Policyrules: # The following requests were manually identified as high-volume and low-risk, so drop them. - level: None resources: - group: "" resources: - endpoints - services - services/status users: - 'system:kube-proxy' verbs: - watch - level: None resources: - group: "" resources: - nodes - nodes/status userGroups: - 'system:nodes' verbs: - get - level: None namespaces: - kube-system resources: - group: "" resources: - endpoints users: - 'system:kube-controller-manager' - 'system:kube-scheduler' - 'system:serviceaccount:kube-system:endpoint-controller' verbs: - get - update - level: None resources: - group: "" resources: - namespaces - namespaces/status - namespaces/finalize users: - 'system:apiserver' verbs: - get # Don't log HPA fetching metrics. - level: None resources: - group: metrics.k8s.io users: - 'system:kube-controller-manager' verbs: - get - list # Don't log these read-only URLs. - level: None nonResourceURLs: - '/healthz*' - /version - '/swagger*' # Don't log events requests. - level: None resources: - group: "" resources: - events # node and pod status calls from nodes are high-volume and can be large, don't log responses for expected updates from nodes - level: Request omitStages: - RequestReceived resources: - group: "" resources: - nodes/status - pods/status users: - kubelet - 'system:node-problem-detector' - 'system:serviceaccount:kube-system:node-problem-detector' verbs: - update - patch - level: Request omitStages: - RequestReceived resources: - group: "" resources: - nodes/status - pods/status userGroups: - 'system:nodes' verbs: - update - patch # deletecollection calls can be large, don't log responses for expected namespace deletions - level: Request omitStages: - RequestReceived users: - 'system:serviceaccount:kube-system:namespace-controller' verbs: - deletecollection # Secrets, ConfigMaps, and TokenReviews can contain sensitive & binary data, # so only log at the Metadata level. - level: Metadata omitStages: - RequestReceived resources: - group: "" resources: - secrets - configmaps - group: authentication.k8s.io resources: - tokenreviews # Get repsonses can be large; skip them. - level: Request omitStages: - RequestReceived resources: - group: "" - group: admissionregistration.k8s.io - group: apiextensions.k8s.io - group: apiregistration.k8s.io - group: apps - group: authentication.k8s.io - group: authorization.k8s.io - group: autoscaling - group: batch - group: certificates.k8s.io - group: extensions - group: metrics.k8s.io - group: networking.k8s.io - group: policy - group: rbac.authorization.k8s.io - group: scheduling.k8s.io - group: settings.k8s.io - group: storage.k8s.io verbs: - get - list - watch # Default level for known APIs - level: RequestResponse omitStages: - RequestReceived resources: - group: "" - group: admissionregistration.k8s.io - group: apiextensions.k8s.io - group: apiregistration.k8s.io - group: apps - group: authentication.k8s.io - group: authorization.k8s.io - group: autoscaling - group: batch - group: certificates.k8s.io - group: extensions - group: metrics.k8s.io - group: networking.k8s.io - group: policy - group: rbac.authorization.k8s.io - group: scheduling.k8s.io - group: settings.k8s.io - group: storage.k8s.io # Default level for all other requests. - level: Metadata omitStages: - RequestReceivedEOF
分发审计策略文件:
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_master_ip in ${NODE_MASTER_IPS[@]} do echo ">>> ${node_master_ip}" scp audit-policy.yaml root@${node_master_ip}:/etc/kubernetes/audit-policy.yaml done
4
创建后续访问 metrics-server 使用的证书
创建证书签名请求:
[root@k8s-master01 work]# cat > proxy-client-csr.json <{ "CN": "aggregator", "hosts": [], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "BeiJing", "L": "BeiJing", "O": "k8s", "OU": "4Paradigm" } ]}EOF
CN 名称为 aggregator,需要与 metrics-server 的 --requestheader-allowed-names 参数配置一致,否则访问会被 metrics-server 拒绝;
生成证书和私钥:
[root@k8s-master01 work]# cfssl gencert -ca=/etc/kubernetes/cert/ca.pem \ -ca-key=/etc/kubernetes/cert/ca-key.pem \ -config=/etc/kubernetes/cert/ca-config.json \ -profile=kubernetes proxy-client-csr.json | cfssljson -bare proxy-client [root@k8s-master01 work]# ls proxy-client*.pemproxy-client-key.pem proxy-client.pem
将生成的证书和私钥文件拷贝到所有 master 节点:
[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_master_ip in ${NODE_MASTER_IPS[@]} do echo ">>> ${node_master_ip}" scp proxy-client*.pem root@${node_master_ip}:/etc/kubernetes/cert/ done
5
创建 kube-apiserver systemd unit 模板文件
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# cat > kube-apiserver.service.template <[Unit]Description=Kubernetes API ServerDocumentation=https://github.com/GoogleCloudPlatform/kubernetesAfter=network.target [Service]WorkingDirectory=${K8S_DIR}/kube-apiserverExecStart=/opt/k8s/bin/kube-apiserver \\ --advertise-address=##NODE_MASTER_IP## \\ --default-not-ready-toleration-seconds=360 \\ --default-unreachable-toleration-seconds=360 \\ --feature-gates=DynamicAuditing=true \\ --max-mutating-requests-inflight=2000 \\ --max-requests-inflight=4000 \\ --default-watch-cache-size=200 \\ --delete-collection-workers=2 \\ --encryption-provider-config=/etc/kubernetes/encryption-config.yaml \\ --etcd-cafile=/etc/kubernetes/cert/ca.pem \\ --etcd-certfile=/etc/kubernetes/cert/kubernetes.pem \\ --etcd-keyfile=/etc/kubernetes/cert/kubernetes-key.pem \\ --etcd-servers=${ETCD_ENDPOINTS} \\ --bind-address=##NODE_MASTER_IP## \\ --secure-port=6443 \\ --tls-cert-file=/etc/kubernetes/cert/kubernetes.pem \\ --tls-private-key-file=/etc/kubernetes/cert/kubernetes-key.pem \\ --insecure-port=0 \\ --audit-dynamic-configuration \\ --audit-log-maxage=15 \\ --audit-log-maxbackup=3 \\ --audit-log-maxsize=100 \\ --audit-log-mode=batch \\ --audit-log-truncate-enabled \\ --audit-log-batch-buffer-size=20000 \\ --audit-log-batch-max-size=2 \\ --audit-log-path=${K8S_DIR}/kube-apiserver/audit.log \\ --audit-policy-file=/etc/kubernetes/audit-policy.yaml \\ --profiling \\ --anonymous-auth=false \\ --client-ca-file=/etc/kubernetes/cert/ca.pem \\ --enable-bootstrap-token-auth \\ --requestheader-allowed-names="" \\ --requestheader-client-ca-file=/etc/kubernetes/cert/ca.pem \\ --requestheader-extra-headers-prefix="X-Remote-Extra-" \\ --requestheader-group-headers=X-Remote-Group \\ --requestheader-username-headers=X-Remote-User \\ --service-account-key-file=/etc/kubernetes/cert/ca.pem \\ --authorization-mode=Node,RBAC \\ --runtime-config=api/all=true \\ --enable-admission-plugins=NodeRestriction \\ --allow-privileged=true \\ --apiserver-count=3 \\ --event-ttl=168h \\ --kubelet-certificate-authority=/etc/kubernetes/cert/ca.pem \\ --kubelet-client-certificate=/etc/kubernetes/cert/kubernetes.pem \\ --kubelet-client-key=/etc/kubernetes/cert/kubernetes-key.pem \\ --kubelet-https=true \\ --kubelet-timeout=10s \\ --proxy-client-cert-file=/etc/kubernetes/cert/proxy-client.pem \\ --proxy-client-key-file=/etc/kubernetes/cert/proxy-client-key.pem \\ --service-cluster-ip-range=${SERVICE_CIDR} \\ --service-node-port-range=${NODE_PORT_RANGE} \\ --logtostderr=true \\ --enable-aggregator-routing=true \\ --v=2Restart=on-failureRestartSec=10Type=notifyLimitNOFILE=65536 [Install]WantedBy=multi-user.targetEOF
解释说明:
advertise-address:apiserver 对外通告的 IP(kubernetes 服务后端节点 IP);
default-*-toleration-seconds:设置节点异常相关的阈值;
max-*-requests-inflight:请求相关的最大阈值;
etcd-*:访问 etcd 的证书和 etcd 服务器地址;
experimental-encryption-provider-config:指定用于加密 etcd 中 secret 的配置;
bind-address: https 监听的 IP,不能为 127.0.0.1,否则外界不能访问它的安全端口 6443;
secret-port:https 监听端口;
insecure-port=0:关闭监听 http 非安全端口(8080);
tls-*-file:指定 apiserver 使用的证书、私钥和 CA 文件;
audit-*:配置审计策略和审计日志文件相关的参数;
client-ca-file:验证 client (kue-controller-manager、kube-scheduler、kubelet、kube-proxy 等)请求所带的证书;
enable-bootstrap-token-auth:启用 kubelet bootstrap 的 token 认证;
requestheader-*:kube-apiserver 的 aggregator layer 相关的配置参数,proxy-client & HPA 需要使用;
requestheader-client-ca-file:用于签名 --proxy-client-cert-file 和 --proxy-client-key-file 指定的证书;在启用了 metric aggregator 时使用;
如果 --requestheader-allowed-names 不为空,则--proxy-client-cert-file 证书的 CN 必须位于 allowed-names 中,默认为 aggregator;
service-account-key-file:签名 ServiceAccount Token 的公钥文件,kube-controller-manager 的 --service-account-private-key-file 指定私钥文件,两者配对使用;
runtime-config=api/all=true: 启用所有版本的 APIs,如 autoscaling/v2alpha1;
authorization-mode=Node,RBAC、--anonymous-auth=false: 开启 Node 和 RBAC 授权模式,拒绝未授权的请求;
enable-admission-plugins:启用一些默认关闭的 plugins;
allow-privileged:运行执行 privileged 权限的容器;
apiserver-count=3:指定 apiserver 实例的数量;
event-ttl:指定 events 的保存时间;
kubelet-*:如果指定,则使用 https 访问 kubelet APIs;需要为证书对应的用户(上面 kubernetes*.pem 证书的用户为 kubernetes) 用户定义 RBAC 规则,否则访问 kubelet API 时提示未授权;
proxy-client-*:apiserver 访问 metrics-server 使用的证书;
service-cluster-ip-range: 指定 Service Cluster IP 地址段;
service-node-port-range: 指定 NodePort 的端口范围;
注意:
如果kube-apiserver机器没有运行 kube-proxy,则需要添加 --enable-aggregator-routing=true 参数(这里master节点没有作为node节点使用,故没有运行kube-proxy,需要加这个参数)requestheader-client-ca-file 指定的 CA 证书,必须具有 client auth and server auth!!
为各节点创建和分发 kube-apiserver systemd unit 文件,替换模板文件中的变量,为各节点生成 systemd unit 文件:
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for (( i=0; i < 3; i++ )) do sed -e "s/##NODE_MASTER_NAME##/${NODE_MASTER_NAMES[i]}/" -e "s/##NODE_MASTER_IP##/${NODE_MASTER_IPS[i]}/" kube-apiserver.service.template > kube-apiserver-${NODE_MASTER_IPS[i]}.service done
其中:NODE_NAMES 和 NODE_IPS 为相同长度的 bash 数组,分别为节点名称和对应的 IP;
[root@k8s-master01 work]# ll kube-apiserver*.service-rw-r--r-- 1 root root 2718 Jun 18 10:38 kube-apiserver-172.16.60.241.service-rw-r--r-- 1 root root 2718 Jun 18 10:38 kube-apiserver-172.16.60.242.service-rw-r--r-- 1 root root 2718 Jun 18 10:38 kube-apiserver-172.16.60.243.service
分发生成的 systemd unit 文件, 文件重命名为 kube-apiserver.service;
[root@k8s-master01 work]# cd /opt/k8s/work[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_master_ip in ${NODE_MASTER_IPS[@]} do echo ">>> ${node_master_ip}" scp kube-apiserver-${node_master_ip}.service root@${node_master_ip}:/etc/systemd/system/kube-apiserver.service done
6
启动 kube-apiserver 服务
[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_master_ip in ${NODE_MASTER_IPS[@]} do echo ">>> ${node_master_ip}" ssh root@${node_master_ip} "mkdir -p ${K8S_DIR}/kube-apiserver" ssh root@${node_master_ip} "systemctl daemon-reload && systemctl enable kube-apiserver && systemctl restart kube-apiserver" done
注意:启动服务前必须先创建工作目录;
检查 kube-apiserver 运行状态
[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# for node_master_ip in ${NODE_MASTER_IPS[@]} do echo ">>> ${node_master_ip}" ssh root@${node_master_ip} "systemctl status kube-apiserver |grep 'Active:'" done
预期输出:
>>> 172.16.60.241 Active: active (running) since Tue 2019-06-18 10:42:42 CST; 1min 6s ago>>> 172.16.60.242 Active: active (running) since Tue 2019-06-18 10:42:47 CST; 1min 2s ago>>> 172.16.60.243 Active: active (running) since Tue 2019-06-18 10:42:51 CST; 58s ago
确保状态为 active (running),否则查看日志,确认原因(journalctl -u kube-apiserver)
7
打印 kube-apiserver 写入 etcd 的数据
[root@k8s-master01 work]# source /opt/k8s/bin/environment.sh[root@k8s-master01 work]# ETCDCTL_API=3 etcdctl \ --endpoints=${ETCD_ENDPOINTS} \ --cacert=/opt/k8s/work/ca.pem \ --cert=/opt/k8s/work/etcd.pem \ --key=/opt/k8s/work/etcd-key.pem \ get /registry/ --prefix --keys-only
预期会打印出很多写入到etcd中的数据信息
8
检查集群信息
[root@k8s-master01 work]# kubectl cluster-infoKubernetes master is running at https://172.16.60.250:8443To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. [root@k8s-master01 work]# kubectl get all --all-namespacesNAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGEdefault service/kubernetes ClusterIP 10.254.0.1 <none> 443/TCP 8m25s
查看集群状态信息
[root@k8s-master01 work]# kubectl get componentstatuses #或者执行命令"kubectl get cs"NAME STATUS MESSAGE ERRORcontroller-manager Unhealthy Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refusedscheduler Unhealthy Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refusedetcd-0 Healthy {"health":"true"} etcd-2 Healthy {"health":"true"} etcd-1 Healthy {"health":"true"}
controller-managerhe 和 schedule状态为Unhealthy,是因为此时还没有部署这两个组件,待后续部署好之后再查看~
这里注意:
如果执行 kubectl 命令式时输出如下错误信息,则说明使用的 ~/.kube/config 文件不对,请切换到正确的账户后再执行该命令:The connection to the server localhost:8080 was refused - did you specify the right host or port?
执行 kubectl get componentstatuses 命令时,apiserver 默认向 127.0.0.1 发送请求。当 controller-manager、scheduler 以集群模式运行时,有可能和 kube-apiserver 不在一台机器上,这时 controller-manager 或 scheduler 的状态为 Unhealthy,但实际上它们工作正常。
9
检查 kube-apiserver 监听的端口
[root@k8s-master01 work]# netstat -lnpt|grep kubetcp 0 0 172.16.60.241:6443 0.0.0.0:* LISTEN 15516/kube-apiserve
需要注意:
6443: 接收 https 请求的安全端口,对所有请求做认证和授权;
由于关闭了非安全端口,故没有监听 8080;
10
授予 kube-apiserver 访问 kubelet API 的权限
在执行 kubectl exec、run、logs 等命令时,apiserver 会将请求转发到 kubelet 的 https 端口。
这里定义 RBAC 规则,授权 apiserver 使用的证书(kubernetes.pem)用户名(CN:kuberntes)访问 kubelet API 的权限:
[root@k8s-master01 work]# kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user kubernetes
11
查看 kube-apiserver 输出的 metrics
需要用到根证书
使用 nginx 的代理端口获取 metrics
[root@k8s-master01 work]# curl -s --cacert /opt/k8s/work/ca.pem --cert /opt/k8s/work/admin.pem --key /opt/k8s/work/admin-key.pem https://172.16.60.250:8443/metrics|head# HELP APIServiceOpenAPIAggregationControllerQueue1_adds (Deprecated) Total number of adds handled by workqueue: APIServiceOpenAPIAggregationControllerQueue1# TYPE APIServiceOpenAPIAggregationControllerQueue1_adds counterAPIServiceOpenAPIAggregationControllerQueue1_adds 12194# HELP APIServiceOpenAPIAggregationControllerQueue1_depth (Deprecated) Current depth of workqueue: APIServiceOpenAPIAggregationControllerQueue1# TYPE APIServiceOpenAPIAggregationControllerQueue1_depth gaugeAPIServiceOpenAPIAggregationControllerQueue1_depth 0# HELP APIServiceOpenAPIAggregationControllerQueue1_longest_running_processor_microseconds (Deprecated) How many microseconds has the longest running processor for APIServiceOpenAPIAggregationControllerQueue1 been running.# TYPE APIServiceOpenAPIAggregationControllerQueue1_longest_running_processor_microseconds gaugeAPIServiceOpenAPIAggregationControllerQueue1_longest_running_processor_microseconds 0# HELP APIServiceOpenAPIAggregationControllerQueue1_queue_latency (Deprecated) How long an item stays in workqueueAPIServiceOpenAPIAggregationControllerQueue1 before being requested.
直接使用 kube-apiserver 节点端口获取 metrics
[root@k8s-master01 work]# curl -s --cacert /opt/k8s/work/ca.pem --cert /opt/k8s/work/admin.pem --key /opt/k8s/work/admin-key.pem https://172.16.60.241:6443/metrics|head[root@k8s-master01 work]# curl -s --cacert /opt/k8s/work/ca.pem --cert /opt/k8s/work/admin.pem --key /opt/k8s/work/admin-key.pem https://172.16.60.242:6443/metrics|head[root@k8s-master01 work]# curl -s --cacert /opt/k8s/work/ca.pem --cert /opt/k8s/work/admin.pem --key /opt/k8s/work/admin-key.pem https://172.16.60.243:6443/metrics|head
出处:http://1t.click/aRcN
未完待续.......
看都看完了,还不点这里试试![c34dad55483fdeacb52e5f4ae7c62550.gif](https://i-blog.csdnimg.cn/blog_migrate/1523a414e51ef5555f7e05fbc0fd60d7.gif)