kubespray简介
Kubespray是开源的部署生产级别 Kubernetes 集群的项目,它整合了Ansible作为部署的工具。
- 可以部署在AWS,GCE,Azure,OpenStack,vSphere,Packet(Bare metal),Oracle Cloud Infrastructure(Experimental)或Baremetal上
- 高可用集群
- 可组合各种组件(例如,选择网络插件)
- 支持最受欢迎的Linux发行版
- 持续集成测试
项目地址:https://github.com/kubernetes-sigs/kubespray
在线部署
国内特殊的网络环境导致使用kubespray特别困难,部分镜像需要从gcr.io拉取,部分二进制文件需要从github下载,所以这里在阿里云上创建3台香港2C4G抢占模式ECS实例进行部署测试。
说明:高可用部署etcd要求3个节点,所以高可用集群最少需要3个节点。
kubespray需要一个部署节点,也可以复用集群任意一个节点,这里在第一个master节点(192.168.0.137)安装kubespray,并执行后续的所有操作。
下载kubespray
#下载正式发布的relese版本
wget https://github.com/kubernetes-sigs/kubespray/archive/v2.13.1.tar.gz
tar -zxvf v2.13.1.tar.gz
或者直接克隆
git clone https://github.com/kubernetes-sigs/kubespray.git -b v2.13.1 --depth=1
安装依赖
cd kubespray-2.13.1/
yum install -y epel-release python3-pip
pip3 install -r requirements.txt
更新 Ansible inventory file,IPS地址为3个ecs实例的内部IP:
cp -rfp inventory/sample inventory/mycluster
declare -a IPS=( 192.168.0.137 192.168.0.138 192.168.0.139)
CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]}
查看自动生成的hosts.yaml,kubespray会根据提供的节点数量自动规划节点角色。这里部署2个master节点,同时3个节点也作为node,3个节点也用来部署etcd。
[root@node1 kubespray-2.13.1]# cat inventory/mycluster/hosts.yaml
all:
hosts:
node1:
ansible_host: 192.168.0.137
ip: 192.168.0.137
access_ip: 192.168.0.137
node2:
ansible_host: 192.168.0.138
ip: 192.168.0.138
access_ip: 192.168.0.138
node3:
ansible_host: 192.168.0.139
ip: 192.168.0.139
access_ip: 192.168.0.139
children:
kube-master:
hosts:
node1:
node2:
kube-node:
hosts:
node1:
node2:
node3:
etcd:
hosts:
node1:
node2:
node3:
k8s-cluster:
children:
kube-master:
kube-node:
calico-rr:
hosts: {}
修改全局环境变量(默认即可)
cat inventory/mycluster/group_vars/all/all.yml
默认安装版本较低,指定kubernetes版本
# vim inventory/mycluster/group_vars/k8s-cluster/k8s-cluster.yml
kube_version: v1.18.3
配置ssh免密,kubespray ansible节点对所有节点免密。
ssh-keygen
ssh-copy-id 192.168.0.137
ssh-copy-id 192.168.0.138
ssh-copy-id 192.168.0.139
运行kubespray playbook安装集群
ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root cluster.yml
查看创建的集群
[root@node1 kubespray-2.13.1]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
node1 Ready master 3m30s v1.18.3 192.168.0.137 <none> CentOS Linux 7 (Core) 3.10.0-1062.18.1.el7.x86_64 docker://18.9.9
node2 Ready master 2m53s v1.18.3 192.168.0.138 <none> CentOS Linux 7 (Core) 3.10.0-1062.18.1.el7.x86_64 docker://18.9.9
node3 Ready <none> 109s v1.18.3 192.168.0.139 <none> CentOS Linux 7 (Core) 3.10.0-1062.18.1.el7.x86_64 docker://18.9.9
[root@node1 kubespray-2.13.1]# kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-796b886f7c-ftm7c 1/1 Running 0 75s
calico-node-bvx8m 1/1 Running 1 97s
calico-node-c88d7 1/1 Running 1 97s
calico-node-gdccq 1/1 Running 1 97s
coredns-6489c7bb8b-k7gpd 1/1 Running 0 61s
coredns-6489c7bb8b-wgmjz 1/1 Running 0 56s
dns-autoscaler-7594b8c675-zqhv6 1/1 Running 0 58s
kube-apiserver-node1 1/1 Running 0 3m24s
kube-apiserver-node2 1/1 Running 0 2m47s
kube-controller-manager-node1 1/1 Running 0 3m24s
kube-controller-manager-node2 1/1 Running 0 2m47s
kube-proxy-d8qf8 1/1 Running 0 111s
kube-proxy-g5f95 1/1 Running 0 111s
kube-proxy-g5vvw 1/1 Running 0 111s
kube-scheduler-node1 1/1 Running 0 3m24s
kube-scheduler-node2 1/1 Running 0 2m47s
kubernetes-dashboard-7dbcd59666-rt78s 1/1 Running 0 55s
kubernetes-metrics-scraper-6858b8c44d-9ttnd 1/1 Running 0 55s
nginx-proxy-node3 1/1 Running 0 112s
nodelocaldns-b4thm 1/1 Running 0 56s
nodelocaldns-rlq4v 1/1 Running 0 56s
nodelocaldns-vx9cc 1/1 Running 0 56s
清理集群
ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root reset.yml
离线部署
由于离线包只包含特定版本,该方式仅支持 ubuntu 22.04 server LTS
部署 kubernetes v1.23.7
集群。
离线包构建流程:
准备一个独立的部署节点192.168.93.23
,必须位于集群节点之外。安装 containerd 容器环境:
wget https://github.com/containerd/nerdctl/releases/download/v0.20.0/nerdctl-full-0.20.0-linux-amd64.tar.gz
tar -zxvf nerdctl-full-0.20.0-linux-amd64.tar.gz -C /usr/local
systemctl enable --now containerd buildkit
生成sshkey
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
为集群节点配置免密,需要在部署节点执行
ssh-copy id 192.168.93.20
ssh-copy id 192.168.93.21
ssh-copy id 192.168.93.22
为集群节点配置本地apt源,需要在要部署的集群节点执行
mv /etc/apt/sources.list /etc/apt/sources.list.bak
echo "deb [trusted=yes] http://192.168.93.23:8088/debs/ubuntu2204 ./" | sudo tee /etc/apt/sources.list.d/debs.list
创建compose文件
cat >docker-compose.yaml<<EOF
version: '3.1'
services:
kubespray:
container_name: kubespray
image: registry.cn-shenzhen.aliyuncs.com/cnmirror/kubespray:v2.19.0
restart: always
command: sleep infinity
volumes:
- kubespray-playbook:/kubespray
- ${HOME}/.ssh/id_rsa:/root/.ssh/id_rsa
kubespray-nginx:
container_name: kubespray-nginx
image: registry.cn-shenzhen.aliyuncs.com/cnmirror/kubespray-nginx:v2.19.0
restart: always
volumes:
- kubespray-date:/usr/share/nginx/html/files
ports:
- 8088:8088
kubespray-registry:
image: registry.cn-shenzhen.aliyuncs.com/cnmirror/kubespray-registry:v2.19.0
container_name: kubespray-registry
restart: always
volumes:
- kubespray-images:/var/lib/registry
ports:
- 5000:5000
volumes:
kubespray-playbook:
kubespray-date:
kubespray-images:
EOF
运行容器
nerdctl compose up -d
查看部署容器
root@ubuntu:~# nerdctl ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b3d7495736c4 registry.cn-shenzhen.aliyuncs.com/cnmirror/kubespray-nginx:v2.19.0 "/docker-entrypoint.…" 4 minutes ago Up 0.0.0.0:8088->8088/tcp kubespray-nginx
c3a395485dbb registry.cn-shenzhen.aliyuncs.com/cnmirror/kubespray-registry:v2.19.0 "/entrypoint.sh /etc…" 4 minutes ago Up 0.0.0.0:5000->5000/tcp kubespray-registry
e61525ee8bc5 registry.cn-shenzhen.aliyuncs.com/cnmirror/kubespray:v2.19.0 "sleep infinity" 4 minutes ago Up kubespray
进入容器,创建inventory
nerdctl exec -it kubespray bash
cp -rfp inventory/sample inventory/mycluster
declare -a IPS=(192.168.93.20 192.168.93.21 192.168.93.22)
CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]}
定义离线安装变量
cd inventory/mycluster/group_vars/all/
cat >offline.yml<<'EOF'
---
registry_host: "192.168.93.23:5000"
files_repo: "http://192.168.93.23:8088/files"
ubuntu_repo: "http://192.168.93.23:8088/debs/ubuntu2204"
## Container Registry overrides
kube_image_repo: "{{ registry_host }}"
gcr_image_repo: "{{ registry_host }}"
github_image_repo: "{{ registry_host }}"
docker_image_repo: "{{ registry_host }}"
quay_image_repo: "{{ registry_host }}"
## Kubernetes components
kubeadm_download_url: "{{ files_repo }}/storage.googleapis.com/kubernetes-release/release/{{ kube_version }}/bin/linux/{{ image_arch }}/kubeadm"
kubectl_download_url: "{{ files_repo }}/storage.googleapis.com/kubernetes-release/release/{{ kube_version }}/bin/linux/{{ image_arch }}/kubectl"
kubelet_download_url: "{{ files_repo }}/storage.googleapis.com/kubernetes-release/release/{{ kube_version }}/bin/linux/{{ image_arch }}/kubelet"
## CNI Plugins
cni_download_url: "{{ files_repo }}/github.com/containernetworking/plugins/releases/download/{{ cni_version }}/cni-plugins-linux-{{ image_arch }}-{{ cni_version }}.tgz"
## cri-tools
crictl_download_url: "{{ files_repo }}/github.com/kubernetes-sigs/cri-tools/releases/download/{{ crictl_version }}/crictl-{{ crictl_version }}-{{ ansible_system | lower }}-{{ image_arch }}.tar.gz"
## [Optional] etcd: only if you **DON'T** use etcd_deployment=host
etcd_download_url: "{{ files_repo }}/github.com/etcd-io/etcd/releases/download/{{ etcd_version }}/etcd-{{ etcd_version }}-linux-{{ image_arch }}.tar.gz"
# [Optional] Calico: If using Calico network plugin
calicoctl_download_url: "{{ files_repo }}/github.com/projectcalico/calico/releases/download/{{ calico_ctl_version }}/calicoctl-linux-{{ image_arch }}"
calico_crds_download_url: "{{ files_repo }}/github.com/projectcalico/calico/archive/{{ calico_version }}.tar.gz"
helm_download_url: "{{ files_repo }}/get.helm.sh/helm-{{ helm_version }}-linux-{{ image_arch }}.tar.gz"
crun_download_url: "{{ files_repo }}/github.com/containers/crun/releases/download/{{ crun_version }}/crun-{{ crun_version }}-linux-{{ image_arch }}"
kata_containers_download_url: "{{ files_repo }}/github.com/kata-containers/runtime/releases/download/{{ kata_containers_version }}/kata-static-{{ kata_containers_version }}-{{ ansible_architecture }}.tar.xz"
# [Optional] runc,containerd: only if you set container_runtime: containerd
runc_download_url: "{{ files_repo }}/github.com/opencontainers/runc/releases/download/{{ runc_version }}/runc.{{ image_arch }}"
containerd_download_url: "{{ files_repo }}/github.com/containerd/containerd/releases/download/v{{ containerd_version }}/containerd-{{ containerd_version }}-linux-{{ image_arch }}.tar.gz"
nerdctl_download_url: "{{ files_repo }}/github.com/containerd/nerdctl/releases/download/v{{ nerdctl_version }}/nerdctl-{{ nerdctl_version }}-{{ ansible_system | lower }}-{{ image_arch }}.tar.gz"
EOF
配置insecure_registries参数
cp /kubespray/inventory/mycluster/group_vars/all/containerd.yml{,.bak}
cat <<EOF>/kubespray/inventory/mycluster/group_vars/all/containerd.yml
containerd_insecure_registries:
"192.168.93.23:5000": "http://192.168.93.23:5000"
EOF
开始部署集群
cd /kubespray
ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root cluster.yml
查看集群运行状态
root@ubuntu:~# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
node1 Ready control-plane,master 15h v1.23.7 192.168.93.20 <none> Ubuntu 22.04 LTS 5.15.0-27-generic containerd://1.6.4
node2 Ready control-plane,master 15h v1.23.7 192.168.93.21 <none> Ubuntu 22.04 LTS 5.15.0-27-generic containerd://1.6.4
node3 Ready <none> 15h v1.23.7 192.168.93.22 <none> Ubuntu 22.04 LTS 5.15.0-27-generic containerd://1.6.4
root@ubuntu:~# kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-55c778f487-vx9nc 1/1 Running 0 16h
kube-system calico-node-6cw2j 1/1 Running 0 16h
kube-system calico-node-h8x5j 1/1 Running 0 16h
kube-system calico-node-vlrh4 1/1 Running 0 16h
kube-system coredns-56fc47f88f-wfx4q 1/1 Running 0 16h
kube-system coredns-56fc47f88f-zb92t 1/1 Running 0 16h
kube-system dns-autoscaler-f6897fd5b-b854l 1/1 Running 0 16h
kube-system kube-apiserver-node1 1/1 Running 1 16h
kube-system kube-apiserver-node2 1/1 Running 1 16h
kube-system kube-controller-manager-node1 1/1 Running 1 16h
kube-system kube-controller-manager-node2 1/1 Running 1 16h
kube-system kube-proxy-sm52q 1/1 Running 0 16h
kube-system kube-proxy-wtrqz 1/1 Running 0 16h
kube-system kube-proxy-xf8k7 1/1 Running 0 16h
kube-system kube-scheduler-node1 1/1 Running 1 16h
kube-system kube-scheduler-node2 1/1 Running 1 16h
kube-system nginx-proxy-node3 1/1 Running 0 16h
kube-system nodelocaldns-5lld5 1/1 Running 0 16h
kube-system nodelocaldns-hhlch 1/1 Running 0 16h
kube-system nodelocaldns-pgfdh 1/1 Running 0 16h
参考:
https://github.com/willzhang/kubespray-offline
https://github.com/k8sli/kubeplay