微信公众号:运维开发故事,作者:刘大仙
RKE简述:
Rancher Kubernetes Engine(RKE)是一款轻量级Kubernetes安装程序,支持在裸机和虚拟化服务器上安装Kubernetes。RKE解决了Kubernetes社区中的一个常见问题,比如:安装复杂性。RKE支持多种平台运行,比如MacOS,linux,windows。
详情见:https://docs.rancher.cn/rke/
Rancher简述:
Rancher 是为使用容器的公司打造的容器管理平台。Rancher 简化了使用 Kubernetes 的流程,开发者可以随处运行 Kubernetes(Run Kubernetes Everywhere),满足 IT 需求规范,赋能 DevOps 团队。
详情见:https://rancher2.docs.rancher.cn/docs/overview/_index
使用环境:
操作系统 | 主机名 | IP地址 | 节点 | 作用 |
---|---|---|---|---|
CentOS 7 1810 | nginx-master | 192.168.111.21 | Nginx主服务器 | 负载均衡 |
CentOS 7 1810 | nginx-backup | 192.168.111.22 | Nginx备服务器 | 负载均衡 |
ubuntu-18.04.3-live-server | rke-node1 | 192.168.111.50 | rke节点1 | RKE集群 |
ubuntu-18.04.3-live-server | rke-node2 | 192.168.111.51 | rke节点2 | RKE集群 |
ubuntu-18.04.3-live-server | rke-node3 | 192.168.111.52 | rke节点3 | RKE集群 |
部署前系统环境准备:
关闭防火墙和SeLinux
为防止因端口问题造成集群组建失败,我们在这里提前关闭防火墙以及selinux
-
centos :
systemctl stop firewalld systemctl disable firewalld setenforce 0 sed -i 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config
-
Ubuntu:
sudo ufw stop
配置host文件:
192.168.111.21 nginx-master
192.168.111.22 nginx-backup
192.168.111.50 rke-node1
192.168.111.51 rke-node2
192.168.111.52 rke-node3
- 配置host文件,并确保每台机器上都可以通过主机名互通
需要用到的工具:
此安装需要以下 CLI 工具。请确保这些工具已经安装并在$PATH
中可用
CLI工具的安装在RKE节点上进行,确保3台节点都已经安装正确
-
kubectl - Kubernetes 命令行工具.
-
rke - Rancher Kubernetes Engine,用于构建 Kubernetes 集群的 cli。
-
helm - Kubernetes 的软件包管理工具。
请参阅Helm 版本要求选择 Helm 的版本来安装 Rancher。
安装 Kubectl:
-
安装参考K8S官网,由于某些特殊原因,此处我们使用snap
sudo apt-get install snapd sudo snap install kubectl --classic # 此处安装较慢,请耐心等待 # 验证安装 kubectl help
安装 RKE:
-
安装参考Rancher官网,由于是从GitHub上下载,文件较大,网络原因请自行解决
wget https://github.com/rancher/rke/releases/download/v1.0.8/rke_linux-amd64 # 将二进制文件移动至/usr/local/bin/下并改名成rke,并赋予可执行权限 sudo mv rke_linux-amd64 /usr/local/bin/rke sudo chmod +x /usr/local/bin/rke # 验证安装 rke --version
安装 Helm:
-
安装参考Helm官网,Helm是Kubernetes的包管理器,Helm的版本需要高于v3
# 下载安装包 wget https://get.helm.sh/helm-v3.2.1-linux-amd64.tar.gz # 解压 tar zxvf helm-v3.2.1-linux-amd64.tar.gz # 将二进制文件移动至/usr/local/bin/ sudo mv linux-amd64/helm /usr/local/bin/helm # 验证安装 helm help
创建 Nginx+Keepalived 集群:
此处在CentOS节点上进行
-
安装 Nginx
# 下载Nginx安装包 wget http://nginx.org/download/nginx-1.17.10.tar.gz # 解压安装包 tar zxvf nginx-1.17.10.tar.gz # 安装编译时必备的软件包 yum install -y gcc gcc-c++ pcre pcre-devel zlib zlib-devel openssl openssl-devel libnl3-devel # 进入nginx目录,此处我们需要使用https,所有在编译时选择 --with-http_ssl_module 模块 cd nginx-1.17.10 mkdir -p /usr/local/nginx ./configure --prefix=/usr/local/nginx --with-http_ssl_module --with-stream # 安装nginx make && make install # 创建nginx命令软连接 ln -s /usr/local/nginx/sbin/nginx /usr/local/bin/nginx # 验证安装 nginx -V # 启动nginx nginx
-
安装 Keepalived
# 下载安装包 wget https://www.keepalived.org/software/keepalived-2.0.20.tar.gz # 解压安装包 tar zxvf keepalived-2.0.20.tar.gz # 编译安装keepalived cd keepalived-2.0.20 mkdir /usr/local/keepalived ./configure --prefix=/usr/local/keepalived/ make && make install # 配置 keepalived 为系统服务 cp /usr/local/keepalived/sbin/keepalived /usr/sbin/keepalived cp /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig/keepalived touch /etc/init.d/keepalived chmod +x /etc/init.d/keepalived # keepalived 中的内容见下文 vim /etc/init.d/keepalived # 配置 keepalived mkdir /etc/keepalived/ cp /usr/local/keepalived/etc/keepalived/keepalived.conf /etc/keepalived/ vim /etc/keepalived/keepalived.conf #keepalived.conf 中的内容见下文 # 启动keepalived systemctl start keepalived systemctl enable keepalived # 验证 systemctl status keepalived # 此时keepalived应该是运行,一个为master,一个为backup, master上执行 ip addr 命令时,应该存在一个虚拟ip地址,backup上不应该有 # 访问 https://192.168.111.20 验证配置
/etc/init.d/keepalived文件内容
#!/bin/sh
Startup script for the Keepalived daemon
processname: keepalived
pidfile: /var/run/keepalived.pid
config: /etc/keepalived/keepalived.conf
chkconfig: - 21 79
description: Start and stop Keepalived
Source function library
. /etc/rc.d/init.d/functions
Source configuration file (we set KEEPALIVED_OPTIONS there)
. /etc/sysconfig/keepalived
RETVAL=0
prog=“keepalived”
start() {
echo -n $"Starting $prog: "
daemon keepalived K E E P A L I V E D O P T I O N S R E T V A L = {KEEPALIVED_OPTIONS} RETVAL= KEEPALIVEDOPTIONSRETVAL=?
echo
[ KaTeX parse error: Expected 'EOF', got '&' at position 16: RETVAL -eq 0 ] &̲& touch /var/lo…prog
}stop() {
echo -n $"Stopping p r o g : " k i l l p r o c k e e p a l i v e d R E T V A L = prog: " killproc keepalived RETVAL= prog:"killprockeepalivedRETVAL=?
echo
[ KaTeX parse error: Expected 'EOF', got '&' at position 16: RETVAL -eq 0 ] &̲& rm -f /var/lo…prog
}reload() {
echo -n $"Reloading p r o g : " k i l l p r o c k e e p a l i v e d − 1 R E T V A L = prog: " killproc keepalived -1 RETVAL= prog:"killprockeepalived−1RETVAL=?
echo
}See how we were called.
case " 1 " i n s t a r t ) s t a r t ; ; s t o p ) s t o p ; ; r e l o a d ) r e l o a d ; ; r e s t a r t ) s t o p s t a r t ; ; c o n d r e s t a r t ) i f [ − f / v a r / l o c k / s u b s y s / 1" in start) start ;; stop) stop ;; reload) reload ;; restart) stop start ;; condrestart) if [ -f /var/lock/subsys/ 1"instart)start;;stop)stop;;reload)reload;;restart)stopstart;;condrestart)if[−f/var/lock/subsys/prog ]; then
stop
start
fi
;;
status)
status keepalived
RETVAL=$?
;;
*)
echo “Usage: $0 {start|stop|reload|restart|condrestart|status}”
RETVAL=1
esacexit $RETVAL
# /etc/keepalived/keepalived.conf 中的内容 ! Configuration File for keepalived global_defs { router_id 192.168.111.21 # 此id在网络中有且只有一个,不应有重复的id } vrrp_script chk_nginx { #因为要检测nginx服务状态,所以创建一个检查脚本 script "/usr/local/keepalived/check_ng.sh" interval 3 } vrrp_instance VI_1 { state MASTER # 配置此节点为master,备机上设置为BACKUP interface ens33 # 设置绑定的网卡 virtual_router_id 51 # vrrp 组, 主备的vrrp组应该一样 priority 120 # 优先级,优先级大的为主 advert_int 1 # 检查间隔 authentication { # 认证 auth_type PASS auth_pass 1111 } virtual_ipaddress { # 虚拟IP 192.168.111.20 } track_script { # 执行脚本 chk_nginx } }
/usr/local/keepalived/check_ng.sh 中的内容
#!/bin/bash
d=date --date today +%Y%m%d_%H:%M:%S
n=ps -C nginx --no-heading|wc -l
if [ $n -eq “0” ]; then
systemctl start nginx
n2=ps -C nginx --no-heading|wc -l
if [ n 2 − e q " 0 " ] ; t h e n e c h o " n2 -eq "0" ]; then echo " n2−eq"0"];thenecho"d nginx down,keepalived will stop" >> /var/log/check_ng.log
systemctl stop keepalived
fi
fi
安装 docker-ce :
此处在RKE节点上进行
# 移除旧版本Docker
sudo apt-get remove docker docker-engine docker.io containerd runc
# 安装工具包
sudo apt-get install -y \
apt-transport-https \
ca-certificates \
curl \
gnupg-agent \
software-properties-common
# 添加 Docker官方 GPG key
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
# 添加 stable apt 源
sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
# 安装 Docker-ce
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io
# 验证安装
docker info
# 将当前用户加入"docker"用户组,加入到该用户组的账号在随后安装过程会用到。用于节点访问的SSH用户必须是节点上docker组的成员
sudo usermod -aG docker $USER
配置四层负载均衡
此处在Nginx集群操作
# 更新nginx配置文件
# vim /usr/local/nginx/conf/nginx.conf
#user nobody;
worker_processes 4;
worker_rlimit_nofile 40000;
events {
worker_connections 8192;
}
stream {
upstream rancher_servers_http {
least_conn;
server 192.168.111.50:80 max_fails=3 fail_timeout=5s;
server 192.168.111.51:80 max_fails=3 fail_timeout=5s;
server 192.168.111.52:80 max_fails=3 fail_timeout=5s;
}
server {
listen 80;
proxy_pass rancher_servers_http;
}
upstream rancher_servers_https {
least_conn;
server 192.168.111.50:443 max_fails=3 fail_timeout=5s;
server 192.168.111.51:443 max_fails=3 fail_timeout=5s;
server 192.168.111.52:443 max_fails=3 fail_timeout=5s;
}
server {
listen 443;
proxy_pass rancher_servers_https;
}
}
开始部署:
使用 RKE 安装 Kubernetes
-
RKE-Node 之间建立 ssh 免密登陆
# 生成 rsa 公钥秘钥 ssh-keygen # 复制当前主机上的公钥到另外两台上面,实现免密码登录 ssh-copy-id -i ~/.ssh/id_rsa.pub docker@192.168.111.50 ssh-copy-id -i ~/.ssh/id_rsa.pub docker@192.168.111.51 ssh-copy-id -i ~/.ssh/id_rsa.pub docker@192.168.111.52 # 注意,自已也要跟自己注册一下,三个节点都要执行 # 验证 docker@rke-node3:~$ ssh docker@192.168.111.50 # 在node3上远程node1 此时ssh应该不需要密码
-
编写 rancher-cluster.yml 文件
# vim rancher-cluster.yml nodes: - address: 192.168.111.50 # 主机IP user: docker # 可以执行docker命令的用户 role: [controlplane,worker,etcd] # 节点角色 - address: 192.168.111.51 user: docker role: [controlplane,worker,etcd] - address: 192.168.111.52 user: docker role: [controlplane,worker,etcd] services: etcd: snapshot: true creation: 6h retention: 24
-
运行 RKE 构建 Kubernetes 集群
rke up --config ./rancher-cluster.yml # 验证:返回下面的消息则说明执行成功。 # Finished building Kubernetes cluster successfully.
-
Pod 是
Running
或Completed
状态。 -
STATUS
为Running
的 Pod,READY
应该显示所有容器正在运行 (例如,3/3
)。 -
STATUS
为Completed
的 Pod 是一次运行的作业。对于这些 Pod,READY
应为0/1
。kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE ingress-nginx nginx-ingress-controller-tnsn4 1/1 Running 0 30s ingress-nginx nginx-ingress-controller-tw2ht 1/1 Running 0 30s ingress-nginx nginx-ingress-controller-v874b 1/1 Running 0 30s kube-system canal-jp4hz 3/3 Running 0 30s kube-system canal-z2hg8 3/3 Running 0 30s kube-system canal-z6kpw 3/3 Running 0 30s kube-system kube-dns-7588d5b5f5-sf4vh 3/3 Running 0 30s kube-system kube-dns-autoscaler-5db9bbb766-jz2k6 1/1 Running 0 30s kube-system metrics-server-97bc649d5-4rl2q 1/1 Running 0 30s kube-system rke-ingress-controller-deploy-job-bhzgm 0/1 Completed 0 30s kube-system rke-kubedns-addon-deploy-job-gl7t4 0/1 Completed 0 30s kube-system rke-metrics-addon-deploy-job-7ljkc 0/1 Completed 0 30s kube-system rke-network-plugin-deploy-job-6pbgj 0/1 Completed 0 30s
-
保存好配置文件
rancher-cluster.yml:RKE集群配置文件。 kube_config_rancher-cluster.yml:群集的Kubeconfig文件,此文件包含完全访问群集的凭据。 rancher-cluster.rkestate:Kubernetes群集状态文件,此文件包含完全访问群集的凭据。
-
执行成功后会在当前目录下生成一个
kube_config_rancher-cluster.yml
的文件, 把这个文件复制到.kube/kube_config_rancher-cluster.yml
# 在用户家目录下进行 mkdir .kube cp kube_config_rancher-cluster.yml .kube/ export KUBECONFIG=$(pwd)/kube_config_rancher-cluster.yml # 验证 kubectl get nodes NAME STATUS ROLES AGE VERSION 192.168.111.50 Ready controlplane,etcd,worker 5m47s v1.17.5 192.168.111.51 Ready controlplane,etcd,worker 5m46s v1.17.5 192.168.111.52 Ready controlplane,etcd,worker 5m47s v1.17.5
-
检查集群 Pod 的运行情况
检查所有必需的 Pod 和容器是否状况良好,然后可以继续进行。
安装 Rancher
-
添加 Helm Chart 仓库
helm repo add rancher-stable https://releases.rancher.com/server-charts/stable
-
为 Rancher 创建 Namespace
kubectl create namespace cattle-system
-
使用 Rancher 生成的自签名证书
# 安装 CustomResourceDefinition 资源 kubectl apply -f https://raw.githubusercontent.com/jetstack/cert-manager/release-0.12/deploy/manifests/00-crds.yaml # **重要:** # 如果您正在运行 Kubernetes v1.15 或更低版本, # 则需要在上方的 kubectl apply 命令中添加`--validate=false`标志, # 否则您将在 cert-manager 的 CustomResourceDefinition 资源中收到与 # x-kubernetes-preserve-unknown-fields 字段有关的验证错误。 # 这是一个良性错误,是由于 kubectl 执行资源验证的方式造成的。 # 为 cert-manager 创建命名空间 kubectl create namespace cert-manager # 添加 Jetstack Helm 仓库 helm repo add jetstack https://charts.jetstack.io # 更新本地 Helm chart 仓库缓存 helm repo update # 安装 cert-manager Helm chart helm install \ cert-manager jetstack/cert-manager \ --namespace cert-manager \ --version v0.12.0 # 验证 kubectl get pods --namespace cert-manager NAME READY STATUS RESTARTS AGE cert-manager-754d9b75d9-6xbk4 1/1 Running 0 94s cert-manager-cainjector-85fbdf788-hthfn 1/1 Running 0 94s cert-manager-webhook-76f9b64b45-bmt5z 1/1 Running 0 94s
-
部署 Rancher 集群
helm install rancher rancher-stable/rancher \ --namespace cattle-system \ --set hostname=rancher.hzqx.com
-
等待 Rancher 集群运行
kubectl -n cattle-system rollout status deploy/rancher Waiting for deployment "rancher" rollout to finish: 0 of 3 updated replicas are available... deployment "rancher" successfully rolled out
-
搭建完成, 访问
https://rancher.hzqx.com
, 默认用户名密码均为 admin