DOCKER学习(二)--Kubernetes(v1.12.1)集群安装

 

本文使用Kubeadm安装Kubernetes

kubeadm是Kubernetes官方提供的用于快速安装Kubernetes集群的工具,伴随Kubernetes每个版本的发布都会同步更新,kubeadm会对集群配置方面的一些实践做调整,通过实验kubeadm可以学习到Kubernetes官方在集群配置上一些新的最佳实践。

1.准备

1.1 系统配置

系统如下:

  • Ubuntu 16.04+
  • Debian 9
  • CentOS 7
  • RHEL 7
  • Fedora 25/26 (best-effort)
  • HypriotOS v1.0.1+
  • Container Linux (tested with 1800.6.0)

内存与CPU:

  • 2 GB or more of RAM per machine (any less will leave little room for your apps)
  • 2 CPUs or more

1.2 系统配置

如果各个主机启用了防火墙,需要开放Kubernetes各个组件所需要的端口:

Master node(s)

Protocol Direction Port Range Purpose Used By
TCP Inbound 6443* Kubernetes API server All
TCP Inbound 2379-2380 etcd server client API kube-apiserver, etcd
TCP Inbound 10250 Kubelet API Self, Control plane
TCP Inbound 10251 kube-scheduler Self
TCP Inbound 10252 kube-controller-manager Self

Worker node(s)

Protocol Direction Port Range Purpose Used By
TCP Inbound 10250 Kubelet API Self, Control plane
TCP Inbound 30000-32767 NodePort Services** All

或者禁用防火墙:

systemctl stop firewalld
systemctl disable firewalld

禁用SELINUX:

setenforce 0
vi /etc/selinux/config
SELINUX=disabled

创建/etc/sysctl.d/k8s.conf文件,添加如下内容:

cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF

执行命令使修改生效:

modprobe br_netfilter
sysctl -p /etc/sysctl.d/k8s.conf

1.3 修改配置主机名

vim /etc/hosts

#添加相应的主机信息
192.168.0.10 app-tahjv1fe-1

1.4 安装docker

安装docker参见另一篇文章DOCKER学习(一)——docker安装

安装docker之后需要确认一下iptables filter表中FOWARD链的默认策略(pollicy)为ACCEPT

iptables -nvL
Chain INPUT (policy ACCEPT 45403 packets, 59M bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 DOCKER-USER  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
    0     0 DOCKER-ISOLATION-STAGE-1  all  --  *      *       0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 DOCKER     all  --  *      docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  docker0 !docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  docker0 docker0  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  *      docker_gwbridge  0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    0     0 DOCKER     all  --  *      docker_gwbridge  0.0.0.0/0            0.0.0.0/0           
    0     0 ACCEPT     all  --  docker_gwbridge !docker_gwbridge  0.0.0.0/0            0.0.0.0/0           
    0     0 DROP       all  --  docker_gwbridge docker_gwbridge  0.0.0.0/0            0.0.0.0/0 

Docker从1.13版本开始调整了默认的防火墙规则,禁用了iptables filter表中FOWARD链,这样会引起Kubernetes集群中跨Node的Pod无法通信。但这里通过安装docker 1806,发现默认策略又改回了ACCEPT,这个不知道是从哪个版本改回的。

如果是DROP状态,使用一下语句开启:

sed -i "13i ExecStartPost=/usr/sbin/iptables -P FORWARD ACCEPT" /usr/lib/systemd/system/docker.service

然后重新启动docker:

systemctl daemon-reload
systemctl restart docker

2. 使用Kubeadm安装kubernetes

2.1安装kubeadm和kubelet

在各节点安装kubeadm和kubelet,这里配置使用aliyun的yum源:

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum makecache fast
yum install -y kubelet kubeadm kubectl

运行kubelet –help可以看到原来kubelet的绝大多数命令行flag参数都被DEPRECATED了,如:

--address 0.0.0.0   The IP address for the Kubelet to serve on (set to 0.0.0.0 for all IPv4 interfaces and `::` for all IPv6 interfaces) (default 0.0.0.0) (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)

而官方推荐我们使用–config指定配置文件,并在配置文件中指定原来这些flag所配置的内容。这也是Kubernetes为了支持动态Kubelet配置(Dynamic Kubelet Configuration)才这么做的。kubelet的配置文件必须是json或yaml格式。

Kubernetes 1.8开始要求关闭系统的Swap,如果不关闭,默认配置下kubelet将无法启动。或者使用配置文件去掉这个限制。

关闭系统swap:

swapoff -a

一般当前使用的服务器可能还运行其他服务,关闭swap可能会对其他服务产生影响,所以这里修改kubelet的配置去掉这个限制。修改配置文件:

vi /etc/sysconfig/kubelet

#add args to KUBELET_EXTRA_ARGS
KUBELET_EXTRA_ARGS=--fail-swap-on=false

2.2 使用kubeadm init初始化服务

在各节点开启机启动kubelet服务

systemctl enable kubelet

接下来使用kubeadm初始化集群,选择app-tahjv1fe-1作为Master Node,在app-tahjv1fe-1上执行下面的命令:

kubeadm init \
  --kubernetes-version=v1.12.1 \
  --pod-network-cidr=10.244.0.0/16 \
  --apiserver-advertise-address=192.168.0.10 \
  --apiserver-cert-extra-sans=$(hostname) \
  --ignore-preflight-errors=Swap

可以使用--insecure-port=port修改api server监听端口。

因为我们选择flannel作为Pod网络插件,所以上面的命令指定–pod-network-cidr=10.244.0.0/16。

直接执行初始化语句会报错:

[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-apiserver:v1.12.1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-controller-manager:v1.12.1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-scheduler:v1.12.1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-proxy:v1.12.1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/pause:3.1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/etcd:3.2.24
[ERROR ImagePull]: failed to pull image k8s.gcr.io/coredns:1.2.2

k8s.gcr.io是谷歌域名,在国内不能直接访问,无法直接从google拉取镜像。

解决办法:新建脚本,从dockerhub上拉取相应的镜像,然后重新tag到kubeadm所需要的名称

vim ./kubernetes.sh

images=(kube-apiserver:v1.12.1 kube-controller-manager:v1.12.1 kube-scheduler:v1.12.1 kube-proxy:v1.12.1 pause:3.1 etcd:3.2.24 coredns:1.2.2)
for imageName in ${images[*]}
do
  docker pull fengzos/$imageName
  docker tag fengzos/$imageName k8s.gcr.io/$imageName
  docker rmi fengzos/$imageName
done

chmod +x ./kubernetes.sh

./kubernetes.sh

执行脚本之后再进行初始化操作。如果初始化失败需要重新进行初始化,需要先进行reset一下:

kubeadm reset

然后再执行初始化脚本。执行成功如下:

[bootstraptoken] using token: 62u3or.7rvaxpgra3r9c1u9
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 192.168.0.10:6443 --token 62u3or.7rvaxpgra3r9c1u9 --discovery-token-ca-cert-hash sha256:0fe59a4d41817cff8d3190a0e3c541219957abd938c4f9243d03782523c663dc

把生成的join命令保存下来,留待其他节点加入集群时使用。

下面的命令是配置常规用户如何使用kubectl访问集群,为当前用户配置kubernetes的配置文件:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

使用 kubectl version查看集群情况:

# kubectl version
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.1", GitCommit:"4ed3216f3ec431b140b1d899130a69fc671678f4", GitTreeState:"clean", BuildDate:"2018-10-05T16:46:06Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.1", GitCommit:"4ed3216f3ec431b140b1d899130a69fc671678f4", GitTreeState:"clean", BuildDate:"2018-10-05T16:36:14Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}

查看一下集群状态:

kubectl get cs
NAME                 STATUS    MESSAGE              ERROR
controller-manager   Healthy   ok
scheduler            Healthy   ok
etcd-0               Healthy   {"health": "true"}

查看节点信息:

# kubectl get node -o wide
NAME            STATUS     ROLES    AGE    VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION          CONTAINER-RUNTIME
app-tahjv1fe-1   NotReady   master   176m   v1.12.1   192.168.0.10   <none>        CentOS Linux 7 (Core)   3.10.0-862.el7.x86_64   docker://18.6.1

在这里可以看到1.12版本的kubeadm额外给app-tahjv1fe-1节点设置了一个污点(Taint):node.kubernetes.io/not-ready:NoSchedule,很容易理解,即如果节点还没有ready之前,是不接受调度的。可是如果Kubernetes的网络插件还没有部署的话,节点是不会进入ready状态的。

2.3 安装Pod Network

接下来安装flannel network add-on:

mkdir -p ~/k8s/
cd ~/k8s
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl apply -f  kube-flannel.yml

clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.extensions/kube-flannel-ds-amd64 created
daemonset.extensions/kube-flannel-ds-arm64 created
daemonset.extensions/kube-flannel-ds-arm created
daemonset.extensions/kube-flannel-ds-ppc64le created
daemonset.extensions/kube-flannel-ds-s390x created

这里注意kube-flannel.yml这个文件里的flannel的镜像是0.10.0,quay.io/coreos/flannel:v0.10.0-amd64

如果Node有多个网卡的话,目前需要在kube-flannel.yml中使用--iface参数指定集群主机内网网卡的名称,否则可能会出现dns无法解析。flanneld启动参数加上--iface=<iface-name>

......
containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.10.0-amd64
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        - --iface=eth1
......

本次按上面的步骤部署flannel,发现没有效果,查看一下集群中的daemonset:

# kubectl get ds -l app=flannel -n kube-system
NAME                      DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                     AGE
kube-flannel-ds-amd64     1         1         1       1            1           beta.kubernetes.io/arch=amd64     38m
kube-flannel-ds-arm       0         0         0       0            0           beta.kubernetes.io/arch=arm       38m
kube-flannel-ds-arm64     0         0         0       0            0           beta.kubernetes.io/arch=arm64     38m
kube-flannel-ds-ppc64le   0         0         0       0            0           beta.kubernetes.io/arch=ppc64le   38m
kube-flannel-ds-s390x     0         0         0       0            0           beta.kubernetes.io/arch=s390x     38m

结合kube-flannel.yml,fannel官方的部署yaml文件是要在集群中创建5个针对不同平台的DaemonSet,通过Node的Label beta.kubernetes.i/oarch,在对应不同平台的Node节点上启动flannel的容器。当前的app-tahjv1fe-1节点是beta.kubernetes.i/oarch=amd64,因此对于kube-flannel-ds-amd64这个DaemonSet来说,它的DESIRED数量应该为1才对。

使用kubectl get pod --all-namespaces -o wide确保所有的Pod都处于Running状态。

# kubectl get pod --all-namespaces -o wide
NAMESPACE     NAME                                     READY   STATUS    RESTARTS   AGE   IP             NODE             NOMINATED NODE
kube-system   coredns-576cbf47c7-f4w7l                 1/1     Running   0          64m   10.244.0.3     app-tahjv1fe-1   <none>
kube-system   coredns-576cbf47c7-xcgpf                 1/1     Running   0          64m   10.244.0.2     app-tahjv1fe-1   <none>
kube-system   etcd-app-tahjv1fe-1                      1/1     Running   1          63m   192.168.0.10   app-tahjv1fe-1   <none>
kube-system   kube-apiserver-app-tahjv1fe-1            1/1     Running   0          63m   192.168.0.10   app-tahjv1fe-1   <none>
kube-system   kube-controller-manager-app-tahjv1fe-1   1/1     Running   7          64m   192.168.0.10   app-tahjv1fe-1   <none>
kube-system   kube-flannel-ds-amd64-8qp72              1/1     Running   0          42m   192.168.0.10   app-tahjv1fe-1   <none>
kube-system   kube-proxy-jr42n                         1/1     Running   0          64m   192.168.0.10   app-tahjv1fe-1   <none>
kube-system   kube-scheduler-app-tahjv1fe-1            1/1     Running   7          63m   192.168.0.10   app-tahjv1fe-1   <none>

2.4 master node 参与工作负载

使用kubeadm初始化的集群,出于安全考虑Pod不会被调度到Master Node上,也就是说Master Node不参与工作负载。这是因为当前的master节点app-tahjv1fe-1被打上了node-role.kubernetes.io/master:NoSchedule的污点,可以修改这个配置,让master参与工作调度,去掉这个污点使app-tahjv1fe-1参与工作负载:

# kubectl taint nodes app-tahjv1fe-1 node-role.kubernetes.io/master-
node "app-tahjv1fe-1" untainted

2.5 向Kubernetes集群中添加Node节点

下面我们将app-tahjv1fe-2这个主机添加到Kubernetes集群中,因为我们同样在app-tahjv1fe-2上的kubelet的启动参数中去掉了必须关闭swap的限制,所以同样需要–ignore-preflight-errors=Swap这个参数。 在app-tahjv1fe-2上执行:

kubeadm join 192.168.0.10:6443 --token 62u3or.7rvaxpgra3r9c1u9 --discovery-token-ca-cert-hash sha256:0fe59a4d41817cff8d3190a0e3c541219957abd938c4f9243d03782523c663dc --ignore-preflight-errors=Swap

......
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.

app-tahjv1fe-2加入集群很是顺利,下面在master节点上执行命令查看集群中的节点:

# kubectl get node
NAME             STATUS   ROLES    AGE    VERSION
app-tahjv1fe-1   Ready    master   149m   v1.12.1
app-tahjv1fe-2   Ready    <none>   52m    v1.12.1

如何从集群中移除Node

如果需要从集群中移除app-tahjv1fe-2这个Node执行下面的命令:

在master节点上执行:

kubectl drain app-tahjv1fe-2 --delete-local-data --force --ignore-daemonsets
kubectl delete node app-tahjv1fe-2

在app-tahjv1fe-2上执行:

kubeadm reset
ifconfig cni0 down
ip link delete cni0
ifconfig flannel.1 down
ip link delete flannel.1
rm -rf /var/lib/cni/

3.Kubernetes常用组件部署

越来越多的公司和团队开始使用Helm这个Kubernetes的包管理器,我们也将使用Helm安装Kubernetes的常用组件。

3.1 Helm的安装

Helm由helm客户端和tiller服务端组成,Helm的安装十分简单。 下载helm命令行工具到master节点app-tahjv1fe-1的/usr/local/bin下,这里下载的2.11.0版本:

wget https://storage.googleapis.com/kubernetes-helm/helm-v2.11.0-linux-amd64.tar.gz
tar -zxvf helm-v2.11.0-linux-amd64.tar.gz
cd linux-amd64/
cp helm /usr/local/bin/

为了安装服务端tiller,还需要在这台机器上配置好kubectl工具和kubeconfig文件,确保kubectl工具可以在这台机器上访问apiserver且正常使用。 这里的app-tahjv1fe-1节点以及配置好了kubectl。

因为Kubernetes APIServer开启了RBAC访问控制,所以需要创建tiller使用的service account: tiller并分配合适的角色给它。这里简单起见直接分配cluster-admin这个集群内置的ClusterRole给它。创建rbac-config.yaml文件:

cd ~/k8s/
vim rbac-config.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: tiller
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: tiller
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
  - kind: ServiceAccount
    name: tiller
    namespace: kube-system
# kubectl create -f rbac-config.yaml
serviceaccount/tiller created
clusterrolebinding.rbac.authorization.k8s.io/tiller created

接下来使用helm部署tiller:

helm init --service-account tiller --skip-refresh

如果直接执行部署语句,要链接 https://kubernetes-charts.storage.googleapis.com去下载镜像;在国内这个地址不能直接访问,所以需要下载其他镜像;docker pull fengzos/tiller:v2.11.0这个镜像是直接从gcr.io/kubernetes-helm/tiller:v2.11.0继承过来的,可以直接使用。

docker pull fengzos/tiller
helm init --service-account tiller --upgrade -i fengzos/tiller:latest  --skip-refresh

tiller默认被部署在k8s集群中的kube-system这个namespace下:

# kubectl get pod -n kube-system -l app=helm
NAME                            READY   STATUS    RESTARTS   AGE
tiller-deploy-8999b76d9-nx57p   1/1     Running   0          2m6s

3.2 使用Helm部署Nginx Ingress

为了便于将集群中的服务暴露到集群外部,从集群外部访问,接下来使用Helm将Nginx Ingress部署到Kubernetes上。 Nginx Ingress Controller被部署在Kubernetes的边缘节点上,这里简单起见,只有一个edge节点。

我们将app-tahjv1fe-1(192.168.0.10)同时做为边缘节点,打上Label:

# kubectl label node app-tahjv1fe-1 node-role.kubernetes.io/edge=
node/app-tahjv1fe-1 labeled
# kubectl get node
NAME             STATUS   ROLES         AGE     VERSION
app-tahjv1fe-1   Ready    edge,master   4h59m   v1.12.1
app-tahjv1fe-2   Ready    <none>        3h22m   v1.12.1

3.3 使用Helm部署dashboard

ingress:
  enabled: true
  image: 
    repository: fengzos/kubernetes-dashboard-amd64
    tag: "v1.10.0"
    pullPolicy: IfNotPresent
  annotations:
    nginx.ingress.kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/secure-backends: "true"
rbac:
  clusterAdminRole: true
helm install stable/kubernetes-dashboard \
-n kubernetes-dashboard \
--namespace kube-system  \
-f kubernetes-dashboard.yaml
#kubectl -n kube-system get secret | grep kubernetes-dashboard-token
kubernetes-dashboard-token-tjj25                 kubernetes.io/service-account-token   3         37s

#kubectl describe -n kube-system secret/kubernetes-dashboard-token-tjj25
Name:         kubernetes-dashboard-token-tjj25
Namespace:    kube-system
Labels:       <none>
Annotations:  kubernetes.io/service-account.name=kubernetes-dashboard
              kubernetes.io/service-account.uid=d19029f0-9cac-11e8-8d94-080027db403a

Type:  kubernetes.io/service-account-token

Data
====
namespace:  11 bytes
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZC10b2tlbi10amoyNSIsImt1YmVy

登录https://192.168.0.10,然后使用token登录

 

没有更多推荐了,返回首页

私密
私密原因:
请选择设置私密原因
  • 广告
  • 抄袭
  • 版权
  • 政治
  • 色情
  • 无意义
  • 其他
其他原因:
120
出错啦
系统繁忙,请稍后再试