阿里云centos 7下kubeadm方式安装kubernetes 1.14.1集群（包含解决墙以及各种坑的问题）

最新推荐文章于 2024-07-30 15:32:42 发布

huyongchao98

最新推荐文章于 2024-07-30 15:32:42 发布

阅读量2.5k

点赞数 1

分类专栏：运维文章标签： kubernetes 运维自动化容器编排

运维专栏收录该内容

5 篇文章 0 订阅

订阅专栏

（一）所有节点(master和worker node)都执行的命令

1.关闭系统swap功能，否则kubernetes无法正常启动

swapoff -a

free -h命令 swap空间为0时关闭成功

2.升级系统:

sudo yum update -y

3. 安装docker

sudo yum install -y docker

查看docker版本

sudo docker version

开机启动

sudo systemctl enable docker && sudo systemctl start docker

本人安装的版本为1.13.1为保证一直，请也安装此版本。

4.安装Kubernetes包

（1）首先要加入阿里的yum源

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

（2）关闭SELinux

sudo setenforce 0
sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

（3）安装kubernetes

sudo yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes

（4）启动

sudo systemctl enable kubelet && sudo systemctl start kubelet

（二）master节点执行命令

（1）防火墙设置 6443 和 10250可访问

sudo firewall-cmd --permanent --add-port=6443/tcp && sudo firewall-cmd --permanent --add-port=10250/tcp && sudo firewall-cmd --reload

（2） IPTables设置

sudo bash -c 'cat <<EOF >  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF'

使上面的设置生效：

sudo sysctl --system

查看是否成功

sudo lsmod | grep br_netfilter

上面命令有显示则成功

（3）Kubernetes配置

1.查看下载kubeadm依赖的images

sudo kubeadm config images list

可以看到依赖了k8s.gcr.io中的镜像，这个被墙了，而我们要完成安装，是需要拉这些images的，拉取实际命令为

sudo kubeadm config images pull

执行时，会报错，依然是墙的问题，所以我们采用其他方式，执行以下三个命令：

下载需要的镜像：

kubeadm config images list |sed -e 's/^/docker pull /g' -e 's#k8s.gcr.io#docker.io/mirrorgooglecontainers#g' |sh -x

重命名镜像

docker images |grep mirrorgooglecontainers |awk '{print "docker tag ",$1":"$2,$1":"$2}' |sed -e 's#docker.io/mirrorgooglecontainers#k8s.gcr.io#2' |sh -x

删除mirrorgooglecontainers镜像

docker images |grep mirrorgooglecontainers |awk '{print "docker rmi ", $1":"$2}' |sh -x

这样就可以不用执行 sudo kubeadm config images pull了

2.初始化kubernetes

sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --image-repository index.docker.io/mirrorgooglecontainers

添加--image-repository参数是为了解决在安装其他包的时候，发生的墙的问题。

加上这个参数以后，会报错：

failed to pull image index.docker.io/mirrorgooglecontainers/coredns:1.3.1: output: Trying to pull repository docker.io/mirrorgooglecontainers/coredns ...

可以先手工pull下来：

docker pull coredns/coredns:1.3.1
docker tag coredns/coredns:1.3.1 index.docker.io/mirrorgooglecontainers/coredns:1.3.1

以上命令执行过程中，如出现：

[ERROR Swap]: running with swap on is not supported. Please disable swap
[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=...

为swap没有关闭，关闭即可。

如出现：

[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
[preflight] If you know what you are doing, you can make a check non-fatal with --ignore-preflight-errors=...

为IPTable设置有问题，重复上面的设置操作即可。

成功执行以后，会有如下提示信息：

[init] Using Kubernetes version: v1.14.0
[preflight] Running pre-flight checks
        [WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master-node kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.0.120]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master-node localhost] and IPs [192.168.0.120 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master-node localhost] and IPs [192.168.0.120 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 16.501860 seconds
[upload-config] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.14" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --experimental-upload-certs
[mark-control-plane] Marking the node k8s-master-node as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node k8s-master-node as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: 3j2pkk.xk7tnltycyz2xh5n
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.0.120:6443 --token khm95w.mo0wwenu2o9hglls \
    --discovery-token-ca-cert-hash sha256:aeb0ca593b63c8d674719858fd2397825825cebc552e3c165f00edb9671d6e32

按照提示的信息，我们在master中执行：

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config 
sudo chown $(id -u):$(id -g) $HOME/.kube/config

查看监听服务状态：

 watch kubectl get pods --all-namespaces

如果所有服务的状态都输running那么说明是正常的，在此过程中，发现coredns一直处于pending状态，

此时可执行如下命令：

export KUBECONFIG=/etc/kubernetes/admin.conf

kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

等待一会以后，发现已经变成running状态了。

查看kubelet服务的状态：

systemctl status -l kubelet

发现了两处报错：

（1） kubelet[29703]: E0429 15:57:26.321596 29703 summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get container info for "/system.slice/kubelet.service": unknown container "/system.slice/kubelet.service"

（2）kubelet[29703]: E0429 15:57:26.321633 29703 summary_sys_containers.go:47] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service"

添加KUBECONFIG路径

cat <<EOF >> ~/.bash_profile
export KUBECONFIG=/etc/kubernetes/admin.conf
EOF
source ~/.bash_profile

编辑/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf

vim /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf

添加参数：

Environment="KUBELET_MY_ARGS=--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice"

添加到执行命令后面：

ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS $KUBELET_MY_ARGS

即可解决

3.安装dashboard

#拉取镜像
docker pull registry.cn-qingdao.aliyuncs.com/wangxiaoke/kubernetes-dashboard-amd64:v1.10.0

#重新打标签
docker tag registry.cn-qingdao.aliyuncs.com/wangxiaoke/kubernetes-dashboard-amd64:v1.10.0 k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.0

#删除无用镜像
docker image rm registry.cn-qingdao.aliyuncs.com/wangxiaoke/kubernetes-dashboard-amd64:v1.10.0

#发布
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml

4.访问dashboard

Dashboard有多种方式可以访问:

1.kubectl proxy方式：只支持127.0.0.1和localhost为来源地址的方式访问，需要配置SSH隧道，比较麻烦，不建议使用。
2.Node Port方式：该方式容易配置，只建议在开发环境的环境中使用。本文采用这种方式实现。
3.Ingress方式：通过Ingress Controller来暴露应用，比较灵活，是最推荐的方式，但较复杂。请参考文章：http://www.ebanban.com/?p=603
4.API Server方式：由于API服务器是公开的，可以从外部访问，是比较推荐的方式。请参考文章：http://www.ebanban.com/?p=603

这里我们尝试通过第二种方式实现：

（1）修改service配置，将type: ClusterIP改成NodePort

执行

kubectl edit service  kubernetes-dashboard --namespace=kube-system

（2）查看外网暴露端口

kubectl get service --namespace=kube-system

（3）创建dashboard用户

创建admin-token.yaml文件，文件内容如下：

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: admin
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: admin
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin
  namespace: kube-system
  labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile

kubectl create -f admin-token.yaml

（4）获取token

kubectl describe secret/$(kubectl get secret -nkube-system |grep admin|awk '{print $1}') -nkube-system

通过浏览器登陆dashboard。输入https://192.168.80.132:30502/ =》意思是外网IP：节点端口、默认浏览器会阻止访问，要加入信任列表，选择令牌访问，然后输入token。

即可。

(5)添加flannel支持，否则会报cni初始化错误的问题

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/a70459be0084506e4ec919aa1c114638878db11b/Documentation/kube-flannel.yml

（三）worker node分支添加

执行之前，先在worker node机器上执行第一步，然后再按照以下步骤执行

（1）在分支机器上执行以下命令：

获取kubeadm init后执行所得的join的命令

kubeadm join 192.168.0.120:6443 --token khm95w.mo0wwenu2o9hglls \
    --discovery-token-ca-cert-hash sha256:aeb0ca593b63c8d674719858fd2397825825cebc552e3c165f00edb9671d6e32

这里面token只有24小时有效时间，获取hash值为（master上运行）：

openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'

重新获取token（master上运行）:

kubeadm token create

worker node如果 systemctl status kubele

告警 cni.go:171] Unable to update cni config: No networks found in /etc/cni/net.d May 29 06:30:28 fnode kubelet[4136]: E0529 06:30:28.935309 4136 kubelet.go:2130] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

解决方法：

vim /var/lib/kubelet/kubeadm-flags.env

去掉--network-plugin=cni重启kubelet服务即可，此时在master上执行kubectl get nodes可以看到worker noder节点由notready变成ready状态了。所谓的CNI就是Container Network Interface，是google等指定的一套容器间进行网络通信的标准。