一、kubernetes安装:
1、kubernetes架构
运行在master节点:
(1)etcd
高可用的key/value存储
只有apiserver有读写权限
使用etcd集群确保数据可靠性
(2)apiserver
Kubernetes系统入口,REST,维持的对象将被持久化到etcd
认证:证书、token、base认证
授权
访问控制
服务账号
资源限制
(3)kube-scheduler
资源需求
服务需求
硬件/软件/策略限制
关联性和非关联性
数据本地化
(4)kube-controller-manager
Replication controller
Endpoint controller
Namespace controller
Serviceaccount controller
运行在工作节点的模块:
(5)kubelet
节点管理器
确保调度到本节点的pod的运行和健康
(6)kube-proxy
Pod网络代理
TCP/UDP请求转发
负载均衡(rr)
机制:
(7)服务发现机制
每个service都绑定了一个虚拟ip地址
最初是向pod中输入环境变量的方式实现服务的发现;
DNS–kube2sky,etcd,skydns;
kube2sky定期从apiserver中获取当前所有service的名称以及ip地址,并将数据持久化到etcd中,skydns定期读取etcd中的内容去修改dns中的信息
注意这个和master中的etcd是不一样的
(8)网络实现
Pods间的通信不需要net,可以使用openvswitch、Flannel、weave
容器间互相通信
节点和容器互相通信
每个Pod使用一个全局唯一的ip
(9)高可用
kubelet保证每一个master节点的服务正常运行
系统监控程序确保kubelet正常运行
Etcd集群
多个apiserver进行负载均衡
master选举确保kube-scheduler和kube-controller-manager高可用
2、安装
- 不能完全依赖官方文档,但是也不能没有官方文档,官方文档乱的很
- 虚拟机快照管理三台机器,方便反复折腾的那种
- 同步时间:
ntpdate 182.92.12.11
- 安装常见软件:
yum install -y telnet psmisc net-tools bash-completion
- 同步时间:
(1)installing kubeadm
参考官方文档:https://kubernetes.io/docs/setup/independent/install-kubeadm/
a、准备工作
- 防火墙使用firewalld、打开相关的端口
- 主机名配置
- 禁swap
for i in iptables ip6tables ebtable
do
systemctl mask $i
done
yum install firewalld
systemctl unmask firewalld && systemctl enable firewalld && systemctl start firewalld
cat > /etc/firewalld/services/k8s.xml <<EOF
<?xml version="1.0" encoding="utf-8"?>
<service>
<short>k8s</short>
<description>for k8s api server/schedule server/control server......</description>
<port protocol="tcp" port="6443"/>
<port protocol="tcp" port="10250"/>
<port protocol="tcp" port="10251"/>
<port protocol="tcp" port="10252"/>
<port protocol="tcp" port="2379-2380"/>
<port protocol="tcp" port="30000-32767"/>
</service>
EOF
firewall-cmd --permanent --zone=trusted --add-service=k8s
systemctl reload firewalld
# 配置hosts
cat >> /etc/hosts <<EOF
10.40.2.228 client01
10.40.2.229 client02
10.40.2.230 server
EOF
cat >> /etc/sysctl.conf <<EOF
vm.swappiness = 0
EOF
sysctl -p
swapoff -a
sed -i 's/\/dev\/mapper\/cl-swap/#\/dev\/mapper\/cl-swap/g' /etc/fstab
或者
Environment="KUBELET_EXTRA_ARGS=--fail-swap-on=false"
b、安装docker
参考前面docker部分内容,下面是官方文档给出的参考:
## Install prerequisites.
yum install yum-utils device-mapper-persistent-data lvm2
## Add docker repository.
yum-config-manager \
--add-repo \
https://download.docker.com/linux/centos/docker-ce.repo
## Install docker.
yum install docker-ce-17.06.2.ce # 17.03.3和下面的kubernet1.11.5兼容,但是17.03系列好像安装都有问题,因此这里用17.06系列
## Create /etc/docker directory.
mkdir /etc/docker
# 注意:这里最好用cgroup驱动最好用cgroupfs,否则后面会报错,参考附录
cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=cgroupfs"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
]
}
EOF
# Restart docker.
systemctl daemon-reload && systemctl enable docker &&systemctl restart docker
cat > /etc/sysctl.d/99-kubernetes-cri.conf <<EOF
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
sysctl -p /etc/sysctl.d/99-kubernetes-cri.conf
c、安装kubeadm, kubelet and kubectl
如果无法翻墙,建议使用阿里云的源:
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
EOF
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
yum list kubeadm --showduplicat
# 这里选择刚修复了一个安全漏洞的1.11.5的文档版本
yum install -y kubelet-1.11.5 kubeadm-1.11.5 kubectl-1.11.5 --disableexcludes=kubernetes
# 检查cgroup的驱动
$ docker info |grep -i cgroup
Cgroup Driver: systemd
# 修改kubelet的启动参数:
sed -i '/\[Service\]/aEnvironmentFile="KUBELET_EXTRA_ARGS=--cgroup-driver=cgroupfs"' /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
官方文档记录是:
cat > /etc/default/kubelet <<EOF
KUBELET_EXTRA_ARGS=--cgroup-driver=cgroupfs
EOF
此时不要执行(k8s 1.11版本之后):
systemctl enable kubelet && systemctl start kubelet
执行了也会报错
failed to load Kubelet config file /var/lib/kubelet/config.yaml, \
error failed to read kubelet config file "/var/lib/kubelet/config.yaml", \
error: open /var/lib/kubelet/config.yaml: no such file or directory
附源码编译安装或者二进制安装:
- Kubernetes编译的各种发行版安装包来源于Github上的项目:https://github.com/kubernetes/kubernetes/releases,把这个项目clone下来,Centos用户,所以进入rpm目录,在安装好docker的机器上执行那个docker-build.sh脚本即可编译rpm包,最后会生成到当前目录的output目录下。
- https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG.md#downloads-for-v174上面下载的二进制包里面有kubeadm和kubectl、kubelet三个命令,以及一些其他常用命令,还有好几个镜像!当然惊镜像不是很全
d、拉取镜像及初始化
从1.11版本之后,一开始systemctl start kubelet是启动不了的
# 查看当前版本的k8s依赖的镜像
$ kubeadm config images list
k8s.gcr.io/kube-apiserver-amd64:v1.11.5
k8s.gcr.io/kube-controller-manager-amd64:v1.11.5
k8s.gcr.io/kube-scheduler-amd64:v1.11.5
k8s.gcr.io/kube-proxy-amd64:v1.11.5
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd-amd64:3.2.18
k8s.gcr.io/coredns:1.1.3
# 下面一般不用,因为墙,拉取不了;如果网络能通,docker服务必须启动
kubeadm config images pull
# 如果无法翻墙,可以用下面的方法获取:
镜像获取:
- 参考:https://mritd.me/2016/10/29/set-up-kubernetes-cluster-by-kubeadm
- 第一种是利用一台国外的服务器,在上面pull下来,然后再save成tar文件,最后scp到本地load进去;
- 第二种方式就是利用docker hub做中转,简单的说就是利用docker hub的自动构建功能,在Github中创建一个Dockerfile,里面只需要FROM xxxx这些gcr.io的镜像即可,最后pull到本地,然后再tag一下。
下面是早期从github拉取1.7.4的镜像经过:
a、
首先创建一个github项目
https://github.com/jiangmf-china/kubernetes-images
b、
其中每个Dockerfile只需要FROM一下即可:
例如apiserver的:
FROM gcr.io/google_containers/kube-apiserver-amd64:v1.7.4
MAINTAINER jmf <jmf240625394@163.com>
c、
最后在 Docker Hub 上创建自动构建项目
jiangmfchina258
jmf240625394@163.com
需要关联github账号
build settings:
Type Name Dockerfile Docker Tag Name
Branch master /etcd-adm64/ etcd-3.0.17
Branch master /k8s-dns-dnsmasq-nanny-adm64/ k8s-nanny-1.14.4
Branch master /k8s-dns-kube-dns-adm64/ k8s-dns-1.14.4
Branch master /k8s-dns-sidecar-adm64/ k8s-sidecar-1.14.4
Branch master /kube-apiserver-adm64/ api-v1.7.4
Branch master /kube-controller-manager-adm64/ controller-v1.7.4
Branch master /kube-proxy-adm64/ proxy-v1.7.4
Branch master /kube-scheduler-adm64/ scheduler-v1.7.4
Branch master /pause-adm64/ pause-v1.7.4
然后手动触发构建!
并将构建的镜像pull下来:
docker pull jiangmfchina258/kubernetes-images:pause-v1.7.4
docker pull jiangmfchina258/kubernetes-images:etcd-3.0.17
docker pull jiangmfchina258/kubernetes-images:k8s-nanny-1.14.4
docker pull jiangmfchina258/kubernetes-images:k8s-dns-1.14.4
docker pull jiangmfchina258/kubernetes-images:k8s-sidecar-1.14.4
docker pull jiangmfchina258/kubernetes-images:api-v1.7.4
docker pull jiangmfchina258/kubernetes-images:controller-v1.7.4
docker pull jiangmfchina258/kubernetes-images:proxy-v1.7.4
docker pull jiangmfchina258/kubernetes-images:scheduler-v1.7.4
[root@server01 system]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/jiangmfchina258/kubernetes-images pause-v1.7.4 504909098456 15 minutes ago 746.9 kB
docker.io/jiangmfchina258/kubernetes-images scheduler-v1.7.4 b1b5991da708 16 minutes ago 77.2 MB
docker.io/jiangmfchina258/kubernetes-images proxy-v1.7.4 7ae5f741a8a8 17 minutes ago 114.7 MB
docker.io/jiangmfchina258/kubernetes-images controller-v1.7.4 c0b348f818f1 18 minutes ago 138 MB
docker.io/jiangmfchina258/kubernetes-images api-v1.7.4 d58dddd089ca 19 minutes ago 186.1 MB
docker.io/jiangmfchina258/kubernetes-images k8s-sidecar-1.14.4 0cff65382abf 20 minutes ago 41.81 MB
docker.io/jiangmfchina258/kubernetes-images k8s-dns-1.14.4 21826c23ab4f 21 minutes ago 49.38 MB
docker.io/jiangmfchina258/kubernetes-images k8s-nanny-1.14.4 706c2cb993cd 22 minutes ago 41.41 MB
docker.io/jiangmfchina258/kubernetes-images etcd-3.0.17 a4274784e44e 26 minutes ago 168.9 MB
docker tag a4274784e44e gcr.io/google_containers/etcd-amd64:3.0.17
docker tag 706c2cb993cd gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.4
docker tag 21826c23ab4f gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.4
docker tag 0cff65382abf gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.4
docker tag d58dddd089ca gcr.io/google_containers/kube-apiserver-amd64:v1.7.4
docker tag c0b348f818f1 gcr.io/google_containers/kube-controller-manager-amd64:v1.7.4
docker tag 7ae5f741a8a8 gcr.io/google_containers/kube-proxy-amd64:v1.7.4
docker tag b1b5991da708 gcr.io/google_containers/kube-scheduler-amd64:v1.7.4
docker tag 504909098456 gcr.io/google_containers/pause-amd64:3.0
docker rmi docker.io/jiangmfchina258/kubernetes-images:pause-v1.7.4
docker rmi docker.io/jiangmfchina258/kubernetes-images:scheduler-v1.7.4
docker rmi docker.io/jiangmfchina258/kubernetes-images:proxy-v1.7.4
docker rmi docker.io/jiangmfchina258/kubernetes-images:controller-v1.7.4
docker rmi docker.io/jiangmfchina258/kubernetes-images:api-v1.7.4
docker rmi docker.io/jiangmfchina258/kubernetes-images:k8s-sidecar-1.14.4
docker rmi docker.io/jiangmfchina258/kubernetes-images:k8s-dns-1.14.4
docker rmi docker.io/jiangmfchina258/kubernetes-images:k8s-nanny-1.14.4
docker rmi docker.io/jiangmfchina258/kubernetes-images:etcd-3.0.17
[root@server01 system]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
gcr.io/google_containers/pause-amd64 3.0 504909098456 42 minutes ago 746.9 kB
gcr.io/google_containers/kube-scheduler-amd64 v1.7.4 b1b5991da708 43 minutes ago 77.2 MB
gcr.io/google_containers/kube-proxy-amd64 v1.7.4 7ae5f741a8a8 44 minutes ago 114.7 MB
gcr.io/google_containers/kube-controller-manager-amd64 v1.7.4 c0b348f818f1 45 minutes ago 138 MB
gcr.io/google_containers/kube-apiserver-amd64 v1.7.4 d58dddd089ca 46 minutes ago 186.1 MB
gcr.io/google_containers/k8s-dns-sidecar-amd64 1.14.4 0cff65382abf 47 minutes ago 41.81 MB
gcr.io/google_containers/k8s-dns-kube-dns-amd64 1.14.4 21826c23ab4f 48 minutes ago 49.38 MB
gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64 1.14.4 706c2cb993cd 49 minutes ago 41.41 MB
gcr.io/google_containers/etcd-amd64 3.0.17 a4274784e44e 53 minutes ago 168.9 MB
docker save gcr.io/google_containers/pause-amd64:3.0 >pause-3.0.tar
docker save gcr.io/google_containers/kube-scheduler-amd64:v1.7.4 >kube-scheduler-amd64-v1.7.4.tar
docker save gcr.io/google_containers/kube-proxy-amd64:v1.7.4 >kube-proxy-amd64-v1.7.4.tar
docker save gcr.io/google_containers/kube-controller-manager-amd64:v1.7.4 >kube-controller-manager-amd64-v1.7.4.tar
docker save gcr.io/google_containers/kube-apiserver-amd64:v1.7.4 >kube-apiserver-amd64-v1.7.4.tar
docker save gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.4 >k8s-dns-sidecar-amd64-1.14.4.tar
docker save gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.4 >k8s-dns-kube-dns-amd64-1.14.4.tar
docker save gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.4 >k8s-dns-dnsmasq-nanny-amd64-1.14.4.tar
docker save gcr.io/google_containers/etcd-amd64:3.0.17 >etcd-amd64-3.0.17.tar
(2)master部署:
注意:
各个节点导入镜像
for i in `ls ./`;do docker load < $i;done
如果使用fannel,init的时候一定要带参数:–pod-network-cidr 172.17.0.0/16
[root@server k8s-v1.11.5]# kubeadm init --kubernetes-version v1.11.5 --pod-network-cidr 172.17.0.0/16 --apiserver-advertise-address 10.40.2.230
[init] using Kubernetes version: v1.11.5
[preflight] running pre-flight checks
[WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly
I1207 17:01:26.872059 2198 kernel_validator.go:81] Validating kernel version
I1207 17:01:26.872586 2198 kernel_validator.go:96] Validating kernel config
[WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 17.06.2-ce. Max validated version: 17.03
[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight/images] Pulling images required for setting up a Kubernetes cluster
[preflight/images] This might take a minute or two, depending on the speed of your internet connection
[preflight/images] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[preflight] Activating the kubelet service
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [server kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.40.2.230]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated sa key and public key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Generated etcd/ca certificate and key.
[certificates] Generated etcd/server certificate and key.
[certificates] etcd/server serving cert is signed for DNS names [server localhost] and IPs [127.0.0.1 ::1]
[certificates] Generated etcd/peer certificate and key.
[certificates] etcd/peer serving cert is signed for DNS names [server localhost] and IPs [10.40.2.230 127.0.0.1 ::1]
[certificates] Generated etcd/healthcheck-client certificate and key.
[certificates] Generated apiserver-etcd-client certificate and key.
[certificates] valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests"
[init] this might take a minute or longer if the control plane images have to be pulled
[apiclient] All control plane components are healthy after 44.503683 seconds
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.11" in namespace kube-system with the configuration for the kubelets in the cluster
[markmaster] Marking the node server as master by adding the label "node-role.kubernetes.io/master=''"
[markmaster] Marking the node server as master by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "server" as an annotation
[bootstraptoken] using token: klofob.su0y8q6wb0n0ec0a
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of machines by running the following on each node
as root:
kubeadm join 10.40.2.230:6443 --token klofob.su0y8q6wb0n0ec0a --discovery-token-ca-cert-hash sha256:515b6f79e29fe60bf7e15c7b60626b81d8a0cc8c77ef5204b283a72156da7c3d
按上面的提示在server端配置一下环境变量。
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
[root@server ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
server NotReady master 1m v1.11.5
因为网络组件还没有配置,所以状态为NotReady, 下面配置 flannel。
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml
# 可能要等会,还要拉镜像等
注意:
一开始,直接使用的就是上面原生的yaml文件,后面突然网络就不通了,懵逼了好长时间,没有排查出来,只有将相关的测试实例(还好才几个)删除,然后删除flannel pod
修改上面下载的kube-flannel.yaml
net-conf.json: |
{
"Network": "172.17.0.0/16",
"Backend": {
"Type": "host-gw"
}
}
将类型改成host-gw,network改成init的时候命令行里面设置的一样
等候pods启动之后发现两台minion已经可以ping通dns服务器了,minion上也添加了相关的路由。
[root@server ~]# kubectl get pods -n kube-system -o=wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
coredns-78fcdf6894-8rklb 1/1 Running 0 5m 172.17.0.3 server <none>
coredns-78fcdf6894-fxpm6 1/1 Running 0 5m 172.17.0.2 server <none>
etcd-server 1/1 Running 0 4m 10.40.2.230 server <none>
kube-apiserver-server 1/1 Running 0 4m 10.40.2.230 server <none>
kube-controller-manager-server 1/1 Running 0 4m 10.40.2.230 server <none>
kube-flannel-ds-amd64-kpnq9 1/1 Running 0 2m 10.40.2.230 server <none>
kube-proxy-vwx7k 1/1 Running 0 5m 10.40.2.230 server <none>
kube-scheduler-server 1/1 Running 0 4m 10.40.2.230 server <none>
[root@server ~]# kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health": "true"}
[root@server ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
server Ready master 6m v1.11.5
(3)minion部署
- 导入镜像
- systemctl reload firewalld #在master上执行
- 加入master
kubeadm join 10.40.2.230:6443 --token klofob.su0y8q6wb0n0ec0a \ --discovery-token-ca-cert-hash sha256:515b6f79e29fe60bf7e15c7b60626b81d8a0cc8c77ef5204b283a72156da7c3d
回到master执行:
[root@server ~]# kubectl get pods -n kube-system -o=wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
coredns-78fcdf6894-8rklb 1/1 Running 2 13m 172.17.0.3 server <none>
coredns-78fcdf6894-fxpm6 1/1 Running 2 13m 172.17.0.2 server <none>
etcd-server 1/1 Running 0 13m 10.40.2.230 server <none>
kube-apiserver-server 1/1 Running 0 12m 10.40.2.230 server <none>
kube-controller-manager-server 1/1 Running 0 13m 10.40.2.230 server <none>
kube-flannel-ds-amd64-kpnq9 1/1 Running 0 11m 10.40.2.230 server <none>
kube-flannel-ds-amd64-qcnbq 1/1 Running 0 2m 10.40.2.228 client01 <none>
kube-flannel-ds-amd64-zhd76 1/1 Running 0 4m 10.40.2.229 client02 <none>
kube-proxy-h47q9 1/1 Running 0 2m 10.40.2.228 client01 <none>
kube-proxy-qw6m2 1/1 Running 0 4m 10.40.2.229 client02 <none>
kube-proxy-vwx7k 1/1 Running 0 13m 10.40.2.230 server <none>
kube-scheduler-server 1/1 Running 0 12m 10.40.2.230 server <none>
[root@server ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
client01 Ready <none> 2m v1.11.5
client02 Ready <none> 5m v1.11.5
server Ready master 14m v1.11.5
[root@server ~]# kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health": "true"}
折腾了好几天,至此,一个可以练手的基础环境完成了
后续测试的内容会逐步贴出来
3、附录:
(1)重置集群
修改好再启动, 如果发现配置文件已经存在可以执行重置
测试环境,可以用一下命令在所有节点重新开始部署集群
kubeadm reset
(2)master节点不会调度pod
默认master节点不会调度pod,可以用下面的命令解除:
# kubectl taint nodes --all node-role.kubernetes.io/master-
node "server01.jmf.com" untainted
(3)错误处理1:
The connection to the server localhost:8080 was refused - did you specify the right host or port?
其实master上面初始化就已经告诉了我们:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
或者使用环境变量:
vi ~/.bash_profile
export KUBECONFIG=/etc/kubernetes/admin.conf
source ~/.bash_profile
(4)错误处理2:
[root@server run]# kubectl get pods --all-namespaces -o=wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
kube-system coredns-78fcdf6894-8bx45 0/1 ContainerCreating 0 1h <none> server <none>
kube-system coredns-78fcdf6894-gl9k9 0/1 ContainerCreating 0 1h <none> server <none>
kube-system etcd-server 1/1 Running 0 1h 10.40.2.230 server <none>
kube-system kube-apiserver-server 1/1 Running 0 1h 10.40.2.230 server <none>
kube-system kube-controller-manager-server 1/1 Running 0 1h 10.40.2.230 server <none>
kube-system kube-flannel-ds-amd64-hskkl 0/1 CrashLoopBackOff 12 42m 10.40.2.229 client02 <none>
kube-system kube-flannel-ds-amd64-j29cz 0/1 CrashLoopBackOff 14 49m 10.40.2.230 server <none>
kube-system kube-flannel-ds-amd64-qfksq 0/1 CrashLoopBackOff 12 39m 10.40.2.228 client01 <none>
kube-system kube-proxy-77ffs 1/1 Running 0 39m 10.40.2.228 client01 <none>
kube-system kube-proxy-mbmcf 1/1 Running 0 1h 10.40.2.230 server <none>
kube-system kube-proxy-v6mlt 1/1 Running 0 42m 10.40.2.229 client02 <none>
kube-system kube-scheduler-server 1/1 Running 0 1h 10.40.2.230 server <none>
kube-system kubernetes-dashboard-767dc7d4d-f8ztl 0/1 ContainerCreating 0 31m <none> client02 <none>
看到状态不是running的有好几个???
看日志,发现open /run/flannel/subnet.env: no such file or directory
参考:https://github.com/kubernetes/kubernetes/issues/36575
(1)If you want to use flannel CNI, make sure you use --pod-network-cidr to kubeadm init on the master node as below:
kubeadm init --pod-network-cidr=10.244.0.0/16
(2)Install a pod network
When using CNI, you must install a pod network add-on for pods to communicate with each other. This should be done before joining minions to the cluster. The kube-dns will wait in ContainerCreating status until a pod network is installed. You can choose any one of the add-ons that meets your needs.
For example, if using flannel download the flannel yaml file and run as follows. You just need to run it only on the master node.
kubectl apply -f flannel.yaml
(3)Make sure your network pod (s) (in this example flannel) are running. Once flannel network pod is up running, kube-dns pod will move to running state as well.
kubectl get pods -n kube-system -o=wide
具体排查过程:
[root@server log]# kubectl describe pod -n kube-system kube-flannel-ds-amd64-5r6pz
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 4m kubelet, server Container image "quay.io/coreos/flannel:v0.10.0-amd64" already present on machine
Normal Created 4m kubelet, server Created container
Normal Started 4m kubelet, server Started container
Normal Pulled 2m (x5 over 4m) kubelet, server Container image "quay.io/coreos/flannel:v0.10.0-amd64" already present on machine
Normal Created 2m (x5 over 4m) kubelet, server Created container
Warning Failed 2m (x5 over 4m) kubelet, server Error: failed to start container "kube-flannel": Error response from daemon: oci runtime error: container_linux.go:262: starting container process caused "process_linux.go:261: applying cgroup configuration for process caused \"No such device or address\""
Warning BackOff 2m (x7 over 4m) kubelet, server Back-off restarting failed container
kubectl logs --namespace kube-system kube-flannel-ds-amd64-46tn5
查看日志
参考下面的解决:
https://gist.github.com/MOZGIII/22bf4eb811ff5d4e0bbe36444422b6d3
将cgroup driver设置为cgroupfs,一开始是设置的systemd
[root@server ~]# kubectl delete node client01
node "client01" deleted
[root@server ~]# kubectl delete node client02
node "client02" deleted
重新部署,ok
(5)安装其他插件(旧的文档直接移植过来了):
可以参考官方文档:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/
dashboard:
kubectl create -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
# kubectl describe svc kubernetes-dashboard --namespace=kube-system
Name: kubernetes-dashboard
Namespace: kube-system
Labels: k8s-app=kubernetes-dashboard
Annotations: <none>
Selector: k8s-app=kubernetes-dashboard
Type: ClusterIP
IP: 10.98.76.131
Port: <unset> 80/TCP
Endpoints:
Session Affinity: None
Events: <none>
好像不能访问???
# kubectl get pod --namespace=kube-system
NAME READY STATUS RESTARTS AGE
etcd-server01.jmf.com 1/1 Running 0 1h
kube-apiserver-server01.jmf.com 1/1 Running 0 1h
kube-controller-manager-server01.jmf.com 1/1 Running 0 1h
kube-dns-2425271678-x56wm 3/3 Running 0 1h
kube-flannel-ds-lndll 2/2 Running 0 1h
kube-proxy-59f0m 1/1 Running 0 1h
kube-scheduler-server01.jmf.com 1/1 Running 0 1h
kubernetes-dashboard-3313488171-9z61w 0/1 ImagePullBackOff 0 10m
综上所述,pull镜像时卡住了!
kubectl create -f https://rawgit.com/kubernetes/dashboard/master/src/deploy/kubernetes-dashboard.yaml
没有镜像:会卡住,所以手动来
wget https://rawgit.com/kubernetes/dashboard/master/src/deploy/kubernetes-dashboard.yaml -O kubernetes-dashboard.yaml
可以看到:
image: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.6.3
所以要将镜像导到本地!
# kubectl describe svc kubernetes-dashboard --namespace=kube-system
Name: kubernetes-dashboard
Namespace: kube-system
Labels: k8s-app=kubernetes-dashboard
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"k8s-app":"kubernetes-dashboard"},"name":"kubernetes-dashboard","namespace":...
Selector: k8s-app=kubernetes-dashboard
Type: ClusterIP
IP: 10.98.76.131
Port: <unset> 80/TCP
Endpoints: 10.96.0.4:9090
Session Affinity: None
Events: <none>
# kubectl proxy
Starting to serve on 127.0.0.1:8001
这样只能本地访问
# kubectl proxy --address='192.168.9.201' --accept-hosts='^*$'
http://192.168.9.201:8001/ui报错:
I0829 13:44:08.580052 24530 logs.go:41] http: proxy error: dial tcp 127.0.0.1:8080: getsockopt: connection refused
然后跑到另外一个窗口执行又可以了!
journalctl -f -u kube-apiserver #排错
要配置用户名和密码:
https://kubernetes.io/docs/admin/authentication/
找回账号密码:kubectl config view
参考:http://dockone.io/article/2514
(6)关于fannel的网络
可以使用curl命令,将它的yaml文件保存下来,研究一下
默认应该使用的是vxlan网络方式
curl https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml -o kube-flannel.yml
(7)安装部署总结:
- 生产环境的部署很多还是基于1.6及以下的,而且是基于二进制部署,这样更容易进行高可用负载!
- 使用kubeadm部署虽然是一种趋势,但高可用集群的部署是最大的痛点,暂时还有很多不确定因素,如果对kubeadm源码不是很熟不建议用这种方式在生产环境部署!