一、机器情况
主机 | ip | 配置 | 操作系统 |
master | 192.168.0.160 | 2c4g50G | centos7.8 |
node01 | 192.168.0.6 | 2c4g50G | centos7.8 |
node02 | 192.168.0.167 | 2c4g50G | centos7.8 |
二、机器设置
以下步骤需要在每个节点上执行
1、设置主机名
在各自节点上设置各自得主机名
1 2 3 4 5 |
|
2、设置hosts
1 2 3 4 5 |
|
3、设置防火墙以及seliunx
1 2 3 4 5 6 7 8 9 |
|
4、关闭swap分区
1 2 3 4 5 6 7 8 9 10 11 |
|
5、创建/etc/sysctl.d/k8s.conf文件,添加如下内容
1 2 3 4 5 6 7 8 9 10 11 12 |
|
6、kube-proxy开启ipvs的前置条件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
三、安装docker
以下步骤需要在每个节点上执行
1、设置阿里云docker yum源
1 2 3 4 5 6 7 |
|
2、安装docker
yum默认是安装最新版本,但是为了兼容性,这里就指定版本安装
1 |
|
3、设置docker的Cgroup Driver
1 2 3 4 5 |
|
4、启动docker设置开机启动
1 |
|
四、使用kubeadm安装kubernetes
1-4需在所有节点执行,5-6在master节点上执行
1、配置yum源
1 2 3 4 5 6 7 8 9 |
|
2、安装kubelet,kubeadm,kubectl
1 |
|
3、修改kubelet的Cgroup Driver
1 2 3 4 |
|
4、下载必要的镜像
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
|
5、初始化
1 2 3 4 5 |
|
1 2 3 |
|
1 2 3 4 |
|
当出现如下信息表示成功
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
1 2 3 4 |
|
6、部署
(1) Flannel
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl apply -f kube-flannel.yml
#如需重新安装需要先删除所创建的网络配置
kubectl delete -f kube-flannel.yml
(2) Calico
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
7、节点加入集群
1 2 3 4 |
|
8、查看状态
1 2 3 4 5 6 7 8 |
|
五、镜像地址
我已把相关镜像打包,放到云盘上了,有需要的自取
链接: https://pan.baidu.com/s/1XKN32WXiXmp6XKlsgw-xGw 提取码: q8hs
六、kubelet服务启动失败问题处理
1.问题重现
1 2 3 4 5 6 7 8 9 10 11 12 13 | [root@leoheng-k8s ~]# systemctl enable kubelet && systemctl start kubelet [root@leoheng-k8s ~]# systemctl status kubelet ● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled) Drop-In: /usr/lib/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: activating (auto-restart) (Result: exit-code) since Thu 2021-01-28 09:56:54 CST; 7s ago Docs: https://kubernetes.io/docs/ Process: 2717 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=255) Main PID: 2717 (code=exited, status=255) Jan 28 09:56:54 leoheng-k8s systemd[1]: Unit kubelet.service entered failed state. Jan 28 09:56:54 leoheng-k8s systemd[1]: kubelet.service failed. |
2.问题处理
1.关闭selinux、firewalld,并且把cgroup设置与docker一样
2.查看官方文档
The kubelet is now restarting every few seconds, as it waits in a crashloop for kubeadm to tell it what to do. This crashloop is expected and normal, please proceed with the next step and the kubelet will start running normally.
3.执行k8s的初始化
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | [root@leoheng-k8s ~]# kubeadm init [init] Using Kubernetes version: v1.20.2 [preflight] Running pre-flight checks [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/ [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.2. Latest validated version: 19.03 [WARNING Hostname]: hostname "leoheng-k8s" could not be reached [WARNING Hostname]: hostname "leoheng-k8s": lookup leoheng-k8s on 100.100.2.138:53: no such host [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "ca" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local leoheng-k8s] and IPs [10.96.0.1 172.18.192.80] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [leoheng-k8s localhost] and IPs [172.18.192.80 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [leoheng-k8s localhost] and IPs [172.18.192.80 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Starting the kubelet [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [apiclient] All control plane components are healthy after 17.002755 seconds [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config-1.20" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Skipping phase. Please see --upload-certs [mark-control-plane] Marking the node leoheng-k8s as control-plane by adding the labels "node-role.kubernetes.io/master=''" and "node-role.kubernetes.io/control-plane='' (deprecated)" [mark-control-plane] Marking the node leoheng-k8s as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] [bootstrap-token] Using token: 1mxqc0.zwbjh8m8g5d35foe [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 172.18.192.80:6443 --token 1mxqc0.zwbjh8m8g5d35foe \ --discovery-token-ca-cert-hash sha256:09949a2800da77bd71d046080fe5a75662472a32d0070305ce1fc7457a642b2d |
4.重新启动kubectl服务
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | [root@leoheng-k8s ~]# systemctl status kubelet ● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled) Drop-In: /usr/lib/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: active (running) since Thu 2021-01-28 09:58:49 CST; 2min 8s ago Docs: https://kubernetes.io/docs/ Main PID: 4088 (kubelet) Tasks: 13 Memory: 37.8M CGroup: /system.slice/kubelet.service └─4088 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/... Jan 28 10:00:34 leoheng-k8s kubelet[4088]: W0128 10:00:34.351824 4088 cni.go:239] Unable to update cni ...net.d Jan 28 10:00:36 leoheng-k8s kubelet[4088]: E0128 10:00:36.029903 4088 kubelet.go:2163] Container runtim...lized Jan 28 10:00:39 leoheng-k8s kubelet[4088]: W0128 10:00:39.352002 4088 cni.go:239] Unable to update cni ...net.d Jan 28 10:00:41 leoheng-k8s kubelet[4088]: E0128 10:00:41.044753 4088 kubelet.go:2163] Container runtim...lized Jan 28 10:00:44 leoheng-k8s kubelet[4088]: W0128 10:00:44.352167 4088 cni.go:239] Unable to update cni ...net.d Jan 28 10:00:46 leoheng-k8s kubelet[4088]: E0128 10:00:46.060000 4088 kubelet.go:2163] Container runtim...lized Jan 28 10:00:49 leoheng-k8s kubelet[4088]: W0128 10:00:49.352345 4088 cni.go:239] Unable to update cni ...net.d Jan 28 10:00:51 leoheng-k8s kubelet[4088]: E0128 10:00:51.075055 4088 kubelet.go:2163] Container runtim...lized Jan 28 10:00:54 leoheng-k8s kubelet[4088]: W0128 10:00:54.352524 4088 cni.go:239] Unable to update cni ...net.d Jan 28 10:00:56 leoheng-k8s kubelet[4088]: E0128 10:00:56.090169 4088 kubelet.go:2163] Container runtim...lized Hint: Some lines were ellipsized, use -l to show in full. [root@leoheng-k8s ~]# |
七、添加节点失效couldn't validate the identity of the API Server 问题处理
报错信息如下
[root@k8s-node2 k8s]# kubeadm join 192.168.1.200:6443 --token ov6qse.lvw984yn30c96p9o --discovery-token-ca-cert-hash sha256:ed7ea5ae0c06f4ace9013e663b223e8da72e4e94e4dc657cfb1db68d777f3984
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 19.03.3. Latest validated version: 18.09
error execution phase preflight: couldn't validate the identity of the API Server: abort connecting to API servers after timeout of 5m0s
查看token
[root@k8s-master ~]# kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
果然没有,生一个,再次查看,token生命周期为一天
[root@k8s-master ~]# kubeadm token create
wxvdun.vec7m9cu4ru3hngg
[root@k8s-master ~]# kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
wxvdun.vec7m9cu4ru3hngg 23h 2019-10-18T10:43:34+08:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
[root@k8s-master ~]# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
ed7ea5ae0c06f4ace9013e663b223e8da72e4e94e4dc657cfb1db68d777f3984
node节点上重新加入
[root@k8s-node2 ~]# kubeadm join 192.168.1.200:6443 --token wxvdun.vec7m9cu4ru3hngg --discovery-token-ca-cert-hash sha256:ed7ea5ae0c06f4ace9013e663b223e8da72e4e94e4dc657cfb1db68d777f3984
指定两个地方,token名和sha256
八、conflicts with file from package问题处理
1、因为之前已经安装过旧版本的docker,在安装的时候报错如下:
Transaction check error:
file /usr/bin/docker from install of docker-ce-18.06.3.ce-3.el7.x86_64 conflicts with file from package docker-ce-cli-1:19.03.5-3.el7.x86_64
file /usr/share/bash-completion/completions/docker from install of docker-ce-18.06.3.ce-3.el7.x86_64 conflicts with file from package docker-ce-cli-1:19.03.5-3.el7.x86_64
file /usr/share/fish/vendor_completions.d/docker.fish from install of docker-ce-18.06.3.ce-3.el7.x86_64 conflicts with file from package docker-ce-cli-1:19.03.5-3.el7.x86_64
file /usr/share/man/man1/docker-attach.1.gz from install of docker-ce-18.06.3.ce-3.el7.x86_64 conflicts with file from package docker-ce-cli-1:19.03.5-3.el7.x86_64
file /usr/share/man/man1/docker-checkpoint-create.1.gz from install of docker-ce-18.06.3.ce-3.el7.x86_64 conflicts with file from package docker-ce-cli-1:19.03.5-3.el7.x86_64
file /usr/share/man/man1/docker-checkpoint-ls.1.gz from install of docker-ce-18.06.3.ce-3.el7.x86_64 conflicts with file from package docker-ce-cli-1:19.03.5-3.el7.x86_64
file /usr/share/man/man1/docker-checkpoint-rm.1.gz from install of docker-ce-18.06.3.ce-3.el7.x86_64 conflicts with file from package docker-ce-cli-1:19.03.5-3.el7.x86_64
file /usr/share/man/man1/docker-checkpoint.1.gz from install of docker-ce-18.06.3.ce-3.el7.x86_64 conflicts with file from package docker-ce-cli-1:19.03.5-3.el7.x86_64
2、卸载旧版本的包。如上图所示,可知需要卸载 docker-ce-cli-1:19.03.5-3.el7.x86_64
sudo yum erase docker-ce-cli-1:19.03.5-3.el7.x86_64