kubeasz 安装K8S 错误解决

问题一

#  /opt/kube/bin/kubectl get nodes
E0221 11:11:16.575740   23693 memcache.go:255] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0221 11:11:16.582180   23693 memcache.go:106] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0221 11:11:16.584653   23693 memcache.go:106] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0221 11:11:16.587073   23693 memcache.go:106] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
NAME       STATUS                     ROLES    AGE   VERSION
master01   Ready,SchedulingDisabled   master   49m   v1.26.1
master02   Ready,SchedulingDisabled   master   49m   v1.26.1
master03   Ready,SchedulingDisabled   master   49m   v1.26.1
worker01   Ready                      node     46m   v1.26.1
worker02   Ready                      node     46m   v1.26.1
worker03   Ready                      node     46m   v1.26.1

解决:

修改/etc/kubeasz/clusters/k8s-01/yml/metrics-server.yaml 文件
118 spec:
119   selector:
120     matchLabels:
121       k8s-app: metrics-server
122   strategy:
123     rollingUpdate:
124       maxUnavailable: 0
125   template:
126     metadata:
127       labels:
128         k8s-app: metrics-server
129     spec:
130       hostNetwork: true      #####添加本行内容
131       containers:
132       - args:
133         - --cert-dir=/tmp
134         - --secure-port=4443

/opt/kube/bin/kubectl apply -f /etc/kubeasz/clusters/k8s-01/yml/metrics-server.yaml

[root@master01 yml]#  /opt/kube/bin/kubectl get nodes
NAME       STATUS                     ROLES    AGE   VERSION
master01   Ready,SchedulingDisabled   master   49m   v1.26.1
master02   Ready,SchedulingDisabled   master   49m   v1.26.1
master03   Ready,SchedulingDisabled   master   49m   v1.26.1
worker01   Ready                      node     46m   v1.26.1
worker02   Ready                      node     46m   v1.26.1
worker03   Ready                      node     46m   v1.26.1

问题二、

FAILED - RETRYING: 轮询等待calico-node 运行 (15 retries left). FAILED - RETR
helm pre-upgrade hooks failed: timed out waiting for the condition
错误日志:
Kubelet Unable to attach or mount volumes – timed out waiting for the condition
解决原因:k8s,docker,cilium等很多功能、特性需要较新的linux内核支持,所以有必要在集群部署前对内核进行升级
解决办法:升级
Centos7事例:
# 载入公钥
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
# 安装ELRepo
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
# 载入elrepo-kernel元数据
yum --disablerepo=\* --enablerepo=elrepo-kernel repolist
# 查看可用的rpm包
yum --disablerepo=\* --enablerepo=elrepo-kernel list kernel*
# 安装长期支持版本的kernel
yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt.x86_64
# 删除旧版本工具包
yum remove kernel-tools-libs.x86_64 kernel-tools.x86_64 -y
# 安装新版本工具包
yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt-tools.x86_64

#查看默认启动顺序
awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg  
CentOS Linux (4.4.183-1.el7.elrepo.x86_64) 7 (Core)  
CentOS Linux (3.10.0-327.10.1.el7.x86_64) 7 (Core)  
CentOS Linux (0-rescue-c52097a1078c403da03b8eddeac5080b) 7 (Core)
#默认启动的顺序是从0开始,新内核是从头插入(目前位置在0,而4.4.4的是在1),所以需要选择0。
grub2-set-default 0  
#重启并检查
reboot

问题三、提示没有kubectl(其它相关命令)命令

添加命令的环境变量
# vim /etc/profile
PATH=$PATH:/opt/kube/bin
export PATH
# source /etc/profile

问题四、安装calico网络中的一些问题

#安装calico网络
如果需要安装calico,请在clusters/xxxx/hosts文件中设置变量 CLUSTER_NETWORK="calico"

在clusters/xxxx/config.yml
# [calico]设置calico 是否使用route reflectors
# 如果集群规模超过50个节点,建议启用该特性
CALICO_RR_ENABLED: true

# CALICO_RR_NODES 配置route reflectors的节点,如果未设置默认使用集群master节点 
# CALICO_RR_NODES: ["192.168.1.1", "192.168.1.2"]
CALICO_RR_NODES: ["10.2.1.190", "10.2.1.191", "10.2.1.192", "10.2.1.193", "10.2.1.194", "10.2.1.195"]

错误
TASK [calico : node label] *******************************************************************************************************************************
fatal: [10.2.1.190]: FAILED! => {"changed": true, "cmd": "for ip in 10.2.1.190 10.2.1.191 10.2.1.192 10.2.1.193 10.2.1.194 10.2.1.195 ;do /etc/kubeasz/bin/kubectl label node \"$ip\" route-reflector=true --overwrite; done", "delta": "0:00:00.479707", "end": "2023-02-21 03:47:09.604375", "msg": "non-zero return code", "rc": 1, "start": "2023-02-21 03:47:09.124668", "stderr": "Error from server (NotFound): nodes \"10.2.1.190\" not found\nError from server (NotFound): nodes \"10.2.1.191\" not found\nError from server (NotFound): nodes \"10.2.1.192\" not found\nError from server (NotFound): nodes \"10.2.1.193\" not found\nError from server (NotFound): nodes \"10.2.1.194\" not found\nError from server (NotFound): nodes \"10.2.1.195\" not found", "stderr_lines": ["Error from server (NotFound): nodes \"10.2.1.190\" not found", "Error from server (NotFound): nodes \"10.2.1.191\" not found", "Error from server (NotFound): nodes \"10.2.1.192\" not found", "Error from server (NotFound): nodes \"10.2.1.193\" not found", "Error from server (NotFound): nodes \"10.2.1.194\" not found", "Error from server (NotFound): nodes \"10.2.1.195\" not found"], "stdout": "", "stdout_lines": []}
解决:查看k8s中node的name如果为主机名,需要在RR的nodes中添加主机名
# CALICO_RR_NODES 配置route reflectors的节点,如果未设置默认使用集群master节点 
# CALICO_RR_NODES: ["192.168.1.1", "192.168.1.2"]
CALICO_RR_NODES: ["master01", "master02", "master03", "worker01", "worker02", "worker03"]
#执行安装
#dk ezctl setup k8s-01 06

[root@master01 kubeasz]# kubectl get node
NAME       STATUS                     ROLES    AGE     VERSION
master01   Ready,SchedulingDisabled   master   3h48m   v1.26.1
master02   Ready,SchedulingDisabled   master   3h48m   v1.26.1
master03   Ready,SchedulingDisabled   master   3h48m   v1.26.1
worker01   Ready                      node     3h45m   v1.26.1
worker02   Ready                      node     3h45m   v1.26.1
worker03   Ready                      node     3h45m   v1.26.1
[root@master01 kubeasz]# kubectl get pod -n kube-system -o wide | grep calico
calico-kube-controllers-7bbb6b796b-6cx6m     1/1     Running   1 (3h23m ago)   3h43m   10.2.1.195     worker03   <none>           <none>
calico-node-2h65h                            1/1     Running   0               3h43m   10.2.1.193     worker01   <none>           <none>
calico-node-bxgdz                            1/1     Running   0               3h43m   10.2.1.190     master01   <none>           <none>
calico-node-d5mgg                            1/1     Running   0               3h43m   10.2.1.194     worker02   <none>           <none>
calico-node-hh2s4                            1/1     Running   0               3h43m   10.2.1.195     worker03   <none>           <none>
calico-node-l55rc                            1/1     Running   0               3h43m   10.2.1.192     master03   <none>           <none>
calico-node-vw9cr                            1/1     Running   0               3h43m   10.2.1.191     master02   <none>           <none>
[root@master01 kubeasz]# calicoctl get node -o wide
NAME       ASN       IPV4            IPV6   
master01   (64512)   10.2.1.190/23          
master02   (64512)   10.2.1.191/23          
master03   (64512)   10.2.1.192/23          
worker01   (64512)   10.2.1.193/23          
worker02   (64512)   10.2.1.194/23          
worker03   (64512)   10.2.1.195/23   

可以在集群中选择1个或多个节点作为 rr 节点,这里先选择节点:k8s401
calicoctl patch node master01 -p '{"spec": {"bgp": {"routeReflectorClusterID": "244.0.0.1"}}}'
#配置node label
calicoctl patch node master01 -p '{"metadata": {"labels": {"route-reflector": "true"}}}'

[root@master01 BGP]# vim bgp.yml

kind: BGPPeer
apiVersion: projectcalico.org/v3
metadata:
  name: peer-with-route-reflectors
spec:
  nodeSelector: all()
  peerSelector: route-reflector == 'true'

[root@master01 BGP]# calicoctl create -f bgp.yml 
Successfully created 1 'BGPPeer' resource(s)

验证增加 rr 之后的bgp 连接情况
dk ansible -i /etc/kubeasz/clusters/k8s-01/hosts all -m shell -a '/opt/kube/bin/calicoctl node status'

依次添加其它的master与worker节点添加到rr

问题五、kubectl get nodes 获取节点时报错

[root@master02 ~]# kubectl get nodes
E0221 14:51:52.209187   17967 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0221 14:51:52.210167   17967 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0221 14:51:52.212104   17967 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0221 14:51:52.213813   17967 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0221 14:51:52.215677   17967 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
The connection to the server localhost:8080 was refused - did you specify the right host or port?

错误解决:

1)token 过期

#kubeadm token generate #生成token
7r3l16.5yzfksso5ty2zzie #下面这条命令中会用到该结果
# kubeadm token create 7r3l16.5yzfksso5ty2zzie  --print-join-command --ttl=0  #根据token输出添加命令
W0604 10:35:00.523781   14568 validation.go:28] Cannot validate kube-proxy config - no validator is available
W0604 10:35:00.523827   14568 validation.go:28] Cannot validate kubelet config - no validator is available
kubeadm join 192.168.254.100:6443 --token 7r3l16.5yzfksso5ty2zzie     --discovery-token-ca-cert-hash sha256:56281a8be264fa334bb98cac5206aa190527a03180c9f397c253ece41d997e8a 

2)k8s api server不可达

[root@master01 ~]#setenforce 0
[root@master01 ~]#sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
[root@master01 ~]#systemctl disable firewalld --now

3)node节点与master节点的时间不同步

 ntpdate time.ntp.org

4)k8s初始化后没有创建必要的运行文件

[root@worker01 ~]# kubectl get nodes
E0221 14:55:32.706287    6388 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0221 14:55:32.706869    6388 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0221 14:55:32.708772    6388 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0221 14:55:32.710750    6388 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
E0221 14:55:32.712341    6388 memcache.go:238] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused
The connection to the server localhost:8080 was refused - did you specify the right host or port?
[root@worker01 ~]# cd .kube/
[root@worker01 .kube]# ll -a
total 0
drwxr-xr-x  2 root root   6 Feb 21 10:15 .
dr-xr-x---. 6 root root 222 Feb 21 11:28 ..

####从正常节点(dev)复制kube这个文件到其它节点

[root@worker01 .kube]# ll -a
total 8
drwxr-xr-x  3 root root   33 Feb 21 14:55 .
dr-xr-x---. 6 root root  222 Feb 21 11:28 ..
drwxr-x---  4 root root   35 Feb 21 14:55 cache
-r--------  1 root root 6198 Feb 21 14:55 config
[root@worker01 .kube]# kubectl get nodes
NAME       STATUS                     ROLES    AGE     VERSION
master01   Ready,SchedulingDisabled   master   4h34m   v1.26.1
master02   Ready,SchedulingDisabled   master   4h34m   v1.26.1
master03   Ready,SchedulingDisabled   master   4h34m   v1.26.1
worker01   Ready                      node     4h30m   v1.26.1
worker02   Ready                      node     4h30m   v1.26.1
worker03   Ready                      node     4h30m   v1.26.1

问题五、安装完calico网络后,查看node只有一台

[root@master-02 ~]#  calicoctl get node -o wide
NAME                    ASN       IPV4            IPV6   
localhost.localdomain   (64512)   10.2.1.195/23          

[root@master-02 ~]# calicoctl node status
Calico process is running.

IPv4 BGP status
No IPv4 peers found.

IPv6 BGP status
No IPv6 peers found.

问题六、Non-critical error occurred during resource retrieval: pods is forbidden

Non-critical error occurred during resource retrieval: namespaces is forbidden: User “system:serviceaccount:kubernetes-dashboard:kubernetes-dashboard” cannot list resource “namespaces” in API group “” at the cluster scope

解决:(删除,重构)

kubectl delete -f admin-user-sa-rbac.yaml 
kubectl delete -f read-user-sa-rbac.yaml 
kubectl delete -f kubernetes-dashboard.yaml 

kubectl apply -f admin-user-sa-rbac.yaml 
kubectl apply -f kubernetes-dashboard.yaml 
kubectl apply -f read-user-sa-rbac.yaml 

#获取token
kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}')

  • 2
    点赞
  • 11
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 2
    评论
sealer是一种工具,用于在KubernetesK8s)集群中进行自动化部署和管理。它可以帮助用户简化K8s部署的过程,提供了一种快速、可靠的方式来安装和配置K8s集群。 首先,我们需要准备好运行K8s集群的机器。这些机器可以是物理服务器、虚拟机或云服务器。安装K8s所需的系统要求通常包括运行最新版本的Linux操作系统(如Ubuntu、CentOS等),具备足够的计算和存储资源,并且能够访问互联网。 接下来,我们可以下载和安装sealer工具。sealer是一个开源项目,可以在GitHub上找到它的源代码。我们可以将其克隆到本地,并按照官方文档提供的指南进行编译和安装。一旦安装完成,我们可以使用sealer提供的命令行界面来操作K8s集群。 在安装K8s之前,我们需要配置sealer的一些参数。这些参数包括K8s集群的节点数量、每个节点的IP地址、K8s版本、网络插件等。我们可以通过修改sealer的配置文件来进行这些配置,并确保每个参数都正确设置。 安装K8s之前,我们还需要安装Docker,因为Kubernetes依赖于Docker来运行容器。我们可以使用sealer提供的命令来安装和配置Docker,以确保它与Kubernetes兼容。 安装Kubernetes时,sealer会自动下载并配置所需的组件,如kube-apiserverkube-controller-manager、kube-scheduler等。它还会为每个节点生成所需的证书和密钥,并将其配置到相应的文件中。 一旦安装完成,我们可以使用sealer提供的命令来验证K8s集群的状态,如查看节点、Pods等。我们还可以使用sealer来添加、删除和更新集群中的节点,以满足我们的需求。 总的来说,sealer是一个功能强大的工具,可以帮助我们简化Kubernetes安装和管理。通过正确配置sealer并按照指南进行操作,我们可以轻松地搭建和维护一个高可用、可靠的Kubernetes集群。
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

疯飙的蜗牛

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值