坑1:kubeadm init命令执行失败,无法连接k8s.gcr.io库
执行kubeadmin init 命令:
sudo kubeadm init --pod-network-cidr=10.244.0.0/16
出现错误:
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-apiserver:v1.23.1: output: Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-controller-manager:v1.23.1: output: Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-scheduler:v1.23.1: output: Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-proxy:v1.23.1: output: Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/pause:3.6: output: Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/etcd:3.5.1-0: output: Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
[ERROR ImagePull]: failed to pull image k8s.gcr.io/coredns/coredns:v1.8.6: output: Error response from daemon: Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
解决:
kubeadmin init命令增加参数 --image-repository registry.aliyuncs.com/google_containers
sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --image-repository registry.aliyuncs.com/google_containers
坑2:kubeadmin init命令无法连接control plane
执行命令:
sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --image-repository registry.aliyuncs.com/google_containers
出现错误:
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all Kubernetes containers running in docker:
- 'docker ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
解决:
1、修改daemon.json
sudo vim /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
2、重启docker
systemctl daemon-reload
systemctl restart docker
完成。
坑3:kubeadm init之后,创建新pod出现failed to set bridge addr: "cni0" already has an IP address different from XXXX的问题。
kubeadm init之后,想创建flannel网络插件,发现pod一直无法启动成功。使用kubectl describe pod命令查看pod情况。
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 12m default-scheduler 0/1 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.
Normal Scheduled 12m default-scheduler Successfully assigned kube-system/coredns-6d8c4cb4d-6pd2b to debian-1
Warning FailedCreatePodSandBox 12m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "f9b5c6a4da4db3b43e00ae71baa2388f8f45c2001e5615aae861c6208ad85de1" network for pod "coredns-6d8c4cb4d-6pd2b": networkPlugin cni failed to set up pod "coredns-6d8c4cb4d-6pd2b_kube-system" network: failed to delegate add: failed to set bridge addr: "cni0" already has an IP address different from 10.0.0.1/24
Warning FailedCreatePodSandBox 12m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "014675b219aa5c6abaa173ccfd5edb23258b2af3ce43ac8a186cac3db040046d" network for pod "coredns-6d8c4cb4d-6pd2b": networkPlugin cni failed to set up pod "coredns-6d8c4cb4d-6pd2b_kube-system" network: failed to delegate add: failed to set bridge addr: "cni0" already has an IP address different from 10.0.0.1/24
解决:删除cni0网卡
sudo ifconfig cni0 down
sudo ip link delete cni0
坑4:kubeadm join 命令pre-flight check warning: [WARNING FileExisting-ebtables]: ebtables not found in system path
[WARNING FileExisting-ebtables]: ebtables not found in system path
安装的时候却说包已经安装了
Reading package lists... Done
Building dependency tree
Reading state information... Done
ebtables is already the newest version (2.0.10.4+snapshot20181205-3).
ebtables set to manually installed.
ethtool is already the newest version (1:4.19-1).
ethtool set to manually installed.
0 upgraded, 0 newly installed, 0 to remove and 2 not upgraded.
解决:前边加sudo
sudo kubeadm join
坑5:安装网络插件flannel后coredns一直重启,无法正常运行。pod之间也无法进行网络传输。
linmao@debian-1:~/kubernetes$ sudo kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6d8c4cb4d-fjnzt 0/1 Running 2 (56s ago) 4m46s
kube-system coredns-6d8c4cb4d-vl54c 0/1 Running 2 (58s ago) 4m46s
kube-system etcd-debian-1 1/1 Running 6 4m57s
kube-system kube-apiserver-debian-1 1/1 Running 6 4m57s
kube-system kube-controller-manager-debian-1 1/1 Running 2 4m57s
kube-system kube-flannel-ds-btn8q 1/1 Running 0 15s
kube-system kube-proxy-9cz5g 1/1 Running 0 4m46s
kube-system kube-scheduler-debian-1 1/1 Running 6 4m57s
linmao@debian-1:~/kubernetes$ sudo kubectl logs -f coredns-6d8c4cb4d-fjnzt -n kube-system
[WARNING] plugin/kubernetes: starting server with unsynced Kubernetes API
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.8.6
linux/amd64, go1.17.1, 13a9191
[ERROR] plugin/errors: 2 2333562686986131856.1223269804745438122. HINFO: read udp 10.0.0.2:36120->192.168.1.1:53: i/o timeout
[ERROR] plugin/errors: 2 2333562686986131856.1223269804745438122. HINFO: read udp 10.0.0.2:44455->192.168.1.1:53: i/o timeout
[ERROR] plugin/errors: 2 2333562686986131856.1223269804745438122. HINFO: read udp 10.0.0.2:47435->192.168.1.1:53: i/o timeout
[ERROR] plugin/errors: 2 2333562686986131856.1223269804745438122. HINFO: read udp 10.0.0.2:38028->192.168.1.1:53: i/o timeout
[ERROR] plugin/errors: 2 2333562686986131856.1223269804745438122. HINFO: read udp 10.0.0.2:47276->192.168.1.1:53: i/o timeout
[ERROR] plugin/errors: 2 2333562686986131856.1223269804745438122. HINFO: read udp 10.0.0.2:54208->192.168.1.1:53: i/o timeout
[ERROR] plugin/errors: 2 2333562686986131856.1223269804745438122. HINFO: read udp 10.0.0.2:58839->192.168.1.1:53: i/o timeout
[ERROR] plugin/errors: 2 2333562686986131856.1223269804745438122. HINFO: read udp 10.0.0.2:35322->192.168.1.1:53: i/o timeout
[ERROR] plugin/errors: 2 2333562686986131856.1223269804745438122. HINFO: read udp 10.0.0.2:52081->192.168.1.1:53: i/o timeout
[ERROR] plugin/errors: 2 2333562686986131856.1223269804745438122. HINFO: read udp 10.0.0.2:33026->192.168.1.1:53: i/o timeout
[INFO] SIGTERM: Shutting down servers then terminating
[INFO] plugin/health: Going into lameduck mode for 5s
解决:
直接kubectl delete pod删除这两个coredns让他重启就行了。
其他坑的链接:
kubectl describe pod 里边没有看到events问题解决_marlinlm的博客-CSDN博客
访问k8s集群出现Unable to connect to the server: x509: certificate is valid for xxx, not xxx问题解决
通过kubeadm join 为k8s集群增加节点出错 couldn‘t validate the identity of the API Server
启动容器时incompatible CNI versions;config is \“1.0.0\“, plugin supports [\“0.1.0\“ \“0.2.0\“...]问题解决