K8s入门-K8s节点notReady状态解决

K8s节点notReady状态解决

挂掉原因:我想要通过externalIP来发布一个service,同时要删除旧的pod,删除命令执行后,节点就不可用了。

错误操作复现

  1. 创建externalIP类型的service
  2. 将已有的deployments/demo的节点置为0(这一步有大问题)
  3. 删除已有的pod节点(直接卡死,之后node节点全部断掉了)
[root@k8s-master01 ~]# kubectl apply -f services-externalip-demo.yaml 
service/demo-externalip-service created
[root@k8s-master01 ~]# kubectl get svc
NAME                      TYPE        CLUSTER-IP     EXTERNAL-IP      PORT(S)   AGE
demo-externalip-service   ClusterIP   10.97.55.241   192.168.15.154   80/TCP    4s
kubernetes                ClusterIP   10.96.0.1      <none>           443/TCP   128d
[root@k8s-master01 ~]# kubectl describe po demo-6666947f9f-ggd4h
Name:           demo-6666947f9f-ggd4h
Namespace:      default
Priority:       0
Node:           k8s-node01/192.168.15.153
Start Time:     Fri, 02 Apr 2021 21:44:42 +0800
Labels:         app=demo
                pod-template-hash=6666947f9f
Annotations:    <none>
Status:         Running
IP:             10.244.3.9
Controlled By:  ReplicaSet/demo-6666947f9f
Containers:
  nginx:
    Container ID:   docker://b18e31dcb300803ce7e611eaecfa45a70b857b403e2dd3ff92db5a341e3306bb
    Image:          nginx:latest
    Image ID:       docker-pullable://nginx@sha256:10b8cc432d56da8b61b070f4c7d2543a9ed17c2b23010b43af434fd40e2ca4aa
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Fri, 02 Apr 2021 21:45:22 +0800
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-srskw (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  default-token-srskw:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-srskw
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                  From                 Message
  ----     ------            ----                 ----                 -------
  Warning  FailedScheduling  5h (x13 over 5h15m)  default-scheduler    0/3 nodes are available: 3 node(s) had taints that the pod didn't tolerate.
  Warning  FailedScheduling  19m                  default-scheduler    0/3 nodes are available: 3 node(s) had taints that the pod didn't tolerate.
  Normal   Scheduled         19m                  default-scheduler    Successfully assigned default/demo-6666947f9f-ggd4h to k8s-node01
  Normal   Pulling           <invalid>            kubelet, k8s-node01  Pulling image "nginx:latest"
  Normal   Pulled            <invalid>            kubelet, k8s-node01  Successfully pulled image "nginx:latest"
  Normal   Created           <invalid>            kubelet, k8s-node01  Created container nginx
  Normal   Started           <invalid>            kubelet, k8s-node01  Started container nginx
[root@k8s-master01 ~]# kubectl describe -f services-externalip-demo.yaml 
Name:              demo-externalip-service
Namespace:         default
Labels:            app=demo-service
Annotations:       kubectl.kubernetes.io/last-applied-configuration:
                     {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app":"demo-service"},"name":"demo-externalip-service","namespa...
Selector:          app=demo
Type:              ClusterIP
IP:                10.97.55.241
External IPs:      192.168.15.154
Port:              http  80/TCP
TargetPort:        80/TCP
Endpoints:         
Session Affinity:  None
Events:            <none>
[root@k8s-master01 ~]# kubectl get deployments
NAME   READY   UP-TO-DATE   AVAILABLE   AGE
demo   0/1     1            0           5h16m
[root@k8s-master01 ~]# kubectl describe -f services-externalip-demo.yaml 
Name:              demo-externalip-service
Namespace:         default
Labels:            app=demo-service
Annotations:       kubectl.kubernetes.io/last-applied-configuration:
                     {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app":"demo-service"},"name":"demo-externalip-service","namespa...
Selector:          app=demo
Type:              ClusterIP
IP:                10.97.55.241
External IPs:      192.168.15.154
Port:              http  80/TCP
TargetPort:        80/TCP
Endpoints:         
Session Affinity:  None
Events:            <none>
[root@k8s-master01 ~]# kubectl get deployments
NAME   READY   UP-TO-DATE   AVAILABLE   AGE
demo   0/1     1            0           5h16m
[root@k8s-master01 ~]# kubectl scale deployments/demo --replicas=0
deployment.extensions/demo scaled
[root@k8s-master01 ~]# kubectl get po
NAME                    READY   STATUS        RESTARTS   AGE
demo-6666947f9f-ggd4h   1/1     Terminating   0          5h16m
[root@k8s-master01 ~]# kubectl delete po demo-6666947f9f-ggd4h
pod "demo-6666947f9f-ggd4h" deleted
^C
[root@k8s-master01 ~]# kubectl get po
NAME                    READY   STATUS        RESTARTS   AGE
demo-6666947f9f-ggd4h   1/1     Terminating   0          5h19m
[root@k8s-master01 ~]# kubectl get deployments
NAME   READY   UP-TO-DATE   AVAILABLE   AGE
demo   0/0     0            0           5h19m
[root@k8s-master01 ~]# kubectl create deploy demo --image=nginx:latest
Error from server (AlreadyExists): deployments.apps "demo" already exists
[root@k8s-master01 ~]# kubectl scale deployments/demo --replicas=3
deployment.extensions/demo scaled
[root@k8s-master01 ~]# kubectl get po
NAME                    READY   STATUS        RESTARTS   AGE
demo-6666947f9f-ggd4h   1/1     Terminating   0          5h20m
demo-6666947f9f-m42q2   0/1     Pending       0          3s
demo-6666947f9f-t58r7   0/1     Pending       0          3s
demo-6666947f9f-xcjzs   0/1     Pending       0          3s
[root@k8s-master01 ~]# kubectl describe po demo-6666947f9f-m42q2
Name:           demo-6666947f9f-m42q2
Namespace:      default
Priority:       0
Node:           <none>
Labels:         app=demo
                pod-template-hash=6666947f9f
Annotations:    <none>
Status:         Pending
IP:             
Controlled By:  ReplicaSet/demo-6666947f9f
Containers:
  nginx:
    Image:        nginx:latest
    Port:         <none>
    Host Port:    <none>
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-srskw (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  default-token-srskw:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-srskw
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  16s (x2 over 16s)  default-scheduler  0/3 nodes are available: 3 node(s) had taints that the pod didn't tolerate.
[root@k8s-master01 ~]# kubectl get node
NAME           STATUS     ROLES    AGE    VERSION
k8s-master01   Ready      master   128d   v1.15.1
k8s-node01     NotReady   <none>   123d   v1.15.1
k8s-node02     NotReady   <none>   123d   v1.15.1

解决方式1-尝试重启所有服务器(失败)

[root@k8s-master01 ~]# poweroff

连接断开
连接成功
Last login: Thu Apr  1 21:32:56 2021 from 192.168.15.1
[root@k8s-master01 ~]# kubectl get nodes
NAME           STATUS     ROLES    AGE    VERSION
k8s-master01   Ready      master   128d   v1.15.1
k8s-node01     NotReady   <none>   123d   v1.15.1
k8s-node02     NotReady   <none>   123d   v1.15.1
[root@k8s-master01 ~]# kubectl describe nodes k8s-node01
Name:               k8s-node01
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=k8s-node01
                    kubernetes.io/os=linux
Annotations:        flannel.alpha.coreos.com/backend-data: {"VtepMAC":"96:6b:71:dc:33:3d"}
                    flannel.alpha.coreos.com/backend-type: vxlan
                    flannel.alpha.coreos.com/kube-subnet-manager: true
                    flannel.alpha.coreos.com/public-ip: 192.168.15.153
                    kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Sun, 29 Nov 2020 00:31:58 +0800
Taints:             node.kubernetes.io/unreachable:NoExecute
                    node.kubernetes.io/unreachable:NoSchedule
Unschedulable:      false
Conditions:
  Type             Status    LastHeartbeatTime                 LastTransitionTime                Reason              Message
  ----             ------    -----------------                 ------------------                ------              -------
  MemoryPressure   Unknown   Fri, 02 Apr 2021 22:12:36 +0800   Thu, 01 Apr 2021 22:13:25 +0800   NodeStatusUnknown   Kubelet stopped posting node status.
  DiskPressure     Unknown   Fri, 02 Apr 2021 22:12:36 +0800   Thu, 01 Apr 2021 22:13:25 +0800   NodeStatusUnknown   Kubelet stopped posting node status.
  PIDPressure      Unknown   Fri, 02 Apr 2021 22:12:36 +0800   Thu, 01 Apr 2021 22:13:25 +0800   NodeStatusUnknown   Kubelet stopped posting node status.
  Ready            Unknown   Fri, 02 Apr 2021 22:12:36 +0800   Thu, 01 Apr 2021 22:13:25 +0800   NodeStatusUnknown   Kubelet stopped posting node status.
Addresses:
  InternalIP:  192.168.15.153
  Hostname:    k8s-node01
Capacity:
 cpu:                2
 ephemeral-storage:  100610052Ki
 hugepages-2Mi:      0
 memory:             4028688Ki
 pods:               110
Allocatable:
 cpu:                2
 ephemeral-storage:  92722223770
 hugepages-2Mi:      0
 memory:             3926288Ki
 pods:               110
System Info:
 Machine ID:                 87ff2ee4182e421680e90c865344076c
 System UUID:                48D64D56-CD82-7CD9-7265-00C117529BB5
 Boot ID:                    9059ab9f-2607-4fe1-880d-f5ccb9f8e784
 Kernel Version:             4.4.244-1.el7.elrepo.x86_64
 OS Image:                   CentOS Linux 7 (Core)
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://19.3.13
 Kubelet Version:            v1.15.1
 Kube-Proxy Version:         v1.15.1
PodCIDR:                     10.244.3.0/24
Non-terminated Pods:         (5 in total)
  Namespace                  Name                     CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                  ----                     ------------  ----------  ---------------  -------------  ---
  default                    demo-6666947f9f-m42q2    0 (0%)        0 (0%)      0 (0%)           0 (0%)         9m56s
  default                    demo-6666947f9f-t58r7    0 (0%)        0 (0%)      0 (0%)           0 (0%)         9m56s
  default                    demo-6666947f9f-xcjzs    0 (0%)        0 (0%)      0 (0%)           0 (0%)         9m56s
  kube-system                kube-flannel-ds-rk4t5    100m (5%)     100m (5%)   50Mi (1%)        50Mi (1%)      123d
  kube-system                kube-proxy-fbtln         0 (0%)        0 (0%)      0 (0%)           0 (0%)         123d
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests   Limits
  --------           --------   ------
  cpu                100m (5%)  100m (5%)
  memory             50Mi (1%)  50Mi (1%)
  ephemeral-storage  0 (0%)     0 (0%)
Events:
  Type     Reason                   Age                            From                    Message
  ----     ------                   ----                           ----                    -------
  Normal   Starting                 <invalid>                      kubelet, k8s-node01     Starting kubelet.
  Normal   NodeAllocatableEnforced  <invalid>                      kubelet, k8s-node01     Updated Node Allocatable limit across pods
  Normal   NodeHasSufficientMemory  <invalid> (x2 over <invalid>)  kubelet, k8s-node01     Node k8s-node01 status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    <invalid> (x2 over <invalid>)  kubelet, k8s-node01     Node k8s-node01 status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     <invalid> (x2 over <invalid>)  kubelet, k8s-node01     Node k8s-node01 status is now: NodeHasSufficientPID
  Warning  Rebooted                 <invalid>                      kubelet, k8s-node01     Node k8s-node01 has been rebooted, boot id: 36b21c39-94a9-437f-bf8a-191eb5181dbe
  Normal   NodeReady                <invalid>                      kubelet, k8s-node01     Node k8s-node01 status is now: NodeReady
  Normal   Starting                 <invalid>                      kube-proxy, k8s-node01  Starting kube-proxy.
  Normal   Starting                 <invalid>                      kubelet, k8s-node01     Starting kubelet.
  Normal   NodeAllocatableEnforced  <invalid>                      kubelet, k8s-node01     Updated Node Allocatable limit across pods
  Normal   NodeHasSufficientMemory  <invalid> (x2 over <invalid>)  kubelet, k8s-node01     Node k8s-node01 status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    <invalid> (x2 over <invalid>)  kubelet, k8s-node01     Node k8s-node01 status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     <invalid> (x2 over <invalid>)  kubelet, k8s-node01     Node k8s-node01 status is now: NodeHasSufficientPID
  Warning  Rebooted                 <invalid>                      kubelet, k8s-node01     Node k8s-node01 has been rebooted, boot id: 9059ab9f-2607-4fe1-880d-f5ccb9f8e784
  Normal   NodeReady                <invalid>                      kubelet, k8s-node01     Node k8s-node01 status is now: NodeReady
  Normal   Starting                 <invalid>                      kube-proxy, k8s-node01  Starting kube-proxy.
[root@k8s-master01 ~]# kubectl describe nodes k8s-node02
Name:               k8s-node02
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=k8s-node02
                    kubernetes.io/os=linux
Annotations:        flannel.alpha.coreos.com/backend-data: {"VtepMAC":"fe:73:ec:96:1c:45"}
                    flannel.alpha.coreos.com/backend-type: vxlan
                    flannel.alpha.coreos.com/kube-subnet-manager: true
                    flannel.alpha.coreos.com/public-ip: 192.168.15.152
                    kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Sat, 28 Nov 2020 23:23:32 +0800
Taints:             node.kubernetes.io/unreachable:NoExecute
                    node.kubernetes.io/unreachable:NoSchedule
Unschedulable:      false
Conditions:
  Type             Status    LastHeartbeatTime                 LastTransitionTime                Reason              Message
  ----             ------    -----------------                 ------------------                ------              -------
  MemoryPressure   Unknown   Fri, 02 Apr 2021 22:13:53 +0800   Thu, 01 Apr 2021 22:14:35 +0800   NodeStatusUnknown   Kubelet stopped posting node status.
  DiskPressure     Unknown   Fri, 02 Apr 2021 22:13:53 +0800   Thu, 01 Apr 2021 22:14:35 +0800   NodeStatusUnknown   Kubelet stopped posting node status.
  PIDPressure      Unknown   Fri, 02 Apr 2021 22:13:53 +0800   Thu, 01 Apr 2021 22:14:35 +0800   NodeStatusUnknown   Kubelet stopped posting node status.
  Ready            Unknown   Fri, 02 Apr 2021 22:13:53 +0800   Thu, 01 Apr 2021 22:14:35 +0800   NodeStatusUnknown   Kubelet stopped posting node status.
Addresses:
  InternalIP:  192.168.15.152
  Hostname:    k8s-node02
Capacity:
 cpu:                2
 ephemeral-storage:  100610052Ki
 hugepages-2Mi:      0
 memory:             4028688Ki
 pods:               110
Allocatable:
 cpu:                2
 ephemeral-storage:  92722223770
 hugepages-2Mi:      0
 memory:             3926288Ki
 pods:               110
System Info:
 Machine ID:                 cab995456bd34aab927d7b5cb22daf5c
 System UUID:                29144D56-00D2-A845-03B6-DBC78819D1F1
 Boot ID:                    0c9f8d85-10ad-4d7f-8c80-e752a420a7e5
 Kernel Version:             4.4.244-1.el7.elrepo.x86_64
 OS Image:                   CentOS Linux 7 (Core)
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://19.3.13
 Kubelet Version:            v1.15.1
 Kube-Proxy Version:         v1.15.1
PodCIDR:                     10.244.2.0/24
Non-terminated Pods:         (2 in total)
  Namespace                  Name                     CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                  ----                     ------------  ----------  ---------------  -------------  ---
  kube-system                kube-flannel-ds-lpdbl    100m (5%)     100m (5%)   50Mi (1%)        50Mi (1%)      123d
  kube-system                kube-proxy-tkwvb         0 (0%)        0 (0%)      0 (0%)           0 (0%)         123d
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests   Limits
  --------           --------   ------
  cpu                100m (5%)  100m (5%)
  memory             50Mi (1%)  50Mi (1%)
  ephemeral-storage  0 (0%)     0 (0%)
Events:
  Type     Reason                   Age                            From                    Message
  ----     ------                   ----                           ----                    -------
  Normal   NodeAllocatableEnforced  <invalid>                      kubelet, k8s-node02     Updated Node Allocatable limit across pods
  Normal   Starting                 <invalid>                      kubelet, k8s-node02     Starting kubelet.
  Warning  Rebooted                 <invalid>                      kubelet, k8s-node02     Node k8s-node02 has been rebooted, boot id: 0768fc77-bb23-4fe9-ae2e-ee987dc16b54
  Normal   NodeReady                <invalid>                      kubelet, k8s-node02     Node k8s-node02 status is now: NodeReady
  Normal   NodeHasNoDiskPressure    <invalid> (x2 over <invalid>)  kubelet, k8s-node02     Node k8s-node02 status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     <invalid> (x2 over <invalid>)  kubelet, k8s-node02     Node k8s-node02 status is now: NodeHasSufficientPID
  Normal   NodeHasSufficientMemory  <invalid> (x2 over <invalid>)  kubelet, k8s-node02     Node k8s-node02 status is now: NodeHasSufficientMemory
  Normal   Starting                 <invalid>                      kube-proxy, k8s-node02  Starting kube-proxy.
  Normal   NodeAllocatableEnforced  <invalid>                      kubelet, k8s-node02     Updated Node Allocatable limit across pods
  Normal   Starting                 <invalid>                      kubelet, k8s-node02     Starting kubelet.
  Normal   NodeHasSufficientMemory  <invalid> (x2 over <invalid>)  kubelet, k8s-node02     Node k8s-node02 status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    <invalid> (x2 over <invalid>)  kubelet, k8s-node02     Node k8s-node02 status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     <invalid> (x2 over <invalid>)  kubelet, k8s-node02     Node k8s-node02 status is now: NodeHasSufficientPID
  Warning  Rebooted                 <invalid>                      kubelet, k8s-node02     Node k8s-node02 has been rebooted, boot id: 0c9f8d85-10ad-4d7f-8c80-e752a420a7e5
  Normal   NodeReady                <invalid>                      kubelet, k8s-node02     Node k8s-node02 status is now: NodeReady
  Normal   Starting                 <invalid>                      kube-proxy, k8s-node02  Starting kube-proxy.

解决方式2-删除externalIP的service并重启

又怀疑是因为service创建的有问题

[root@k8s-master01 ~]# kubectl get svc
NAME                      TYPE        CLUSTER-IP     EXTERNAL-IP      PORT(S)   AGE
demo-externalip-service   ClusterIP   10.97.55.241   192.168.15.154   80/TCP    50m
kubernetes                ClusterIP   10.96.0.1      <none>           443/TCP   129d
[root@k8s-master01 ~]# kubectl delete -f services-externalip-demo.yaml 
service "demo-externalip-service" deleted
[root@k8s-master01 ~]# kubectl get svc
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   129d
[root@k8s-master01 ~]# kubectl get nodes
NAME           STATUS     ROLES    AGE    VERSION
k8s-master01   Ready      master   129d   v1.15.1
k8s-node01     NotReady   <none>   123d   v1.15.1
k8s-node02     NotReady   <none>   123d   v1.15.1
[root@k8s-master01 ~]# systemctl restart kubelet
[root@k8s-master01 ~]# kubectl get nodes
NAME           STATUS     ROLES    AGE    VERSION
k8s-master01   Ready      master   129d   v1.15.1
k8s-node01     NotReady   <none>   123d   v1.15.1
k8s-node02     NotReady   <none>   123d   v1.15.1

又在从节点上看kubelet的日志,发现联不通master节点的6443端口

4月 02 22:59:06 k8s-node02 kubelet[46204]: E0402 22:59:06.527365   46204 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:444: Failed to list *v1.Service: Get https://192.168.15.154:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.15.154:6443: connect: connection refused

看master节点的6443端口,服务是启动着的;

[root@k8s-master01 ~]# netstat -lntup|grep 6443
tcp6       0      0 :::6443                 :::*                    LISTEN      2017/kube-apiserver 

又把目标转向kube-system命名空间下的pod

[root@k8s-master01 ~]# kubectl get po -n kube-system
NAME                                   READY   STATUS    RESTARTS   AGE
coredns-5c98db65d4-gskbg               1/1     Running   14         129d
coredns-5c98db65d4-kgrls               1/1     Running   14         129d
etcd-k8s-master01                      1/1     Running   14         129d
kube-apiserver-k8s-master01            1/1     Running   18         129d
kube-controller-manager-k8s-master01   1/1     Running   22         129d
kube-flannel-ds-25ch5                  1/1     Running   8          128d
kube-flannel-ds-lpdbl                  0/1     Error     2          123d
kube-flannel-ds-rk4t5                  0/1     Error     2          123d
kube-proxy-6ksnl                       1/1     Running   14         129d
kube-proxy-fbtln                       0/1     Error     2          123d
kube-proxy-tkwvb                       0/1     Error     2          123d
kube-scheduler-k8s-master01            1/1     Running   26         129d

看启动flannel的Error状态的pod的api信息,看不出什么原因

 Normal  SandboxChanged  <invalid>  kubelet, k8s-node02  Pod sandbox changed, it will be killed and re-created.
  Normal  Pulled          <invalid>  kubelet, k8s-node02  Container image "jmgao1983/flannel:latest" already present on machine
  Normal  Created         <invalid>  kubelet, k8s-node02  Created container install-cni
  Normal  Started         <invalid>  kubelet, k8s-node02  Started container install-cni
  Normal  Pulled          <invalid>  kubelet, k8s-node02  Container image "jmgao1983/flannel:latest" already present on machine
  Normal  Created         <invalid>  kubelet, k8s-node02  Created container kube-flannel
  Normal  Started         <invalid>  kubelet, k8s-node02  Started container kube-flannel
  Normal  SandboxChanged  <invalid>  kubelet, k8s-node02  Pod sandbox changed, it will be killed and re-created.
  Normal  Pulled          <invalid>  kubelet, k8s-node02  Container image "jmgao1983/flannel:latest" already present on machine
  Normal  Created         <invalid>  kubelet, k8s-node02  Created container install-cni

看pod的日志,发现说是从节点的10250端口访问不了,发现从节点的10250端口是有服务启动的

[root@k8s-master01 ~]# kubectl logs -f kube-flannel-ds-lpdbl -n kube-system
Error from server: Get https://192.168.15.152:10250/containerLogs/kube-system/kube-flannel-ds-lpdbl/kube-flannel?follow=true: dial tcp 192.168.15.152:10250: connect: no route to host
[root@k8s-master01 ~]# kubectl logs -f kube-proxy-fbtln -n kube-system
Error from server: Get https://192.168.15.153:10250/containerLogs/kube-system/kube-proxy-fbtln/kube-proxy?follow=true: dial tcp 192.168.15.153:10250: connect: no route to host
[root@k8s-master01 ~]# netstat -lntup|grep 10250
tcp6       0      0 :::10250                :::*                    LISTEN      70516/kubelet

从节点10250服务

[root@k8s-node01 ~]# netstat -lntup|grep 10250
tcp6       0      0 :::10250                :::*                    LISTEN      46643/kubelet 
[root@k8s-node02 ~]# netstat -lntup|grep 10250
tcp6       0      0 :::10250                :::*                    LISTEN      46204/kubelet 

最终把信息看向了connect: no route to host

从节点ping主节点

[root@k8s-node01 ~]# ping 192.168.15.154
PING 192.168.15.154 (192.168.15.154) 56(84) bytes of data.
64 bytes from 192.168.15.154: icmp_seq=1 ttl=64 time=0.080 ms
64 bytes from 192.168.15.154: icmp_seq=2 ttl=64 time=0.041 ms
64 bytes from 192.168.15.154: icmp_seq=3 ttl=64 time=0.064 ms
[root@k8s-node02 ~]# ping 192.168.15.154
PING 192.168.15.154 (192.168.15.154) 56(84) bytes of data.
64 bytes from 192.168.15.154: icmp_seq=1 ttl=64 time=0.047 ms
64 bytes from 192.168.15.154: icmp_seq=2 ttl=64 time=0.040 ms

主节点访问从节点

[root@k8s-master01 ~]# telnet 192.168.15.153 10250
Trying 192.168.15.153...
telnet: connect to address 192.168.15.153: No route to host
[root@k8s-master01 ~]# ping 192.168.15.153
PING 192.168.15.153 (192.168.15.153) 56(84) bytes of data.
From 192.168.15.154 icmp_seq=1 Destination Host Unreachable
From 192.168.15.154 icmp_seq=2 Destination Host Unreachable
From 192.168.15.154 icmp_seq=3 Destination Host Unreachable

又查看三台机器的网络配置信息,发现没问题(因为之前一直可以用),防火墙也是关闭的,路由信息也没什么异常

[root@k8s-master01 ~]# vim /etc/sysconfig/network-scripts/ifcfg-ens33
[root@k8s-master01 ~]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:firewalld(1)
[root@k8s-master01 ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.15.2    0.0.0.0         UG    100    0        0 ens33
10.244.0.0      0.0.0.0         255.255.255.0   U     0      0        0 cni0
10.244.2.0      10.244.2.0      255.255.255.0   UG    0      0        0 flannel.1
10.244.3.0      10.244.3.0      255.255.255.0   UG    0      0        0 flannel.1
172.17.0.0      0.0.0.0         255.255.0.0     U     0      0        0 docker0
192.168.15.0    0.0.0.0         255.255.255.0   U     100    0        0 ens33

最终结论是:主节点服务器ping不通从节点,从节点可以ping主节点。

重启主节点服务器和一台从节点,发现就可以ping通了,从节点也自动注册上来了

[root@k8s-master01 ~]# ping www.baidu.com
PING www.a.shifen.com (14.215.177.38) 56(84) bytes of data.
64 bytes from 14.215.177.38 (14.215.177.38): icmp_seq=1 ttl=128 time=33.2 ms
64 bytes from 14.215.177.38 (14.215.177.38): icmp_seq=2 ttl=128 time=34.1 ms
^C
--- www.a.shifen.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 33.253/33.701/34.149/0.448 ms
[root@k8s-master01 ~]# ping 192.168.15.153
PING 192.168.15.153 (192.168.15.153) 56(84) bytes of data.
64 bytes from 192.168.15.153: icmp_seq=1 ttl=64 time=0.351 ms
64 bytes from 192.168.15.153: icmp_seq=2 ttl=64 time=0.309 ms
^C
--- 192.168.15.153 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.309/0.330/0.351/0.021 ms
[root@k8s-master01 ~]# kubectl get nodes
NAME           STATUS     ROLES    AGE    VERSION
k8s-master01   Ready      master   129d   v1.15.1
k8s-node01     Ready      <none>   123d   v1.15.1

总结:上面第一次重启没有恢复可能是没有完全重启成功,需要观察kube-system下的服务是否都启动成功,如果有没启动成功的,可以使用kubectl logs -f来看日志,再分析问题出现在什么地方。一般kube-proxy出问题,就是网络的连通性问题。

评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值