worker节点calico无法启动定位分析

问题现象

worker节点部署的calico-node 无法拉起,反复启动,日志信息如下

 kubectl logs -f  calico-node-hv4sf -nkube-system
2020-12-02 13:20:13.067 [INFO][8] startup.go 259: Early log level set to info
2020-12-02 13:20:13.067 [INFO][8] startup.go 275: Using NODENAME environment for node name
2020-12-02 13:20:13.067 [INFO][8] startup.go 287: Determined node name: xxx-work-1
2020-12-02 13:20:13.068 [INFO][8] k8s.go 228: Using Calico IPAM
2020-12-02 13:20:13.069 [INFO][8] startup.go 319: Checking datastore connection
2020-12-02 13:20:16.075 [INFO][8] startup.go 334: Hit error connecting to datastore - retry error=Get https://10.96.0.1:443/api/v1/nodes/foo: dial tcp 10.96.0.1:443: connect: no route to host
2020-12-02 13:20:19.081 [INFO][8] startup.go 334: Hit error connecting to datastore - retry error=Get https://10.96.0.1:443/api/v1/nodes/foo: dial tcp 10.96.0.1:443: connect: no route to host
2020-12-02 13:20:23.087 [INFO][8] startup.go 334: Hit error connecting to datastore - retry error=Get https://10.96.0.1:443/api/v1/nodes/foo: dial tcp 10.96.0.1:443: connect: no route to host
2020-12-02 13:20:27.095 [INFO][8] startup.go 334: Hit error connecting to datastore - retry error=Get https://10.96.0.1:443/api/v1/nodes/foo: dial tcp 10.96.0.1:443: connect: no route to host

问题定位

master集群双网络 172.31.0.0/16 10.0.0.0/24 worker单网络 172.31.0.0/16

私网管理网
master-1172.31.0.2610.0.0.77
master-2172.31.0.2610.0.0.128
master-3172.31.0.2610.0.0.154
worker-1172.31.0.23-
worker-2172.31.0.8-

从worker节点 curl 10.96.0.1不通

apiserver svc 信息

kubectl get svc -owide
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE   SELECTOR
kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   29h   <none>

endpoint 走的是10.0.0.0/xx 网络,因为k8s api-server 默认绑的是有默认网关的网卡

集群信息

kubectl cluster-info
Kubernetes master is running at https://k8s-cluster-ins-0029-master-vip.service.consul:6443
KubeDNS is running at https://k8s-cluster-ins-0029-master-vip.service.consul:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

kube-apiserver svc 信息

# kubectl get svc kubernetes
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   25h
# kubectl get ep kubernetes
NAME         ENDPOINTS                                       AGE
kubernetes   10.0.0.133:6443,10.0.0.32:6443,10.0.0.50:6443   25h

查看worker信息

# iptables -t nat -nL
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination         
cali-PREROUTING  all  --  0.0.0.0/0            0.0.0.0/0            /* cali:6gwbT8clXdHdC1b1 */
KUBE-SERVICES  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */
DOCKER     all  --  0.0.0.0/0            0.0.0.0/0            ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
cali-OUTPUT  all  --  0.0.0.0/0            0.0.0.0/0            /* cali:tVnHkvAo15HuiPy0 */
KUBE-SERVICES  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service portals */
DOCKER     all  --  0.0.0.0/0           !127.0.0.0/8          ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
cali-POSTROUTING  all  --  0.0.0.0/0            0.0.0.0/0            /* cali:O3lYWMrLQYEMJtB5 */
KUBE-POSTROUTING  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes postrouting rules */
MASQUERADE  all  --  172.17.0.0/16        0.0.0.0/0           

Chain DOCKER (2 references)
target     prot opt source               destination         
RETURN     all  --  0.0.0.0/0            0.0.0.0/0           

Chain KUBE-KUBELET-CANARY (0 references)
target     prot opt source               destination         

Chain KUBE-MARK-DROP (0 references)
target     prot opt source               destination         
MARK       all  --  0.0.0.0/0            0.0.0.0/0            MARK or 0x8000

Chain KUBE-MARK-MASQ (15 references)
target     prot opt source               destination         
MARK       all  --  0.0.0.0/0            0.0.0.0/0            MARK or 0x4000

Chain KUBE-NODEPORTS (1 references)
target     prot opt source               destination         

Chain KUBE-POSTROUTING (1 references)
target     prot opt source               destination         
RETURN     all  --  0.0.0.0/0            0.0.0.0/0            mark match ! 0x4000/0x4000
MARK       all  --  0.0.0.0/0            0.0.0.0/0            MARK xor 0x4000
MASQUERADE  all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service traffic requiring SNAT */

Chain KUBE-PROXY-CANARY (0 references)
target     prot opt source               destination         

Chain KUBE-SEP-6AVEXVWMTAUJHVS6 (1 references)
target     prot opt source               destination         
KUBE-MARK-MASQ  all  --  10.244.149.131       0.0.0.0/0           
DNAT       udp  --  0.0.0.0/0            0.0.0.0/0            udp to:10.244.149.131:53



Chain KUBE-SEP-AJAH3OWF36MHDVF7 (1 references)
target     prot opt source               destination         
KUBE-MARK-MASQ  all  --  10.244.149.131       0.0.0.0/0           
DNAT       tcp  --  0.0.0.0/0            0.0.0.0/0            tcp to:10.244.149.131:9153

Chain KUBE-SEP-F4NUFHPP6MV3U2FB (1 references)
target     prot opt source               destination         
KUBE-MARK-MASQ  all  --  10.244.149.130       0.0.0.0/0           
DNAT       udp  --  0.0.0.0/0            0.0.0.0/0            udp to:10.244.149.130:53

Chain KUBE-SEP-ILEHVTEL5AKI6EAE (1 references)
target     prot opt source               destination         
KUBE-MARK-MASQ  all  --  10.244.149.130       0.0.0.0/0           
DNAT       tcp  --  0.0.0.0/0            0.0.0.0/0            tcp to:10.244.149.130:9153

Chain KUBE-SEP-N4P2JU5RW7IWUD2Z (1 references)
target     prot opt source               destination         
KUBE-MARK-MASQ  all  --  10.244.229.131       0.0.0.0/0           
DNAT       tcp  --  0.0.0.0/0            0.0.0.0/0            tcp to:10.244.229.131:3443

Chain KUBE-SEP-ORF6FH7KUHVWJER7 (1 references)
target     prot opt source               destination         
KUBE-MARK-MASQ  all  --  10.244.149.130       0.0.0.0/0           
DNAT       tcp  --  0.0.0.0/0            0.0.0.0/0            tcp to:10.244.149.130:53

Chain KUBE-SEP-UMPZ2SD2APVNR4IN (1 references)
target     prot opt source               destination         
KUBE-MARK-MASQ  all  --  10.244.149.131       0.0.0.0/0           
DNAT            tcp  --  0.0.0.0/0            0.0.0.0/0            tcp to:10.244.149.131:53

Chain KUBE-SEP-6JBY7EOKHF37VPAE (1 references)
target          prot opt source               destination         
KUBE-MARK-MASQ  all  --  10.0.0.50            0.0.0.0/0           
DNAT            tcp  --  0.0.0.0/0            0.0.0.0/0            tcp to:10.0.0.50:6443

Chain KUBE-SEP-VW6RD437TCEB4BL4 (1 references)
target          prot opt source               destination         
KUBE-MARK-MASQ  all  --  10.0.0.133           0.0.0.0/0           
DNAT            tcp  --  0.0.0.0/0            0.0.0.0/0            tcp to:10.0.0.133:6443

Chain KUBE-SEP-ZCIWMPUBNREXOPRW (1 references)
target          prot opt source               destination         
KUBE-MARK-MASQ  all  --  10.0.0.32            0.0.0.0/0           
DNAT            tcp  --  0.0.0.0/0            0.0.0.0/0            tcp to:10.0.0.32:6443

Chain KUBE-SERVICES (2 references)
target     prot opt source               destination         
KUBE-MARK-MASQ  udp  -- !10.244.0.0/16        10.96.0.10           /* kube-system/kube-dns:dns cluster IP */ udp dpt:53
KUBE-SVC-TCOU7JCQXEZGVUNU  udp  --  0.0.0.0/0            10.96.0.10           /* kube-system/kube-dns:dns cluster IP */ udp dpt:53
KUBE-MARK-MASQ  tcp  -- !10.244.0.0/16        10.96.0.10           /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:53
KUBE-SVC-ERIFXISQEP7F7OF4  tcp  --  0.0.0.0/0            10.96.0.10           /* kube-system/kube-dns:dns-tcp cluster IP */ tcp dpt:53
KUBE-MARK-MASQ  tcp  -- !10.244.0.0/16        10.96.0.10           /* kube-system/kube-dns:metrics cluster IP */ tcp dpt:9153
KUBE-SVC-JD5MR3NA4I4DYORP  tcp  --  0.0.0.0/0            10.96.0.10           /* kube-system/kube-dns:metrics cluster IP */ tcp dpt:9153
KUBE-MARK-MASQ  tcp  -- !10.244.0.0/16        10.107.150.63        /* orch-operator-system/orch-operator-webhook-service: cluster IP */ tcp dpt:3443
KUBE-SVC-6HOYT5WSPFV75AOP  tcp  --  0.0.0.0/0            10.107.150.63        /* orch-operator-system/orch-operator-webhook-service: cluster IP */ tcp dpt:3443
KUBE-MARK-MASQ             tcp  -- !10.244.0.0/16        10.96.0.1            /* default/kubernetes:https cluster IP */ tcp dpt:443
KUBE-SVC-NPX46M4PTMTKRN6Y  tcp  --  0.0.0.0/0            10.96.0.1            /* default/kubernetes:https cluster IP */ tcp dpt:443
KUBE-NODEPORTS             all  --  0.0.0.0/0            0.0.0.0/0            /* kubernetes service nodeports; NOTE: this must be the last rule in this chain */ ADDRTYPE match dst-type LOCAL

Chain KUBE-SVC-6HOYT5WSPFV75AOP (1 references)
target     prot opt source               destination         
KUBE-SEP-N4P2JU5RW7IWUD2Z  all  --  0.0.0.0/0            0.0.0.0/0           

Chain KUBE-SVC-ERIFXISQEP7F7OF4 (1 references)
target     prot opt source               destination         
KUBE-SEP-ORF6FH7KUHVWJER7  all  --  0.0.0.0/0            0.0.0.0/0            statistic mode random probability 0.50000000000
KUBE-SEP-UMPZ2SD2APVNR4IN  all  --  0.0.0.0/0            0.0.0.0/0           

Chain KUBE-SVC-JD5MR3NA4I4DYORP (1 references)
target     prot opt source               destination         
KUBE-SEP-ILEHVTEL5AKI6EAE  all  --  0.0.0.0/0            0.0.0.0/0            statistic mode random probability 0.50000000000
KUBE-SEP-AJAH3OWF36MHDVF7  all  --  0.0.0.0/0            0.0.0.0/0           

Chain KUBE-SVC-NPX46M4PTMTKRN6Y (1 references)
target                     prot opt source               destination         
KUBE-SEP-VW6RD437TCEB4BL4  all  --  0.0.0.0/0            0.0.0.0/0            statistic mode random probability 0.33333333349
KUBE-SEP-ZCIWMPUBNREXOPRW  all  --  0.0.0.0/0            0.0.0.0/0            statistic mode random probability 0.50000000000
KUBE-SEP-6JBY7EOKHF37VPAE  all  --  0.0.0.0/0            0.0.0.0/0           

Chain KUBE-SVC-TCOU7JCQXEZGVUNU (1 references)
target     prot opt source               destination         
KUBE-SEP-F4NUFHPP6MV3U2FB  all  --  0.0.0.0/0            0.0.0.0/0            statistic mode random probability 0.50000000000
KUBE-SEP-6AVEXVWMTAUJHVS6  all  --  0.0.0.0/0            0.0.0.0/0           

Chain cali-OUTPUT (1 references)
target     prot opt source               destination         
cali-fip-dnat  all  --  0.0.0.0/0            0.0.0.0/0            /* cali:GBTAv2p5CwevEyJm */

Chain cali-POSTROUTING (1 references)
target     prot opt source               destination         
cali-fip-snat  all  --  0.0.0.0/0            0.0.0.0/0            /* cali:Z-c7XtVd2Bq7s_hA */
cali-nat-outgoing  all  --  0.0.0.0/0            0.0.0.0/0            /* cali:nYKhEzDlr11Jccal */
MASQUERADE  all  --  0.0.0.0/0            0.0.0.0/0            /* cali:SXWvdsbh4Mw7wOln */ ADDRTYPE match src-type !LOCAL limit-out ADDRTYPE match src-type LOCAL

Chain cali-PREROUTING (1 references)
target     prot opt source               destination         
cali-fip-dnat  all  --  0.0.0.0/0            0.0.0.0/0            /* cali:r6XmIziWUJsdOK6Z */

Chain cali-fip-dnat (2 references)
target     prot opt source               destination         

Chain cali-fip-snat (1 references)
target     prot opt source               destination         

Chain cali-nat-outgoing (1 references)
target     prot opt source               destination         
MASQUERADE  all  --  0.0.0.0/0            0.0.0.0/0            /* cali:flqWnvo8yq4ULQLa */ match-set cali40masq-ipam-pools src ! match-set cali40all-ipam-pools dst

经过分析 10.96.0.1 会被转发到 tcp to:10.0.0.50:6443 tcp to:10.0.0.133:6443 tcp to:10.0.0.32:6443
而 worker网段为 172.31.0.0/xx ,因此10.96.0.1无法访问 通过kubectl get ep kubernetes也验证了apiserver服务是转发到了到了10.0.0.0/24 网段的pod上


Chain KUBE-SEP-6JBY7EOKHF37VPAE (1 references)
target          prot opt source               destination         
KUBE-MARK-MASQ  all  --  10.0.0.50            0.0.0.0/0           
DNAT            tcp  --  0.0.0.0/0            0.0.0.0/0            tcp to:10.0.0.50:6443

Chain KUBE-SEP-VW6RD437TCEB4BL4 (1 references)
target          prot opt source               destination         
KUBE-MARK-MASQ  all  --  10.0.0.133           0.0.0.0/0           
DNAT            tcp  --  0.0.0.0/0            0.0.0.0/0            tcp to:10.0.0.133:6443

Chain KUBE-SEP-ZCIWMPUBNREXOPRW (1 references)
target          prot opt source               destination         
KUBE-MARK-MASQ  all  --  10.0.0.32            0.0.0.0/0           
DNAT            tcp  --  0.0.0.0/0            0.0.0.0/0            tcp to:10.0.0.32:6443


解决方案

k8s创建时指定网卡

master1节点配置
cat kubeadm-config.yaml
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: v1.17.14
imageRepository: harbor.xxx.com/library/k8s.gcr.io
apiServer:
  timeoutForControlPlane: 4m0s
  certSANs:
    - k8s-cluster-ins-0029-master-1.service.consul
    - k8s-cluster-ins-0029-master-2.service.consul
    - k8s-cluster-ins-0029-master-3.service.consul

controlPlaneEndpoint: k8s-cluster-ins-0029-master-vip.service.consul:6443  #此处为slb vip 端口
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
  podSubnet: 10.244.0.0/16  #podIP的范围
dns:
  type: CoreDNS
etcd:
  local:
    dataDir: /data/etcd

---

apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 172.31.0.26  #设置master要绑定的IP

master 2,3 节点添加命令
join_cmd=`kubeadm token create --print-join-command
$join_cmd --control-plane --apiserver-advertise-address=${主机绑定的ip}

kubeadm token create --print-join-command

kubeadm join k8s-cluster-ins-0029-master-vip.service.consul:6443 --token g9stix.vvinbvdt83ndeyoc     --discovery-token-ca-cert-hash sha256:e966b388406a6a04b78c04d1d2a62b4a6a50799c37c708e5fadf6fabb7481231 

参考

生成参考配置

kubeadm config print init-defaults

kubeadm config print init-defaults 


apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 1.2.3.4
  bindPort: 6443
nodeRegistration:
  criSocket: /var/run/dockershim.sock
  name: k8s-cluster-ins-0029-master-1
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
  type: CoreDNS
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.17.0
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/12
scheduler: {}


获取集群kubeadm实际配置

kubectl get cm kubeadm-config -n kube-system -oyaml
apiVersion: v1
data:
  ClusterConfiguration: |
    apiServer:
      certSANs:
      - k8s-cluster-ins-0029-master-1.service.consul
      - k8s-cluster-ins-0029-master-2.service.consul
      - k8s-cluster-ins-0029-master-3.service.consul
      extraArgs:
        advertise-address: 172.31.0.26
        authorization-mode: Node,RBAC
      timeoutForControlPlane: 4m0s
    apiVersion: kubeadm.k8s.io/v1beta2
    certificatesDir: /etc/kubernetes/pki
    clusterName: kubernetes
    controlPlaneEndpoint: k8s-cluster-ins-0029-master-vip.service.consul:6443
    controllerManager: {}
    dns:
      type: CoreDNS
    etcd:
      local:
        dataDir: /data/etcd
    imageRepository: harbor.xxx.com/library/k8s.gcr.io
    kind: ClusterConfiguration
    kubernetesVersion: v1.17.14
    networking:
      dnsDomain: cluster.local
      podSubnet: 10.244.0.0/16
      serviceSubnet: 10.96.0.0/12
    scheduler: {}
  ClusterStatus: |
    apiEndpoints:
      k8s-cluster-ins-0029-master-1:
        advertiseAddress: 10.0.0.77
        bindPort: 6443
      k8s-cluster-ins-0029-master-2:
        advertiseAddress: 10.0.0.128
        bindPort: 6443
      k8s-cluster-ins-0029-master-3:
        advertiseAddress: 10.0.0.154
        bindPort: 6443
    apiVersion: kubeadm.k8s.io/v1beta2
    kind: ClusterStatus
kind: ConfigMap
metadata:
  creationTimestamp: "2020-12-03T09:29:09Z"
  name: kubeadm-config
  namespace: kube-system
  resourceVersion: "665"
  selfLink: /api/v1/namespaces/kube-system/configmaps/kubeadm-config
  uid: 78961de1-682a-49d0-8c8f-dc6d4e47ca04



https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/control-plane-flags/

https://github.com/kubernetes/kubernetes/issues/33618

https://github.com/kubernetes/kubeadm/blob/master/docs/design/design_v1.9.md#optional-self-hosting

https://idig8.com/2019/08/08/zoujink8skubeadmdajian-kubernetes1-15-1jiqunhuanjing14/

https://feisky.gitbooks.io/kubernetes/content/troubleshooting/network.html

https://github.com/kubernetes/kubernetes/issues/33618
https://github.com/projectcalico/calico/issues/3092
https://github.com/projectcalico/calico/issues/2720

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 3
    评论
启动一个 Kubernetes 集群需要以下步骤: 1. 安装 Docker:Kubernetes 使用 Docker 作为容器运行时,所以需要先安装 Docker。 2. 安装 kubeadm、kubelet 和 kubectl:这些是 Kubernetes 组件,用于创建和管理 Kubernetes 集群。 3. 初始化 master 节点:使用 kubeadm 命令初始化 Kubernetes master 节点。 4. 添加 worker 节点:使用 kubeadm 命令添加 Kubernetes worker 节点。 以下是简要步骤: 1. 在所有节点上安装 Docker: ``` $ sudo apt-get update $ sudo apt-get install -y docker.io ``` 2. 在所有节点上安装 kubeadm、kubelet 和 kubectl: ``` $ sudo apt-get update $ sudo apt-get install -y apt-transport-https $ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add - $ echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list $ sudo apt-get update $ sudo apt-get install -y kubelet kubeadm kubectl $ sudo apt-mark hold kubelet kubeadm kubectl ``` 3. 在 master 节点上初始化 Kubernetes: ``` $ sudo kubeadm init --pod-network-cidr=192.168.0.0/16 ``` 其中 `--pod-network-cidr` 用于指定 Pod 网段,需要和后面的网络插件匹配。 4. 安装网络插件(这里以 Calico 为例): ``` $ kubectl apply -f https://docs.projectcalico.org/v3.18/manifests/calico.yaml ``` 5. 在 worker 节点上加入 Kubernetes 集群: ``` $ sudo kubeadm join <master-node-ip>:<master-node-port> --token <token> --discovery-token-ca-cert-hash <hash> ``` 其中 `<master-node-ip>` 和 `<master-node-port>` 分别为 Kubernetes master 节点的 IP 和端口,`<token>` 和 `<hash>` 是在初始化 master 节点时生成的。 以上是一个简单的 Kubernetes 集群启动流程。具体细节和配置需要根据实际情况进行调整。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值