16.节点通信-flannel

16.节点通信-flannel

Configuration: https://github.com/flannel-io/flannel/blob/master/Documentation/configuration.md
Running flannel: https://github.com/flannel-io/flannel/blob/master/Documentation/running.md
Backends: https://github.com/flannel-io/flannel/blob/master/Documentation/backends.md
kubernetes单节点集群搭建之CNI flannel网络安装

本节使用二进制方式安装,有助于理解flanel细节原理
k8s集群里安装可以参考下列命令
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

一、安装flannel

(演示节点node21)

获取flannel

https://github.com/flannel-io/flannel/releases
文章测试安装版本:v0.13.1-rc2

解压
mkdir /opt/flannel-v0.13.1-rc2
tar xzvf flannel-v0.13.1-rc2-linux-amd64.tar.gz -C /opt/flannel-v0.13.1-rc2/
ln -s /opt/flannel-v0.13.1-rc2  /opt/flannel
配置

vi /opt/flannel/subnet.env

FLANNEL_NETWORK=172.20.0.0/16
FLANNEL_SUBNET=172.20.21.1/24
FLANNEL_MTU=1500
FLANNEL_IPMASQ=false

vi /opt/flannel/net-conf.json
该json配置实际操作一般是保存在etcd上,启动flannel时指定etcd服务器去获取,我们此次操作也是etcd方式,此处我们了解一下这种方式及相关配置项的作用

{
    "Network": "172.20.0.0/16",
    "SubnetLen": 24,
    "SubnetMin": "172.20.21.0",
    "SubnetMax": "172.20.23.0",
    "Backend": {
        "Type": "vxlan",
        "DirectRouting": true
    }
}

Network:用于指定Flannel地址池
SubnetLen:用于指定分配给单个宿主机的docker0的ip段的子网掩码的长度
SubnetMin:用于指定最小能够分配的ip段
SudbnetMax:用于指定最大能够分配的ip段,在上面的示例中,表示每个宿主机可以分配一个24位掩码长度的子网,可以分配的子网从172.20.21.0/24到172.20.23.0/24,也就意味着在这个网段中,最多只能有3台宿主机
Backend:用于指定数据包以什么方式转发,默认为udp模式,host-gw模式性能最好,但不能跨宿主机网络。不同的类型配置可选项也是不同的,此处按下不表,下文再行描述

启动脚本

vi /opt/flannel/flanneld.sh

#!/bin/bash
./flanneld \
--public-ip=172.10.10.21 \       ---其他节点可访问的IP,用于主机间通信。默认为用于通信的接口的IP。
--etcd-endpoints=https://172.10.10.12:2379,https://172.10.10.21:2379,https://172.10.10.22:2379 \   ---逗号分隔的etcd端点列表
--etcd-prefix=/hzw.com/network \    ---自定义从etcd中获取配置目录名,默认是/coreos.com/network,flannel会获取该目录下的config文件做为net-work配置
--etcd-keyfile=./cert/client-key.pem \
--etcd-certfile=./cert/client.pem \
--etcd-cafile=./cert/ca.pem \
--kube-subnet-mgr=false \
--iface=ens33 \
--subnet-file=/opt/flannel/subnet.env \
--healthz-ip="0.0.0.0" \
--healthz-port=2401

--etcd-prefix=/hzw.com/network 就意味着flannel启动时会从etcd上获取/hzw.com/network/config配置,即最终的key是etcd-profix + / + config,其实etcd-profix在etcd中实际上就是目录,配置就是该目录下的config文件
chmod +x /opt/flannel/flanneld.sh
supervisor 托管: 略

命令行启动参数说明
--public-ip="":   IP accessible by other nodes for inter-host communication. Defaults to the IP of the interface being used for communication.
--etcd-endpoints=http://127.0.0.1:4001:   a comma-delimited list of etcd endpoints.
--etcd-prefix=/coreos.com/network:   etcd prefix.
--etcd-keyfile="":   SSL key file used to secure etcd communication.
--etcd-certfile="":   SSL certification file used to secure etcd communication.
--etcd-cafile="":   SSL Certificate Authority file used to secure etcd communication.
--kube-subnet-mgr:   Contact the Kubernetes API for subnet assignment instead of etcd.
--iface="":   interface to use (IP or name) for inter-host communication. Defaults to the interface for the default route on the machine. This can be specified multiple times to check each option in order. Returns the first match found.
--iface-regex="":   regex expression to match the first interface to use (IP or name) for inter-host communication. If unspecified, will default to the interface for the default route on the machine. This can be specified multiple times to check each regex in order. Returns the first match found. This option is superseded by the iface option and will only be used if nothing matches any option specified in the iface options.
--iptables-resync=5:   resync period for iptables rules, in seconds. Defaults to 5 seconds, if you see a large amount of contention for the iptables lock increasing this will probably help.
--subnet-file=/run/flannel/subnet.env:   filename where env variables (subnet and MTU values) will be written to.
--net-config-path=/etc/kube-flannel/net-conf.json:   path to the network configuration file to use
--subnet-lease-renew-margin=60:   subnet lease renewal margin, in minutes.
--ip-masq=false:   setup IP masquerade for traffic destined for outside the flannel network. Flannel assumes that the default policy is ACCEPT in the NAT POSTROUTING chain.
-v=0:   log level for V logs. Set to 1 to see messages related to data path.
--healthz-ip="0.0.0.0":   The IP address for healthz server to listen (default "0.0.0.0")
--healthz-port=0:   The port for healthz server to listen(0 to disable)
--version:   print version and exit

任何命令行选项都可以通过将其与FLANNEL_前缀组合配置在subnet.env中,规则字母转换为大写并将所有其他破折号替换为下划线,如--etcd-endpoints可以在subnet.env中配置FLANNEL_ETCD_ENDPOINTS代替

etcd上配置flannel网络配置

etcdctl put /hzw.com/network/config '{"Network": "172.20.0.0/16","SubnetLen": 24,"SubnetMin": "172.20.21.0","SubnetMax": "172.20.23.0","Backend": {"Type": "host-gw"}}'

默认是:/coreos.com/network/config

我们先用host-gw模式进行验证
此处有坑!! 此版本flannel使用etcd3.4 的V3版本接口是有问题(v3版本api中etcd get获取的内容是携带key的,flannel获取config时会json解析失败),etcd需开启v2版本支持--enable-v2,然后设置变量时按如下方式用v2接口
ETCDCTL_API=2 etcdctl set /hzw.com/network/config '{"Network": "172.20.0.0/16","SubnetLen": 24,"SubnetMin": "172.20.21.0","SubnetMax": "172.20.23.0","Backend": {"Type": "host-gw"}}'
相关issue:https://github.com/flannel-io/flannel/issues/1191

vxlan: "Backend":{"Type": "vxlan", "DirectRouting": true}
vxlan模式DirectRouting置为true时,L2层能自由通信的容器之间会通过host-gw方式进行通信,跨L2层的容器间才使用vxlan隧道通信

flannel启动并验证
  • 启动
    在node21和node22上都配置好flannel,并启动
-- 启动日志(node21 为例)
I0330 11:32:25.724206   75822 main.go:532] Using interface with name ens33 and address 172.10.10.21
I0330 11:32:25.724308   75822 main.go:545] Using 172.10.10.21 as external address
2021-03-30 11:32:25.724894 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
I0330 11:32:25.724969   75822 main.go:253] Created subnet manager: Etcd Local Manager with Previous Subnet: 172.20.21.0/24
I0330 11:32:25.724977   75822 main.go:256] Installing signal handlers
I0330 11:32:25.725820   75822 main.go:592] Start healthz server on 0.0.0.0:2401
I0330 11:32:25.743936   75822 main.go:391] Found network config - Backend type: host-gw     --- 从etcd获取/hzw.com/network/config
I0330 11:32:25.750353   75822 local_manager.go:147] Found lease (172.20.21.0/24) for current IP (172.10.10.21), reusing
I0330 11:32:25.752940   75822 main.go:314] Changing default FORWARD chain policy to ACCEPT      --- 修改iptables规则,FORWARD默认规则为ACCEPT
I0330 11:32:25.753194   75822 main.go:322] Wrote subnet file to /opt/flannel/subnet.env           --- flannel会将配置写入到指定的subnet.env文件
I0330 11:32:25.753203   75822 main.go:326] Running backend.
I0330 11:32:25.753890   75822 route_network.go:53] Watching for new subnet leases
I0330 11:32:25.762672   75822 main.go:434] Waiting for 22h59m59.988408092s to renew lease
  • 查看etcd上的内容
alias etcdctl2='ETCDCTL_API=2 etcdctl'     --定义别名使用v2接口访问etcd[root@node21 ~]# etcdctl2 ls /hzw.com/network
/hzw.com/network/config
/hzw.com/network/subnets
[root@node21 ~]# etcdctl2 ls /hzw.com/network/subnets
/hzw.com/network/subnets/172.20.21.0-24
/hzw.com/network/subnets/172.20.22.0-24
[root@node21 ~]# etcdctl2 get /hzw.com/network/config
{
    "Network": "172.20.0.0/16",
    "SubnetLen": 24,
    "SubnetMin": "172.20.21.0",
    "SubnetMax": "172.20.23.0",
    "Backend":{
        "Type": "vxlan", 
        "DirectRouting": true
    }
}
[root@node21 ~]# etcdctl2 get /hzw.com/network/subnets/172.20.21.0-24
{"PublicIP":"172.10.10.21","BackendType":"vxlan","BackendData":{"VNI":1,"VtepMAC":"6a:ee:94:5b:6f:b7"}}
[root@node21 ~]# etcdctl2 get /hzw.com/network/subnets/172.20.22.0-24
{"PublicIP":"172.10.10.22","BackendType":"vxlan","BackendData":{"VNI":1,"VtepMAC":"46:bc:db:69:df:47"}}
  • 验证一下容器通信
    node21容器1:172.20.21.2 ----访问—> node22容器2:172.20.22.2
    在这里插入图片描述
    上图显示,不同节点的容器之间能够进行通信,但是容器2 中看到的不是容器1的ip,而是容器1宿主机的ip
  • 问题1
    容器内部接收跨筑起其他容器的请求时,识别的IP是对方宿主机IP,这是有问题的,需要优化iptables的SNAT策略,使172.20/16范围内的交互不进行SNAT替换
    查看iptable相关规则:iptables-save | grep -i postrouting
    在这里插入图片描述
    问题1的凶手就是这条规则,需优化这条规则-A POSTROUTING -s 172.20.21.0/24 ! -o docker0 ! 172.20.0.0/16 -j MASQUERADE
  • 问题2
    实验过程中,按上文配置,flannel启动时会随机获取网段,可能172.10.10.21node获取的子网网段是172.20.22.1/24
    我希望的是172.10.10.21容器子网是172.20.21.1/24,通过这样的约定就可以通过容器ip看出来容器再哪个节点上,如172.20.21.2容器我们知道它在172.10.10.21的节点上。若子网网段是随机的那就能难受了
    有没有办法指定node获取子网网段??
  • 问题3
    容器的ip网段是由docker来控制的,上文我们是手动控制了flannel的配置和docker:daemon.json中的子网ip网段保持一致才实现的,事实上docker的网段和flannel分配的网段策略是两回事,我们需要将docker的网段配置和flannel关联起来,这该如何操作?
二. 优化iptables规则是容器间通信不进行源地址替换

针对上文问题1的优化

需针对下面的这条规则进行优化
[root@node21 ~]# iptables-save | grep POSTROUTING
...
-A POSTROUTING -s 172.20.21.0/24 ! -o docker0 -j MASQUERADE
...

上面这条规则说明:
POSTROUTING nat表的POSTROUTING链
-s 172.20.21.0/24 ! -o docker0 按匹配源地址且非docker0设备筛选
-j MASQUERADE 执行MASQUERADE操作,这个功能与 SNAT 略有不同,当进行 IP 伪装时,不需指定要伪装成哪个 IP,IP 会从网卡直接读取

按上述规则看,所有源地址为172.20.21.0/24网段(node21上容器网段)且非过docker0设备(本节点容器间通过docker0通信)的网络数据包会进行将源地址替换为本机网络设备的IP, 因此其他节点获取到的报文源地址就看不到容器IP了
因此我们想一想, 若将目标地址是总容器集群网段的网络包从该条规则中进行屏蔽, 不进行MASQUERADE是不是就可以了
1.删除原规则
iptables -t nat -D POSTROUTING -s 172.20.21.0/24 ! -o docker0 -j MASQUERADE
2.在原规则的基础上增加一个筛选条件! -d 172.20/16(目标地址不是我们flannel给各个node分配容器子网的总网段)
iptables -t nat -I POSTROUTING -s 172.20.21.0/24 ! -o docker0 ! -d 172.20/16 -j MASQUERADE
查看新规则
-A POSTROUTING -s 172.20.21.0/24 ! -d 172.20.0.0/16 ! -o docker0 -j MASQUERADE
同理其他node节点都要进行类似操作
效果:
在这里插入图片描述

三. 使用将flannel分配的网段作为docker容器网段
1.由flannel子网信息生成docker配置
  • mk-docker-opts.sh
    flannel安装目录下有一个将flannel subnet.env 翻译成docker opt参数的脚本mk-docker-opts.sh
[root@node21 flannel]# ll
-rwxr-xr-x 1 etcd etcd 45687144 2月   9 04:42 flanneld
-rwxr-xr-x 1 root root      414 3月  30 16:22 flanneld.sh
-rwxr-xr-x 1 etcd etcd     2139 5月  29 2019 mk-docker-opts.sh
-rw-rw-r-- 1 etcd etcd     4642 2月   9 04:29 README.md
-rw-r--r-- 1 root root       98 4月   7 04:21 subnet.env
  • 执行
    ./mk-docker-opts.sh -f subnet.env -d /run/docker_opts.env
    -f: 指定flannel:subnet.env文件
    -d: 指定生成目标文件路径(不指定则默认为/run/docker_opts.env)
  • 查看文件内容
[root@node21 flannel]# cat subnet.env 
FLANNEL_NETWORK=172.20.0.0/16
FLANNEL_SUBNET=172.20.21.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=false
[root@node21 flannel]# cat /run/docker_opts.env 
DOCKER_OPT_BIP="--bip=172.20.21.1/24"
DOCKER_OPT_IPMASQ="--ip-masq=true"
DOCKER_OPT_MTU="--mtu=1450"
DOCKER_OPTS=" --bip=172.20.21.1/24 --ip-masq=true --mtu=1450"
2. 修改docker配置使用docker_opts.env
  • 删掉daemon.json中对子网的配置,由docker.service中配置使用docker_opts.env来确定子网(没有配置的就跳过这一步)
{
    "insecure-registries": ["harbor.hzwod.com"],
    "bip": "172.20.22.1/24",
    "fixed-cidr": "172.20.22.1/25",
    "exec-opts": ["native.cgroupdriver=systemd"],
    "live-restore": true
}
修改为》》》
{
    "insecure-registries": ["harbor.hzwod.com"],
    "exec-opts": ["native.cgroupdriver=systemd"],
    "live-restore": true
}
  • 编辑docker.service
    vi /lib/systemd/system/docker.service
......
[Service]
Type=notify
EnvironmentFile=/run/docker_opts.env
ExecStart=/usr/bin/dockerd ${DOCKER_OPTS} -H fd:// --containerd=/run/containerd/containerd.sock
......
  • 重启docker
    systemctl daemon-reload
    systemctl restart docker.service
Recommended backends
VXLAN

Use in-kernel VXLAN to encapsulate the packets.
Type and options:

  • Type (string): vxlan
  • VNI (number): VXLAN Identifier (VNI) to be used. On Linux, defaults to 1. On Windows should be greater than or equal to 4096.
  • Port (number): UDP port to use for sending encapsulated packets. On Linux, defaults to kernel default, currently 8472, but on Windows, must be 4789.
  • GBP (Boolean): Enable VXLAN Group Based Policy. Defaults to false. GBP is not supported on Windows
  • DirectRouting (Boolean): Enable direct routes (like host-gw) when the hosts are on the same subnet. VXLAN will only be used to encapsulate packets to hosts on different subnets. Defaults to false. DirectRouting is not supported on Windows.
  • MacPrefix (String): Only use on Windows, set to the MAC prefix. Defaults to 0E-2A.
host-gw

Use host-gw to create IP routes to subnets via remote machine IPs. Requires direct layer2 connectivity between hosts running flannel.
host-gw provides good performance, with few dependencies, and easy set up.
Type:

  • Type (string): host-gw
UDP

Use UDP only for debugging if your network and kernel prevent you from using VXLAN or host-gw.
Type and options:

  • Type (string): udp
  • Port (number): UDP port to use for sending encapsulated packets. Defaults to 8285.
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

hzw@sirius

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值