记录一下docker network命令和macvlan技术

如果要构建跨主机节点的docker网络,比如使用docker swarm,默认情况下docker swarm用的是overlay网络技术,这会导致网络性能比较差,如果使用macvlan的话,性能就很好。

macvlan是linux kernel模块,功能是允许在同一个物理网卡上配置多个 MAC 地址,即多个NIC,每个 interface 可以配置自己的 IP。macvlan 本质上是一种网卡虚拟化技术,macvlan 的最大优点是性能极好,相比其他实现,macvlan 不需要创建 Linux bridge,而是直接通过以太网连接物理网络,lsmod一下可以看到确实是linux内核加载的模块!

[root@linux3 ~]# lsmod | grep mac
macvlan                19239  0 

看得出来是Patrick McHardy这哥们开发的macvlan技术

[root@linux3 ~]# modinfo macvlan
filename:       /lib/modules/3.10.0-1160.el7.x86_64/kernel/drivers/net/macvlan.ko.xz
alias:          rtnl-link-macvlan
description:    Driver for MAC address based VLANs
author:         Patrick McHardy <kaber@trash.net>
license:        GPL
retpoline:      Y
rhelversion:    7.9
srcversion:     140D211EA232257B4320276
depends:        
intree:         Y
vermagic:       3.10.0-1160.el7.x86_64 SMP mod_unload modversions 
signer:         CentOS Linux kernel signing key
sig_key:        E1:FD:B0:E2:A7:E8:61:A1:D1:CA:80:A2:3D:CF:0D:BA:3A:A4:AD:F5
sig_hashalgo:   sha256

本来吧,一个网卡对应一个MAC地址,这个是网卡制造商生产网卡就固定好的,现在有了macvlan,一个网卡能虚拟出多个mac地址,也是醉了,那就变成假设你的电脑上只有一块物理网卡,通过macvlan技术,就好似你的电脑上目前有n块网卡一样,省的你买网卡了,比如下面的图表示2台服务器,各自只有1个物理网卡,但是通过macvlan技术,各自虚拟出4个网卡,分别有自己的mac地址,这会带来什么呢?

 注意,为了让4个虚拟网卡(带自己的mac地址)都可以通过物理网卡发送数据,需要把物理网卡通过linux相关命令设置为混杂模式 ip link set ens33 promisc on,这里的ens33就是物理网卡的名称,具体ifconfig可以看得到网卡的名称

[root@linux3 ~]# docker network ls
NETWORK ID     NAME      DRIVER    SCOPE
987befb5a3b1   bridge    bridge    local
581ad0a5ce75   host      host      local
0a451b66cef2   none      null      local

[root@linux3 ~]# docker network --help

Usage:  docker network COMMAND

Manage networks

Commands:
  connect     Connect a container to a network
  create      Create a network
  disconnect  Disconnect a container from a network
  inspect     Display detailed information on one or more networks
  ls          List networks
  prune       Remove all unused networks
  rm          Remove one or more networks

Run 'docker network COMMAND --help' for more information on a command.

[root@linux3 ~]# docker network create --help

Usage:  docker network create [OPTIONS] NETWORK

Create a network

Options:
      --attachable           Enable manual container attachment
      --aux-address map      Auxiliary IPv4 or IPv6 addresses used by Network driver (default map[])
      --config-from string   The network from which to copy the configuration
      --config-only          Create a configuration only network
  -d, --driver string        Driver to manage the Network (default "bridge")
      --gateway strings      IPv4 or IPv6 Gateway for the master subnet
      --ingress              Create swarm routing-mesh network
      --internal             Restrict external access to the network
      --ip-range strings     Allocate container ip from a sub-range
      --ipam-driver string   IP Address Management Driver (default "default")
      --ipam-opt map         Set IPAM driver specific options (default map[])
      --ipv6                 Enable IPv6 networking
      --label list           Set metadata on a network
  -o, --opt map              Set driver specific options (default map[])
      --scope string         Control the network's scope
      --subnet strings       Subnet in CIDR format that represents a network segment

 通过命令行帮助,可以看到如何创建macvlan网络

[root@linux3 ~]# docker network create --driver macvlan \
>     --subnet=172.16.86.0/24 \
>     --gateway=172.16.86.1 \
>     -o parent=ens33 macvlan_net1
5ab5dd6287bf3177518b573f1006f5500499102ce887618ee8b9a3c94f3fa105

可以看到,我在这台机器上创建了一个macvlan类型的docker网络

[root@linux3 ~]# docker network ls
NETWORK ID     NAME           DRIVER    SCOPE
987befb5a3b1   bridge         bridge    local
581ad0a5ce75   host           host      local
5ab5dd6287bf   macvlan_net1   macvlan   local
0a451b66cef2   none           null      local

用inspect查看一下详情

[root@linux3 ~]# docker network inspect macvlan_net1
[
    {
        "Name": "macvlan_net1",
        "Id": "5ab5dd6287bf3177518b573f1006f5500499102ce887618ee8b9a3c94f3fa105",
        "Created": "2021-09-13T13:45:15.941104269+08:00",
        "Scope": "local",
        "Driver": "macvlan",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "172.16.86.0/24",
                    "Gateway": "172.16.86.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {},
        "Options": {
            "parent": "ens33"
        },
        "Labels": {}
    }
]


macvlan网络因为是跨主机的网络,所以上面的命令要同时在多个主机上执行,然后我们用上面的网络启动一个容器,并指定容器的IP地址

docker run -itd --name box1 --ip=172.16.86.20 --network macvlan_net1 busybox
5c8be65f8680df44296c84df6343f029517619b3777179a34180880bce0a3c08

再启动一个box2容器,IP地址是21

docker run -itd --name box2 --ip=172.16.86.21 --network macvlan_net1 busybox
f2353faf15d47acdeaaac51937a7cc352a59976a70b1b8b11ad7ab1ebad41636


然后互相ping一下,不错,IP可以互通哇

[root@linux3 ~]# docker exec box2 ping -c 3 172.16.86.20
PING 172.16.86.20 (172.16.86.20): 56 data bytes
64 bytes from 172.16.86.20: seq=0 ttl=64 time=0.138 ms
64 bytes from 172.16.86.20: seq=1 ttl=64 time=0.083 ms
64 bytes from 172.16.86.20: seq=2 ttl=64 time=0.058 ms

--- 172.16.86.20 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.058/0.093/0.138 ms
[root@linux3 ~]# docker exec box1 ping -c 3 172.16.86.21
PING 172.16.86.21 (172.16.86.21): 56 data bytes
64 bytes from 172.16.86.21: seq=0 ttl=64 time=0.052 ms
64 bytes from 172.16.86.21: seq=1 ttl=64 time=0.082 ms
64 bytes from 172.16.86.21: seq=2 ttl=64 time=0.051 ms

--- 172.16.86.21 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.051/0.061/0.082 ms

可以看到只有一个docker0,我们用brctl show可以看到这台服务器上只有1个网桥,我们创建macvlan格式的docker网络的时候,并不会新创建网桥,不使用网桥,这个是macvlan模式比bridge模式,网络性能更好的原因。

[root@linux3 ~]# brctl show
bridge name	bridge id		STP enabled	interfaces
docker0		8000.0242ce22dc79	no		veth03b63f8

[root@linux3 ~]# docker network ls
NETWORK ID     NAME           DRIVER    SCOPE
987befb5a3b1   bridge         bridge    local
581ad0a5ce75   host           host      local
5ab5dd6287bf   macvlan_net1   macvlan   local
0a451b66cef2   none           null      local

[root@linux3 ~]# ifconfig
docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        inet6 fe80::42:ceff:fe22:dc79  prefixlen 64  scopeid 0x20<link>
        ether 02:42:ce:22:dc:79  txqueuelen 0  (Ethernet)
        RX packets 177695  bytes 16034206 (15.2 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 217980  bytes 584716285 (557.6 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

 通过ip link命令查看容器的网卡设施,可以看到容器里面的网卡除了lo,就是一个eth0@if2,这个if2是有讲究的,2是序号的意思,再回到宿主机上看ip link,就看得出来了,这个容器的的eth0就是宿主机的ens33通过macvlan技术虚拟出来的一个网卡罢了。

 再看看docker安装好以后默认创建的bridge网桥

docker network ls
NETWORK ID     NAME      DRIVER    SCOPE
6dbe26dc7d0c   bridge    bridge    local
581ad0a5ce75   host      host      local
0a451b66cef2   none      null      local

[root@linux4 andycui]# docker network inspect bridge
[
    {
        "Name": "bridge",
        "Id": "6dbe26dc7d0cfeb290ebd653e3db12e4be5918d9f934b26c1964d7ede83c30ba",
        "Created": "2021-08-25T15:17:57.124884782+08:00",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.17.0.0/16",
                    "Gateway": "172.17.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {},
        "Options": {
            "com.docker.network.bridge.default_bridge": "true",
            "com.docker.network.bridge.enable_icc": "true",
            "com.docker.network.bridge.enable_ip_masquerade": "true",
            "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
            "com.docker.network.bridge.name": "docker0",
            "com.docker.network.driver.mtu": "1500"
        },
        "Labels": {}
    }
]

bridge networks are isolated networks on a single Engine installation. If you want to create a network that spans multiple Docker hosts each running an Engine, you must create an overlay network.

Docker默认的bridge模式是单机隔离的,也就是上面的Subnet 172.17.0.0/16只是给本机的容器分配IP的,不能跨主机,而且跨主机很可能容器IP都是一样的,如果要跨主机,就必须创建overlay网络模式。

但是默认情况下,docker的overlay网络的性能比较差,比如如果用docker swarm init创建跨主机的docker network,那么用的就是overlay网络,性能不太好

[root@linux3 ~]# docker swarm init --help

Usage:  docker swarm init [OPTIONS]

Initialize a swarm

Options:
      --advertise-addr string                  Advertised address (format: <ip|interface>[:port])
      --autolock                               Enable manager autolocking (requiring an unlock key to start a stopped manager)
      --availability string                    Availability of the node ("active"|"pause"|"drain") (default "active")
      --cert-expiry duration                   Validity period for node certificates (ns|us|ms|s|m|h) (default 2160h0m0s)
      --data-path-addr string                  Address or interface to use for data path traffic (format: <ip|interface>)
      --data-path-port uint32                  Port number to use for data path traffic (1024 - 49151). If no value is set or is set to 0, the default port (4789) is used.
      --default-addr-pool ipNetSlice           default address pool in CIDR format (default [])
      --default-addr-pool-mask-length uint32   default address pool subnet mask length (default 24)
      --dispatcher-heartbeat duration          Dispatcher heartbeat period (ns|us|ms|s|m|h) (default 5s)
      --external-ca external-ca                Specifications of one or more certificate signing endpoints
      --force-new-cluster                      Force create a new cluster from current state
      --listen-addr node-addr                  Listen address (format: <ip|interface>[:port]) (default 0.0.0.0:2377)
      --max-snapshots uint                     Number of additional Raft snapshots to retain
      --snapshot-interval uint                 Number of log entries between Raft snapshots (default 10000)
      --task-history-limit int                 Task history retention limit (default 5)

如果是用kubernetes的话,有一个CNI插件flannel,性能也不咋地,具体:

  1. 先有一个全局的大的IP地址范围,存放在etcd里面,比如这个IP地址范围有10000个IP地址
  2. 然后给每个主机分配一小段subnetwork ip地址范围,比如200个
  3. 每个主机上的容器就才能够这个小段IP地址范围内分配IP
  4. flannel网络要保存每台主机分配的ip地址范围,每个容器的IP地址,这些信息都存放在etcd里面
  5. 每个主机上都运行一个flanneld进程,支持把容器的数据包封装后再通过主机的网卡发出去,收到后再解封发给容器,所以这里有一个封包解包的过程,导致性能差了,但并不是不能用,具体支持udp封包、vxlan等方式,也就是说用udp数据包把容器的网络包再封装一下,再发出去
  6. 这就意味着,当主机1上的容器A要和主机B上的容器2通信的时候,主机1上的flanneld进程要预先知道容器2所在的物理主机B,这样主机1上的flanneld进程才好发UDP数据包,发送UDP数据包,必须知道主机B的IP地址

可以通过api方式访问dockerd服务端进程,并不是只能通过docker命令才能和服务端交互,通过api一样可以操作远程服务器上的容器,参考:Develop with Docker Engine SDKs | Docker Documentation

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值