Docker的网络配置:创建网桥

Table of Contents

环境

介绍

关于命名空间

关于Veth对

建立我自己的Docker网络

里程碑1:主机内Ping

里程碑2:通过主机NAT进行外部访问

里程碑3:从主机到远程容器的SSH

里程碑4:集装箱的平面桥接专用网络

里程碑5:主机和容器网络之间的路由器(FAILED)

里程碑6:只需将主机1和容器test1.1用作路由器(失败)

里程碑7:终于使路由器正常工作

未来

推荐阅读


 

环境

3个节点VM(或称为主机),每个安装了docker。

  • VM1 /主机1:10.32.171.202 centos7
  • VM2 /主机2:10.32.171.203 centos7
  • Vm3 / Host3:10.32.171.204 centos7

首先,让我们设置实验环境。在每个节点上

docker pull centos:7
docker run -d --name test1 centos:7 /bin/bash -c "while true; do sleep 3600; done"  # name test2 on VM2, test3 on VM3

# to connect the container
docker exec -it test1 bash

确保内核愿意转发IP数据包。

$ sysctl net.ipv4.conf.all.forwarding=1
$ sysctl net.ipv4.conf.all.forwarding
net.ipv4.conf.all.forwarding = 1

介绍

安装后,Docker将创建网桥docker0。创建容器时,一对veth(veth2a6e52a)将其链接到docker0。请参阅官方文档。请注意,这ens32是我的房东的“ eth0”。

# VM1
$ ip li
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 1000
    link/ether 00:50:56:98:61:4c brd ff:ff:ff:ff:ff:ff
3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT
    link/ether 56:84:7a:fe:97:99 brd ff:ff:ff:ff:ff:ff
25: veth2a6e52a: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT
    link/ether 56:7f:46:26:0d:cb brd ff:ff:ff:ff:ff:ff

$ brctl show
bridge name     bridge id               STP enabled     interfaces
docker0         8000.56847afe9799       no              veth2a6e52a

$ bridge li
25: veth2a6e52a state UP : <BROADCAST,UP,LOWER_UP> mtu 1500 master docker0 state forwarding priority 32 cost 2

Docker使用route和iptables(请注意MASQUERADE)为容器创建NAT网络,以便它可以从外部访问。

$ ip route
default via 10.32.171.1 dev ens32
10.32.171.0/24 dev ens32  proto kernel  scope link  src 10.32.171.202
169.254.0.0/16 dev ens32  scope link  metric 1002
172.17.0.0/16 dev docker0  proto kernel  scope link  src 172.17.42.1

$ iptables --list -t nat
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination
PREROUTING_direct  all  --  anywhere             anywhere
PREROUTING_ZONES_SOURCE  all  --  anywhere             anywhere
PREROUTING_ZONES  all  --  anywhere             anywhere
DOCKER     all  --  anywhere             anywhere             ADDRTYPE match dst-type LOCAL

Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
OUTPUT_direct  all  --  anywhere             anywhere
DOCKER     all  --  anywhere            !loopback/8           ADDRTYPE match dst-type LOCAL

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination
MASQUERADE  all  --  172.17.0.0/16        anywhere
POSTROUTING_direct  all  --  anywhere             anywhere
POSTROUTING_ZONES_SOURCE  all  --  anywhere             anywhere
POSTROUTING_ZONES  all  --  anywhere             anywhere
MASQUERADE  tcp  --  172.17.0.4           172.17.0.4           tcp dpt:https
MASQUERADE  tcp  --  172.17.0.4           172.17.0.4           tcp dpt:6611
MASQUERADE  tcp  --  172.17.0.4           172.17.0.4           tcp dpt:7072
MASQUERADE  tcp  --  172.17.0.4           172.17.0.4           tcp dpt:http
MASQUERADE  tcp  --  172.17.0.4           172.17.0.4           tcp dpt:9011

Chain DOCKER (2 references)
target     prot opt source               destination

Chain OUTPUT_direct (1 references)
target     prot opt source               destination

Chain POSTROUTING_ZONES (1 references)
target     prot opt source               destination
POST_public  all  --  anywhere             anywhere            [goto]
POST_public  all  --  anywhere             anywhere            [goto]

Chain POSTROUTING_ZONES_SOURCE (1 references)
target     prot opt source               destination

Chain POSTROUTING_direct (1 references)
target     prot opt source               destination

Chain POST_public (2 references)
target     prot opt source               destination
POST_public_log  all  --  anywhere             anywhere
POST_public_deny  all  --  anywhere             anywhere
POST_public_allow  all  --  anywhere             anywhere

Chain POST_public_allow (1 references)
target     prot opt source               destination

Chain POST_public_deny (1 references)
target     prot opt source               destination

Chain POST_public_log (1 references)
target     prot opt source               destination

Chain PREROUTING_ZONES (1 references)
target     prot opt source               destination
PRE_public  all  --  anywhere             anywhere            [goto]
PRE_public  all  --  anywhere             anywhere            [goto]

Chain PREROUTING_ZONES_SOURCE (1 references)
target     prot opt source               destination

Chain PREROUTING_direct (1 references)
target     prot opt source               destination

Chain PRE_public (2 references)
target     prot opt source               destination
PRE_public_log  all  --  anywhere             anywhere
PRE_public_deny  all  --  anywhere             anywhere
PRE_public_allow  all  --  anywhere             anywhere

Chain PRE_public_allow (1 references)
target     prot opt source               destination

Chain PRE_public_deny (1 references)
target     prot opt source               destination

Chain PRE_public_log (1 references)
target     prot opt source               destination

关于命名空间

容器使用的名称空间与其他进程不同。要查看它(命名空间指南2):

# VM1
$ ll /proc/3378/ns  # docker process
total 0
lrwxrwxrwx 1 root root 0 May 11 14:32 ipc -> ipc:[4026531839]
lrwxrwxrwx 1 root root 0 May 11 14:32 mnt -> mnt:[4026532442]
lrwxrwxrwx 1 root root 0 May 11 14:32 net -> net:[4026531956]
lrwxrwxrwx 1 root root 0 May 11 14:32 pid -> pid:[4026531836]
lrwxrwxrwx 1 root root 0 May 11 14:32 uts -> uts:[4026531838]
$ ll /proc/1/ns     # systemd process
total 0
lrwxrwxrwx 1 root root 0 May 11 14:33 ipc -> ipc:[4026531839]
lrwxrwxrwx 1 root root 0 May 11 14:33 mnt -> mnt:[4026531840]
lrwxrwxrwx 1 root root 0 May 11 14:33 net -> net:[4026531956]
lrwxrwxrwx 1 root root 0 May 11 14:33 pid -> pid:[4026531836]
lrwxrwxrwx 1 root root 0 May 11 14:33 uts -> uts:[4026531838]
$ ll /proc/4718/ns/   # container test1, the ns is different
total 0
lrwxrwxrwx 1 root root 0 May  8 16:44 ipc -> ipc:[4026532453]
lrwxrwxrwx 1 root root 0 May  8 16:44 mnt -> mnt:[4026532451]
lrwxrwxrwx 1 root root 0 May  8 16:44 net -> net:[4026532456]
lrwxrwxrwx 1 root root 0 May  8 16:44 pid -> pid:[4026532454]
lrwxrwxrwx 1 root root 0 May  8 16:44 uts -> uts:[4026532452]

但是,ip netns什么也没显示。为什么?Docker默认删除网络名称空间信息。

# However, ip netns shows nothing (root). Why?
$ ip netns list
$

# New network namespaces should able to be seen in /var/run/netns. But docker deletes them on default
$ ip netns add blue
$ ls /var/run/netns/
blue
$ ip netns delete blue

# Let's restore these netns info
$ docker inspect --format='' test1    # show pid of test1
4718
$ ln -s /proc/4718/ns/net /var/run/netns/4718

# View network info inside my test1 container
$ ip netns
4718
$ ip netns exec 4718 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
32: eth0: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 02:42:ac:11:00:01 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe11:1/64 scope link
       valid_lft forever preferred_lft forever

# Show another pair of outside veth pair
$ ethtool -S veth523170f
NIC statistics:
     peer_ifindex: 32    # Note the 32 shown above

# Restore everything to default
$ rm -f /var/run/netns/4718

问题:如何在Shell中创建自己的名称空间并对其进行启动/更改过程?

关于Veth

Veth对通常用于不同网络空间之间的通信,请参见2。副对有性能问题:1 ] 2。有些建议使用OVS补丁端口,overperforms VETH对。

建立我自己的Docker网络

这是在主机和容器之间构建网络的一系列实验。最后,我将为容器建立一个专用网络192.168.7.0/24,与主机网络10.32.171.0/24分开。一个虚拟路由器(按网络名称空间)连接两个网络,这使主机和容器可以相互SSH。

里程碑1:主机内Ping

首先,在每个主机上,删除原始的docker bridge和iptables规则。本节中的某些内容在此处引用。

service docker stop
ip link set dev docker0 down
brctl delbr docker0
iptables -t nat -F POSTROUTING

添加我自己的桥

brctl addbr bridge0
ip addr add 192.168.5.1/24 dev bridge0
ip link set dev bridge0 up

brctl addbr bridge1
ip addr add 192.168.6.1/24 dev bridge1
ip link set dev bridge1 up

在未修改bridge0和iptables的情况下启动docker服务。关于Fabien,关于如何在CentOS上配置docker启动选项。

# Append below to /etc/sysconfig/docker::OPTIONS
-b=bridge0 --iptables=false    # actually, bridge0 can be whatever
# Start docker service
service docker start

在每台主机上启动没有网络配置的centos测试容器。

$ docker run -d --name test1.1 --net=none centos:7 /bin/bash -c "while true; do sleep 3600; done"  # name test2.1 on VM2, test3.1 on VM3
$ docker exec -it test1.1 bash

# inside test1.1, no network at all
$ ip li
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
$ ip ro
$

在每个主机上为我的测试容器创建2个NIC

# Restore the network namespace info
export TEST_PID=$(docker inspect --format='' test1.1)    # change to test2.1 on VM2, test3.1 on VM3
ip netns add blue && ip netns delete blue    # to ensure /var/run/netns folder exists
ln -s /proc/${TEST_PID}/ns/net /var/run/netns/${TEST_PID}

# Create veth pairs for container
ip link add ${TEST_PID}.eth0 type veth peer name veth0
ip link add ${TEST_PID}.eth1 type veth peer name veth1

# Assign veth pairs to container
ip li set veth0 netns ${TEST_PID}
ip li set veth1 netns ${TEST_PID}

# Add NIC to bridges
brctl addif bridge0 ${TEST_PID}.eth0
brctl addif bridge1 ${TEST_PID}.eth1

在容器内,您可以看到这些新的NIC

$ docker exec -it test1.1 bash
$ ip li
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
47: veth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT qlen 1000
    link/ether 6e:21:24:67:71:fd brd ff:ff:ff:ff:ff:ff
49: veth1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT qlen 1000
    link/ether 6e:14:8b:d3:de:a5 brd ff:ff:ff:ff:ff:ff

在每个主机上,为我的每个容器NIC在对应的网桥网络中提供一个新IP。(我只是在伪造DHCP服务。)

# Host 1
ip netns exec ${TEST_PID} ip addr add 192.168.5.2 dev veth0
ip netns exec ${TEST_PID} ip addr add 192.168.6.2 dev veth1

# Host 2
ip netns exec ${TEST_PID} ip addr add 192.168.5.3 dev veth0
ip netns exec ${TEST_PID} ip addr add 192.168.6.3 dev veth1

# Host 3
ip netns exec ${TEST_PID} ip addr add 192.168.5.4 dev veth0
ip netns exec ${TEST_PID} ip addr add 192.168.6.4 dev veth1

在每个主机上调出所有接口

ip netns exec ${TEST_PID} ip li set veth0 up
ip netns exec ${TEST_PID} ip li set veth1 up
ip li set ${TEST_PID}.eth0 up
ip li set ${TEST_PID}.eth1 up

在每个主机上,在容器内配置路由

# Host 1
ip netns exec ${TEST_PID} ip route add 192.168.5.0/24 dev veth0
ip netns exec ${TEST_PID} ip route add 192.168.6.0/24 dev veth1
ip netns exec ${TEST_PID} ip route add default via 192.168.5.1

# Host 2
ip netns exec ${TEST_PID} ip route add 192.168.5.0/24 dev veth0
ip netns exec ${TEST_PID} ip route add 192.168.6.0/24 dev veth1
ip netns exec ${TEST_PID} ip route add default via 192.168.5.1

# Host 3
ip netns exec ${TEST_PID} ip route add 192.168.5.0/24 dev veth0
ip netns exec ${TEST_PID} ip route add 192.168.6.0/24 dev veth1
ip netns exec ${TEST_PID} ip route add default via 192.168.5.1

此时,您应该首先从自己的主机ping容器。但是您不能从外部主机对其执行ping操作。

# Host 1
$ ping 192.168.5.2
PING 192.168.5.2 (192.168.5.2) 56(84) bytes of data.
64 bytes from 192.168.5.2: icmp_seq=1 ttl=64 time=0.188 ms
...
$ ping 192.168.6.2
PING 192.168.6.2 (192.168.6.2) 56(84) bytes of data.
64 bytes from 192.168.6.2: icmp_seq=1 ttl=64 time=0.160 ms
...

# Host 3
$ ping 192.168.5.4
PING 192.168.5.4 (192.168.5.4) 56(84) bytes of data.
64 bytes from 192.168.5.4: icmp_seq=1 ttl=64 time=0.275 ms
...
$ ping 192.168.6.4
PING 192.168.6.4 (192.168.6.4) 56(84) bytes of data.
64 bytes from 192.168.6.4: icmp_seq=1 ttl=64 time=0.120 ms
...

里程碑2:通过主机NAT进行外部访问

在每个主机上添加主机NAT,以便测试容器可以通过veth0访问外部。相关指南

# Modify the nat table
iptables -t nat -A POSTROUTING -j MASQUERADE -s 192.168.5.0/24 -o ens32    # ens32 is my host's eth0

添加适当的转发规则,以便在每个主机上接受我的容器数据包。

# Modify the filter table. Note that if you want to use "!" or "any", read manula, don't just append them to device name
iptables -I FORWARD 1 -i bridge0 ! -o bridge0 -j ACCEPT
iptables -I FORWARD 2 ! -i bridge0 -o bridge0 -m state --state RELATED,ESTABLISHED -j ACCEPT

iptables -I FORWARD 3 -i bridge1 ! -o bridge1 -j ACCEPT
iptables -I FORWARD 4 ! -i bridge1 -o bridge1 -m state --state RELATED,ESTABLISHED -j ACCEPT

在我的每个centos容器上安装必要的工具。

docker exec -it test1.1 bash    # name 2.1 on VM2, 3.1 on VM3
# inside container
yum install -y bind-utils traceroute telnet openssh openssh-server openssh-clients net-tools tcpdump

让我们在每个容器中启用sshd。有用的文档在这里。

docker exec -it test1.1 bash    # name 2.1 on VM2, 3.1 on VM3
# inside container
vi /etc/ssh/sshd_config
    # Change below config options
    PasswordAuthentication yes
    PermitRootLogin yes
/usr/bin/ssh-keygen -A
echo 'root:123work' | chpasswd
nohup /usr/sbin/sshd -D > /var/log/sshd.log 2>&1 &
exit

在每个主机上,让我们测试sshd。到目前为止,我们可以将主机从ssh切换到位于其上的容器,而不能将SSH从其他主机上的容器。

# on host 1
$ ssh root@192.168.5.2 'cat /etc/hostname'
6e5898a7d4e4
$ ssh root@192.168.6.2 'cat /etc/hostname'
6e5898a7d4e4

# on host 3
$ ssh root@192.168.5.4 'cat /etc/hostname'
89c83ee77559
$ ssh root@192.168.6.4 'cat /etc/hostname'
89c83ee77559

里程碑3:从主机到远程容器的SSH

如何使任何主机连接到任何容器?我们开始做吧。在每个主机上,删除到bridge1的原始路由,因为它们太宽。

ip route delete 192.168.6.0/24 dev bridge1

我们仅将本地容器连接到本地网桥。其他容器被网关连接到其相应的主机。

# On host 1
ip route add 192.168.6.2 dev bridge1 src 192.168.6.1
ip route add 192.168.6.3 via 10.32.171.203 dev ens32
ip route add 192.168.6.4 via 10.32.171.204 dev ens32

# On host 2
ip route add 192.168.6.2 via 10.32.171.202 dev ens32
ip route add 192.168.6.3 dev bridge1 src 192.168.6.1
ip route add 192.168.6.4 via 10.32.171.204 dev ens32

# On host 3
ip route add 192.168.6.2 via 10.32.171.202 dev ens32
ip route add 192.168.6.3 via 10.32.171.203 dev ens32
ip route add 192.168.6.4 dev bridge1 src 192.168.6.1

# You should still be able to ssh to local container. For example on host 1
ssh root@192.168.5.2 'cat /etc/hostname'
ssh root@192.168.6.2 'cat /etc/hostname'

接下来,我们需要确保可以在容器外部发送ping回复

# On host 1
ip netns exec ${TEST_PID} ip route add 10.32.171.203 via 192.168.6.1 dev veth1
ip netns exec ${TEST_PID} ip route add 10.32.171.204 via 192.168.6.1 dev veth1

# On host 2
ip netns exec ${TEST_PID} ip route add 10.32.171.202 via 192.168.6.1 dev veth1
ip netns exec ${TEST_PID} ip route add 10.32.171.204 via 192.168.6.1 dev veth1

# On host 3
ip netns exec ${TEST_PID} ip route add 10.32.171.202 via 192.168.6.1 dev veth1
ip netns exec ${TEST_PID} ip route add 10.32.171.203 via 192.168.6.1 dev veth1

使iptables允许每台主机上的ssh通信

iptables -I FORWARD 1 -o bridge1 -p tcp --dport 22 -j ACCEPT

现在,您应该可以通过192.168.6.0/24从任何主机到任何容器ssh,但不能从一个容器到另一个容器。

# On host 1
ssh root@192.168.6.4 'cat /etc/hostname'
ssh root@192.168.6.2 'cat /etc/hostname'

# On host 3
ssh root@192.168.6.2 'cat /etc/hostname'
ssh root@192.168.6.4 'cat /etc/hostname'

里程碑4:集装箱的平面桥接专用网络

此设置在每个主机上需要2个NIC。我已经ens32在以上部分中了。感谢为ens34每台主机给我一个全新的操作员,该操作员从基础级别(vCenter)启用了混杂模式。请注意,尽管逻辑上在单独的网络中,ens32并且ens34实际上连接在同一vswitch上。首先,把所有东西都带ens34出来。他们不需要ip。

# On each host
ip li set ens34 up

bridge2在本节中,我将使用一个新的桥。这样就不会弄乱以前的配置。桥没有IP。每个容器内还将有一个新的NIC。在每个主机上执行以下操作

# Create new bridge. The bridge doesn't have ip
brctl addbr bridge2
ip link set dev bridge2 up

# Create new veth pair as the new NIC
ip link add ${TEST_PID}.eth2 type veth peer name veth2
ip li set veth2 netns ${TEST_PID}
brctl addif bridge2 ${TEST_PID}.eth2
ip netns exec ${TEST_PID} ip li set veth2 up
ip li set ${TEST_PID}.eth2 up

# Delete useless routes added by Milestone 3. It may mess up what we gonna do next
ip netns exec ${TEST_PID} ip route del 10.32.171.202
ip netns exec ${TEST_PID} ip route del 10.32.171.203
ip netns exec ${TEST_PID} ip route del 10.32.171.204

# Setup the route for each container
ip netns exec ${TEST_PID} ip route add 192.168.7.0/24 dev veth2

# Add new NIC of the host to bridge
brctl addif bridge2 ens34    # my second NIC on the host is ens34

# Enable promisc mode for NIC
ip li set ens34 promisc on

为每个容器设置IP

# On host 1
ip netns exec ${TEST_PID} ip addr add 192.168.7.2 dev veth2

# On host 2
ip netns exec ${TEST_PID} ip addr add 192.168.7.3 dev veth2

# On host 3
ip netns exec ${TEST_PID} ip addr add 192.168.7.4 dev veth2

现在您应该可以ssh每个容器。

# On host 1
ip netns exec ${TEST_PID} ssh root@192.168.7.4 'cat /etc/hostname'
ip netns exec ${TEST_PID} ssh root@192.168.7.2 'cat /etc/hostname'

# On host 3
ip netns exec ${TEST_PID} ssh root@192.168.7.2 'cat /etc/hostname'
ip netns exec ${TEST_PID} ssh root@192.168.7.4 'cat /etc/hostname'

里程碑5:主机和容器网络之间的路由器(FAILED)

我正在尝试在主机网络10.32.171.0/24和容器网络192.168.7.0/24之间构建路由器。由于在10.32.171.0/24上没有新的IP,因此将路由器放在主机1中。路由器由网络名称空间实现,甚至不需要容器。首先,在主机1上,设置路由器。

# On host 1
ip netns add testrt
ip li add testrt.eth0 type veth peer name veth0    # NIC for host network 10.32.171.0/24
ip li add testrt.eth1 type veth peer name veth1    # NIC for container network 192.168.7.0/24
ip li set veth0 netns testrt
ip li set veth1 netns testrt

brctl addif bridge2 testrt.eth1                    # connect veth1 to container network
ip netns exec testrt ip addr add 192.168.7.100 dev veth1

ip li set testrt.eth0 up
ip netns exec testrt ip li set veth0 up
ip li set testrt.eth1 up
ip netns exec testrt ip li set veth1 up

ip li set testrt.eth0 promisc on
ip netns exec testrt ip li set veth0 promisc on
ip li set testrt.eth1 promisc on
ip netns exec testrt ip li set veth1 promisc on

路由器testrt现在已连接到主机1内部的ens34。让我们安装testrt的路由。

# On host 1
# make sure packet going container network are routed into testrt
ip route add 192.168.7.0/24 dev testrt.eth0

# Install testrt router table
ip netns exec testrt ip route add 192.168.7.0/24 dev veth1
ip netns exec testrt ip route add 10.32.171.0/24 dev veth0

# You should be able to ping or ping from testrt now
ping 192.168.7.100
ip netns exec testrt ping 192.168.7.2
ip netns exec testrt ping 192.168.7.4
ip netns exec testrt ping 10.32.171.202

让我们将每个主机和容器设置为使用testrt作为网关。

# On each host
ip netns exec ${TEST_PID} ip route add 10.32.171.0/24 via 192.168.7.100 dev veth2

# On each host except host 1 (host 1 already has route to testrt)
ip route add 192.168.7.0/24 via 10.32.171.202 dev ens32

您应该能够从每个主机(网络10.32.171.0/24)到任何容器(网络192.168.7.0/24)进行ssh交换,反之亦然。

# On host 1
ssh root@192.168.7.2 'cat /etc/hostname'
ssh root@192.168.7.4 'cat /etc/hostname'
docker exec -it test1.1 ssh root@10.32.171.202 'cat /etc/hostname'
docker exec -it test1.1 ssh root@10.32.171.204 'cat /etc/hostname'

# On host 3
ssh root@192.168.7.2 'cat /etc/hostname'
ssh root@192.168.7.4 'cat /etc/hostname'
docker exec -it test3.1 ssh root@10.32.171.202 'cat /etc/hostname'
docker exec -it test3.1 ssh root@10.32.171.204 'cat /etc/hostname'

此处失败:主机1 ping 192.168.7.2,可以在test1.1的veth0(tcpdump)上看到ICMP请求,但从未到达veth1。我尝试使用容器(testrt2)而不是网络名称空间来构建testrt,并将其卡在同一位置。主机3 ping 192.168.7.101(容器testrt2),在testrt2.eth0上听到了对10.32.171.204的arp请求,但没有ens32。另一个问题,testrt2甚至无法ping 10.32.171.202。

里程碑6:只需将主机1和容器test1.1用作路由器(失败)

由于我发现很难使用网络名称空间或容器在主机1内创建虚拟路由器,因此我将通过调整ip路由将主机1本身用作路由器。首先,从Milestone 5中删除旧的网关设置。

# On each host
ip route del 192.168.7.0/24
ip netns exec ${TEST_PID} ip route del 10.32.171.0/24

# On host 1
brctl delif bridge2 testrt.eth1

我的计划是将主机1用作10.32.171.0/24的路由器/网关,将test1.1用作192.168.7.0/24的路由器/网关,并通过新的veth对将主机1和test1.1连接在一起。

# On host 1
ip li add ${TEST_PID}.eth3 type veth peer name veth3
ip li set veth3 netns ${TEST_PID}

ip li set ${TEST_PID}.eth3 promisc on
ip netns exec ${TEST_PID} ip li set veth3 promisc on

ip li set ${TEST_PID}.eth3 up
ip netns exec ${TEST_PID} ip li set veth3 up

ip route add 192.168.7.0/24 dev ${TEST_PID}.eth3
ip netns exec ${TEST_PID} ip route add 10.32.171.0/24 dev veth3

# Now you should be able to ping test1.1
ping 192.168.7.2

在其他主机和容器上设置网关

# On host 2 and 3
ip route add 192.168.7.0/24 via 10.32.171.202 dev ens32
ip netns exec ${TEST_PID} ip route add 10.32.171.0/24 via 192.168.7.2 dev veth2

失败:主机1 ping 192.168.7.4,位于test1.1内,veth3看到ICMP请求,但是veth2从未转发。Milestone 5遇到相同的问题。另一个问题,test1.1甚至无法ping主机1。至少我可以在此处生成一些规则

  • 如果未设置IP,则NIC不会发送数据包;如果IP与数据包不在同一网络范围内,NIC也不会发送数据包。即使ip route指出数据包已传递到该NIC。
  • 创建一个容器,然后将一对vester放入其中,然后在主机中放入一对。这不会使容器和主机连接。
  • 始终在每跳之间的两个方向上测试网络连接。

里程碑7:终于使路由器正常工作

在本节中,我将使用命名空间名称空间构建一个虚拟路由器,该虚拟路由器位于主机1中。容器网络192.168.7.0/24和主机网络10.32.171.205/24将能够通过此路由器相互通信。首先,让我们清理上面两个失败部分留下的混乱情况。

# On host 1
ip netns exec ${TEST_PID} ip li del veth3
brctl delif bridge2 testrt.eth0
brctl delif bridge2 testrt.eth1
brctl delif bridge2 testrt2.eth0
brctl delif bridge2 testrt2.eth0

# On each host
ip route del 192.168.7.0/24
ip netns exec ${TEST_PID} ip route del 10.32.171.0/24

尽管如此,我还是使用网络名称空间来构建路由器, testrt3

# On host 1
ip netns add testrt3
ip li add testrt3.eth0 type veth peer name veth0    # NIC for host network 10.32.171.0/24
ip li add testrt3.eth1 type veth peer name veth1    # NIC for container network 192.168.7.0/24
ip li set veth0 netns testrt3
ip li set veth1 netns testrt3

brctl addif bridge2 testrt3.eth1                    # connect veth1 to container network
ip netns exec testrt3 ip addr add 192.168.7.100 dev veth1

ip li set testrt3.eth0 up
ip netns exec testrt3 ip li set veth0 up
ip li set testrt3.eth1 up
ip netns exec testrt3 ip li set veth1 up

ip netns exec testrt3 ip route add 192.168.7.0/24 dev veth1
ip netns exec testrt3 ip route add 10.32.171.0/24 dev veth0

让我们将每个容器的网关设置为testrt3。

# On each host
ip netns exec ${TEST_PID} ip route add 10.32.171.0/24 via 192.168.7.100 dev veth2

# You should be able to ping router from container
ip netns exec ${TEST_PID} ping 192.168.7.100

这时,当我从容器ping通到主机时,可以在testrt3的veth1上看到icmp或arp请求。但是他们无法转发到testrt3的veth0。接下来,要将testrt3的veth0连接到主机1,我们创建一个桥,该桥连接到10.32.171.0/24而不是ens32。

# On host 1
brctl addbr br-ens32
ip li set br-ens32 promisc on

# Add ens32 to br-ens32, host 1 will be temporarily disconnected
ip addr del 10.32.171.202/24 dev ens32 && \
ip addr add 10.32.171.202/24 dev br-ens32 && \
brctl addif br-ens32 ens32 && \
ip route add default via 10.32.171.1 dev br-ens32 && \    # add default route. in my case it is 10.32.171.1
ip link set dev br-ens32 up

# Reconnect to host 1, restore the original routes
ip route add 169.254.0.0/16 dev br-ens32 metric 1002    # the cloud init address
ip route add 192.168.6.3 via 10.32.171.203 dev br-ens32
ip route add 192.168.6.4 via 10.32.171.204 dev br-ens32

# Restore the host NAT for 192.168.5.0/24
LINENO_MASQ=$(iptables -L POSTROUTING -v --line-numbers -t nat | grep 'MASQUERADE .* ens32 .* 192.168.5.0' | awk '{print $1}')
iptables -D POSTROUTING ${LINENO_MASQ} -t nat
iptables -t nat -A POSTROUTING -j MASQUERADE -s 192.168.5.0/24 -o br-ens32

现在我们可以将testrt3的veth0连接到10.32.171.0/24。

# On host 1
brctl addif br-ens32 testrt3.eth0

接下来,向面对10.32.171.0/24的testrt的veth0添加一个IP地址。我从网络运营商那里借了10.32.171.205。路由器NIC的两侧都应有ip。

# On host 1
ip netns exec testrt3 ip addr add 10.32.171.205 dev veth0

# You should be able to ping router
ping 10.32.171.205
ip netns exec testrt3 ping 10.32.171.202

将每个主机的网关设置为testrt3

# On host 1
ip route add 192.168.7.0/24 via 10.32.171.205 dev br-ens32

# On host 2 and 3
ip route add 192.168.7.0/24 via 10.32.171.205 dev ens32

到目前为止,我可以将任何主机的ssh切换到任何容器,反之亦然。

# On host 1
ssh root@192.168.7.2 'cat /etc/hostname' && \
ssh root@192.168.7.4 'cat /etc/hostname' && \
docker exec -it test1.1 ssh root@10.32.171.202 'hostname' && \
docker exec -it test1.1 ssh root@10.32.171.204 'hostname' && \
docker exec -it test1.1 ssh root@192.168.7.2 'cat /etc/hostname' && \
docker exec -it test1.1 ssh root@192.168.7.4 'cat /etc/hostname'

# On host 3
ssh root@192.168.7.2 'cat /etc/hostname' && \
ssh root@192.168.7.4 'cat /etc/hostname' && \
docker exec -it test3.1 ssh root@10.32.171.202 'hostname' && \
docker exec -it test3.1 ssh root@10.32.171.204 'hostname' && \
docker exec -it test3.1 ssh root@192.168.7.2 'cat /etc/hostname' && \
docker exec -it test3.1 ssh root@192.168.7.4 'cat /etc/hostname'

最后,我使路由器正常工作!关键是,路由器的每个侧面NIC必须具有其所面对的网络范围内的IP。其他主机或容器必须在相应侧将网关设置为此路由器的IP。在之前的部分中,我始终尝试避免为路由器使用额外的IP,即10.32.171.205,该IP仍然无法正常工作。

未来

使用gre / vxlan为容器设置专用隧道网络。向其添加路由器。

 

推荐阅读

iptables详解(1):iptables概念

iptables详解(2):路由表

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值