flannel是COREOS公司开发的容器网络解决方案,flannel为每个容器分配一个subnet,该主机上的容器从这个subnet上获取地址,这些IP可以在主机之间路由,容器之间无需NAT和port mapping就可以互相通信。
每个subnet都是从一个更大的地址池中划分的,flannel会在每个主机上运行一个叫flanneld的agent,其职责就是从这个大的地址池中分配subnet。为了在主机之间共享信息,flannel用etcd存放网络配置、已分配的subnet、host的IP等信息。
接下来开始实践flannel。
一、实验环境
IP | 主机名 | 部署 |
---|---|---|
10.1.1.17 | master | etcd |
10.1.1.13 | host1 | etcd |
10.1.1.14 | host2 | etcd |
二、配置etcd集群:
三台主机同时进行以下操作:
下载etcd-v3.3.10-linux-amd64.tar.gz并配置:
下载链接: https://pan.baidu.com/s/1VAE3CrtmDD8E_5K-fFmFqQ 提取码: 5qbq
[root@localhost ~]# tar xf etcd-v3.3.10-linux-amd64.tar.gz
[root@localhost ~]# cd etcd-v3.3.10-linux-amd64
[root@localhost ~]# cp etcd etcdctl /usr/local/bin/
创建etcd数据目录:
[root@localhost ~]# mkdir -p /var/lib/etcd
在master主机创建etcd的systemd unit文件/usr/lib/systemd/system/etcd.service:
[Unit]
Description=etcd server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.conf
ExecStart=/usr/local/bin/etcd \
--name master \
--initial-advertise-peer-urls http://10.1.1.17:2380 \
--listen-peer-urls http://10.1.1.17:2380 \
--listen-client-urls http://10.1.1.17:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.1.1.17:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster master=http://10.1.1.17:2380,node1=http://10.1.1.13:2380,node2=http://10.1.1.14:2380 \
--initial-cluster-state new \
--data-dir=/var/lib/etcd
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
在host1主机创建etcd的systemd unit文件/usr/lib/systemd/system/etcd.service:
[Unit]
Description=etcd server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.conf
ExecStart=/usr/local/bin/etcd \
--name node1 \
--initial-advertise-peer-urls http://10.1.1.13:2380 \
--listen-peer-urls http://10.1.1.13:2380 \
--listen-client-urls http://10.1.1.13:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.1.1.13:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster master=http://10.1.1.17:2380,node1=http://10.1.1.13:2380,node2=http://10.1.1.14:2380
--initial-cluster-state new \
--data-dir=/var/lib/etcd
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
在host1主机创建etcd的systemd unit文件/usr/lib/systemd/system/etcd.service:
[Unit]
Description=etcd server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.conf
ExecStart=/usr/local/bin/etcd \
--name node2 \
--initial-advertise-peer-urls http://10.1.1.14:2380 \
--listen-peer-urls http://10.1.1.14:2380 \
--listen-client-urls http://10.1.1.14:2379,http://127.0.0.1:2379 \
--advertise-client-urls http://10.1.1.14:2379 \
--initial-cluster-token etcd-cluster-1 \
--initial-cluster master=http://10.1.1.17:2380,node1=http://10.1.1.13:2380,node2=http://10.1.1.14:2380 \
--initial-cluster-state new \
--data-dir=/var/lib/etcd
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
由于并没有配置TSL认证,所以都是http而不是https,etcd客户端监听在2379,服务端监听在2380
三个节点配置完成了服务文件,于是加载该文件并启动服务:
[root@localhost ~]# systemctl daemon-reload
[root@localhost ~]# systemctl enable etcd
[root@localhost ~]# systemctl start etcd
[root@localhost ~]# systemctl status etcd.service
master:
host1:
host2:
三、etcd基本使用
查看集群中的成员,任意节点执行就行:
[root@localhost ~]# etcdctl member list
710f9ef1aa65701d: name=node2 peerURLs=http://10.1.1.14:2380 clientURLs=http://10.1.1.14:2379 isLeader=false
7ba5fed3ecb69a7a: name=master peerURLs=http://10.1.1.17:2380 clientURLs=http://10.1.1.17:2379 isLeader=true
ec10529c611c4e55: name=node1 peerURLs=http://10.1.1.13:2380 clientURLs=http://10.1.1.13:2379 isLeader=false
可以看见集群中自动推选了一个节点作为leader,然后查看集群健康状态:
[root@localhost ~]# etcdctl cluster-health
member 710f9ef1aa65701d is healthy: got healthy result from http://10.1.1.14:2379
member 7ba5fed3ecb69a7a is healthy: got healthy result from http://10.1.1.17:2379
member ec10529c611c4e55 is healthy: got healthy result from http://10.1.1.13:2379
cluster is healthy
[root@localhost ~]#
使用etcd进行操作数据:
[root@localhost ~]# etcdctl set name zll
zll
在host1和host2上进行查看:
host1:
[root@localhost ~]# etcdctl get name
zll
host2:
[root@localhost ~]# etcdctl get name
zll
对etcd集群中的member进行操作:
移除成员:
首先查看member列表:
[root@localhost ~]# etcdctl member list
710f9ef1aa65701d: name=node2 peerURLs=http://10.1.1.14:2380 clientURLs=http://10.1.1.14:2379 isLeader=false
7ba5fed3ecb69a7a: name=master peerURLs=http://10.1.1.17:2380 clientURLs=http://10.1.1.17:2379 isLeader=true
ec10529c611c4e55: name=node1 peerURLs=http://10.1.1.13:2380 clientURLs=http://10.1.1.13:2379 isLeader=false
现在将node2也就是http://10.1.1.14:2379节点移除集群:
[root@localhost ~]# etcdctl member remove 710f9ef1aa65701d
Removed member 710f9ef1aa65701d from cluster
再次查看集群节点:
[root@localhost ~]# etcdctl member list
7ba5fed3ecb69a7a: name=master peerURLs=http://10.1.1.17:2380 clientURLs=http://10.1.1.17:2379 isLeader=true
ec10529c611c4e55: name=node1 peerURLs=http://10.1.1.13:2380 clientURLs=http://10.1.1.13:2379 isLeader=false
向集群增加一个member,将10.1.1.14加入集群:
[root@localhost ~]# etcdctl member add node2 http://10.1.1.14:2380
Added member named node2 with ID 2887571d3752e0b2 to cluster
ETCD_NAME="node2"
ETCD_INITIAL_CLUSTER="node2=http://10.1.1.14:2380,master=http://10.1.1.17:2380,node1=http://10.1.1.13:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"
[root@localhost ~]# etcdctl member list
2887571d3752e0b2[unstarted]: peerURLs=http://10.1.1.14:2380
7ba5fed3ecb69a7a: name=master peerURLs=http://10.1.1.17:2380 clientURLs=http://10.1.1.17:2379 isLeader=true
ec10529c611c4e55: name=node1 peerURLs=http://10.1.1.13:2380 clientURLs=http://10.1.1.13:2379 isLeader=false
以上可以看出,node2已经加入集群,但当前是unstarted状态,接着根据上面的提示来:
[root@localhost ~]# export ETCD_NAME="node2"
[root@localhost ~]# export ETCD_INITIAL_CLUSTER="node2=http://10.1.1.14:2380,master=http://10.1.1.17:2380,node1=http://10.1.1.13:2380"
[root@localhost ~]# export ETCD_INITIAL_CLUSTER_STATE="existing"
[root@localhost ~]# etcd --listen-client-urls http://10.1.1.14:2379 --advertise-client-urls http://10.1.1.14:2379 --listen-peer-urls http://10.1.1.14:2380 --initial-advertise-peer-urls http://10.1.1.14:2380 --data-dir=/var/lib/etcd/
2019-08-30 23:25:07.436845 I | pkg/flags: recognized and used environment variable ETCD_INITIAL_CLUSTER=node2=http://10.1.1.14:2380,master=http://10.1.1.17:2380,node1=http://10.1.1.13:2380
2019-08-30 23:25:07.436977 I | pkg/flags: recognized and used environment variable ETCD_INITIAL_CLUSTER_STATE=existing
2019-08-30 23:25:07.436993 I | pkg/flags: recognized and used environment variable ETCD_NAME=node2
2019-08-30 23:25:07.437035 I | etcdmain: etcd Version: 3.3.10
2019-08-30 23:25:07.437040 I | etcdmain: Git SHA: 27fc7e2
2019-08-30 23:25:07.437043 I | etcdmain: Go Version: go1.10.4
2019-08-30 23:25:07.437063 I | etcdmain: Go OS/Arch: linux/amd64
2019-08-30 23:25:07.437067 I | etcdmain: setting maximum number of CPUs to 1, total number of available CPUs is 1
2019-08-30 23:25:07.437189 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2019-08-30 23:25:07.437453 C | etcdmain: listen tcp 10.1.1.14:2380: bind: cannot assign requested address
node2节点绑定成功,但注册失败,而且该节点一旦被移除,节点上的服务也会自动down掉,正在研究解决办法中。
四、etcd与flannel部署
先在10.1.1.17master节点部署etcd:
yum install -y etcd
修改配置文件:
[root@localhost ~]# vim /etc/etcd/etcd.conf
#[Member]
ETCD_DATA_DIR="/var/lib/etcd"
ETCD_LISTEN_CLIENT_URLS="http://10.1.1.17:2379,http://127.0.0.1:2379"
ETCD_NAME="default"
ETCD_ADVERTISE_CLIENT_URLS="http://10.1.1.17:2379"
启动etcd:
[root@localhost ~]# systemctl start etcd
测试etcd:
[root@localhost ~]# etcdctl set testdir/testkey0 0
0
[root@localhost ~]# etcdctl get testdir/testkey0
0
测试完成,正常返回0表示etcd基本正常。
在10.1.1.17,10.1.1.13-14三台服务器安装flannel:
[root@localhost ~]# yum install -y flannel
三个节点分别修改配置文件:
[root@localhost ~]# vim /etc/sysconfig/flanneld
#Flanneld configuration options
#etcd url location. Point this to the server where etcd runs
#FLANNEL_ETCD_ENDPOINTS对应地址必须为etcd的地址,三个节点必须一致
FLANNEL_ETCD_ENDPOINTS="http://10.1.1.17:2379"
#etcd config key. This is the configuration key that flannel queries
#For address range assignment
FLANNEL_ETCD_PREFIX="/atomic.io/network"
#Any additional options that you want to pass
#FLANNEL_OPTIONS=""
五、配置网络
使用etcd在10.1.1.17上配置网络:
[root@localhost ~]# etcdctl mk /atomic.io/network/config '{ "Network": "192.168.0.0/16" }'
分别启动flannel:
[root@localhost ~]# systemctl enable flanneld.service
[root@localhost ~]# systemctl start flanneld.service
必须重启docker:
[root@localhost ~]# systemctl restart docker
查看三台主机的IP,可以看到docker0默认网桥的网段与flannel0的网段一样,以10.1.1.17节点为例:
[root@localhost ~]# ifconfig
docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1472
inet 192.168.3.1 netmask 255.255.255.0 broadcast 0.0.0.0
inet6 fe80::42:32ff:fe50:9752 prefixlen 64 scopeid 0x20<link>
ether 02:42:32:50:97:52 txqueuelen 0 (Ethernet)
RX packets 6582 bytes 268851 (262.5 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 11027 bytes 9304023 (8.8 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
flannel0: flags=4305<UP,POINTOPOINT,RUNNING,NOARP,MULTICAST> mtu 1472
inet 192.168.3.0 netmask 255.255.0.0 destination 192.168.3.0
inet6 fe80::b1ea:266b:96b3:a6d5 prefixlen 64 scopeid 0x20<link>
unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 500 (UNSPEC)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 56 bytes 4596 (4.4 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
在三个主机分别启动nginx容器,分别为master,node-1,node-2:
[root@localhost ~]# docker run -ti -d --name=master docker.io/nginx /bin/bash
[root@localhost ~]# docker run -ti -d --name=node-1 docker.io/nginx /bin/bash
[root@localhost ~]# docker run -ti -d --name=node-2 docker.io/nginx /bin/bash
分别进入容器master,node-1,node-2查看IP:
master:
[root@localhost ~]# docker attach master
root@c0ec62d06a89:/# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1472
inet 192.168.3.2 netmask 255.255.255.0 broadcast 0.0.0.0
inet6 fe80::42:c0ff:fea8:302 prefixlen 64 scopeid 0x20<link>
ether 02:42:c0:a8:03:02 txqueuelen 0 (Ethernet)
RX packets 11036 bytes 9304721 (8.8 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 6583 bytes 361041 (352.5 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
node-1:
[root@localhost ~]# docker attach node-1
root@dc07f4251c87:/# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1472
inet 192.168.81.2 netmask 255.255.255.0 broadcast 0.0.0.0
inet6 fe80::42:c0ff:fea8:5102 prefixlen 64 scopeid 0x20<link>
ether 02:42:c0:a8:51:02 txqueuelen 0 (Ethernet)
RX packets 10784 bytes 9311035 (8.8 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 6525 bytes 355837 (347.4 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
node-2:
[root@localhost ~]# docker attach node-2
root@4a01d549e745:/# ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1472
inet 192.168.39.2 netmask 255.255.255.0 broadcast 0.0.0.0
inet6 fe80::42:c0ff:fea8:2702 prefixlen 64 scopeid 0x20<link>
ether 02:42:c0:a8:27:02 txqueuelen 0 (Ethernet)
RX packets 12237 bytes 9436762 (8.9 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 7916 bytes 489085 (477.6 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
现在可以进行ping测试:
在master中ping node-1和node-2:
root@c0ec62d06a89:/# ping -c 2 192.168.81.2
PING 192.168.81.2 (192.168.81.2): 56 data bytes
64 bytes from 192.168.81.2: icmp_seq=0 ttl=60 time=1.572 ms
64 bytes from 192.168.81.2: icmp_seq=1 ttl=60 time=0.641 ms
--- 192.168.81.2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.641/1.107/1.572/0.466 ms
root@c0ec62d06a89:/# ping -c 2 192.168.39.2
PING 192.168.39.2 (192.168.39.2): 56 data bytes
64 bytes from 192.168.39.2: icmp_seq=0 ttl=60 time=1.385 ms
64 bytes from 192.168.39.2: icmp_seq=1 ttl=60 time=0.603 ms
--- 192.168.39.2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.603/0.994/1.385/0.391 ms
在node-1中ping master和node-2:
root@dc07f4251c87:/# ping -c 2 192.168.3.2
PING 192.168.3.2 (192.168.3.2): 56 data bytes
64 bytes from 192.168.3.2: icmp_seq=0 ttl=60 time=1.544 ms
64 bytes from 192.168.3.2: icmp_seq=1 ttl=60 time=0.678 ms
--- 192.168.3.2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.678/1.111/1.544/0.433 ms
root@dc07f4251c87:/# ping -c 2 192.168.39.2
PING 192.168.39.2 (192.168.39.2): 56 data bytes
64 bytes from 192.168.39.2: icmp_seq=0 ttl=60 time=2.601 ms
64 bytes from 192.168.39.2: icmp_seq=1 ttl=60 time=2.026 ms
--- 192.168.39.2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 2.026/2.313/2.601/0.288 ms
在node-2中ping master和node-1:
root@4a01d549e745:/# ping -c 2 192.168.3.2
PING 192.168.3.2 (192.168.3.2): 56 data bytes
64 bytes from 192.168.3.2: icmp_seq=0 ttl=60 time=1.851 ms
64 bytes from 192.168.3.2: icmp_seq=1 ttl=60 time=1.544 ms
--- 192.168.3.2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 1.544/1.698/1.851/0.154 ms
root@4a01d549e745:/# ping -c 2 192.168.81.2
PING 192.168.81.2 (192.168.81.2): 56 data bytes
64 bytes from 192.168.81.2: icmp_seq=0 ttl=60 time=1.557 ms
64 bytes from 192.168.81.2: icmp_seq=1 ttl=60 time=0.713 ms
--- 192.168.81.2 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.713/1.135/1.557/0.422 ms
以上跨主机容器均可正常通信。
如上面操作后,发现各容器内分配的ip之间相互ping不通,基本就是由于防火墙问题引起的!
可是明明已经在前面部署的时候,通过"systemctl stop firewalld.service"关闭了防火墙,为什么还有防火墙问题??
这是因为linux还有底层的iptables,所以解决办法是在各节点上执行下面操作:
iptables -P INPUT ACCEPT
iptables -P FORWARD ACCEPT
iptables -F
iptables -L -n
以上,flannel配置实验基本完成。