docker(11、Docker Swarm2) 5、Swarm 如何实现 Failover 6、如何访问 Service 7、swarm 的 routing mesh

42 篇文章 0 订阅

5、Swarm 如何实现 Failover

故障是在所难免的,容器可能崩溃,Docker Host 可能宕机,不过幸运的是,Swarm 已经内置了 failover策略
创建Service 的时候,我们没有告诉 swarm 发生故障时该如何处理,只是说明了我们期望的状态(比如 3 份副本),swarm会尽最大努力达成这个期望的状态,无论发生什么状况。

[root@swarm-manager ~]# docker service ps web_server 
ID                  NAME                IMAGE               NODE                DESIRED S
waibnto5emfg        web_server.1        httpd:latest        swarm-worker2       Running  
wt8p6ogq9ufa        web_server.2        httpd:latest        swarm-worker1       Running  
saisz4jtz9sl        web_server.3        httpd:latest        swarm-worker1       Running  
ep0b1ao3wob8         \_ web_server.3    httpd:latest        swarm-manager       Shutdown 
[root@swarm-manager ~]# 

当前 3 个副本运行在 swarm-worker1(2副本) 和 swarm-worker2(1副本)上 ,现在我们测试swarm 的failover特性,swarm-worker1 关机 

swarm-worker1 关机前状态

​
[root@swarm-manager ~]# docker service ps web_server 
ID                  NAME                IMAGE               NODE                DESIRED S
waibnto5emfg        web_server.1        httpd:latest        swarm-worker2       Running  
wt8p6ogq9ufa        web_server.2        httpd:latest        swarm-worker1       Running  
saisz4jtz9sl        web_server.3        httpd:latest        swarm-worker1       Running  
ep0b1ao3wob8         \_ web_server.3    httpd:latest        swarm-manager       Shutdown 
[root@swarm-manager ~]# 

 swarm-worker1 进行关机操作

swarm检测到swarm-worker1 关机,开始在swarm-worker2上启动新的容器

[root@swarm-manager ~]# docker service ps web_server 
ID                  NAME                IMAGE               NODE                DESIRED STATE       CURRENT STATE            ERROR               PORTS
waibnto5emfg        web_server.1        httpd:latest        swarm-worker2       Running             Running 59 minutes ago                       
a01unr3dlz11        web_server.2        httpd:latest        swarm-worker2       Ready               Ready 2 seconds ago                          
wt8p6ogq9ufa         \_ web_server.2    httpd:latest        swarm-worker1       Shutdown            Running 13 seconds ago                       
xvk85uad8m9g        web_server.3        httpd:latest        swarm-worker2       Ready               Ready 2 seconds ago                          
saisz4jtz9sl         \_ web_server.3    httpd:latest        swarm-worker1       Shutdown            Running 13 seconds ago                                          
[root@swarm-manager ~]#

故障转移完毕,3个副本都运行在了swarm-worker2 上

[root@swarm-manager ~]# docker service ps web_server 
ID                  NAME                IMAGE               NODE                DESIRED STATE       CURRENT STATE                ERROR               PORTS
waibnto5emfg        web_server.1        httpd:latest        swarm-worker2       Running             Running about an hour ago                        
a01unr3dlz11        web_server.2        httpd:latest        swarm-worker2       Running             Running 47 seconds ago                           
wt8p6ogq9ufa         \_ web_server.2    httpd:latest        swarm-worker1       Shutdown            Running about a minute ago                       
xvk85uad8m9g        web_server.3        httpd:latest        swarm-worker2       Running             Running 47 seconds ago                           
saisz4jtz9sl         \_ web_server.3    httpd:latest        swarm-worker1       Shutdown            Running about a minute ago                                             
[root@swarm-manager ~]# 

swarm-worker1已被标记为 Down

[root@swarm-manager ~]# docker node ls
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
9ggtulpdf9mr4x66o9defr56k *   swarm-manager       Ready               Drain               Leader              19.03.8
lw2jxibv485xttqjeyxjxjckw     swarm-worker1       Down                Active                                  19.03.8
rdbwovw93wk3gb2yqjpipjgyd     swarm-worker2       Ready               Active                                  19.03.8
[root@swarm-manager ~]# 

6、如何访问 Service

为了便于分析,我们重新部署 web_server。

① docker service rm 删除 web_server,service 的所有副本(容器)都会被删除。
② 重新创建 service,这次直接用 --replicas=2 创建两个副本。
③ 每个 worker node 上运行了一个副本。

查看新建Service 容器分布情况

[root@swarm-manager ~]# docker service ps web_server
ID                  NAME                IMAGE               NODE                DESIRED STATE       CURRENT STATE           ERROR               PORTS
n6aup7g9qe3w        web_server.1        httpd:latest        swarm-worker1       Running             Running 5 minutes ago                       
k1k7zmy3nbiw        web_server.2        httpd:latest        swarm-worker2       Running             Running 5 minutes ago                       
[root@swarm-manager ~]# 

查看某容器内部IP地址(例如看web_server.1)在swarm-worker1上查看,在 swarm-worker1 上运行了一个容器,是 web_server 的一个副本,容器监听了 80 端口,但并没有映射到 Docker Host,所以只能通过容器的 IP 访问。查看一下容器的 IP。

查看方式1

[root@swarm-worker1 ~]# docker inspect web_server.1.n6aup7g9qe3wuh6fi4qjqltm1  | jq .[0].NetworkSettings
{
  "Bridge": "",
  "SandboxID": "5acfc48ce5cf469975c9e089395e1982e97d717e283e29cfdd8e4785e0a2d9d9",
  "HairpinMode": false,
  "LinkLocalIPv6Address": "",
  "LinkLocalIPv6PrefixLen": 0,
  "Ports": {
    "80/tcp": null
  },
  "SandboxKey": "/var/run/docker/netns/5acfc48ce5cf",
  "SecondaryIPAddresses": null,
  "SecondaryIPv6Addresses": null,
  "EndpointID": "0019aec7e0088b9f25f3117073a8855ad8347547aa979b1d4263f5a7fa8025f1",
  "Gateway": "172.17.0.1",
  "GlobalIPv6Address": "",
  "GlobalIPv6PrefixLen": 0,
  "IPAddress": "172.17.0.2",
  "IPPrefixLen": 16,
  "IPv6Gateway": "",
  "MacAddress": "02:42:ac:11:00:02",
  "Networks": {
    "bridge": {
      "IPAMConfig": null,
      "Links": null,
      "Aliases": null,
      "NetworkID": "4861ecf0308533c9b8098628afb6fdad00104304227bcda2bad1524b89bbe408",
      "EndpointID": "0019aec7e0088b9f25f3117073a8855ad8347547aa979b1d4263f5a7fa8025f1",
      "Gateway": "172.17.0.1",
      "IPAddress": "172.17.0.2",
      "IPPrefixLen": 16,
      "IPv6Gateway": "",
      "GlobalIPv6Address": "",
      "GlobalIPv6PrefixLen": 0,
      "MacAddress": "02:42:ac:11:00:02",
      "DriverOpts": null
    }
  }
}
[root@swarm-worker1 ~]# 

查看方式2

 docker exec web_server.1.n6aup7g9qe3wuh6fi4qjqltm1  ip r

容器 IP 为 172.17.0.2,实际上连接的是 Docker 默认 bridge 网络。

我们可以直接在 swarm-worker1 上访问容器的 http 服务。

这样的访问也仅仅是容器层面的访问,服务并没有暴露给外部网络,只能在 Docker 主机上访问。换句话说,当前配置下,我们无法访问 service web_server。

外部访问Service

要将 service 暴露到外部,执行下面的命令:

[root@swarm-manager ~]# docker service update --publish-add 8080:80 web_server
web_server
overall progress: 0 out of 2 tasks 
1/2: preparing 
2/2:   
service update paused: update paused due to failure or early termination of task l3xw1g63lc7iwtcd1gkhka0p9
[root@swarm-manager ~]# 

如果是新建 service,可以直接用使用 --publish 参数,比如:

docker service create --name web_server --publish 8080:80 --replicas=2 httpd

容器在 80 端口上监听 http 请求,--publish-add 8080:80 将容器的 80 映射到主机的 8080 端口,这样外部网络就能访问到 service 了。

[root@swarm-manager ~]# curl http://192.168.1.109:8080   #访问 swarm-worker1 IP 验证外部访问
<html><body><h1>It works!</h1></body></html>
[root@swarm-manager ~]# curl http://192.168.1.108:8080   #访问 swarm-worker2 IP 验证外部访问
<html><body><h1>It works!</h1></body></html>
[root@swarm-manager ~]# curl http://192.168.1.107:8080   #访问 swarm-manager IP 验证外部访问(swarm-manager 上没有容器,也可以访问成功,这是swarm的一个特性 routing mesh)
<html><body><h1>It works!</h1></body></html>

curl 集群中任何一个节点的 8080 端口,都能够访问到 web_server

这实际上就是使用 swarm 的好处了,这个功能叫做 routing mesh

7、swarm 的 routing mesh

swarm 的 routing mesh 。当外部访问任意节点的8080端口时,swarm 内部的 load balance 会将请求转发给web_server 其中的一个副本。大概如下图所示:

无论访问哪个节点,及时该节点上没有运行Service的副本,最终都能访问到Service。
我们还可以配置一个外部的 load balance ,将请求路由到 swarm Service 。比如配置HAProxy,将请求分发到各个节点的8080端口。
如下图所示,一共是两层 load balance。

ingress 网络

当我们应用 --publis-add 8080:80 时,swarm 会重新配置service,我们会看到容器发生了以下变化

 1、创建新的service 此时没有端口

[root@swarm-manager ~]# docker service create --name web_server --replicas 2 tomcat
rpnazgtqu5besyf9nu37k0mrp
overall progress: 2 out of 2 tasks 
1/2: running   
2/2: running   
verify: Service converged 

查看web_server服务 没在manager的原因是因为swarm-manager 为仅manager模式

[root@swarm-manager ~]# docker service ps web_server
ID                  NAME                IMAGE               NODE                DESIRED STATE       CURRENT STATE                ERROR               PORTS
vo9qpszyizvg        web_server.1        tomcat:latest       swarm-worker1       Running             Running 43 seconds ago                           
tb07x0zd80m5        web_server.2        tomcat:latest       swarm-worker2       Running             Running about a minute ago                       
[root@swarm-manager ~]# 

查看每个worker上的容器

[root@swarm-worker1 ~]# docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
e0ebf913c320        tomcat:latest       "catalina.sh run"   9 minutes ago       Up 9 minutes        8080/tcp            web_server.1.vo9qpszyizvgcvhrat8o6ftx4
[root@swarm-worker1 ~]# 
[root@swarm-worker2 ~]# docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
ee290ffb8548        tomcat:latest       "catalina.sh run"   23 minutes ago      Up 23 minutes       8080/tcp            web_server.2.tb07x0zd80m5wbuk3md4qephf
[root@swarm-worker2 ~]# 

应用 --publis-add 8080:80 时,swarm 会重新配置 service

[root@swarm-manager ~]# docker service update --publish-add  8080:8080 web_server
web_server
overall progress: 2 out of 2 tasks 
1/2: running   
2/2: running   
verify: Service converged 
[root@swarm-manager ~]#
[root@swarm-manager ~]# docker service ps web_server
ID                  NAME                IMAGE               NODE                DESIRED STATE       CURRENT STATE                     ERROR               PORTS
lyt5rqclwmb7        web_server.1        tomcat:latest       swarm-worker1       Running             Running less than a second ago                        
vo9qpszyizvg         \_ web_server.1    tomcat:latest       swarm-worker1       Shutdown            Shutdown less than a second ago                       
85rzce2kmove        web_server.2        tomcat:latest       swarm-worker2       Running             Running less than a second ago                        
tb07x0zd80m5         \_ web_server.2    tomcat:latest       swarm-worker2       Shutdown            Shutdown less than a second ago                       
[root@swarm-manager ~]#                     

之前的所有副本都被 Shutdown,然后启动了新的副本。我们查看一下新副本的容器网络配置。

[root@swarm-worker1 ~]# docker exec  web_server.1.lyt5rqclwmb7uy4pei2v1t89q ip r
default via 172.18.0.1 dev eth1 
10.0.0.0/24 dev eth0 proto kernel scope link src 10.0.0.7 
172.18.0.0/16 dev eth1 proto kernel scope link src 172.18.0.3 
[root@swarm-worker1 ~]# 
 
[root@swarm-worker2 ~]# docker exec web_server.2.85rzce2kmovex0z4qczvf08bo ip r
default via 172.18.0.1 dev eth1 
10.0.0.0/24 dev eth0 proto kernel scope link src 10.0.0.6 
172.18.0.0/16 dev eth1 proto kernel scope link src 172.18.0.3 
[root@swarm-worker2 ~]# 

 容器的网络与 --publish-add 之前已经大不一样了,现在有两块网卡,每块网卡连接不同的 Docker 网络

实际上:

  1. eth0 连接的是一个 overlay 类型的网络,名字为 ingress,其作用是让运行在不同主机上的容器可以相互通信。

  2. eth1 连接的是一个 bridge 类型的网络,名字为 docker_gwbridge,其作用是让容器能够访问到外网。

[root@swarm-worker1 ~]# docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
1053874d4c05        bridge              bridge              local
e1f61fad2c8e        docker_gwbridge     bridge              local
ed70c23b9dc0        host                host                local
ps22zeddizrp        ingress             overlay             swarm
db704745faea        none                null                local
[root@swarm-worker1 ~]# 
[root@swarm-worker2 ~]# docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
629e0e29cb41        bridge              bridge              local
5f4edd2f9569        docker_gwbridge     bridge              local
ed70c23b9dc0        host                host                local
ps22zeddizrp        ingress             overlay             swarm
db704745faea        none                null                local
[root@swarm-worker2 ~]# 

ingress 网络是 swarm 创建时 Docker 为自动我们创建的,swarm 中的每个 node 都能使用 ingress

通过 overlay 网络,主机与容器,容器与容器之间可以互相访问。同时,routing mesh 将外部请求路由到不同主机的容器,从而实现了外部网络对service的访问。

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值