一、Docker Swarm简介
上面的链接是Docker Swarm官方文档介绍
二、创建Docker Swarm
1. 环境准备
系统版本 | docker版本 | hostname | ip地址 |
Ubuntu 20.04.02 | 20.10.12 | docker-swarm-manager | 192.168.100.10 |
docker-swarm-node1 | 192.168.100.11 | ||
docker-swarm-node2 | 192.168.100.12 | ||
docker-swarm-node3 | 192.168.100.13 |
2. 主机群的设置
- 用于群集管理通信的TCP 端口 2377
- TCP和UDP 端口 7946,用于节点之间的通信
- 用于覆盖网络流量的UDP 端口 4789
3. 初始化Swarm
1. 创建新的群
docker swarm init --advertise-addr 192.168.100.10:2377
Swarm initialized: current node (pi68on0hdlfghbt0aqi7hssfr) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-5ntm061z0zqeti3d7leo1470kti1397haq7dbwdwzysxz8r70q-8itm8rifkjh0j8we0dasf8cqv 192.168.100.10:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
2. 检查swarm的状态
docker info
...snip...
Swarm: active
NodeID: pi68on0hdlfghbt0aqi7hssfr
Is Manager: true
ClusterID: iqd9k1j8pm32vvmrh1yai5xv2
Managers: 1
Nodes: 1
3. 查看相关节点状态
docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
pi68on0hdlfghbt0aqi7hssfr * docker-swarm-manager Ready Active Leader 20.10.12
4. 加入节点
1. 如果忘记了join的token,可以在manager上运行以下命令
docker swarm jion-token worker
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-5ntm061z0zqeti3d7leo1470kti1397haq7dbwdwzysxz8r70q-8itm8rifkjh0j8we0dasf8cqv 192.168.100.10:2377
2. 打开node1的终端或者通过SSH远程连接并加入swarm
docker swarm join \
--token SWMTKN-1-5ntm061z0zqeti3d7leo1470kti1397haq7dbwdwzysxz8r70q-8itm8rifkjh0j8we0dasf8cqv \
192.168.100.10:2377
This node joined a swarm as a worker.
3. 在manager端检查node状态
docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
pi68on0hdlfghbt0aqi7hssfr * docker-swarm-manager Ready Active Leader 20.10.12
9wl85wofncens0op1fht44onh docker-swarm-node1 Ready Active 20.10.12
4. 然后在node2和node3上以此类推最终达到以下状态
docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
pi68on0hdlfghbt0aqi7hssfr * docker-swarm-manager Ready Active Leader 20.10.12
9wl85wofncens0op1fht44onh docker-swarm-node1 Ready Active 20.10.12
wx8o91l3iphjpp6sifwa3t67n docker-swarm-node2 Ready Active 20.10.12
vt61077ko09lvpfgg80wa6sa5 docker-swarm-node3 Ready Active 20.10.12
三、创建图形化UI-Visualizer
1. 直接部署Visualizer容器-开放8080端口,如果端口被占用自行更换其他端口
docker run -d -p 8080:8080 -e HOST=172.16.0.10 -e PORT=8080 \
-v /var/run/docker.sock:/var/run/docker.sock \
--name visualizer dockersamples/visualizer
//-e HOST指定的是容器
2. 打开Visualizer网站
http://192.168.100.10:8080
确认1个Manager和3个Worker都正常显示
四、部署服务
1. 创建一个服务
docker service create --replicas 8 --name web -p 80:80 nginx
mvtevaaz4gns1to94xcmcg44s
overall progress: 8 out of 8 tasks
1/8: running [==================================================>]
2/8: running [==================================================>]
3/8: running [==================================================>]
4/8: running [==================================================>]
5/8: running [==================================================>]
6/8: running [==================================================>]
7/8: running [==================================================>]
8/8: running [==================================================>]
verify: Service converged
创建服务命令docker service create
标志服务名称 --name web
指定8个实例运行所需的状态 --replicas 8, 一个实例就是一个容器
将服务定义为执行nginx(80端口)容器:-p 80:80 nginx
2. 检查服务
Manager节点
docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
mvtevaaz4gns web replicated 8/8 nginx:latest *:80->80/tcp
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b01461d7d65f nginx:latest "/docker-entrypoint.…" 3 minutes ago Up 3 minutes 80/tcp web.3.jastpsuzc79xxvucx17rfsb9q
410b4a32b2f7 nginx:latest "/docker-entrypoint.…" 3 minutes ago Up 3 minutes 80/tcp web.7.pr506t1svbnfsp56gnn4p4g7d
node1节点(node2,node3以此类推)
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
4c645bf8f22e nginx:latest "/docker-entrypoint.…" 3 minutes ago Up 3 minutes 80/tcp web.2.lqqr8o8qc74vtxodc2xiyklme
aa5298c9bdab nginx:latest "/docker-entrypoint.…" 3 minutes ago Up 3 minutes 80/tcp web.6.x7k5rpxnd8lrt3iqgbsnurcwd
图形化界面
确认图形化界面当中的CONTAINER ID和命令查看一致
docker service ps web
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
bswlg5zgveyo web.1 nginx:latest docker-swarm-node2 Running Running 11 minutes ago
lqqr8o8qc74v web.2 nginx:latest docker-swarm-node1 Running Running 11 minutes ago
jastpsuzc79x web.3 nginx:latest docker-swarm-manager Running Running 11 minutes ago
yi4zi4dwcwbu web.4 nginx:latest docker-swarm-node3 Running Running 11 minutes ago
sdd1qk04o26i web.5 nginx:latest docker-swarm-node2 Running Running 11 minutes ago
x7k5rpxnd8lr web.6 nginx:latest docker-swarm-node1 Running Running 11 minutes ago
pr506t1svbnf web.7 nginx:latest docker-swarm-manager Running Running 11 minutes ago
l5t467cpy3j5 web.8 nginx:latest docker-swarm-node3 Running Running 11 minutes ago
3. 将副本数量增加或减少
docker service scale web=5
web scaled to 5
overall progress: 5 out of 5 tasks
1/5: running [==================================================>]
2/5: running [==================================================>]
3/5: running [==================================================>]
4/5: running [==================================================>]
5/5: running [==================================================>]
scale数量大于当前则为增加,少于当前则为减少,示例中为减少
docker service ps web
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
bswlg5zgveyo web.1 nginx:latest docker-swarm-node2 Running Running 16 minutes ago
lqqr8o8qc74v web.2 nginx:latest docker-swarm-node1 Running Running 16 minutes ago
jastpsuzc79x web.3 nginx:latest docker-swarm-manager Running Running 16 minutes ago
yi4zi4dwcwbu web.4 nginx:latest docker-swarm-node3 Running Running 16 minutes ago
sdd1qk04o26i web.5 nginx:latest docker-swarm-node2 Running Running 16 minutes ago
图形化界面显示
4. 设置Manager节点不运行容器,如果要drain一个node,只需输入node名替换manager即可
docker node update --availability drain docker-swarm-manager
docker-swarm-manager
docker service ps web
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
bswlg5zgveyo web.1 nginx:latest docker-swarm-node2 Running Running 23 minutes ago
lqqr8o8qc74v web.2 nginx:latest docker-swarm-node1 Running Running 23 minutes ago
3ccpb8wnkkpj web.3 nginx:latest docker-swarm-node1 Running Running 34 seconds ago
jastpsuzc79x \_ web.3 nginx:latest docker-swarm-manager Shutdown Shutdown 35 seconds ago
yi4zi4dwcwbu web.4 nginx:latest docker-swarm-node3 Running Running 22 minutes ago
sdd1qk04o26i web.5 nginx:latest docker-swarm-node2 Running Running 23 minutes ago
//设置manager以后不运行容器,但已经运行的容器并不会停止
// --availability:选项后面共有三个选项可配置,如下:
active:工作;pause:暂时不工作;drain:永久性的不工作
此处manager上的节点shutdown,并且转移到node1节点,图形化界面如下
5. 恢复manager节点可以运行容器
docker node update --availability active docker-swarm-manager
docker-swarm-manager
docker service ps web
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
bswlg5zgveyo web.1 nginx:latest docker-swarm-node2 Running Running 23 minutes ago
lqqr8o8qc74v web.2 nginx:latest docker-swarm-node1 Running Running 23 minutes ago
3ccpb8wnkkpj web.3 nginx:latest docker-swarm-node1 Running Running 34 seconds ago
jastpsuzc79x \_ web.3 nginx:latest docker-swarm-manager Shutdown Shutdown 35 seconds ago
yi4zi4dwcwbu web.4 nginx:latest docker-swarm-node3 Running Running 22 minutes ago
sdd1qk04o26i web.5 nginx:latest docker-swarm-node2 Running Running 23 minutes ago
如图所示,发现恢复了之后,node1节点的web.3也不会转移回manager节点,这个时候需要重新Scale一下服务
docker service scale web=8
web scaled to 8
overall progress: 8 out of 8 tasks
1/8: running [==================================================>]
2/8: running [==================================================>]
3/8: running [==================================================>]
4/8: running [==================================================>]
5/8: running [==================================================>]
6/8: running [==================================================>]
7/8: running [==================================================>]
8/8: running [==================================================>]
verify: Service converged
查看服务,会发现manager节点重新运行容器,但是之前的web.3不会再恢复了
docker service ps web
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
bswlg5zgveyo web.1 nginx:latest docker-swarm-node2 Running Running 32 minutes ago
lqqr8o8qc74v web.2 nginx:latest docker-swarm-node1 Running Running 32 minutes ago
3ccpb8wnkkpj web.3 nginx:latest docker-swarm-node1 Running Running 9 minutes ago
jastpsuzc79x \_ web.3 nginx:latest docker-swarm-manager Shutdown Shutdown 9 minutes ago
yi4zi4dwcwbu web.4 nginx:latest docker-swarm-node3 Running Running 32 minutes ago
sdd1qk04o26i web.5 nginx:latest docker-swarm-node2 Running Running 32 minutes ago
83t42vs4ok1m web.6 nginx:latest docker-swarm-manager Running Running about a minute ago
5bam3kkyqfst web.7 nginx:latest docker-swarm-node3 Running Running about a minute ago
xam0e6o28kaq web.8 nginx:latest docker-swarm-manager Running Running about a minute ago
图形化界面显示如下
6. 将node1节点的Docker Service Stop
systemctl stop docker
Warning: Stopping docker.service, but it can still be activated by:
docker.socket
打开图形化界面检查node1节点的服务,变成红色,且服务转移到Maneger和Node3节点
恢复node1节点的docker服务
systemctl start docker
检查图形化界面,会发现该节点已恢复,但是服务并未在node1节点重新启动,如果想恢复和之前drain一样重新scale一下服务即可
7. 同时部署两个服务(部署前已将之前服务删除)
docker service create --replicas 4 --name web -p 80:80 nginx
krnna254c80y9qnzpalgegqcf
overall progress: 4 out of 4 tasks
1/4: running [==================================================>]
2/4: running [==================================================>]
3/4: running [==================================================>]
4/4: running [==================================================>]
verify: Service converged
docker service create --replicas 4 --name helloworld alpine ping docker.com
pyu39oz1zvtg8hjvtuhvtc1lv
overall progress: 4 out of 4 tasks
1/4: running [==================================================>]
2/4: running [==================================================>]
3/4: running [==================================================>]
4/4: running [==================================================>]
verify: Service converged
图形化界面显示
这里就不去单独节点查看服务了,有兴趣的可以自行查看
五、删除服务
查看当前node2节点运行的容器信息如下
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ad971edaf5b5 nginx:latest "/docker-entrypoint.…" 43 minutes ago Up 43 minutes 80/tcp web.1.bswlg5zgveyotosf344xt8jul
2f03bc1777d6 nginx:latest "/docker-entrypoint.…" 43 minutes ago Up 43 minutes 80/tcp web.5.sdd1qk04o26if40msv77yxbgd
删除Manager节点的服务
docker service rm web
web
docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
docker service ps web
no such service: web
docker service rm helloworld
helloworld
docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
docker service ps helloworld
no such service: helloworld
重新检查node2运行的容器情况,容器已消失
docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
图形化界面如下,和初始登陆网站没有任何服务运行一致
六、删除Node节点,退出Swarm
node1节点申请离开swarm群
docker swarm leave
Node left the swarm.
manager查看当前node状态,确认node1状态为Down
docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
pi68on0hdlfghbt0aqi7hssfr * docker-swarm-manager Ready Active Leader 20.10.12
9wl85wofncens0op1fht44onh docker-swarm-node1 Down Active 20.10.12
wx8o91l3iphjpp6sifwa3t67n docker-swarm-node2 Ready Active 20.10.12
vt61077ko09lvpfgg80wa6sa5 docker-swarm-node3 Ready Active 20.10.12
删除node1节点
docker node rm docker-swarm-node1
docker-swarm-node1
docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
pi68on0hdlfghbt0aqi7hssfr * docker-swarm-manager Ready Active Leader 20.10.12
wx8o91l3iphjpp6sifwa3t67n docker-swarm-node2 Ready Active 20.10.12
vt61077ko09lvpfgg80wa6sa5 docker-swarm-node3 Ready Active 20.10.12
删除运行中的节点
docker node rm docker-swarm-node2
Error response from daemon: rpc error: code = FailedPrecondition desc = node wx8o91l3iphjpp6sifwa3t67n is not down and can't be removed
以此类推删除所有节点之后,manager申请退出集群
docker swarm leave --force
Node left the swarm.
docker node ls
Error response from daemon: This node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again.
再打打开图形化界面发现整个swarm集群已消失
PS: 官网还有很多其他内容,比如docker swarm --autolock等等,有兴趣的可以仔细研究