docker swarm
适用于小型服务
swarm集群管理
-
初始化swarm集群
-
初始化swarm集群
- 简介
# swarm集群初始命令 docker swarm init iption
# option --advertise-addr string Advertised address (format: <ip|interface>[:port]) --autolock Enable manager autolocking (requiring an unlock key to start a stopped manager) --availability string Availability of the node ("active"|"pause"|"drain") (default "active") --cert-expiry duration Validity period for node certificates (ns|us|ms|s|m|h) (default 2160h0m0s) --data-path-addr string Address or interface to use for data path traffic (format: <ip|interface>) --data-path-port uint32 Port number to use for data path traffic (1024 - 49151). If no value is set or is set to 0, the default port (4789) is used. --default-addr-pool ipNetSlice default address pool in CIDR format (default []) --default-addr-pool-mask-length uint32 default address pool subnet mask length (default 24) --dispatcher-heartbeat duration Dispatcher heartbeat period (ns|us|ms|s|m|h) (default 5s) --external-ca external-ca Specifications of one or more certificate signing endpoints --force-new-cluster Force create a new cluster from current state --listen-addr node-addr Listen address (format: <ip|interface>[:port]) (default 0.0.0.0:2377) --max-snapshots uint Number of additional Raft snapshots to retain --snapshot-interval uint Number of log entries between Raft snapshots (default 10000) --task-history-limit int Task history retention limit (default 5) # 必要选项: --advertise-addr nodeIP # 生产环境建议加上autolock,自动生成管理密码,没有密码不可以操作集群上的服务,但是节点上的操作任然是被允许的,比如危险命令 # docker swarm leave -f , --autolock
- 演示
[root@master01 ~]# docker swarm init --advertise-addr 192.168.100.100 --autolock Swarm initialized: current node (yqxoycfgndwmiyjynr54fuhsq) is now a manager. To add a worker to this swarm, run the following command: docker swarm join --token SWMTKN-1-0m16mv1kjlk4ul04kfjmp0wrgeoyvw1xdhijt9akpemc8ul2mv-dx9rs1ihclocpvj0xkwbjkf92 192.168.100.100:2377 To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions. To unlock a swarm manager after it restarts, run the `docker swarm unlock` command and provide the following key: SWMKEY-1-AqQyfkwjrMxW/6DPRnwM3BMXYQ1MOmE962WPxH4kpqg Please remember to store this key in a password manager, since without it you will not be able to restart the manager.
-
-
集群节点管理
- 加入集群
# 运行初始化生成的命令可以将节点加入到swarm集群 docker swarm join --token SWMTKN-1-0m16mv1kjlk4ul04kfjmp0wrgeoyvw1xdhijt9akpemc8ul2mv-dx9rs1ihclocpvj0xkwbjkf92 192.168.100.100:2377 # 如果要加入集群并设置节点为manger,运行此命令将会生成加入命令 docker swarm join-token manager # 忘记命令也不要紧docker swarm join-token option 将重新生成加入集群token, [root@master01 ~]# docker swarm join-token --help Usage: docker swarm join-token [OPTIONS] (worker|manager) Manage join tokens Options: -q, --quiet Only display token --rotate Rotate join token
- 查看集群节点
[root@master01 ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION ounkvqlzlzkmb0p4kqgmr4goh * master01 Ready Active Leader 20.10.17 106yy9rf4seympuk1zb8evn77 master02 Ready Active 20.10.17
- 节点主动离开集群
离开集群前,请务必确认好改节点是否的
manager
节点若集群上,若当前集群manager
节点个数为2,离开以后将导致集群不可用!# 现将该节点上的容器驱逐,强制离开将导致服务临时掉线 [root@master01 ~]# docker node update --availability drain nodeName/nodeID # 登录到需要离开集群的节点上执行 [root@master02 /]# docker swarm leave -f
- 删除节点
已经主动离开的阶段或者已更换的node需要manager删主动删除节点,
注意:需要删除的节点状态必须为Down,否则删除不了
[root@master01 ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION ounkvqlzlzkmb0p4kqgmr4goh * master01 Ready Active Leader 20.10.17 106yy9rf4seympuk1zb8evn77 master02 Down Active 20.10.17 [root@master01 ~]# docker node rm master02 master02 [root@master01 ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION ounkvqlzlzkmb0p4kqgmr4goh * master01 Ready Active Leader 20.10.17 [root@master01 ~]#
- 节点状态为down
状态 原因 是否可恢复 down 主动离开集群 不可恢复,在manager上删除后重新加入 down docker服务没有启动/主机网络不通 启动后自动变为Ready -
节点角色管理
角色有两种:
- manager:管理者
- worker:工人
最小的高可用swarm集群必须有三台主机,三个manger,容忍一台宕机或者服务掉线
# 将worker节点提升为manager # 方法1:docker node promote [root@master01 ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION ounkvqlzlzkmb0p4kqgmr4goh * master01 Ready Active Leader 20.10.17 0m9ns2j13qnri3c7ajegbfyca master02 Ready Active 20.10.17 0jyazzuwa848731ej7tcfbruo master03 Ready Active 20.10.17 [root@master01 ~]# docker node p master02 promote ps [root@master01 ~]# docker node promote master02 Node master02 promoted to a manager in the swarm. [root@master01 ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION ounkvqlzlzkmb0p4kqgmr4goh * master01 Ready Active Leader 20.10.17 0m9ns2j13qnri3c7ajegbfyca master02 Ready Active Reachable 20.10.17 0jyazzuwa848731ej7tcfbruo master03 Ready Active 20.10.17 [root@master01 ~]# # 方法二:docker node update --role manager master03 [root@master01 ~]# docker node update --role manager "docker node update" requires exactly 1 argument. See 'docker node update --help'. Usage: docker node update [OPTIONS] NODE Update a node [root@master01 ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION ounkvqlzlzkmb0p4kqgmr4goh * master01 Ready Active Leader 20.10.17 0m9ns2j13qnri3c7ajegbfyca master02 Ready Active Reachable 20.10.17 0jyazzuwa848731ej7tcfbruo master03 Ready Active 20.10.17 [root@master01 ~]# docker node update --role manager master03 master03 [root@master01 ~]# docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION ounkvqlzlzkmb0p4kqgmr4goh * master01 Ready Active Leader 20.10.17 0m9ns2j13qnri3c7ajegbfyca master02 Ready Active Reachable 20.10.17 0jyazzuwa848731ej7tcfbruo master03 Ready Active Reachable 20.10.17
service管理
- 创建服务
参数太多了,请查看帮着文档
# docker service create [OPTIONS] IMAGE [COMMAND] [ARG...]
# 简单示范,创建2个副本的nginx
[root@master01 ~]# docker service create --name my-nginx --replicas=2 nginx:latest
- 查看服务
- ls
- ps
- inspect
# 查看服务列表
[root@master01 ~]# docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
c92sztnj0kwf my-nginx replicated 2/2 nginx:latest
# 查看详情
[root@master01 ~]# docker service ps my-nginx
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
ix9if5ckpix6 my-nginx.1 nginx:latest master01 Running Running 55 seconds ago
q3b0u82ohktr my-nginx.2 nginx:latest master02 Running Running 55 seconds ago
#
[root@master01 ~]# docker service inspect my-nginx --pretty
ID: c92sztnj0kwfrwzgfdmkrs4za
Name: my-nginx
Service Mode: Replicated
Replicas: 2
Placement:
UpdateConfig:
Parallelism: 1
On failure: pause
Monitoring Period: 5s
Max failure ratio: 0
Update order: stop-first
RollbackConfig:
Parallelism: 1
On failure: pause
Monitoring Period: 5s
Max failure ratio: 0
Rollback order: stop-first
ContainerSpec:
Image: nginx:latest@sha256:0d17b565c37bcbd895e9d92315a05c1c3c9a29f762b011a10c54a66cd53c9b31
Init: false
Resources:
Endpoint Mode: vip
- 调度(扩展/缩减)
# 扩展
[root@master01 ~]# docker service scale my-nginx=6
my-nginx scaled to 6
overall progress: 6 out of 6 tasks
1/6: running [==================================================>]
2/6: running [==================================================>]
3/6: running [==================================================>]
4/6: running [==================================================>]
5/6: running [==================================================>]
6/6: running [==================================================>]
verify: Service converged
[root@master01 ~]# docker service ps my-nginx
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
ix9if5ckpix6 my-nginx.1 nginx:latest master01 Running Running 6 minutes ago
q3b0u82ohktr my-nginx.2 nginx:latest master02 Running Running 6 minutes ago
yhsame8ip3f5 my-nginx.3 nginx:latest master02 Running Running 27 seconds ago
gx3kif1hatel my-nginx.4 nginx:latest master01 Running Running 26 seconds ago
xdwaaiwczs0s my-nginx.5 nginx:latest master03 Running Running 20 seconds ago
61buc8iz8yqd my-nginx.6 nginx:latest master03 Running Running 20 seconds ago
# 缩减
[root@master01 ~]# docker service scale my-nginx=3
my-nginx scaled to 3
overall progress: 3 out of 3 tasks
1/3: running [==================================================>]
2/3: running [==================================================>]
3/3: running [==================================================>]
verify: Service converged
[root@master01 ~]# docker service ps my-nginx
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
ix9if5ckpix6 my-nginx.1 nginx:latest master01 Running Running 10 minutes ago
599evfsdau6d my-nginx.2 nginx:latest master01 Running Running 3 minutes ago
q3b0u82ohktr \_ my-nginx.2 nginx:latest master02 Shutdown Shutdown 3 minutes ago
hwuer2ldfzsm my-nginx.3 nginx:latest master03 Running Running 3 minutes ago
yhsame8ip3f5 \_ my-nginx.3 nginx:latest master02 Shutdown Shutdown 3 minutes ago
- 删除服务
docker service rm servierName/serviceID
节点驱逐与启用
- drain:驱逐,service调度到该节点上的所有container停用,并在其他可用节点上创建对应个数的container
- active:启用,重新该节点纳入到可调度节点中
- pause:暂停调度到该节点上,已经在该节点上的container不受影响
docker node update --availability active/drain/pause nodeName/nodeID