节点规划
IP地址 | 主机名 | 节点 |
---|---|---|
192.168.200.40 | master | swarm集群master节点 |
192.168.200.41 | node | swarm集群node节点 |
基础准备:配置好主机名和网络,并安装好docker-ce
部署swarm集群
配置主机映射
[root@master ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.200.40 master
192.168.200.41 node
配置时间同步
所有节点安装chrony服务。
# yum install -y chrony
Master节点修改/etc/chrony.conf文件,注释默认NTP服务器,指定上游公共NTP服务器,并允许其他节点同步时间。
[root@master ~]# sed -i 's/^server/#&/' /etc/chrony.conf
[root@master ~]# cat >> /etc/chrony.conf << EOF
local stratum 10
server ntp1.aliyun.com iburst
allow all
EOF
Master节点重启chronyd服务并设为开机启动,开启网络时间同步功能。
[root@master ~]# systemctl enable chronyd && systemctl restart chronyd
[root@master ~]# timedatectl set-ntp true
Node节点修改/etc/chrony.conf文件,指定内网 Master节点为上游NTP服务器,重启服务并设为开机启动。
[root@node ~]# sed -i 's/^server/#&/' /etc/chrony.conf
[root@node ~]# echo server 10.18.4.33 iburst >> /etc/chrony.conf //IP为master节点地址
[root@node ~]# systemctl enable chronyd && systemctl restart chronyd
所有节点执行chronyc sources命令,查询结果中如果存在以“^*”开头的行,即说明已经同步成功。
# chronyc sources
210 Number of sources = 1
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
^* 120.25.115.20 2 6 31 2 -9427us[+28798s] +/- 50ms
配置Docker API
所有节点开启docker API。
# vi /lib/systemd/system/docker.service
将
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
修改为
ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock
# systemctl daemon-reload
# systemctl restart docker
# ./image.sh
初始化集群
在master节点创建Swarm集群
[root@master ~]# docker swarm init --advertise-addr 192.168.200.40
Swarm initialized: current node (5f8ng0rphfv5z5zm8ef675ts3) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-10s8ucnirvldo0ezz0vury7kgdb60c0sxagtbvjhsiw7waz9yn-13bdkhm47mueul5l7vv3ujwrf 192.168.200.40:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
初始化命令中“–advertise-addr”选项表示管理节点公布它的IP是多少。其它节点必须能通过这个IP找到管理节点。
输出结果中包含3个步骤:
① Swarm创建成功,swarm-manager成为manager node。
② 添加worker node需要执行的命令。
③ 添加manager node需要执行的命令。
node节点加入集群
复制前面的docker swarm join命令,在Node节点执行以加入Swarm集群。
[root@node ~]# docker swarm join --token SWMTKN-1-10s8ucnirvldo0ezz0vury7kgdb60c0sxagtbvjhsiw7waz9yn-13bdkhm47mueul5l7vv3ujwrf 192.168.200.40:2377
This node joined a swarm as a worker.
如果初始化时没有记录下docker swarm init提示的添加worker的完整命令,可以通过docker swarm join-token worker命令查看。
[root@master ~]# docker swarm join-token worker
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-10s8ucnirvldo0ezz0vury7kgdb60c0sxagtbvjhsiw7waz9yn-13bdkhm47mueul5l7vv3ujwrf 192.168.200.40:2377
验证集群
登录Master节点,查看各节点状态。
[root@master ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
5f8ng0rphfv5z5zm8ef675ts3 * master Ready Active Leader 18.09.6
yafko9vsiqqvbjt0q0chto4ng node Ready Active 18.09.6
安装Portainer
Portainer是Docker的图形化管理工具,提供状态显示面板、应用模板快速部署、容器镜像网络数据卷的基本操作(包括上传和下载镜像、创建容器等操作)、事件日志显示、容器控制台操作、Swarm集群和服务等集中管理和操作、登录用户管理和控制等功能。功能十分全面,基本能满足中小型企业对容器管理的全部需求。
登录Master节点,安装Portainer。
[root@master ~]# docker service create --name portainer --publish 9000:9000 --replicas=1 --constraint 'node.role == manager' --mount type=bind,src=//var/run/docker.sock,dst=/var/run/docker.sock --mount type=volume,src=portainer_data,dst=/data portainer/portainer -H unix:///var/run/docker.sock
cupdp88lqqywaecazjz71vfcm
overall progress: 1 out of 1 tasks
1/1: running
verify: Waiting 5 seconds to verify that tasks are stable...
verify: Waiting 5 seconds to verify that tasks are stable...
verify: Waiting 5 seconds to verify that tasks are stable...
verify: Waiting 5 seconds to verify that tasks are stable...
verify: Waiting 5 seconds to verify that tasks are stable...
verify: Waiting 4 seconds to verify that tasks are stable...
verify: Waiting 4 seconds to verify that tasks are stable...
verify: Waiting 4 seconds to verify that tasks are stable...
verify: Waiting 4 seconds to verify that tasks are stable...
verify: Waiting 4 seconds to verify that tasks are stable...
verify: Waiting 3 seconds to verify that tasks are stable...
verify: Waiting 3 seconds to verify that tasks are stable...
verify: Service converged
登录Portainer
打开浏览器,输入地址http://master_IP:9000访问Portainer主页,如下图所示:
首次登录时需设置用户名和密码,然后输入设置的用户名和密码进行登录,进入Swarm控制台,如下图所示:
运行Service
现在已经创建好了Swarm集群,执行如下命令部署一个运行httpd镜像的Service。
[root@master ~]# docker service create --name web_server httpd
qy82v4g9ux9jchz6f7s68juto
overall progress: 1 out of 1 tasks
1/1: running [==================================================>]
verify: Service converged
部署Service的命令形式与运行容器的docker run很相似,–name为Service命名,httpd为镜像的名字。
通过docker service ls命令可以查看当前Swarm中的Service。
[root@master ~]# docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
cupdp88lqqyw portainer replicated 1/1 portainer/portainer:latest *:9000->9000/tcp
qy82v4g9ux9j web_server replicated 1/1 httpd:latest
REPLICAS显示当前副本信息,1/1意思是web_server这个Service期望的容器副本数量为1,目前已经启动的副本数量为1,即当前Service已经部署完成。
命令docker service ps可以查看Service每个副本的状态。
[root@master ~]# docker service ps web_server
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
ho5as4iv7l7a web_server.1 httpd:latest node Running Running 4 minutes ago
可以查看到Service唯一的副本被分派到node,当前的状态是Running。
service伸缩
刚刚仅部署了只有一个副本的Service,不过对于Web服务,通常会运行多个实例。这样可以负载均衡,同时也能提供高可用。
Swarm要实现这个目标非常简单,增加Service的副本数就可以了。在Master节点上执行如下命令。
[root@master ~]# docker service scale web_server=5
web_server scaled to 5
overall progress: 5 out of 5 tasks
1/5: running [==================================================>]
2/5: running [==================================================>]
3/5: running [==================================================>]
4/5: running [==================================================>]
5/5: running [==================================================>]
verify: Service converged
副本数增加到5,通过docker service ls和docker service ps命令查看副本的详细信息。
[root@master ~]# docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
cupdp88lqqyw portainer replicated 1/1 portainer/portainer:latest *:9000->9000/tcp
qy82v4g9ux9j web_server replicated 5/5 httpd:latest
[root@master ~]# docker service ps web_server
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
ho5as4iv7l7a web_server.1 httpd:latest node Running Running 44 hours ago
psrlfzcxpnbz web_server.2 httpd:latest master Running Running 38 seconds ago
p0qw5bcfm3zk web_server.3 httpd:latest node Running Running about a minute ago
d1flqovagz24 web_server.4 httpd:latest master Running Running 38 seconds ago
f2bm75f6gb5l web_server.5 httpd:latest node Running Running about a minute ago
5个副本已经分布在Swarm的所有节点上。
既然可以通过scale up扩容服务,当然也可以通过scale down减少副本数,运行下面的命令。
[root@master ~]# docker service ps web_server
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
ho5as4iv7l7a web_server.1 httpd:latest node Running Running 44 hours ago
psrlfzcxpnbz web_server.2 httpd:latest master Running Running 8 minutes ago
可以查看到web_server.3、web_server.4和web_server.5这3个副本已经被删除了。
访问Service
要访问http服务,首先得保证网络通畅,其次需要知道服务的IP。查看容器的网络配置。
[root@master ~]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
872cfb574304 httpd:latest "httpd-foreground" 12 minutes ago Up 12 minutes 80/tcp web_server.2.psrlfzcxpnbzya60n0hqcdkyt
3f7caf458c91 portainer/portainer:latest "/portainer -H unix:…" 44 hours ago Up 44 hours 8000/tcp, 9000/tcp portainer.1.xlaqvh2qz8oweznozs2pcz33l
5e70c25de4a2 registry:latest "/entrypoint.sh /etc…" 5 days ago Up 45 hours 0.0.0.0:5000->5000/tcp registry
在Master上运行了一个容器,是web_server的一个副本,容器监听了80端口,但并没有映射到Docker Host,所以只能通过容器的IP访问。但是服务并没有暴露给外部网络,只能在Docker主机上访问,外部无法访问。要将Service暴露到外部,方法其实很简单,执行下面的命令即可。
[root@master ~]# docker service update --publish-add 8080:80 web_server
web_server
overall progress: 2 out of 2 tasks
1/2: running [==================================================>]
2/2: running [==================================================>]
verify: Service converged
–publish-add 8080:80将容器的80映射到主机的8080端口,这样外部网络就能访问到Service了。通过http://任意节点IP:8080即可访问Service,如下图所示:
service存储数据
Service的容器副本可能会伸缩,甚至失败,会在不同的主机上创建和销毁,这就引出一个问题,如果Service有需要管理的数据,那么这些数据应该如何存放呢?如果把数据打包在容器里,这显然不行,除非数据不会发生变化,否则,如何在多个副本之间保持同步呢?volume是将宿主级的目录映射到容器中,以实现数据持久化。可以用两种方式来实现:
- volume默认模式:工作节点宿主机数据同步到容器内。
- volume NFS共享存储模式:管理节点宿主同步到工作节点宿主,工作节点宿主同步到容器。
生产环境中一般推荐使用volume NFS共享存储模式。
登录Master节点,安装NFS服务端、配置NFS主配置文件、添加权限并启动。
[root@master ~]# yum install nfs-utils -y
添加目录让相应网段可以访问并添加读写权限。
vi /etc/exports
/root/share 192.168.200.0/24(rw,async,insecure,anonuid=1000,anongid=1000,no_root_squash)
创建共享目录,添加权限。
[root@master ~]# mkdir -p /root/share
[root@master ~]# chmod 777 /root/share
/root/share为共享目录,生效配置。
[root@master ~]# exportfs -rv
exporting 192.168.200.0/24:/root/share
开启RPC服务并设置开机自启。
[root@master ~]# systemctl start rpcbind
[root@master ~]# systemctl enable rpcbind
启动NFS服务并设置开机自启。
[root@master ~]# systemctl start nfs
[root@master ~]# systemctl enable nfs
查看NFS是否挂载成功。
[root@master ~]# cat /var/lib/nfs/etab
/root/share 192.168.200.0/24(rw,async,wdelay,hide,nocrossmnt,insecure,no_root_squash,no_all_squash,no_subtree_check,secure_locks,acl,no_pnfs,anonuid=1000,anongid=1000,sec=sys,rw,insecure,no_root_squash,no_all_squash)
登录Node节点,安装NFS客户端并启动服务。
[root@node ~]# yum install nfs-utils -y
[root@node ~]# systemctl start rpcbind
[root@node ~]# systemctl enable rpcbind
[root@node ~]# systemctl start nfs
[root@node ~]# systemctl enable nfs
部署的服务可能分不到各个节点上,在所有节点创建docker volume。
# docker volume create --driver local --opt type=nfs --opt o=addr=192.168.200.40,rw --opt device=:/root/share foo33
–opt device=:/root/share用于指向共享目录,也可以是共享目录下的子目录。
查看volume:
# docker volume ls
DRIVER VOLUME NAME
local foo33
local portainer_data
可以查看到docker volume列表中有foo33,查看volume详细信息。
[root@master ~]# docker volume inspect foo33
[
{
"CreatedAt": "2022-06-30T18:59:33+08:00",
"Driver": "local",
"Labels": {},
"Mountpoint": "/var/lib/docker/volumes/foo33/_data",
"Name": "foo33",
"Options": {
"device": ":/root/share",
"o": "addr=192.168.200.40,rw",
"type": "nfs"
},
"Scope": "local"
}
]
可以看出NFS的/root/share被挂载到了/var/lib/docker/volumes/foo33/_data目录。
创建并发布服务:
[root@master ~]# docker service create --name test-nginx-nfs --publish 80:80 --mount type=volume,source=foo33,destination=/app/share --replicas 3 nginx
8oum0hbdue82g2dllioy36w0a
overall progress: 3 out of 3 tasks
1/3: running [==================================================>]
2/3: running [==================================================>]
3/3: running [==================================================>]
verify: Service converged
查看服务分布的节点:
[root@master ~]# docker service ps test-nginx-nfs
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
u7moidu94mlb test-nginx-nfs.1 nginx:latest node Running Running 2 minutes ago
ob5bxjva6y5l test-nginx-nfs.2 nginx:latest master Running Running 2 minutes ago
nfm3gsyt7vqc test-nginx-nfs.3 nginx:latest node Running Running 2 minutes ago
在Master节点/root/share目录中生成一个index.html文件。
[root@master ~]# cd /root/share/
[root@master share]# touch index.html
[root@master share]# ll
total 0
-rw-r--r-- 1 root root 0 Jun 30 19:11 index.html
查看宿主机目录挂载情况。
[root@master share]# docker volume inspect foo33
[
{
"CreatedAt": "2022-06-30T19:11:54+08:00",
"Driver": "local",
"Labels": {},
"Mountpoint": "/var/lib/docker/volumes/foo33/_data",
"Name": "foo33",
"Options": {
"device": ":/root/share",
"o": "addr=192.168.200.40,rw",
"type": "nfs"
},
"Scope": "local"
}
]
[root@master share]# ls /var/lib/docker/volumes/foo33/_data/
index.html
查看容器目录。
[root@master share]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
e9f553ff6955 nginx:latest "/docker-entrypoint.…" 7 minutes ago Up 7 minutes 80/tcp test-nginx-nfs.2.ob5bxjva6y5lh9nfvsonn9mwv
d3882f1d65d7 httpd:latest "httpd-foreground" About an hour ago Up About an hour 80/tcp web_server.2.xoscum3l3up24ijifk7t05zur
3f7caf458c91 portainer/portainer:latest "/portainer -H unix:…" 45 hours ago Up 45 hours 8000/tcp, 9000/tcp portainer.1.xlaqvh2qz8oweznozs2pcz33l
5e70c25de4a2 registry:latest "/entrypoint.sh /etc…" 5 days ago Up 46 hours 0.0.0.0:5000->5000/tcp registry
[root@master share]# docker exec -it e9f553ff6955 bash
root@e9f553ff6955:/# ls app/share/
index.html
可以发现,NFS已经挂载成功。
调度调节
默认配置下Master也是worker node,所以Master上也运行了副本。如果不希望在Master上运行Service,可以执行如下命令。
[root@master ~]# docker node update --availability drain master
master
通过 docker node ls 命令查看各节点现在的状态。
[root@master ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
5f8ng0rphfv5z5zm8ef675ts3 * master Ready Drain Leader 18.09.6
yafko9vsiqqvbjt0q0chto4ng node Ready Active 18.09.6
Drain表示Master已经不负责运行Service,之前Master运行的那1个副本会如何处理呢?使用docker service ps命令来查看。
[root@master ~]# docker service ps test-nginx-nfs
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
zldb50hko5u1 test-nginx-nfs.1 nginx:latest node Running Running about a minute ago
evfeswhul6ig \_ test-nginx-nfs.1 nginx:latest master Shutdown Shutdown about a minute ago
vhm8tkgbauui test-nginx-nfs.2 nginx:latest node Running Running 9 minutes ago
yyaouxk58z5g test-nginx-nfs.3 nginx:latest node Running Running 9 minutes ago
Master上的副本test-nginx-nfs.1已经被Shut down了,为了达到3个副本数的目标,在Node上添加了新的副本test-nginx-nfs.1。