目录:
第一节 多节点OpenStack Charms 部署指南0.0.1.dev223–1--OpenStack Charms 部署指南
第二节 多节点OpenStack Charms 部署指南0.0.1.dev223–2-安装MAAS
第三节 多节点OpenStack Charms 部署指南0.0.1.dev223–3-安装Juju
第四节 多节点OpenStack Charms 部署指南0.0.1.dev223–4-安装openstack
第五节 多节点OpenStack Charms 部署指南0.0.1.dev223–5--使bundle安装openstack
第六节 多节点OpenStack Charms 部署指南0.0.1.dev223–6--配置vault和设置数字证书生命周期
第七节 多节点OpenStack Charms 部署指南0.0.1.dev223–7--juju 离线部署bundle
第八节 多节点OpenStack Charms 部署指南0.0.1.dev223–8--配置 OpenStack
附录 t 多节点OpenStack Charms 部署指南0.0.1.dev223–附录T–OpenStack 高可用性
第九节 多节点OpenStack Charms 部署指南0.0.1.dev223–9--网络拓扑
第十节 多节点OpenStack Charms 部署指南0.0.1.dev223–10–OpenStack 高可用基础架构实际
第十一节 多节点OpenStack Charms 部署指南0.0.1.dev223–11–访问Juju仪表板
第十二节 多节点OpenStack Charms 部署指南0.0.1.dev223–12–OpenStack 配置openstack失败后处理
第十三节 多节点OpenStack Charms 部署指南0.0.1.dev223–13–OpenStack配置高可用后无法登陆openstack dashboard
第十四节 多节点OpenStack Charms 部署指南0.0.1.dev223–14–ssh端口转发解决IDC机房国际线路不良问题
第十五节 多节点OpenStack Charms 部署指南0.0.1.dev299–15–OpenStack 实例高可用
第十六节 多节点OpenStack Charms 部署指南0.0.1.dev299–16–OpenStack基础架构高可用The easyrsa resource is missing. .
第十七节 多节点OpenStack Charms 部署指南0.0.1.dev303–17–修改实例数量等quota上限
第十八节 多节点OpenStack Charms 部署指南0.0.1.dev303–18–backup备份
第十九节 多节点OpenStack Charms 部署指南0.0.1.dev303–19–juju log
第二十节 多节点OpenStack Charms 部署指南0.0.1.dev303–20–控制器高可用性
第二十一节 多节点OpenStack Charms 部署指南0.0.1.dev303–21–控制器备份和还原
第二十二节 多节点OpenStack Charms 部署指南0.0.1.dev223–22-- Resource: res_masakari_haproxy not running
第二十三节 多节点OpenStack Charms 部署指南0.0.1.dev223–23-登录openstack-dashboad,SSLError(SSLCertVerificationError
第二十四节 多节点OpenStack Charms 部署指南0.0.1.dev223–24-Resource: res_masakari_f8b6bde_vip not running
第二十五节 多节点OpenStack Charms 部署指南0.0.1.dev223–25–rsyslog 日志服务器构建实际
第二十六节 多节点OpenStack Charms 部署指南0.0.1.dev223–26–跨model 建立关系构建rsyslog 日志服务器构建实际
第二十七节 多节点OpenStack Charms 部署指南0.0.1.dev223–27–Charm Hook
第二十八节 多节点OpenStack Charms 部署指南0.0.1.dev223–28–Command run
第三十节 多节点OpenStack Charms 部署指南0.0.1.–30–参考体系结构—Dell EMC硬件上的Canonical Charmed OpenStack(Ussuri)
第三十一节 多节点OpenStack Charms 部署指南0.0.1.–31–vm hosting-1
第三十二节 多节点OpenStack Charms 部署指南0.0.1.–32–vm hosting-2-VM host networking (snap/2.9/UI)
第三十三节 多节点OpenStack Charms 部署指南0.0.1.–33–vm hosting-3-Adding a VM host (snap/2.9/UI)
第三十四节 多节点OpenStack Charms 部署指南0.0.1.–34–vm hosting-4-VM host存储池和创建和删除vm (snap/2.9/UI)
第三十五节 多节点OpenStack Charms 部署指南0.0.1.–35–Command export-bundle备份opensack并重新部署openstack
第三十六节 多节点openstack charms 部署指南0.0.1-36-graylog实际-1
第三十七节 多节点openstack charms 部署指南0.0.1-37-graylog实际-2
第三十八节 多节点openstack charms 部署指南0.0.1-38-graylog实际-3
第三十九节 多节点openstack charms 部署指南0.0.1-39-graylog-4-filebeat
第四十节 多节点openstack charms 部署指南0.0.1-40-prometheus2
参考文档:
PACEMAKER
Removing unit from hacluster doesn’t properly remove node from corosync Edit
布置好的高可用openstack,想接着布置rsyslog,收集日志。需要腾出一台服务器,供rsyslog使用。
就将machine 3 的masakari删除,并在machine 0上重建。
juju remove-unit masakari/3 --force --no-wait
juju add-unit masakari --to lxd:0
没想到,重建完毕后,juju status masakari
输出为:
Model Controller Cloud/Region Version SLA Timestamp
openstack maas-controller mymaas/default 2.8.10 unsupported 18:52:52+08:00
App Version Status Scale Charm Store Rev OS Notes
hacluster blocked 3 hacluster jujucharms 74 ubuntu
masakari 10.0.0 active 3 masakari local 0 ubuntu
masakari-mysql-router 8.0.23 active 3 mysql-router local 0 ubuntu
Unit Workload Agent Machine Public address Ports Message
masakari/1* active idle 2/lxd/2 10.0.2.234 15868/tcp Unit is ready
hacluster/0* blocked idle 10.0.2.234 Resource: res_masakari_f8b6bde_vip not running
masakari-mysql-router/0* active idle 10.0.2.234 Unit is ready
masakari/2 active idle 0/lxd/3 10.0.2.250 15868/tcp Unit is ready
hacluster/1 blocked idle 10.0.2.250 Resource: res_masakari_f8b6bde_vip not running
masakari-mysql-router/1 active idle 10.0.2.250 Unit is ready
masakari/3 active idle 1/lxd/18 10.0.3.28 15868/tcp Unit is ready
hacluster/3 blocked idle 10.0.3.28 Resource: res_masakari_f8b6bde_vip not running
masakari-mysql-router/3 active idle 10.0.3.28 Unit is ready
Machine State DNS Inst id Series AZ Message
0 started 10.0.0.156 node2 focal default Deployed
0/lxd/3 started 10.0.2.250 juju-db6013-0-lxd-3 focal default Container started
1 started 10.0.0.159 node4 focal default Deployed
1/lxd/18 started 10.0.3.28 juju-db6013-1-lxd-18 focal default Container started
2 started 10.0.0.158 node3 focal default Deployed
2/lxd/2 started 10.0.2.234 juju-db6013-2-lxd-2 focal default Container started
开始以为还是计数器归零了,所以不在重新执行了,就和前面的博文写的,执行了以下命令:
juju run --unit masakari/0 sudo crm resource refresh
juju run --unit masakari/1 sudo crm resource refresh
juju run --unit masakari/3 sudo crm resource refresh
juju run --unit hacluster/0 sudo crm resource refresh
juju run --unit hacluster/2sudo crm resource refresh
juju run --unit hacluster/3 sudo crm resource refresh
juju run-action masakari/0 pause --wait
juju run-action masakari/1 pause --wait
juju run-action masakari/3 pause --wait
juju run-action masakari/0 resume --wait
juju run-action masakari/1 resume --wait
juju run-action masakari/3 resume --wait
juju run-action --wait vault/0 reissue-certificates
但是,没作用。
去看了下crm的命令解释,执行了crm status,输出类似下面:
juju run --unit masakari/1 sudo crm status
Cluster Summary:
* Stack: corosync
* Current DC: juju-db6013-1-lxd-18 (version 2.0.3-4b1f869f0f) - partition with quorum
* Last updated: Sat Apr 10 04:09:54 2021
* Last change: Sat Apr 10 03:08:14 2021 by root via crm_resource on juju-db6013-2-lxd-2
* 8 nodes configured
* 11 resource instances configured
Node List:
* Node node1: UNCLEAN (offline)
* Online: [ juju-db6013-0-lxd-3 juju-db6013-1-lxd-18 juju-db6013-2-lxd-2 ]
* OFFLINE: [ juju-db6013-1-lxd-2 ]
* RemoteOFFLINE: [ node2.maas node3.maas node4.maas ]
Full List of Resources:
* Resource Group: grp_masakari_vips:
* res_masakari_f8b6bde_vip (ocf::heartbeat:IPaddr2): Stopped
* Clone Set: cl_res_masakari_haproxy [res_masakari_haproxy]:
* Stopped: [ juju-db6013-0-lxd-3 juju-db6013-1-lxd-18 juju-db6013-2-lxd-2 node2.maas node3.maas node4.maas ]
* node3.maas (ocf::pacemaker:remote): Stopped
* node2.maas (ocf::pacemaker:remote): Stopped
* node4.maas (ocf::pacemaker:remote): Stopped
* st-maas (stonith:external/maas): Stopped
* st-null (stonith:null): Stopped
Failed Fencing Actions:
* reboot of node1 failed: delegate=, client=pacemaker-controld.39721, origin=juju-db6013-1-lxd-18, last-failed='2021-04-10 04:09:22Z'
juju run --unit masakari/1 sudo crm configure show
node 1: node1
node 1000: juju-db6013-2-lxd-2
node 1001: juju-db6013-0-lxd-3
node 1002: juju-db6013-1-lxd-2
node 1003: juju-db6013-1-lxd-18
primitive node2.maas ocf:pacemaker:remote \
params server=10.0.0.156 reconnect_interval=60 \
op monitor interval=30s
primitive node3.maas ocf:pacemaker:remote \
params server=10.0.0.158 reconnect_interval=60 \
op monitor interval=30s
primitive node4.maas ocf:pacemaker:remote \
params server=10.0.0.159 reconnect_interval=60 \
op monitor interval=30s
primitive res_masakari_f8b6bde_vip IPaddr2 \
params ip=10.0.7.72 \
meta migration-threshold=INFINITY failure-timeout=5s \
op monitor timeout=20s interval=10s \
op_params depth=0
primitive res_masakari_haproxy lsb:haproxy \
meta migration-threshold=INFINITY failure-timeout=5s \
op monitor interval=5s
primitive st-maas stonith:external/maas \
params url="http://10.0.0.3:5240/MAAS" apikey="HrNTLvEaW2Z4hUaGCr:rXuELuKrB2q3wAne2r:xmTKFCDheeNXunddCdBkuHZbGVgFv9sU" hostnames="node2 node2.m aas node3 node3.maas node4 node4.maas" \
op monitor interval=25 start-delay=25 timeout=25
primitive st-null stonith:null \
params hostlist="juju-db6013-0-lxd-3 juju-db6013-1-lxd-18 juju-db6013-2-lxd-2" \
op monitor interval=25 start-delay=25 timeout=25
group grp_masakari_vips res_masakari_f8b6bde_vip
clone cl_res_masakari_haproxy res_masakari_haproxy \
meta clone-max=5
location loc-cl_res_masakari_haproxy-juju-db6013-0-lxd-3 cl_res_masakari_haproxy 0: juju-db6013-0-lxd-3
location loc-cl_res_masakari_haproxy-juju-db6013-1-lxd-18 cl_res_masakari_haproxy 0: juju-db6013-1-lxd-18
location loc-cl_res_masakari_haproxy-juju-db6013-1-lxd-2 cl_res_masakari_haproxy 0: juju-db6013-1-lxd-2
location loc-cl_res_masakari_haproxy-juju-db6013-2-lxd-2 cl_res_masakari_haproxy 0: juju-db6013-2-lxd-2
location loc-cl_res_masakari_haproxy-node1 cl_res_masakari_haproxy 0: node1
location loc-grp_masakari_vips-juju-db6013-0-lxd-3 grp_masakari_vips 0: juju-db6013-0-lxd-3
location loc-grp_masakari_vips-juju-db6013-1-lxd-18 grp_masakari_vips 0: juju-db6013-1-lxd-18
location loc-grp_masakari_vips-juju-db6013-1-lxd-2 grp_masakari_vips 0: juju-db6013-1-lxd-2
location loc-grp_masakari_vips-juju-db6013-2-lxd-2 grp_masakari_vips 0: juju-db6013-2-lxd-2
location loc-grp_masakari_vips-node1 grp_masakari_vips 0: node1
location loc-node2.maas-juju-db6013-0-lxd-3 node2.maas 200: juju-db6013-0-lxd-3
location loc-node2.maas-juju-db6013-1-lxd-2 node2.maas 0: juju-db6013-1-lxd-2
location loc-node2.maas-juju-db6013-2-lxd-2 node2.maas 0: juju-db6013-2-lxd-2
location loc-node2.maas-node1 node2.maas 0: node1
location loc-node3.maas-juju-db6013-0-lxd-3 node3.maas 0: juju-db6013-0-lxd-3
location loc-node3.maas-juju-db6013-1-lxd-2 node3.maas 200: juju-db6013-1-lxd-2
location loc-node3.maas-juju-db6013-2-lxd-2 node3.maas 0: juju-db6013-2-lxd-2
location loc-node3.maas-node1 node3.maas 0: node1
location loc-node4.maas-juju-db6013-0-lxd-3 node4.maas 0: juju-db6013-0-lxd-3
location loc-node4.maas-juju-db6013-1-lxd-2 node4.maas 0: juju-db6013-1-lxd-2
location loc-node4.maas-juju-db6013-2-lxd-2 node4.maas 200: juju-db6013-2-lxd-2
location loc-node4.maas-node1 node4.maas 0: node1
location loc-st-maas-juju-db6013-0-lxd-3 st-maas 0: juju-db6013-0-lxd-3
location loc-st-maas-juju-db6013-1-lxd-18 st-maas 0: juju-db6013-1-lxd-18
location loc-st-maas-juju-db6013-1-lxd-2 st-maas 0: juju-db6013-1-lxd-2
location loc-st-maas-juju-db6013-2-lxd-2 st-maas 0: juju-db6013-2-lxd-2
location loc-st-maas-node1 st-maas 0: node1
location loc-st-null-juju-db6013-0-lxd-3 st-null 0: juju-db6013-0-lxd-3
location loc-st-null-juju-db6013-1-lxd-18 st-null 0: juju-db6013-1-lxd-18
location loc-st-null-juju-db6013-1-lxd-2 st-null 0: juju-db6013-1-lxd-2
location loc-st-null-juju-db6013-2-lxd-2 st-null 0: juju-db6013-2-lxd-2
location loc-st-null-node1 st-null 0: node1
property cib-bootstrap-options: \
have-watchdog=false \
dc-version=2.0.3-4b1f869f0f \
cluster-infrastructure=corosync \
cluster-name=debian \
no-quorum-policy=stop \
cluster-recheck-interval=60 \
stonith-enabled=true \
symmetric-cluster=false
rsc_defaults rsc-options: \
resource-stickiness=100 \
failure-timeout=180
看起来是和删除的node没清空有关。
论坛答疑也回来了,大意如下:
有一个关于缩减规模议题的bug,该bug定于本月发布,但是,现在,您需要使用crm手动删除machine。命令为:
crm node delete <hostname of dead node in 'crm status'>
并跨集群运行配置更改的hook,使用下面命令:
juju run --application hacluster 'hooks/update-status'
根据上面crm status 的输出结果,首先执行了下面命令:
juju run --unit masakari/1 sudo crm node delete juju-db6013-1-lxd-2
再显示crm状态
juju run --unit masakari/1 sudo crm status
Cluster Summary:
* Stack: corosync
* Current DC: juju-db6013-1-lxd-18 (version 2.0.3-4b1f869f0f) - partition with quorum
* Last updated: Sat Apr 10 09:13:49 2021
* Last change: Sat Apr 10 09:13:41 2021 by root via crm_node on juju-db6013-2-lxd-2
* 7 nodes configured
* 11 resource instances configured
Node List:
* Node node1: UNCLEAN (offline)
* Online: [ juju-db6013-0-lxd-3 juju-db6013-1-lxd-18 juju-db6013-2-lxd-2 ]
* RemoteOFFLINE: [ node2.maas node3.maas node4.maas ]
Full List of Resources:
* Resource Group: grp_masakari_vips:
* res_masakari_f8b6bde_vip (ocf::heartbeat:IPaddr2): Stopped
* Clone Set: cl_res_masakari_haproxy [res_masakari_haproxy]:
* Stopped: [ juju-db6013-0-lxd-3 juju-db6013-1-lxd-18 juju-db6013-2-lxd-2 node2.maas node3.maas node4.maas ]
* node3.maas (ocf::pacemaker:remote): Stopped
* node2.maas (ocf::pacemaker:remote): Stopped
* node4.maas (ocf::pacemaker:remote): Stopped
* st-maas (stonith:external/maas): Stopped
* st-null (stonith:null): Stopped
Failed Fencing Actions:
* reboot of node1 failed: delegate=, client=pacemaker-controld.39721, origin=juju-db6013-1-lxd-18, last-failed='2021-04-10 09:13:42Z'
然后再执行:
juju run --application hacluster 'hooks/update-status'
- Stdout: ""
UnitId: hacluster/0
- Stdout: ""
UnitId: hacluster/3
- Stdout: ""
UnitId: hacluster/1
再看下crm status
juju run --unit masakari/2 sudo crm status
Cluster Summary:
* Stack: corosync
* Current DC: juju-db6013-1-lxd-18 (version 2.0.3-4b1f869f0f) - partition with quorum
* Last updated: Sat Apr 10 09:16:40 2021
* Last change: Sat Apr 10 09:13:41 2021 by root via crm_node on juju-db6013-2-lxd-2
* 7 nodes configured
* 11 resource instances configured
Node List:
* Node node1: UNCLEAN (offline)
* Online: [ juju-db6013-0-lxd-3 juju-db6013-1-lxd-18 juju-db6013-2-lxd-2 ]
* RemoteOFFLINE: [ node2.maas node3.maas node4.maas ]
Full List of Resources:
* Resource Group: grp_masakari_vips:
* res_masakari_f8b6bde_vip (ocf::heartbeat:IPaddr2): Stopped
* Clone Set: cl_res_masakari_haproxy [res_masakari_haproxy]:
* Stopped: [ juju-db6013-0-lxd-3 juju-db6013-1-lxd-18 juju-db6013-2-lxd-2 node2.maas node3.maas node4.maas ]
* node3.maas (ocf::pacemaker:remote): Stopped
* node2.maas (ocf::pacemaker:remote): Stopped
* node4.maas (ocf::pacemaker:remote): Stopped
* st-maas (stonith:external/maas): Stopped
* st-null (stonith:null): Stopped
可是masakari还是不运行,估计是还有死节点没删除。
这次决定把node1删除。
juju run --unit masakari/1 sudo crm node delete node1
juju run --application hacluster ‘hooks/update-status’
juju run --unit masakari/1 sudo crm status
Cluster Summary:
* Stack: corosync
* Current DC: juju-db6013-1-lxd-18 (version 2.0.3-4b1f869f0f) - partition with quorum
* Last updated: Sat Apr 10 10:52:01 2021
* Last change: Sat Apr 10 10:49:59 2021 by root via crm_node on juju-db6013-2-lxd-2
* 6 nodes configured
* 11 resource instances configured
Node List:
* Online: [ juju-db6013-0-lxd-3 juju-db6013-1-lxd-18 juju-db6013-2-lxd-2 ]
* RemoteOnline: [ node2.maas node3.maas node4.maas ]
Full List of Resources:
* Resource Group: grp_masakari_vips:
* res_masakari_f8b6bde_vip (ocf::heartbeat:IPaddr2): Started juju-db6013-1-lxd-18
* Clone Set: cl_res_masakari_haproxy [res_masakari_haproxy]:
* Started: [ juju-db6013-0-lxd-3 juju-db6013-1-lxd-18 juju-db6013-2-lxd-2 ]
* Stopped: [ node2.maas node3.maas node4.maas ]
* node3.maas (ocf::pacemaker:remote): Started juju-db6013-0-lxd-3
* node2.maas (ocf::pacemaker:remote): Started juju-db6013-0-lxd-3
* node4.maas (ocf::pacemaker:remote): Started juju-db6013-2-lxd-2
* st-maas (stonith:external/maas): Started juju-db6013-1-lxd-18
* st-null (stonith:null): Started juju-db6013-2-lxd-2
Failed Fencing Actions:
* reboot of node1 failed: delegate=, client=pacemaker-controld.39721, origin=juju-db6013-1-lxd-18, last-failed='2021-04-10 10:49:13Z'
juju run --application hacluster 'hooks/update-status'
- Stdout: ""
UnitId: hacluster/0
- Stdout: ""
UnitId: hacluster/1
- Stdout: ""
UnitId: hacluster/3
果然服务都起来了。
#juju status masakari
Model Controller Cloud/Region Version SLA Timestamp
openstack maas-controller mymaas/default 2.8.10 unsupported 18:54:05+08:00
App Version Status Scale Charm Store Rev OS Notes
hacluster active 3 hacluster jujucharms 74 ubuntu
masakari 10.0.0 active 3 masakari local 0 ubuntu
masakari-mysql-router 8.0.23 active 3 mysql-router local 0 ubuntu
Unit Workload Agent Machine Public address Ports Message
masakari/1* active idle 2/lxd/2 10.0.2.234 15868/tcp Unit is ready
hacluster/0* active idle 10.0.2.234 Unit is ready and clustered
masakari-mysql-router/0* active idle 10.0.2.234 Unit is ready
masakari/2 active idle 0/lxd/3 10.0.2.250 15868/tcp Unit is ready
hacluster/1 active idle 10.0.2.250 Unit is ready and clustered
masakari-mysql-router/1 active idle 10.0.2.250 Unit is ready
masakari/3 active idle 1/lxd/18 10.0.3.28 15868/tcp Unit is ready
hacluster/3 active idle 10.0.3.28 Unit is ready and clustered
masakari-mysql-router/3 active idle 10.0.3.28 Unit is ready
Machine State DNS Inst id Series AZ Message
0 started 10.0.0.156 node2 focal default Deployed
0/lxd/3 started 10.0.2.250 juju-db6013-0-lxd-3 focal default Container started
1 started 10.0.0.159 node4 focal default Deployed
1/lxd/18 started 10.0.3.28 juju-db6013-1-lxd-18 focal default Container started
2 started 10.0.0.158 node3 focal default Deployed
2/lxd/2 started 10.0.2.234 juju-db6013-2-lxd-2 focal default Container started
结果不错。