作者:张华 发表于:2016-07-21
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明
( http://blog.csdn.net/quqi99 )
问题
本文描述如何用packmaker为ceilometer添加HA服务.
使用Juju/MAAS快速部署测试环境
ceilometer-hacluster:
charm: cs:~openstack-charmers-next/hacluster
options:
debug: True
ceilometer:
#comment branch comment to use local charm
#branch: https://github.com/openstack/charm-ceilometer
constraints: mem=1G
num_units: 3
options:
vip: 10.5.100.20
它相当于下列命令:
git clone https://github.com/openstack/charm-hacluster ceilometer-cluster
juju deploy --repository=/home/ubuntu/openstack-charm-testing local:trusty/ceilometer-cluster
juju deploy -n3 --repository=/home/ubuntu/openstack-charm-testing local:trusty/ceilometer
juju deploy ceilometer-agent
juju set ceilometer vip=10.5.100.20
juju add-relation ceilometer ceilometer-cluster
juju add-relation ceilometer keystone:identity-service
juju add-relation ceilometer keystone:identity-notifications
juju add-relation ceilometer rabbitmq-server
juju add-relation ceilometer mongodb
juju add-relation ceilometer-agent nova-compute
juju add-relation ceilometer-agent ceilometer
3, 修改dev.yaml添加关系
- [ ceilometer, ceilometer-hacluster ]
4, 部署MAAS(略),然后执行juju命令一键部署OpenStack环境
juju-deployer -c ./next.yaml -d trusty-liberty
背后发生了什么
1, /etc/haproxy/haproxy.cfg
global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
maxconn 20000
user haproxy
group haproxy
spread-checks 0
defaults
log global
mode tcp
option tcplog
option dontlognull
retries 3
timeout queue 5000
timeout connect 5000
timeout client 30000
timeout server 30000
listen stats
bind 127.0.0.1:8888
mode http
stats enable
stats hide-version
stats realm Haproxy\ Statistics
stats uri /
stats auth admin:sqn99Cdznn2hYbSJz9nfnJ43fhWwVjpk
frontend tcp-in_ceilometer_api
bind *:8777
acl net_10.5.4.61 dst 10.5.4.61/255.255.0.0
use_backend ceilometer_api_10.5.4.61 if net_10.5.4.61
default_backend ceilometer_api_10.5.4.61
backend ceilometer_api_10.5.4.61
balance leastconn
server ceilometer-2 10.5.4.64:8767 check
server ceilometer-0 10.5.4.63:8767 check
server ceilometer-1 10.5.4.61:8767 check
totem {
version: 2
# How long before declaring a token lost (ms)
token: 3000
# How many token retransmits before forming a new configuration
token_retransmits_before_loss_const: 10
# How long to wait for join messages in the membership protocol (ms)
join: 60
# How long to wait for consensus to be achieved before starting a new round of membership configuration (ms)
consensus: 3600
# Turn off the virtual synchrony filter
vsftype: none
# Number of messages that may be sent by one processor on receipt of the token
max_messages: 20
# Limit generated nodeids to 31-bits (positive signed integers)
clear_node_high_bit: yes
# Disable encryption
secauth: off
# How many threads to use for encryption/decryption
threads: 0
ip_version: ipv4
# This specifies the mode of redundant ring, which may be none, active, or passive.
rrp_mode: none
interface {
# The following values need to be set based on your environment
ringnumber: 0
bindnetaddr: 10.5.0.0
mcastaddr: 226.94.1.1
mcastport: 5403
}
transport: udp
}
quorum {
# Enable and configure quorum subsystem (default: off)
# see also corosync.conf.5 and votequorum.5
provider: corosync_votequorum
expected_votes: 3
}
logging {
fileline: off
to_stderr: yes
to_logfile: no
to_syslog: yes
syslog_facility: daemon
debug: on
logger_subsys {
subsys: QUORUM
debug: on
}
}
3, sudo crm configure show
$ sudo crm configure show
node $id="168100925" juju-zhhuabj-machine-2 \
attributes standby="on"
node $id="168100927" juju-zhhuabj-machine-1 \
attributes standby="off"
node $id="168100928" juju-zhhuabj-machine-3 \
attributes standby="off"
primitive res_ceilometer_agent_central ocf:openstack:ceilometer-agent-central \
op monitor interval="30s" \
meta target-role="Started"
primitive res_ceilometer_eth0_vip ocf:heartbeat:IPaddr2 \
params ip="10.5.100.20" cidr_netmask="255.255.0.0" nic="eth0"
primitive res_ceilometer_haproxy lsb:haproxy \
op monitor interval="5s"
group grp_ceilometer_vips res_ceilometer_eth0_vip
clone cl_ceilometer_haproxy res_ceilometer_haproxy
property $id="cib-bootstrap-options" \
dc-version="1.1.10-42f2063" \
cluster-infrastructure="corosync" \
no-quorum-policy="ignore" \
stonith-enabled="false" \
last-lrm-refresh="1468984707"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
为一个存在的环境添加HA
juju add-unit nova-cloud-controller -n 2
juju deploy hacluster ncc-hacluster --series trusty
juju add-relation nova-cloud-controller ncc-hacluster
juju config nova-cloud-controller vip=10.5.104.1
20200916更新
Ceilometer 功能被拆分成三部分,即采集(ceilometer)、存储(gnocchi)、告警(Aodh), 还有一个Panko主要提供事件存储服务。
架构如下: https://www.cnblogs.com/gaozhengwei/p/7097605.html
- ceilometer-agent-compute采集虚机本身(nova-compute)的状态信息,同时ceilometer-agent-central也通过rest api采集除nova-compute之外的其他各组件(volumes, network, images等)的信息.收集完之后数据发给MQ. 这些分类也可以参见源码中的setup.cfg. ceilometer-polling(入口ceilometer.cmd.polling.main)的启动分析见 - http://mengalong.github.io/2018/07/27/ceilometer-agent-compute-ana/
- ceilometer-agent-notification(入口ceilometer/notification.py)不仅从MQ接收ceilometer-agent-compute与ceilometer-agent-central收集到的数据, 还直接从MQ接收OpenStack各组件的相关信息,见-https://www.cnblogs.com/sammyliu/archive/2004/01/13/4384470.html
- ceilometer-agent-notification从MQ中获取数据后将数据存储在Gnocchi服务中
- ceilometer-alarm-evaluator对比数据库中的告警策略和计量的统计结果,将被触发的告警消息通过消息队列发送给ceilometer-alarm-notifier, ceilometer-alarm-notifier触发告警
- ceilometer-agent-collector (deprecated in Ocata)
环境快速搭建:
./generate-bundle.sh -s bionic -r stein --create-model --name ceilometer:stsstack --num-compute 1 --telemetry
./generate-bundle.sh --name ceilometer:stsstack --replay --run
juju run-action ceilometer/0 ceilometer-upgrade
./configure
./tools/instance_launch.sh 1 cirros2
sudo apt install python3-gnocchiclient
gnocchi metric list |grep <vm-id>
gnocchi resource show resource-id
ceilometer/cmd/polling.py -> create_polling_service -> ceilometer/polling/manager.py#AgentManager
./ceilometer/compute/pollsters/__init__.py ->
ceilometer.compute.virt =
libvirt = ceilometer.compute.virt.libvirt.inspector:LibvirtInspector
memory.usage = ceilometer.compute.pollsters.instance_stats:MemoryUsagePollster
# find metric id first
$ gnocchi metric list |grep 64cb10fd-246f-4864-b06d-687d59c47c2c
| 192c38bf-86b4-416f-ba4f-0dd0de806b1c | ceilometer-low | vcpus | vcpu | 64cb10fd-246f-4864-b06d-687d59c47c2c |
| 27d940e8-0201-4078-b9b0-0109f03d1727 | ceilometer-low-rate | cpu | ns | 64cb10fd-246f-4864-b06d-687d59c47c2c |
| 78ffe32c-da9b-41ef-bff0-a2e960712ba7 | ceilometer-low | memory | MB | 64cb10fd-246f-4864-b06d-687d59c47c2c |
| 9daad480-1eb2-4c4c-a878-004d522774dd | ceilometer-low | disk.root.size | GB | 64cb10fd-246f-4864-b06d-687d59c47c2c |
| bbd47291-2feb-47e1-bdae-8dc8f8c41e72 | ceilometer-low | disk.ephemeral.size | GB | 64cb10fd-246f-4864-b06d-687d59c47c2c |
| c49f013e-37b6-4039-95d4-a16b85da611d | ceilometer-low | memory.usage | MB | 64cb10fd-246f-4864-b06d-687d59c47c2c |
| e9bd7d21-d3ad-4ba6-b40d-ae7ff92c6019 | ceilometer-low | compute.instance.booting.time | sec | 64cb10fd-246f-4864-b06d-687d59c47c2c |
# another way to find metric id
$ gnocchi resource show 64cb10fd-246f-4864-b06d-687d59c47c2c
+-----------------------+---------------------------------------------------------------------+
| Field | Value |
+-----------------------+---------------------------------------------------------------------+
| created_by_project_id | e0cf8c083ee34856addfb7453b9ab02c |
| created_by_user_id | 824b47c739e44006bb8b8f13a44c1257 |
| creator | 824b47c739e44006bb8b8f13a44c1257:e0cf8c083ee34856addfb7453b9ab02c |
| ended_at | None |
| id | 64cb10fd-246f-4864-b06d-687d59c47c2c |
| metrics | compute.instance.booting.time: e9bd7d21-d3ad-4ba6-b40d-ae7ff92c6019 |
| | cpu: 27d940e8-0201-4078-b9b0-0109f03d1727 |
| | disk.ephemeral.size: bbd47291-2feb-47e1-bdae-8dc8f8c41e72 |
| | disk.root.size: 9daad480-1eb2-4c4c-a878-004d522774dd |
| | memory.usage: c49f013e-37b6-4039-95d4-a16b85da611d |
| | memory: 78ffe32c-da9b-41ef-bff0-a2e960712ba7 |
| | vcpus: 192c38bf-86b4-416f-ba4f-0dd0de806b1c |
| original_resource_id | 64cb10fd-246f-4864-b06d-687d59c47c2c |
| project_id | 04bde26469464ae99f0e5d16259dfe76 |
| revision_end | None |
| revision_start | 2020-09-16T08:46:09.338454+00:00 |
| started_at | 2020-09-16T08:45:50.944660+00:00 |
| type | instance |
| user_id | d75e2e16b87a4b4da5fab29d1bb1b222 |
+-----------------------+---------------------------------------------------------------------+
$ gnocchi measures show c49f013e-37b6-4039-95d4-a16b85da611d
+---------------------------+-------------+------------+
| timestamp | granularity | value |
+---------------------------+-------------+------------+
| 2020-09-16T08:45:00+00:00 | 300.0 | 14.890625 |
| 2020-09-16T08:50:00+00:00 | 300.0 | 14.4453125 |
+---------------------------+-------------+------------+
如何调试
ceilometer Newton之前使用oslo_service启动进程这时可以使用pdb直接调试,但之后使用cotyledon模块启动进程框架,它底层是multiprocess,启动进程框架后会进入子进程,子进程的stdin/out/err等文件均已关闭,pdb也就无法在子进程里运行了.下面通过实现一个新的类ForkedPdb重定向stdin的方法实现子进程调试(见-https://zhuanlan.zhihu.com/p/63898351)
import sys
import pdb
class ForkedPdb(pdb.Pdb):
def interaction(self, *args, **kwargs):
_stdin = sys.stdin
try:
sys.stdin = open('/dev/stdin')
pdb.Pdb.interaction(self, *args, **kwargs)
finally:
sys.stdin = _stdin
#import pdb;pdb.set_trace()
ForkedPdb().set_trace()
最好的是通过rpdb调试" import rpdb;rpdb.set_trace() " (/usr/lib/python3/dist-packages/ceilometer/pipeline/event.py), 修改workers=1, 然后运行"systemctl restart ceilometer-agent-notification", 最后运行"nc 127.0.0.1 4444"
apt install python3-pip -y
pip3 install rpdb
import rpdb;rpdb.set_trace() #add this line in /usr/lib/python3/dist-packages/ceilometer/pipeline/event.py
change workers=1 in /etc/ceilometer/ceilometer.conf
systemctl restart ceilometer-agent-notification
tail -f /var/log/ceilometer/ceilometer-agent-notification.log
./tools/instance_launch.sh 1 cirros #use it to trigger breakpoint
netstat -anp |grep 4444
nc 127.0.0.1 4444
20210512更新 - keystone ha
juju deploy cs:keystone -n3
juju deploy cs:mysql
juju deploy cs:hacluster keystone-hacluster
juju add-relation keystone mysql
juju add-relation keystone keystone-hacluster
juju config keystone vip=10.5.100.0
参考
[1] https://wiki.ubuntu.com/OpenStack/OpenStackCharms/ReleaseNotes1501
[2] https://blog.csdn.net/qingyuanluofeng/article/details/83536546
[3] https://www.cnblogs.com/luohaixian/p/11145939.html