使用pacemaker为OpenStack组件添加HA服务(by quqi99)

作者:张华  发表于:2016-07-21
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明

( http://blog.csdn.net/quqi99 )

问题

本文描述如何用packmaker为ceilometer添加HA服务.

使用Juju/MAAS快速部署测试环境

    ceilometer-hacluster:
      charm: cs:~openstack-charmers-next/hacluster
      options:
        debug: True
    ceilometer:
      #comment branch comment to use local charm
      #branch: https://github.com/openstack/charm-ceilometer
      constraints: mem=1G
      num_units: 3
      options:
        vip: 10.5.100.20
它相当于下列命令:
git clone https://github.com/openstack/charm-hacluster ceilometer-cluster
juju deploy --repository=/home/ubuntu/openstack-charm-testing local:trusty/ceilometer-cluster
juju deploy -n3 --repository=/home/ubuntu/openstack-charm-testing local:trusty/ceilometer
juju deploy ceilometer-agent
juju set ceilometer vip=10.5.100.20
juju add-relation ceilometer ceilometer-cluster
juju add-relation ceilometer keystone:identity-service
juju add-relation ceilometer keystone:identity-notifications
juju add-relation ceilometer rabbitmq-server
juju add-relation ceilometer mongodb
juju add-relation ceilometer-agent nova-compute
juju add-relation ceilometer-agent ceilometer
3, 修改dev.yaml添加关系
   - [ ceilometer, ceilometer-hacluster ]
 
4, 部署MAAS(略),然后执行juju命令一键部署OpenStack环境
    juju-deployer -c ./next.yaml -d trusty-liberty

背后发生了什么

1, /etc/haproxy/haproxy.cfg
global
    log 127.0.0.1 local0
    log 127.0.0.1 local1 notice
    maxconn 20000
    user haproxy
    group haproxy
    spread-checks 0

defaults
    log global
    mode tcp
    option tcplog
    option dontlognull
    retries 3
    timeout queue 5000
    timeout connect 5000
    timeout client 30000
    timeout server 30000

listen stats
    bind 127.0.0.1:8888
    mode http
    stats enable
    stats hide-version
    stats realm Haproxy\ Statistics
    stats uri /
    stats auth admin:sqn99Cdznn2hYbSJz9nfnJ43fhWwVjpk

frontend tcp-in_ceilometer_api
    bind *:8777
    acl net_10.5.4.61 dst 10.5.4.61/255.255.0.0
    use_backend ceilometer_api_10.5.4.61 if net_10.5.4.61
    default_backend ceilometer_api_10.5.4.61

backend ceilometer_api_10.5.4.61
    balance leastconn
    server ceilometer-2 10.5.4.64:8767 check
    server ceilometer-0 10.5.4.63:8767 check
    server ceilometer-1 10.5.4.61:8767 check
 
2, /etc/corosync/corosync.conf
totem {
        version: 2
        # How long before declaring a token lost (ms)
        token: 3000
        # How many token retransmits before forming a new configuration
        token_retransmits_before_loss_const: 10
        # How long to wait for join messages in the membership protocol (ms)
        join: 60
        # How long to wait for consensus to be achieved before starting a new round of membership configuration (ms)
        consensus: 3600
        # Turn off the virtual synchrony filter
        vsftype: none
        # Number of messages that may be sent by one processor on receipt of the token
        max_messages: 20
        # Limit generated nodeids to 31-bits (positive signed integers)
        clear_node_high_bit: yes
        # Disable encryption
        secauth: off
        # How many threads to use for encryption/decryption
        threads: 0
        ip_version: ipv4
        # This specifies the mode of redundant ring, which may be none, active, or passive.
        rrp_mode: none
        interface {
                # The following values need to be set based on your environment
                ringnumber: 0
                bindnetaddr: 10.5.0.0
                mcastaddr: 226.94.1.1
                mcastport: 5403
        }
        transport: udp
}
quorum {
        # Enable and configure quorum subsystem (default: off)
        # see also corosync.conf.5 and votequorum.5
        provider: corosync_votequorum
        expected_votes: 3
        }
logging {
        fileline: off
        to_stderr: yes
        to_logfile: no
        to_syslog: yes
        syslog_facility: daemon
        debug: on
        logger_subsys {
                subsys: QUORUM
                debug: on
        }
}

3, sudo crm configure show
$ sudo crm configure show
node $id="168100925" juju-zhhuabj-machine-2 \
	attributes standby="on"
node $id="168100927" juju-zhhuabj-machine-1 \
	attributes standby="off"
node $id="168100928" juju-zhhuabj-machine-3 \
	attributes standby="off"
primitive res_ceilometer_agent_central ocf:openstack:ceilometer-agent-central \
	op monitor interval="30s" \
	meta target-role="Started"
primitive res_ceilometer_eth0_vip ocf:heartbeat:IPaddr2 \
	params ip="10.5.100.20" cidr_netmask="255.255.0.0" nic="eth0"
primitive res_ceilometer_haproxy lsb:haproxy \
	op monitor interval="5s"
group grp_ceilometer_vips res_ceilometer_eth0_vip
clone cl_ceilometer_haproxy res_ceilometer_haproxy
property $id="cib-bootstrap-options" \
	dc-version="1.1.10-42f2063" \
	cluster-infrastructure="corosync" \
	no-quorum-policy="ignore" \
	stonith-enabled="false" \
	last-lrm-refresh="1468984707"
rsc_defaults $id="rsc-options" \
	resource-stickiness="100"

为一个存在的环境添加HA

juju add-unit nova-cloud-controller -n 2
juju deploy hacluster ncc-hacluster --series trusty
juju add-relation nova-cloud-controller ncc-hacluster
juju config nova-cloud-controller vip=10.5.104.1

20200916更新

Ceilometer 功能被拆分成三部分,即采集(ceilometer)、存储(gnocchi)、告警(Aodh), 还有一个Panko主要提供事件存储服务。
架构如下: https://www.cnblogs.com/gaozhengwei/p/7097605.html

  • ceilometer-agent-compute采集虚机本身(nova-compute)的状态信息,同时ceilometer-agent-central也通过rest api采集除nova-compute之外的其他各组件(volumes, network, images等)的信息.收集完之后数据发给MQ. 这些分类也可以参见源码中的setup.cfg. ceilometer-polling(入口ceilometer.cmd.polling.main)的启动分析见 - http://mengalong.github.io/2018/07/27/ceilometer-agent-compute-ana/
  • ceilometer-agent-notification(入口ceilometer/notification.py)不仅从MQ接收ceilometer-agent-compute与ceilometer-agent-central收集到的数据,  还直接从MQ接收OpenStack各组件的相关信息,见-https://www.cnblogs.com/sammyliu/archive/2004/01/13/4384470.html
  • ceilometer-agent-notification从MQ中获取数据后将数据存储在Gnocchi服务中
  • ceilometer-alarm-evaluator对比数据库中的告警策略和计量的统计结果,将被触发的告警消息通过消息队列发送给ceilometer-alarm-notifier, ceilometer-alarm-notifier触发告警
  • ceilometer-agent-collector (deprecated in Ocata)

环境快速搭建:

./generate-bundle.sh -s bionic -r stein --create-model --name ceilometer:stsstack --num-compute 1 --telemetry
./generate-bundle.sh --name ceilometer:stsstack --replay --run
juju run-action ceilometer/0 ceilometer-upgrade
./configure
./tools/instance_launch.sh 1 cirros2
sudo apt install python3-gnocchiclient
gnocchi metric list |grep <vm-id>
gnocchi resource show  resource-id


ceilometer/cmd/polling.py -> create_polling_service -> ceilometer/polling/manager.py#AgentManager

./ceilometer/compute/pollsters/__init__.py -> 
ceilometer.compute.virt =
    libvirt = ceilometer.compute.virt.libvirt.inspector:LibvirtInspector

memory.usage = ceilometer.compute.pollsters.instance_stats:MemoryUsagePollster


# find metric id first
$ gnocchi metric list |grep 64cb10fd-246f-4864-b06d-687d59c47c2c
| 192c38bf-86b4-416f-ba4f-0dd0de806b1c | ceilometer-low      | vcpus                         | vcpu    | 64cb10fd-246f-4864-b06d-687d59c47c2c |
| 27d940e8-0201-4078-b9b0-0109f03d1727 | ceilometer-low-rate | cpu                           | ns      | 64cb10fd-246f-4864-b06d-687d59c47c2c |
| 78ffe32c-da9b-41ef-bff0-a2e960712ba7 | ceilometer-low      | memory                        | MB      | 64cb10fd-246f-4864-b06d-687d59c47c2c |
| 9daad480-1eb2-4c4c-a878-004d522774dd | ceilometer-low      | disk.root.size                | GB      | 64cb10fd-246f-4864-b06d-687d59c47c2c |
| bbd47291-2feb-47e1-bdae-8dc8f8c41e72 | ceilometer-low      | disk.ephemeral.size           | GB      | 64cb10fd-246f-4864-b06d-687d59c47c2c |
| c49f013e-37b6-4039-95d4-a16b85da611d | ceilometer-low      | memory.usage                  | MB      | 64cb10fd-246f-4864-b06d-687d59c47c2c |
| e9bd7d21-d3ad-4ba6-b40d-ae7ff92c6019 | ceilometer-low      | compute.instance.booting.time | sec     | 64cb10fd-246f-4864-b06d-687d59c47c2c |

# another way to find metric id
$ gnocchi resource show 64cb10fd-246f-4864-b06d-687d59c47c2c
+-----------------------+---------------------------------------------------------------------+
| Field                 | Value                                                               |
+-----------------------+---------------------------------------------------------------------+
| created_by_project_id | e0cf8c083ee34856addfb7453b9ab02c                                    |
| created_by_user_id    | 824b47c739e44006bb8b8f13a44c1257                                    |
| creator               | 824b47c739e44006bb8b8f13a44c1257:e0cf8c083ee34856addfb7453b9ab02c   |
| ended_at              | None                                                                |
| id                    | 64cb10fd-246f-4864-b06d-687d59c47c2c                                |
| metrics               | compute.instance.booting.time: e9bd7d21-d3ad-4ba6-b40d-ae7ff92c6019 |
|                       | cpu: 27d940e8-0201-4078-b9b0-0109f03d1727                           |
|                       | disk.ephemeral.size: bbd47291-2feb-47e1-bdae-8dc8f8c41e72           |
|                       | disk.root.size: 9daad480-1eb2-4c4c-a878-004d522774dd                |
|                       | memory.usage: c49f013e-37b6-4039-95d4-a16b85da611d                  |
|                       | memory: 78ffe32c-da9b-41ef-bff0-a2e960712ba7                        |
|                       | vcpus: 192c38bf-86b4-416f-ba4f-0dd0de806b1c                         |
| original_resource_id  | 64cb10fd-246f-4864-b06d-687d59c47c2c                                |
| project_id            | 04bde26469464ae99f0e5d16259dfe76                                    |
| revision_end          | None                                                                |
| revision_start        | 2020-09-16T08:46:09.338454+00:00                                    |
| started_at            | 2020-09-16T08:45:50.944660+00:00                                    |
| type                  | instance                                                            |
| user_id               | d75e2e16b87a4b4da5fab29d1bb1b222                                    |
+-----------------------+---------------------------------------------------------------------+

$ gnocchi measures show c49f013e-37b6-4039-95d4-a16b85da611d
+---------------------------+-------------+------------+
| timestamp                 | granularity |      value |
+---------------------------+-------------+------------+
| 2020-09-16T08:45:00+00:00 |       300.0 |  14.890625 |
| 2020-09-16T08:50:00+00:00 |       300.0 | 14.4453125 |
+---------------------------+-------------+------------+

如何调试

ceilometer Newton之前使用oslo_service启动进程这时可以使用pdb直接调试,但之后使用cotyledon模块启动进程框架,它底层是multiprocess,启动进程框架后会进入子进程,子进程的stdin/out/err等文件均已关闭,pdb也就无法在子进程里运行了.下面通过实现一个新的类ForkedPdb重定向stdin的方法实现子进程调试(见-https://zhuanlan.zhihu.com/p/63898351)

import sys
import pdb
 
class ForkedPdb(pdb.Pdb):
    def interaction(self, *args, **kwargs):
        _stdin = sys.stdin
        try:
            sys.stdin = open('/dev/stdin')
            pdb.Pdb.interaction(self, *args, **kwargs)
        finally:
            sys.stdin = _stdin

#import pdb;pdb.set_trace()            
ForkedPdb().set_trace()      

最好的是通过rpdb调试" import rpdb;rpdb.set_trace() " (/usr/lib/python3/dist-packages/ceilometer/pipeline/event.py), 修改workers=1, 然后运行"systemctl restart ceilometer-agent-notification", 最后运行"nc 127.0.0.1 4444"

apt install python3-pip -y
pip3 install rpdb
import rpdb;rpdb.set_trace()  #add this line in /usr/lib/python3/dist-packages/ceilometer/pipeline/event.py
change workers=1 in /etc/ceilometer/ceilometer.conf
systemctl restart ceilometer-agent-notification
tail -f /var/log/ceilometer/ceilometer-agent-notification.log
./tools/instance_launch.sh 1 cirros #use it to trigger breakpoint
netstat -anp |grep 4444
nc 127.0.0.1 4444

20210512更新 - keystone ha

juju deploy cs:keystone -n3
juju deploy cs:mysql
juju deploy cs:hacluster keystone-hacluster
juju add-relation keystone mysql
juju add-relation keystone keystone-hacluster
juju config keystone vip=10.5.100.0

参考

[1] https://wiki.ubuntu.com/OpenStack/OpenStackCharms/ReleaseNotes1501
[2] https://blog.csdn.net/qingyuanluofeng/article/details/83536546
[3] https://www.cnblogs.com/luohaixian/p/11145939.html
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

quqi99

你的鼓励就是我创造的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值