Rabbitmq has his own buildin cluster management system. Here, we don’t need Pacemaker, everything is managed by RabbitMQ itself.
RabbitMQ or more generally the management queues layer is a critical component of OpenStack because every requests/queries use this layer to communicate.
CLUSTERING SETUP
1
| $ sudo apt-get install rabbitmq-server
|
RabbitMQ generates a cookie for each server instance. This cookie must be the same on each member of the cluster:
1
2
3
4
5
6
7
8
| rabbitmq-01:~$ $ sudo cat /var/lib/rabbitmq/.erlang.cookie
ITCWRVSIDPHRSLGXBHCFc
rabbitmq-02:~$ rabbitmqctl stop_app
rabbitmq-02:~$ rabbitmqctl reset
rabbitmq-01:~$ ssh localadmin@rabbitmq-02 'echo -n "" > /var/lib/rabbitmq/.erlang.cookie'
rabbitmq-02:~$ rabbitmqctl cluster rabbit@rabbitmq-01
Clustering node 'rabbit@rabbitmq-02' with ['rabbit@rabbitmq-01'] ...
...done.
|
Check your cluster status, on the node 01 or 02, whatever:
1
2
3
4
5
| rabbitmq-02:~$ sudo rabbitmqctl cluster_status
Cluster status of node 'rabbit@rabbitmq-02' ...
[{nodes,[{disc,['rabbit@rabbitmq-01']},{ram,['rabbitmq-02']}]},
{running_nodes,['rabbit@rabbitmq-01','rabbit@rabbitmq-02']}]
...done.
|
Eventually check your queues on both nodes, they should be identical:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
| $ sudo rabbitmqctl list_queues name slave_pids synchronised_slave_pids
Listing queues ...
network_fanout_3a9877aa044c47b7846bbe01bf7faa4a
volume.server-03
amq.gen-gXDebEl-MNze8Nv83Ox3wD
volume_fanout_2df3d830f3ac40a98fe924bb032f6512
cert_fanout_4b1d7951b629470db0d13880b5814a00
amq.gen-waleDspYQ1IP5lKkpmkc9P
volume
scheduler.server-05
consoleauth_fanout_a0c57c1bca3645eda5b25788de9dd484
amq.gen-gxnO_iZvkdB9BaLHX6g9LT
amq.gen-AsSmN5K6zqJ3-xDlJeP16t
amq.gen-A9B2HzA2dZmHjP-bNv2rwb
compute
scheduler
network
cert.server-05
consoleauth.server-05
network.server2
compute.server2
amq.gen-wN5T2Ylyqbu41-_RzJ4vf8
amq.gen-wHYGWuERrgzEHbBPrmTfwS
scheduler_fanout_d57ca4552c9542ed860a0ef8a9d48534
cert
compute_fanout_b9c1dd93cb1a4282bb32c066e7d621d2
amq.gen-QtDR90yqOLXuaNEPHozmt_
consoleauth
...done.
|
Cluster nodes can be of two types: disk or ram. Disk nodes replicate data in ram and on disk, thus providing redundancy in the event of node failure and recovery from global events such as power failure across all nodes. Ram nodes replicate data in ram only and are mainly used for scalability. A cluster must always have at least one disk node.
You can also verify that the connection is well established between the node:
1
2
| $ sudo netstat -plantu | grep 10.0.
tcp 0 0 10.0.0.1:39958 10.0.0.2:46117 ESTABLISHED 5294/beam.smp
|
TIPS
CHANGE THE IP OR THE HOSTNAME OF A NODE
If you changed you IP address or your hostname, this is pretty nasty and harsh but it works:
1
2
3
4
5
6
| $ sudo rabbitmqctl stop_app
$ sudo dpkg-reconfigure rabbitmq-server
Stopping rabbitmq-server: RabbitMQ is not running
rabbitmq-server.
Starting rabbitmq-server: SUCCESS
rabbitmq-server.
|
The IP address and/or will be refresh in the rabbitmq database.
CONVERT RAM NODE TO DISK NODE
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| rabbitmq-02:~$ sudo rabbitmqctl cluster_status
Cluster status of node 'rabbit@server-02' ...
[{nodes,[{disc,['rabbit@server-01']},{ram,['rabbit@server-02']}]},
{running_nodes,['rabbit@server-01','rabbit@server-02']}]
...done.
rabbitmq-02:~$ sudo rabbitmqctl stop_app
Stopping node 'rabbit@server-02' ...
...done.
rabbitmq-02:~$ sudo rabbitmqctl cluster rabbit@server-01 rabbit@server-02
Clustering node 'rabbit@server-02' with ['rabbit@server-01',
'rabbit@server-02'] ...
...done.
rabbitmq-02:~$ sudo rabbitmqctl start_app
Starting node 'rabbit@server-02' ...
...done.
rabbitmq-02:~$ sudo rabbitmqctl cluster_status
Cluster status of node 'rabbit@server-02' ...
[{nodes,[{disc,['rabbit@server-02','rabbit@server-01']}]},
{running_nodes,['rabbit@server-01','rabbit@server-02']}]
...done.
|
HAPROXY CONFIGURATION
Clustering doesn’t mean high-availability, this is why I put a load-balancer on top. Here HAProxy will balance the request only on one node, if this node fails the request will be route to the other node. It’s simple as that. The native port of HAproxy and the OpenStack queues ports are configured.
global
log 127.0.0.1 local0
#log loghost local0 info
maxconn 1024
#chroot /usr/share/haproxy
user haproxy
group haproxy
daemon
#debug
#quiet
defaults
log global
#log 127.0.0.1:514 local0 debug
log 127.0.0.1 local1 debug
mode tcp
option tcplog
option dontlognull
retries 3
option redispatch
maxconn 1024
# Default!
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
listen rabbitmq_cluster 0.0.0.0:4369
mode tcp
balance roundrobin
server server-07_active 172.17.1.8:4369 check inter 5000 rise 2 fall 3
server server-08_backup 172.17.1.9:4369 backup check inter 5000 rise 2 fall 3
listen rabbitmq_cluster_openstack 0.0.0.0:5672
mode tcp
balance roundrobin
server server-07_active 172.17.1.8:5672 check inter 5000 rise 2 fall 3
server server-08_backup 172.17.1.9:5672 backup check inter 5000 rise 2 fall 3
Et voilà!