

Starting independent nodes

Clusters are set up by re-configuring existing RabbitMQ nodes into a cluster configuration. Hence the first step is to start RabbitMQ on all nodes in the normal way:

rabbit1$ rabbitmq-server -detached
rabbit2$ rabbitmq-server -detached
rabbit3$ rabbitmq-server -detached

This creates three independent RabbitMQ brokers, one on each node, as confirmed by the cluster_statuscommand:

rabbit1$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
rabbit3$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit3 ...

The node name of a RabbitMQ broker started from the rabbitmq-server shell script israbbit@shorthostname, where the short node name is lower-case (as in rabbit@rabbit1, above). If you use the rabbitmq-server.bat batch file on Windows, the short node name is upper-case (as inrabbit@RABBIT1). When you type node names, case matters, and these strings must match exactly.

Creating the cluster

In order to link up our three nodes in a cluster, we tell two of the nodes, say rabbit@rabbit2 andrabbit@rabbit3, to join the cluster of the third, say rabbit@rabbit1.

We first join rabbit@rabbit2 as a ram node in a cluster with rabbit@rabbit1 in a cluster. To do that, onrabbit@rabbit2 we stop the RabbitMQ application and join the rabbit@rabbit1 cluster enabling the --ramflag, and restart the RabbitMQ application. Note that joining a cluster implicitly resets the node, thus removing all resources and data that were previously present on that node.

rabbit2$ rabbitmqctl stop_app
Stopping node rabbit@rabbit2 ...done.
rabbit2$ rabbitmqctl join_cluster --ram rabbit@rabbit1
Clustering node rabbit@rabbit2 with [rabbit@rabbit1] ...done.
rabbit2$ rabbitmqctl start_app
Starting node rabbit@rabbit2 ...done.

We can see that the two nodes are joined in a cluster by running the cluster_status command on either of the nodes:

rabbit1$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...

Now we join rabbit@rabbit3 as a disk node to the same cluster. The steps are identical to the ones above, except that we omit the --ram flag in order to turn it into a disk rather than ram node. This time we'll cluster to rabbit2 to demonstrate that the node chosen to cluster to does not matter - it is enough to provide one online node and the node will be clustered to the cluster that the specified node belongs to.

rabbit3$ rabbitmqctl stop_app
Stopping node rabbit@rabbit3 ...done.
rabbit3$ rabbitmqctl join_cluster rabbit@rabbit2
Clustering node rabbit@rabbit3 with rabbit@rabbit2 ...done.
rabbit3$ rabbitmqctl start_app
Starting node rabbit@rabbit3 ...done.

We can see that the three nodes are joined in a cluster by running the cluster_status command on any of the nodes:

rabbit1$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
rabbit3$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit3 ...

By following the above steps we can add new nodes to the cluster at any time, while the cluster is running.

Changing node types

We can change the type of a node from ram to disk and vice versa. Say we wanted to reverse the types ofrabbit@rabbit2 and rabbit@rabbit3, turning the former from a ram node into a disk node and the latter from a disk node into a ram node. To do that we can use the change_cluster_node_type command. The node must be stopped first.

rabbit2$ rabbitmqctl stop_app
Stopping node rabbit@rabbit2 ...done.
rabbit2$ rabbitmqctl change_cluster_node_type disc
Turning rabbit@rabbit2 into a disc node ...
Starting node rabbit@rabbit2 ...done.
rabbit3$ rabbitmqctl stop_app
Stopping node rabbit@rabbit3 ...done.
rabbit3$ rabbitmqctl change_cluster_node_type ram
Turning rabbit@rabbit3 into a ram node ...
rabbit3$ rabbitmqctl start_app
Starting node rabbit@rabbit3 ...done.

Restarting cluster nodes

Nodes that have been joined to a cluster can be stopped at any time. It is also ok for them to crash. In both cases the rest of the cluster continues operating unaffected, and the nodes automatically "catch up" with the other cluster nodes when they start up again.

We shut down the nodes rabbit@rabbit1 and rabbit@rabbit3 and check on the cluster status at each step:

rabbit1$ rabbitmqctl stop
Stopping and halting node rabbit@rabbit1 ...done.
rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
rabbit3$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit3 ...
rabbit3$ rabbitmqctl stop
Stopping and halting node rabbit@rabbit3 ...done.
rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...

Now we start the nodes again, checking on the cluster status as we go along:

rabbit1$ rabbitmq-server -detached
rabbit1$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
rabbit3$ rabbitmq-server -detached
rabbit1$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
rabbit3$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit3 ...

There are some important caveats:

  • At least one disk node should be running at all times to prevent data loss. RabbitMQ will prevent the creation of a RAM-only cluster in many situations, but it still won't stop you from stopping and forcefully resetting all the disc nodes, which will lead to a RAM-only cluster. Doing this is not advisable and makes losing data very easy.
  • When the entire cluster is brought down, the last node to go down must be the first node to be brought online. If this doesn't happen, the nodes will wait 30 seconds for the last disc node to come back online, and fail afterwards. If the last node to go offline cannot be brought back up, it can be removed from the cluster using the forget_cluster_node command - consult the rabbitmqctl manpage for more information.
  • If all cluster nodes stop in a simultaneous and uncontrolled manner (for example with a power cut) you can be left with a situation in which all nodes think that some other node stopped after them. In this case you can use the force_boot command on one node to make it bootable again - consult the rabbitmqctlmanpage for more information.

Breaking up a cluster

Nodes need to be removed explicitly from a cluster when they are no longer meant to be part of it. We first remove rabbit@rabbit3 from the cluster, returning it to independent operation. To do that, onrabbit@rabbit3 we stop the RabbitMQ application, reset the node, and restart the RabbitMQ application.

rabbit3$ rabbitmqctl stop_app
Stopping node rabbit@rabbit3 ...done.
rabbit3$ rabbitmqctl reset
Resetting node rabbit@rabbit3 ...done.
rabbit3$ rabbitmqctl start_app
Starting node rabbit@rabbit3 ...done.

Note that it would have been equally valid to list rabbit@rabbit3 as a node.

Running the cluster_status command on the nodes confirms that rabbit@rabbit3 now is no longer part of the cluster and operates independently:

rabbit1$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
rabbit3$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit3 ...

We can also remove nodes remotely. This is useful, for example, when having to deal with an unresponsive node. We can for example remove rabbit@rabbi1 from rabbit@rabbit2.

rabbit1$ rabbitmqctl stop_app
Stopping node rabbit@rabbit1 ...done.
rabbit2$ rabbitmqctl forget_cluster_node rabbit@rabbit1
Removing node rabbit@rabbit1 from cluster ...

Note that rabbit1 still thinks its clustered with rabbit2, and trying to start it will result in an error. We will need to reset it to be able to start it again.

rabbit1$ rabbitmqctl start_app
Starting node rabbit@rabbit1 ...
Error: inconsistent_cluster: Node rabbit@rabbit1 thinks it's clustered with node rabbit@rabbit2, but rabbit@rabbit2 disagrees
rabbit1$ rabbitmqctl reset
Resetting node rabbit@rabbit1 ...done.
rabbit1$ rabbitmqctl start_app
Starting node rabbit@mcnulty ...

The cluster_status command now shows all three nodes operating as independent RabbitMQ brokers:

rabbit1$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit1 ...
rabbit2$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit2 ...
rabbit3$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit3 ...

Note that rabbit@rabbit2 retains the residual state of the cluster, whereas rabbit@rabbit1 andrabbit@rabbit3 are freshly initialised RabbitMQ brokers. If we want to re-initialise rabbit@rabbit2 we follow the same steps as for the other nodes:

rabbit2$ rabbitmqctl stop_app
Stopping node rabbit@rabbit2 ...done.
rabbit2$ rabbitmqctl reset
Resetting node rabbit@rabbit2 ...done.
rabbit2$ rabbitmqctl start_app
Starting node rabbit@rabbit2 ...done.




