Stay updated

Let’s see how to use RabbitMQ in high availability contexts
High availability with RabbitMQ
Wednesday, July 17, 2019

When dealing with complex systems, it is often necessary to consider mechanisms, thanks to which the services provided are always available, it means that a system must be highly reliable.

In the previous article (Decoupling the communication with RabbitMQ), we explored the basic features of RabbitMQ and we considered how it represents an excellent solution for communication between different applications.

But RabbitMQ also provides features, that make it a valid tool even for those systems that require a certain level of QoS (Quality of Service).

A RabbitMQ broker can be defined as the logical set of one or more nodes, that run the RabbitMQ application and that share the same entities (queues, exchanges, bindings, etc.).

The set of nodes is also called cluster. These nodes are identified within the cluster by their name, consisting of a prefix and a hostname, which must be therefore unique.

Moreover, the hostname is necessary for their identification within the cluster and therefore a hostname must be resolvable, for example, through a DNS system (Domain Name System).

For this reason, and to best simulate the composition of a cluster, I defined a Docker bridge network, that provides automatic DNS resolution between containers.

docker network create --subnet= cluster-network

We can create cluster nodes using two Docker containers, that run RabbitMQ, with the following commands:

docker run -d -h node1.rabbit \
           --net cluster-network --ip\
           --name rabbitNode1\
           --add-host node2.rabbit:\
           -p "4369:4369"\
           -p "5672:5672"\
           -p "15672:15672"\
           -p "25672:25672"\ 
           -p "35672:35672"\
           -e "RABBITMQ_USE_LONGNAME=true"\
           -e RABBITMQ_ERLANG_COOKIE="cookie"\
docker run -d -h node2.rabbit\
           --net cluster-network --ip\
           --name rabbitNode2\
           --add-host node1.rabbit:\
           -p "4370:4369"\
           -p "5673:5672"\
           -p "15673:15672"\
           -p "25673:25672"\
           -p "35673:35672"\
           -e "RABBITMQ_USE_LONGNAME=true"\
           -e RABBITMQ_ERLANG_COOKIE="cookie"\

Currently, the two nodes are separate entities:



In order to join the two nodes to the same cluster, it must be ensured that specific ports are accessible and in particular:

  • 4369: epdm (ERLANG PORT MAPPER DAEMON), a service for peers discovery used by nodes and Rabbitmq CLI tools;
  • 5672: port used by AMQP protocol;
  • 25672: used for communication between nodes and CLI tools;
  • 35672-35682: used by CLI tools for communication with nodes;
  • 15672: used, for instance, by the management UI.

Furthermore, we note that, among the configured environment variables, RABBITMQ_ERLANG_COOKIE is defined. This variable is a secret key, that allows two nodes of a cluster to interact with each other.

Now stop the execution of RabbitMQ on the rabbitNode2 node:

docker exec rabbitNode2 rabbitmqctl stop_app

and then:

docker exec rabbitNode2 rabbitmqctl join_cluster rabbit@node1.rabbit

Restarting the application:

docker exec rabbitNode2 rabbitmqctl start_app

We get the following result:

Obviously, the Management UI is very easy and practical but we can obtain information on the cluster also by running the following command:

docker exec rabbitNode1 rabbitmqctl cluster_status

Let’s try to send a message to our cluster with the Sender application, shown in the previous article. Once the application has been run, we can go to the management UI of both nodes, and we can see that the queue has been created and the message is correctly appended, but all that happens is replicated on both nodes.



What happens if one of the nodes stops working?

If we stop the execution on the rabbitNode1 while the message is in queue, we can find this situation on management UI of the rabbitNode2:

Creating a cluster, we certainly have the replication of data and states necessary for the operating of the broker, but this is not true for queues, that are basically located on a single node and for this reason, terminating the execution of the rabbitNode1 node we have lost the message sent.

To remedy this unpleasant situation that results in loss of information, RabbitMQ allows you to create High Available Queues, also called Mirrored Queue. A queue, which is located on a node (master), can be replicated, as well as the operations that take place on it, on the nodes (mirrors) that create the cluster.

To configure the cluster queues to be mirrored, you need to define a policy that is a pattern shared by all the queues represented by a regular expression.

Let’s run the following command:

docker exec rabbitNode1 rabbitmqctl set_policy ha "." '{"ha-mode":"all"}'

where ha is the name of the policy, “.” is the pattern and ha-mode, set to “all”, causes all queues to be high available.

Let’s create the queue again by starting the Sender application, then, in the queue section of the management UI, exactly in the details of our queue, we can check what we have just described.

If we stop again the rabbitNode1 container, it will no longer be running but, unlike before, the queue is still available, and we have not lost the message.

To receive it correctly, however, you need to make a small change to our consumer application defined in the previous article as Receiver.

var endPointList = new List<amqptcpendpoint>
                new AmqpTcpEndpoint("localhost", 5672),
                new AmqpTcpEndpoint("localhost", 5673)
            var factory = new ConnectionFactory();
            using (var connection = factory.CreateConnection(endPointList))

We define a list of Endpoints (the nodes of our cluster) by specifying the ports for the AMQP protocol. We pass this list as a parameter of the CreateConnection method, made available by the RabbitMQ client for .NET, which verifies which endpoint is available for the connection to the broker.

Running the Receiver application we notice how the message is consumed correctly:

Al prossimo articolo.