Last year at IT-Clouds, a group responsible for the development of Swisscom’s cloud products, we enabled RabbitMQ’s quorum queue  support in our internal PaaS solution. We then migrated one of our software systems to take advantage of them in production. This post describes our system, our motivation and the steps we followed to migrate a number of Java-/Spring-based applications using code only.
IBiS stands for 'Integration of Business Services'. It is a key backend system of Swisscom's cloud offerings (Enterprise Service Cloud, Dynamic Computing Services), powering the billing and reporting capabilities of our products. The system:
To achieve its goals, IBiS listens on all sort of events coming from cloud services and acts on them accordingly. It consists of a number of components architected in a simple, event-driven manner:
IBiS is developed at Swisscom, written in modern Java, and powered by recent releases of the Spring Framework.
At the heart of IBiS lies RabbitMQ , an open-source messaging broker supporting the AMQP messaging protocol . We use RabbitMQ as a service offered within our internal Application Cloud (iAPC), a PaaS solution based on Cloud Foundry . It is responsible for reliable transport of events within the system, from the components that ingest them (e.g. from Apache Kafka ) to the components that process them by applying the relevant business logic. One of the most important business requirements of IBiS is the accuracy in following what is happening in the Swisscom clouds. To achieve that, we have to ensure that events are delivered in a reliable manner. Loss of an event can have bad consequences and might easily result in billing or reporting errors (e.g. a VM was deleted in the cloud but IBiS never learnt about it as the event was lost).
Originally, IBiS leveraged durable mirrored queues to achieve the desired reliability. While this has been a fine solution for a long time, recent RabbitMQ versions (3.8.0+) ship with quorum queue support. Quorum queues are considered to be the modern alternative to mirrored queues. They focus on data safety and have been designed specifically to address the needs of systems like IBiS, where reliability is key. Since our workload is exactly what the quorum queues are built for, we have decided to migrate away from mirrored queues. The rest of this article describes how we did it.
Before we talk about the migration itself, it is important to briefly describe the RabbitMQ setup that IBiS leverages. Our internal messaging architecture relies on several topic exchanges  to which messages (events) are sent. Each message is accompanied by a routing key which is, in our case, the event type. The exchange takes care of routing messages to a bunch of queues . The queues are created (declared) by our backend components, with each queue belonging to exactly one component and listening on a subset of routing keys. This essentially represents a publish/subscribe system where each backend component can subscribe to certain routing keys (event types) and listen only for the messages of interest. The architecture of one such exchange is depicted in the figure below:
When handling messages, we also have to deal with failures (e.g. when an event is malformed). For this purpose we rely on dead letter queues. When a message cannot be processed by a backend component, it is sent to a dead letter exchange  which pushes it to a dead letter queue owned by the component in which the problem was detected. The message stays there until we can understand the problem and process it manually.
Eventually our architecture can be summarised as follows.
Our research into the migration from mirrored queues to quorum queues quickly revealed that it is not something that RabbitMQ could help us with on its own. The RabbitMQ queues cannot just change their type. They are immutable, i.e., to switch to quorum queues we have to declare new queues, move messages from the old queues, and remove the old queues. One option to solve this problem would be to use the Shovel Plugin . However, this would require involving the RabbitMQ team at Swisscom to take action (plan maintenance, install the plugin, test it etc.) with no guarantee that the approach would actually work correctly for our use case. We therefore decided to take another route and handle everything as code, with Spring AMQP.
Spring AMQP  is a Spring project aiming to support the development of AMQP-based solutions. It provides handy abstractions that are simple to use, integrates well with the Spring Framework, and allows to leverage full potential of AMQP. While discussing Spring AMQP in depth is out of the scope of this article, it is important to note that it provides classes that allow us to declare and manage various RabbitMQ objects. For example, we can declare a new topic exchange, routing rules (bindings), as well as a new queue as simply as:
Once we declare our setup (which is then automatically provisioned at application startup, in an idempotent way), we can access RabbitMQ using the RabbitTemplate class. Let us try to put these pieces together to migrate mirrored queues to quorum queues.
We can note down what needs to happen in order to transparently and safely migrate from mirrored queues to quorum queues. We need to:
By leveraging our messaging architecture and Spring AMQP, we can translate these steps into the ones below. For each backend component we want to:
The steps described above are depicted in the sequence diagram below:
One might wonder why we decided to drain the old queues by sending the messages to dead letter queues instead of consuming them directly. We have chosen this approach for maintenance reasons. Since we heavily rely on dead letter queues to handle erroneous messages, we already have robust, battle-tested code in place that we can reuse. Consuming from the mirrored queues directly would be possible, but comes with a risk of not handling some edge cases and hence losing messages.
We can now go through the code that we used to migrate our queues.
First, let us declare the names of the queues. Assuming we declared the new quorum queue under the name of `my-quorum-queue` and we have the previous classic mirrored queue `my-classic-mirrored-queue` in place:
Then, we can use the autowired RabbitTemplate object (see Spring AMQP docs for more details) to obtain an administration client:
We then first check if the classic queue exists. If it does not, then we have nothing to do.
Next, we leverage the bindings that we declare in our application’s configuration. If we create them similarly to the example outlined in the Spring AMQP section above, we can autowire a Binding list and use it to remove the bindings from the old queue. The key assumption here is that the bindings are the same, i.e., the new queue has the same set of bindings as the old one.
Then, we assume that removing the bindings takes some time and some messages might still arrive while we are progressing. Therefore, we wait a bit.
At this point, the old queue should reach consistent state. All the messages are assumed to be in and no new messages should arrive as there are no bindings. We can therefore reject all of them, hence transferring to the dead letter queue:
At this moment, the old queue should be drained, with all outstanding messages moved to the dead letter queue. As explained previously, our IBiS codebase features logic to handle the dead letter queue. This is manifested through the `deadLetterQueueService` object (which, under the hood, simply provides a couple of utility methods to read from the dead letter queue or get some statistics). We use it to ensure migration has been successful and consume the messages:
Finally, we remove the old queue:
We have exposed this migration as a REST endpoint triggering automatically (after application startup) or manually (in case we need to repeat it after an error). The migration is idempotent, i.e., it can be triggered multiple times and repetitively yields the same result.
The full code can be found here.
In this article we described how we moved IBiS, a software platform that we develop to support Swisscom's cloud offerings in terms of billing and reporting, to leverage RabbitMQ's quorum queues. We discussed how we use RabbitMQ internally, what our requirements are, and how we decided to proceed with the migration. Finally, we went through real-world code samples to give a glimpse how such a migration can be achieved with Java and Spring AMQP project.
Trouve le Job ou l’univers professionnel qui te convient. Où tu veux co-créer et évoluer.
Ce qui nous définit, c’est toi.