A clientâ€™s adoption of the cloud is often viewed as binary, they are either on the cloud, or not. In reality the adoption is gradual, and fragmented due to different priorities within the business. For instance, the central IT team may want to use the cloud as an IaaS provider to allow them to gain additional agility from Cloud providerâ€™s ability to provision and scale their infrastructure, while line of businesses within the organization may want to purchase SaaS based solutions to complete their work more effectively. These different drivers, often lead to the situation where an organization has components on premise, and also in multiple cloud providers, such as IBM, AWS and Azure. To prevent these becoming silos, we need to provide secure, reliable communication between all of these deployments.
Figure 1: A typical organizations multi-cloud deployment
Logically each of the cloud providers can be considered a separate network zone, with separate IP address ranges, which may or may not overlap. These network zones need to be connected, this raw connectivity could be via the public internet, or in certain situations dedicated connectivity. Regardless of the mechanism used, raw IP connectivity is required. The natural next step is to declare success and run HTTPS over the raw IP connectivity.
Figure 2: Logical communication between network zones
Initially this may seem like a suitable approach, however you will soon identify a number of issues:
- Network Isolation: individual cloud providers have their own network setup, and addressable IPs. When connecting clouds you would need to decide if you can have a single logical network or if network address translation is required. Commonly a single logical network is impractical due to existing conflicts.
- Firewalls: the IP connectivity allows in theory for communication, however organizations have spent years building complex security to protect their systems. Systems such as Firewalls deliberately prevent communication from the outside world (such as Cloud providers) from entering their trusted on premise network.
- Reliability: within an organizations data center, the modern networks are relatively reliable, however as you reach out to cloud providers, especially when the public internet is being used, reliability is often an issue. For certain applications, this may not be a problem, such as a product pricing system, however you may be more concerned with an order management system. Therefore raw HTTP across the public internet may not be suitable, and assured delivery may be beneficial.
- Security: while communication is within the organizations premises, you own the machines and network equipment, point to point TLS security may be adequate for sensitive traffic. As you adopt the cloud, TLS becomes less secure as it is potentially decrypted on load balancers and firewalls outside of your control. You are then relying on these Cloud providers to re-secure the traffic, and handle the traffic sensitively. While this may be adequate, some use cases may require a higher level of security, such as end to end security of the message.
- Store and forward: cloud applications are notorious for having a lower availability than traditional on premise applications. Regardless of the truth around their reputation any application can be unavailable for a period due to failures, and in this case we do not want requests to be blocked, which can cascade the issue across applications. Therefore we need to be able to decouple applications, and provide a store and forward capability in the case of the target application being unavailable.
Given these challenges this leads us to consider other transports, and technologies to resolve these difficulties. Logically we want to introduce messaging gateways in each of the network zones for these activities:
Figure 3: Messaging Gateways providing connectivity between clouds
The messaging gateways have to provide the following capabilities:
- Message Routing across network zones: regardless of the number of network zones, the messaging gateway, must be able to handle and hide any different network addressing from the consuming and producing applications.
- End to end security: messages sent through the messaging infrastructure should be able to have different levels of security, depending on the requirements. This could be different levels such as secure transport and message encryption.
- Assured Reliable Exactly Once Delivery: Depending on the application requirements, consumers must be able to retrieve messages exactly once. To support this the solution should act in a transactional manner, and depending on the use cases be able to support global transactional coordination across multiple resources.
- Message Store: to provide the ability to support the store and forward requirement, a message store is required to handle a build-up of messages. This store will be temporary, while the application is unavailable, or IP network connectivity issues, however to provide the assured reliable delivery it will need to be persistent.
- Flexible communication choices: the application communicating with the messaging gateway should be able to decide on its preferred communication mechanism. This could be via a client in various programming languages such as Java or using standard protocols such as HTTP, MQTT and AMQP.
- Highly Available and Scalable: the messaging gateway needs to be highly available to allow applications to offload their messages, and continue with their work without delay. As the messaging gateway is central to all communication across the multiple clouds, it is important to assure this can be scaled to meet the requirements of a growing business.
Many familiar with messaging technology will realize this is not a new problem, or a new set of capabilities. It is an evolution of the existing enterprise messaging problem, for cloud computing. Therefore we want to apply our years of experience, and proven technology to this problem, while assuring that we continue to provide a flexible and agile solution to the problem.
IBM MQ has a proven record of solving this use case across data centers separated across the globe. This same approach can be taken for the multi cloud deployment. In an IBM MQ deployment, the Messaging Gateway will be Queue Managers, and these will be clustered together into a MQ Cluster to allow efficient administration, high availability and scalability, as shown below:
Figure 4: IBM MQ Cluster providing the Multi-Cloud communication
This document does not attempt to explain MQ Clusters in full, however it is important that we understand the key fundamentals.
- Queue Managers (shown as Gateway Queue Managers): these are queue managers that are used to store and route messages through the cluster. For applications hosted in the same network zone (e.g. the same cloud, or on premise), clustered queues and topics will be created on these Gateway Queue Managers, to allow messages destined for these application to be stored. When an application wants to communicate with an application in another network zone (cloud), the application communicates with its local gateway queue managers and places the message, and can be assured the message will be delivered to the application securely. In some situations a single highly available queue manager will be adequate for each Gateway Queue Managers, however this can be scaled out as the availability and load requirements demands.
- Full Repository: holds all the information about the clustered queue managers objects. Normally there will be two full repositories within a cluster, so a level of redundancy is provided in the case of a failure. A full repository is a standard IBM MQ Queue Manager installation, however we recommend that these are left to the duty of administrating the cluster and do not hold any messaging traffic. The placement of the full repository is something to consider in a detailed design, however clients often decide to locate these in the on premise data center(s). All queue managers within a cluster will cache their cluster knowledge locally, therefore fast reliable access to the full repositories is not typically essential, to assure runtime performance.
The diagram below illustrates the configuration of a Multi-Cloud IBM MQ Cluster with full repositories located on premise:
Figure 5: Multi-Cloud IBM MQ Cluster with Full Repositories located on premise
In the case of an existing IBM MQ estate, a MQ Cluster may already exist, and clients often ask if they should extend their existing cluster, or establish a separate cluster. As the network configuration within the Multi-Cloud Cluster would need careful maintenance, it is suggested that overlapping clusters are used. An overlapping cluster is when two clusters, have at least one queue manager acting as a bridge between them. The diagram below illustrates this point where the on premise gateway queue managers are acting as the bridge between the clusters:
Figure 6: Overlapping MQ Cluster with on premise solution
The diagram above reuses the Full Repository associated with the Multi-Cloud Cluster, this may or may not be the approach that individual clients decide to take.
The overlapping cluster then allows an application in either cloud to communicate with their local queue manager, and messages be delivered to any application queue manager on premise. This traffic is funneled through the Gateway Queue Managers on premise, which act as the bridge between the two clusters. This has the advantage that direct communication between the Gateway Queue Managers in the cloud, and the on premise application queue managers are NOT required.
Taking our scenario one step further, as the number of applications within the cloud increase it may be required to establish separate MQ clusters in the individual cloud providers. The final topology may look similar to the below:
Figure 7: Overlapping MQ Cluster with all network zones
You can see that separate clusters have been established in the IBM and AWS clouds, and overlapped with the Multi-Cloud cluster. This allows applications to communicate with the local Queue Manager within the cloud (this no longer needs to be the Gateway Queue Manager), and messages can be delivered across the entire messaging network. Within the new clusters you will notice that new Full Repositories have been added, different options exist here, but for simplicity they are shown as two separate full repositories in each cloud.