Set up a reliable, high performant distributed messaging infrastructure with Kafka
Implement a resource adapter to integrate Kafka with enterprise Java solutions
Because the world has gone mobile, apps must now make data available in real time. It is not only the final result that is stored in a database table that is important but also all the actions a user makes while using an application. Whatever information that is available, such as user clicks, log data, or sensor data is used to enhance the user experience, generate reports, feed machine learning systems, and so on. Today developers must focus on systems that are based on real time flow of events.
The following image shows an example of an architecture based on event stream processing.
Apache Kafka has established itself as the go-to solution for building highly scalable event-based systems. Kafka provides rapidly evolving capabilities for an event streaming platform that developers can use in modern business solutions. However, developers often need to integrate existing Java EE business solutions that are based on technologies like IBM MQ or IBM WebSphere Application Platform into these new event streaming architectures.
Consider this example. An online store has a mobile application that uses Kafka to send payment request data to a distributed payment system implemented in Enterprise Java. The solution must absolutely guarantee complete processing of a payment request exactly once (to avoid multiple charges to the buyer). However, failures are inevitable within a distributed system, so the solution needs to handle the failures gracefully.
Implementing messaging using Apache Kafka
Apache Kafka is a distributed system used for event stream processing and is extensively used in microservices architectures and cloud-based environments. It provides messaging, storing, and processing of events, all inside the same platform.
The image below shows a basic topology of Apache Kafka components, consisting of producers and consumers exchanging messages via the Kafka cluster infrastructure.
Even though Kafka has many advantages, Kafka struggles with issues such as:
- Manual compensation logic in case of message processing failure, which might lead to messages not being processed
- No support for XA transaction processing
- Ensuring exactly once delivery processing in consumer applications
- Extra development and maintainability effort for integrating it into enterprise solutions
To solve Kafka integration issues, you can apply traditional messaging topology concepts, like transaction logs, recovery logs, and XA transactions. You can implement a resource adapter that is based on the Java EE Connector Architecture (JCA). With this JCA resource adapter, you can provide the application server with ACID features for Kafka message processing. This JCA resource adapter then provides seamless Kafka integration with Enterprise Java applications.
Implementing a JCA resource adapater
The Java EE Connector Architecture defines a set of scalable, secure, and transactional mechanisms. You can install your JCA resource adapter into any Java EE compliant application server, such as IBM Websphere Application Server, IBM Business Process Manager, JBoss, WebSphere Liberty, Glassfish, or Weblogic.
The Java EE Connector Architecture specification also provides a set of standard contracts that enable communication between enterprise applications and enterprise information systems like Kafka. The JCA resource adapter can plug into an application server and enable Kafka integration by taking care of all system-level mechanisms (transactions, connection management, crash recovery, error tracing, and logging). The JCA resource adapter will hide all Kafka communication logic from the enterprise applications that need to integrate with it. By implementing the JCA resource adapter, the enterprise application provider can focus on implementing business and presentation logic and not low-level logic related to Kafka integration. Therefore, the JCA resource adapter, is developed once and reused by different applications.
Let’s relate this to our online store payment scenario by taking a look at the following diagram that shows the designed solution system context.
The mobile application sends payment request data to Kafka, which is integrated with the payment enterprise application through the Kafka resource adapter. In addition, payment notifications are also pushed to Kafka using the adapter. The adapter starts an XA transaction, which is propagated to the payment enterprise application and also to the notification system. Therefore, all the tasks that are related to payment request processing will run inside the same global transaction and complete or fail together. The design imposes setting up retry, dead letter, and transaction log topics on Kafka in addition to the topics that the data is read from or written to.
Let’s now explore the processing of messages sent from and to the mobile application in more detail.
The inbound flow
In our payment scenario, the inbound flow refers to the communication that is initiated by the online store mobile application, which sends the payment request data to Kafka. The resource adapter provides Kafka connectivity and asynchronously delivers messages to the message endpoints that exist on an application server. This can be achieved using the message inflow contract as defined by the JCA Specification.
The Kafka JCA resource adapter implements an Activation Specification JavaBean with a set of configuration properties that are used for endpoint activation configuration. These configuration details are defined as part of an application server configuration.
The resource adapter periodically polls a batch of payment requests from the inbound Kafka topic. After successfully polling the data, it iterates through the batch of data, and asynchronously delivers messages to the endpoint instances. Multiple endpoint instances might exist for each message endpoint, which enables concurrent message consumption and providing high throughput.
The Kafka consumer offset is committed right after scheduling message delivery to avoid the problem of blocked batches. This design is viable because the resource adapter implements failover procedures through retry, dead letter, and transaction log topics that need to be set up on Kafka. In our case, the endpoint needs to support XA transactions, and a transaction context needs to be created before sending data to the endpoint, thus providing atomic message consumption.
If the transaction were to be aborted by the application server, all work that was done by the endpoint instance should be rolled back and the message should be forwarded to the Kafka retry topic.
The adapter consumes messages from the Kafka retry topic and reprocesses them. After the configured number of retries to process a message is exceeded, the adapter transfers that message to a Kafka dead letter topic. Because messages sent to the dead letter topic contain valuable business data, it is important to monitor the topic.
The outbound flow
The outbound flow refers to Kafka communication initiated by an enterprise application. In our case, this is the notification system that is used for sending payment confirmations to the mobile application. The JCA Specification defines a connection management contract that enables an application server to pool Kafka connections, which provides a scalable environment that can support a large number of clients.
Kafka outbound connection configuration details are defined using a Managed Connection Factory JavaBean. Using these configuration details, the adapter enables administrators and developers to configure the Kafka producer and decide on features such as reliability, availability, throughput, latency, and transaction support. These configuration details are defined as part of the application server configuration.
The Kafka JCA resource adapter exposes a Kafka Connection Factory and Kafka Connection that implement CCI (Common Client Interface) as well as JMS (Java Message Service) interfaces. An application component looks up a connection factory using the JNDI (Java Naming and Directory Interface) name. After successfully obtaining the factory, the application uses it for getting a connection for accessing Kafka. As a result, you can add a Kafka integration seamlessly for the notification system application, which currently sends data to JMS messaging providers like IBM MQ or Active MQ.
The resource adapter outbound flow encapsulates low-level Kafka communication logic and provides the following:
- Connection pooling
- Uses Kafka transactional mechanisms to guarantee only once delivery
- Takes care of identifying, logging, and handling Kafka failures gracefully
- Implements XA transactions and therefore provides reliable message processing with Kafka in distributed systems.
To manage transactions inside the outbound flow, the Kafka resource adapter uses the transaction management contract defined by the JCA Specification.
In our case, the connection factory needs to be set to support XA transactions, and the adapter needs to start a Kafka transaction when the client obtains a connection. The Kafka transaction is to be aborted if the application server rolls back the transaction at any point. In the case when an XA transaction commit takes place, the transaction manager performs a two-phase commit protocol across all the resources that are taking part in the running transaction. This guarantees that all read/write access to the managed resources is entirely committed or rolled back.
Lastly, the resource adapter keeps track of running transactions by writing transaction data into a Kafka transaction log topic. The data that is written to the transaction log topic is used for crash recovery processing, which provides reliable message processing in a distributed system.
The approach to design a Kafka JCA Adapter provides a “plug and play” JMS integration with a Kafka event processing platform for standard Enterprise Java solutions. The design enables you to seamlessly integrate Kafka with existing enterprise applications with no need to implement compensation logic. The adapter also enables an application server to provide the infrastructure and runtime environment for Kafka connectivity and transaction management that enterprise applications rely on.