Bridging the Gap: Direct COBOL Integration with Kafka on z/OS with IBM SDK
IBM recently announced the availability of Kafka for native z/OS applications, called IBM Open Enterprise SDK for Apache Kafka. I have personally been looking forward to this announcement from IBM for some time, and I’m happy to see it available now. In this blog, I will discuss a high-level overview of this offering and how it will benefit mainframe shops.
Chapter 1 : What is Kafka
Apache Kafka is an open-source software platform used to handle real-time data streams. Imagine you have a lot of messages or data coming in from different sources, like websites, apps, or sensors. Kafka efficiently collects, stores, and moves this data. Organizations use Kafka to process and analyze data in real-time, enabling them to make faster decisions and improve their services. According to a survey, around 50,000 organizations use Kafka, including big names like Uber, Netflix, and Tesla. The image below shows where Apache Kafka fits in.
Chapter 2 : Kafka integration with zOS Apps — Before SDK
The question is, was there a way to integrate z/OS applications with Kafka before this SDK was released? The answer is yes, but it required intermediaries to connect to Kafka. Below are some of the options that existed earlier.
I. MQ Connector to Kafka: For z/OS applications using MQ, the Kafka Connect source connector is used to copy data from IBM MQ into Apache Kafka.
II. Java on CICS: Data could be written from Java applications running in CICS Liberty using the Kafka client API. The client would send messages to the Kafka server through these APIs.
III. Change Data Capture: Replicate data from databases or VSAM files to Kafka using the IBM InfoSphere CDC solution, I have personally tried this solution to replicate DB2 zOS data to Kafka.
Chapter 3: What changes with IBM SDK for Kafka
Although over 70% of z/OS applications are written in COBOL (with an estimated 800 billion lines of COBOL code in production today), COBOL developers have been unable to interface with Kafka without relying on the intermediaries mentioned above. That changes now! The IBM Open Enterprise SDK for Apache Kafka provides a direct connection between mainframes and Kafka clusters, enabling z/OS apps to consume and publish events independently of their native applications.
This offering enables z/OS application developers to read and publish Kafka events directly from their native COBOL code. Developers can easily convert z/OS native COBOL copybooks into JSON event formats, regardless of programming language or runtime environment. This package allows clients to seamlessly integrate Kafka event processing into their existing z/OS applications.
Kafka Producer APIs
These APIs allow an application to publish a stream to a Kafka topic. They can be called from COBOL or C/C++ source code.
Kafka Consumer APIs
These APIs enable an application to subscribe to one or more topics, ingest, and process the streams stored in those topics. The APIs can handle real-time records or process past records and can be called from COBOL or C/C++ source code.
Data Transformation Utility
This utility facilitates the transformation between native COBOL copybooks and JSON event formats, simplifying the use of Kafka APIs.
Below image depicts the components of SDK and how the flow is from Producer to Consumer.
Below is a small snippet of COBOL code that publishes a message to a Kafka topic. Please note that some configurations are required to reach this point, which are not covered here.
* Create PRODUCE
MOVE ‘TEST MESSAGE FROM zOS TO KAFKA’ TO MSG
MOVE FUNCTION E2ACONV(MSG) TO MESSAGE-A
SET MSG-PTR TO ADDRESS OF MESSAGE-A
MOVE FUNCTION KAFPROD(KAF-TOPIC-REF PARTITION MSGFLAGS
MSG-PTR MSG-LEN PROD-KEY KEY-LEN MSG-OPAQUE)
TO PRODUCE-RES
I recommend checking out the blog by Madhu for a deep dive into the technical details of this offering. Link to blog https://medium.com/@madhuba07/modernizing-mainframe-applications-with-kafka-and-the-open-enterprise-sdk-d96af5d63a54
Chapter 4 : IBM MQ and Kafka
IBM MQ, formerly known as WebSphere MQ, is an enterprise messaging solution that provides a secure and reliable messaging infrastructure. It enables programs on distributed systems to communicate effectively with mainframe applications. MQ specializes in message queuing, ensuring that messages are delivered safely across various contexts.
One important topic to address in the context of mainframe integration software is the comparison between IBM MQ and Apache Kafka. Below is a high-level overview of the differences between the two:
So, if the question is which one is preferred over the other, this is already answered in the table above. But in short, both have their USPs: Kafka shines in circumstances requiring high throughput and real-time data processing, whereas IBM MQ is designed for environments that prioritize message delivery reliability and security.
Chapter 5 : Conclusion
A decade or two ago, mainframes were not as interconnected with the outside world as they are today. The rise of hybrid clouds has made mainframe integration with distributed applications almost mandatory. As Kafka emerged as the go-to solution for data integration, mainframes could not be left behind. While there were other methods for writing mainframe data to Kafka (as discussed in this blog), a direct integration between COBOL programs and Kafka was lacking. The release of the IBM SDK for Kafka on z/OS marks a significant step in bridging this gap. Excited to explore this offering in coming days.
References :
(1) https://www.ibm.com/products/open-enterprise-sdk-apache-kafka
(2) https://enlyft.com/tech/products/apache-kafka
(3) https://www.ibm.com/products/mq
Disclaimer: The views and opinions expressed in this blog are solely those of the author and do not necessarily reflect the official policy or position of the author’s employer or any other organization. The information provided is based on personal experience and research and is intended for informational purposes only.