Exploitation of data is critical to business success, and quicker data processing improves an organization’s ability to react to business events in real time. As a result, organizations are bringing together new types of data from a variety of internal and external sources for real-time data or near-real-time analytics. This can involve building data lakes and information hubs — often on public clouds — fed by real-time streaming technologies, to process and gain value from this variety of data. All these trends drive a growing need for capabilities that can effectively feed data into information hubs, data lakes and data warehouses and thereafter quickly process large data sets. These capabilities empower quick responses to changing business events, better engagement with clients, and more.

As organizations struggled to manage the ingestion of rapidly changing structured operational data, a pattern emerged in which organizations leverage data initially delivered to Kafka-based information hubs.

Kafka was conceived as a distributed streaming platform. It provides a very low latency pipeline that enables real-time event processing, movement of data between systems and applications, and real-time transformation of data. However, Kafka is more than just a pipeline; it can also store data. Kafka-based information hubs go well beyond feeding a data lake; they also deliver continuously changing data for downstream data integration with everything from the cloud to AI environments and more.

To help organizations deliver transactional data from the OLTP databases that power the mission-critical business applications into Kafka-based information hubs. IBM® Data Replication provides a Kafka target engine that applies  data  with very high throughput into Kafka. The Kafka target engine is fully integrated with all of the IBM data replication low-impact log-based captures from a wide variety of sources, including Db2® z/OS®; Db2 for iSeries; Db2 for UNIX, Linux® and Windows; Oracle; Microsoft SQL Server; PostgreSQL; MySQL; Sybase; Informix®; and even IBM Virtual Storage Access Method (VSAM) and Information Management System (IMS).

In the event that the requirement does not involve delivery to Kafka, the IBM data replication portfolio also provides a comprehensive solution for delivery of data to other targets such as databases, Hadoop, files, and message queues.

There is often little room for latency when delivering the data that will optimize decision making or provide better services to your customers. Hence, you need the right data replication capability that can incrementally replicate changes captured from database logs in near-real time. In turn, this capability can facilitate streaming analytics, feeding a data lake, and more, using the data landed by IBM replication into Kafka.

Learn more

See how you can use IBM Data Replication for optimized incremental delivery of transactional data to feed your Hadoop-based data lakes or Kafka-based data hubs, read the IBM Data Replication for Big Data solution brief. And read this blog to learn more and register for a planned, fully managed replication service on IBM Cloud® infrastructure that will address real-time replication for cloud-to-cloud and on-premises-to-cloud use cases.

Was this article helpful?
YesNo

More from Analytics

IBM acquires StreamSets, a leading real-time data integration company

3 min read - We are thrilled to announce that IBM has acquired StreamSets, a real-time data integration company specializing in streaming structured, unstructured and semistructured data across hybrid multicloud environments. Acquired from Software AG along with webMethods, this strategic acquisition expands IBM's already robust data integration capabilities, helping to solidify our position as a leader in the data integration market and enhancing IBM Data Fabric’s delivery of secure, high-quality data for artificial intelligence (AI).  According to a Forrester study conducted on behalf of…

Fine-tune your data lineage tracking with descriptive lineage

4 min read - Data lineage is the discipline of understanding how data flows through your organization: where it comes from, where it goes, and what happens to it along the way. Often used in support of regulatory compliance, data governance and technical impact analysis, data lineage answers these questions and more.  Whenever anyone talks about data lineage and how to achieve it, the spotlight tends to shine on automation. This is expected, as automating the process of calculating and establishing lineage is crucial to…

Reimagine data sharing with IBM Data Product Hub

3 min read - We are excited to announce the launch of IBM® Data Product Hub, a modern data sharing solution designed to accelerate data-driven outcomes across your organization. Today, we're making this product generally available to our clients across the world, following its announcement at the IBM Think conference in May 2024. Data sharing has become the lifeblood of modern organizations, fueling growth and driving innovation. But traditional approaches to data sharing can often be a bottleneck constricting the seamless sharing of data.…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters