Home Topics Event Streaming What is event streaming?
Explore IBM's event streaming solution Subscribe to AI topic updates
Illustration with collage of pictograms of gear, robotic arm, mobile phone

 

Published: 30 April 2024
Contributors: Chrystal R. China, Michael Goodwin

What is event streaming?

Event streaming is the practice of capturing real-time data from applications, databases and IoT devices and transporting it to various destinations for immediate processing and storage, or for real-time analysis and analytics reporting.

A key function in event stream processing (ESP), event streaming enables IT infrastructures to handle large, continuous streams of events by processing data when the event or change happens.

Event streaming often serves as a complement to batch processing, which acts on large, static data sets (or “data at rest”). However, instead of processing data in batches, event streaming processes single data points as they emerge so that software within the architecture can interpret and respond to streams of data (“data in motion”) in real time.

High-performance event streaming services can power a range of both simple and complex tasks, from sending notifications when stock or product prices change to building real-time machine learning models that detect suspicious user activity. Even in the case of batch processing, event streaming can add depth to data analytics by connecting events with their respective timestamps and identifying historical trends.

Book a live demo

Learn how IBM Event Automation can help you put events to work by enabling business and IT users to detect situations, act in real time, automate decisions and maximize revenue potential.

Related content

Subscribe to the IBM newsletter

What is an event?

Event streaming revolves around the unbounded, sequential and real-time flow of data records, called "events,” foundational data structures that record any occurrence in the system or environment. It’s a term that essentially refers to every data point in the system. Therefore, a “stream” (also called a data stream or streaming data) is the continuous delivery of those events.

Each event typically comprises a key that identifies the event or the entity it pertains to, a value that holds the actual data of the event, a timestamp that indicates when the event occurred or was recorded and sometimes metadata about the data source, schema version or another attribute.

With the help of specialized stream processing engines, events can undergo a few different processes within a stream. “Aggregations” perform data calculations, like means, sums and standard deviation. “Ingestion” adds streaming data to databases. Analytics processing uses patterns in streaming data to predict future events, and enrichment processing combines data points with other data sources to provide context and create meaning.

Events are often tied to business operations or user navigation processes and typically trigger another action, process or series of events. Take online banking, as one example.

When a user clicks “transfer” to send money from one bank account to another, the funds are withdrawn from the sender’s account and added to the recipient’s bank account, email or SMS notifications are sent to either (or both) parties, and if necessary, security and fraud prevention protocols are deployed.

Key components of event streaming

Events are, of course, the central component of event streaming; however, a series of other components enable streaming services to process events as quickly and effectively as they do. Other vital components include:

Brokers

Brokers, or message brokers, are the servers that run event streaming platforms. Message brokers enable applications, systems and services to communicate with each other and exchange information by converting messages between formal messaging protocols. This allows interdependent services to “talk” with one another directly, even if they are written in different languages (Java or Python, for example ) or implemented on different platforms. It also facilitates the decoupling of processes and services within systems.

Brokers can validate, store, route and deliver messages to the appropriate destinations. In distributed event streaming systems, brokers ensure low latency and high availability by replicating events across multiple nodes. Brokers can also form clusters—sets of brokers working together for easier load balancing and scalability.

Topics

Topics are categorizations or feed names to which events are published, providing a way to organize and filter events within the platform. They act as the "subject" for events, allowing consumers to subscribe to topics and receive only relevant events.

Partitions

Topics can be further divided into partitions, allowing multiple consumers to read from a topic simultaneously without disrupting the order of each partition.

Offsets

An offset is a unique identifier for each event within a partition, marking the position of an event within the sequence. Consumers use offsets to organize the events they’ve processed. If, for instance, a consumer disconnects from an event and later reconnects, it can resume processing from the last known offset.

How does event streaming work?

Given the proliferation of data—and the resulting surge of data traffic—event streaming is an essential component of modern data architectures, especially in environments that require lightning-fast decision-making capabilities or in organizations looking to automate decision-making responsibilities.

Here’s how event streaming services manage event data:

  1. Event generation. Event streaming starts when producers (microservices, backend systems, IoT ecosystems or event-driven APIs, for instance) send events to the event streaming platform.

  2. Event publishing. Using client libraries, producers publish the event to a specific topic within the platform, at which point they become available for consumers (the apps and services that subscribe to the topic).   

  3. Event storage. The platform stores the events for a predetermined period or until they are consumed, with brokers managing the event storage and retrieval processes.

  4. Event consumption. Consumers process event data to initiate other events. Depending on their purpose or configuration, they can act upon events immediately as they arrive (real-time processing), store them for later processing or compile them for batch processing. Consumers also track their position in the data pipeline by using offsets, so they can resume processing from where they left off, in case of a failure or restart.

  5. Event delivery. The event broker delivers events to all subscribed consumers. Delivery semantics can include "at least once" delivery (where events are guaranteed to be delivered but may be duplicated), "exactly once" delivery (where each event is delivered once and only once), or another type of semantics.

  6. Event processing. Once the initial event is consumed and delivered, event data is processed for several downstream actions, including transforming the data, aggregating it and triggering complex event processing (CEP) workflows.
Event streaming features

In addition to standard streaming and processing, event streaming platforms (like Amazon Kinesis, Google Pub/Sub, Azure Event Hubs and IBM Event Automation, which uses the processing power of the open source Apache Kafka platform) facilitate a range of streaming practices that enhance functionality.

Exactly-once processing

Exactly-once delivery semantics ensures that each event in a stream is processed exactly once, an essential feature for preventing duplicate and lost stream events. Most event streaming systems include mechanisms to provide exactly-once semantics, regardless of failures elsewhere in the system.

Backpressure

When downstream components can't keep up with the incoming event rate, backpressure prevents streams from overwhelming the system. With backpressure, a data flow control mechanism, consumers can signal producers to throttle or stop data production when they’re overwhelmed with data processing or unable to keep up with incoming events.

This process allows systems to gracefully handle workloads by buffering or dropping incoming events—instead of disrupting the entire system—so that event processing remains stable as workloads fluctuate.

Consumer groups

Event consumers often work as part of a consumer group to accelerate event consumption. Each consumer in a consumer group is assigned a subset of partitions to process, parallelizing consumption for greater efficiency. If one consumer within the group fails or needs to be added or removed, the platform can dynamically reassign partitions to maintain balance and fault tolerance.

Watermarking

Event streaming often means processing data in a time-sensitive manner. Watermarking enables progress tracking (by using event time) in stream processing systems; it enforces a completeness threshold that indicates when the system can consider event data fully processed. Watermarking can also come in handy for ensuring accuracy in time-based processing and for reconciling out-of-order events.

Data retention and compaction

Most event streaming platforms offer customizable data retention policies that allow developers to control how long events are available for consumption. Data compaction, however, is a process that removes redundant or obsolete data from topics, keeping the storage footprint minimal while preserving essential data.

It’s worth noting, again, that standard streaming architectures typically decouple event producers, event brokers and consumers, so that components can be scaled and maintained independently.  

Event streaming use cases

Event streaming is a powerful concept that allows organizations to use data as it’s generated, creating more responsive and intelligent systems. With the rise of data-driven decision-making, event streaming is an increasingly important component in modern software architectures.

As such, event streaming technologies have a range of use cases across business sectors, including:

Banking and financial services

Financial institutions can use event streaming services to process market data in real time, enabling algorithmic trading systems to make split-second decisions based on up-to-the-minute market conditions. Event streaming’s real-time monitoring capabilities also help institutions quickly identify and address fraud and security risks.

Manufacturing

Event streaming can facilitate supply chain optimization by allowing manufacturers to track materials and products as they move through the supply chain to identify bottlenecks and process inefficiencies. Furthermore, by streaming data from IoT/IIoT sensors on machinery, managers can predict when and why equipment might fail and perform preventive maintenance or predictive maintenance to avoid unplanned downtime.

Gaming and entertainment

Online gaming platforms can use event streaming services to track player actions and game state changes, which can be used to run game analytics, enforce anti-cheating policies and increase player engagement. Streaming platforms can leverage event data to provide personalized content recommendations for users and create a tailored customer experience.

Transportation and logistics

Event streaming can be used to track vehicle location and status, enabling real-time routing based on traffic conditions, delivery schedules and vehicle performance. Logistics companies can similarly use event data from scanning devices and GPS trackers to provide customers with real-time updates on the status of their e-commerce deliveries.

Use in event-driven architectures and other patterns

Beyond specific industries, event streaming can also be useful when deployed in concert with other technologies and architectures. For instance, event streaming is sometimes associated with patterns like event sourcing and command query responsibility segregation (CQRS).

Event sourcing is an architectural pattern wherein changes to the app state are stored as a sequence of events. Used alongside event streams, event sourcing allows the streaming system to replay these events to reconstruct the state of an entity at any point in time or to drive other components of the system.

In a CQRS architectural pattern, the system is split into two different models: one that handles commands (writing) and one that handles queries (reading). Event streaming can be used in CQRS to propagate changes from the write model to the read model in real-time, enabling asynchronous integration between the two.

Event streaming is also a foundational technology for building event-driven architectures.

An event-driven architecture enables loosely coupled components to communicate through events, but instead of publishing streams of events to a broker, it publishes a single-purpose event that another app or service can use to perform actions in turn. The in-stream processing power provided by event-driven architectures—used along with event-streaming capabilities—can enable businesses to respond to data in motion and make quick decisions based on all current and historical data.

In recent years, cloud providers (including IBM Cloud) have started to offer the principles of event streaming as services. Event streaming-as-a-service makes it easier for businesses to adopt event streaming without managing the entire underlying infrastructure, further broadening event streaming’s use cases.

Related products
IBM Event Automation

IBM® Event Automation is a fully composable solution that enables businesses to accelerate their event-driven efforts, wherever they are on their journey. The event streams, event endpoint management and event processing capabilities help lay the foundation of an event-driven architecture for unlocking the value of events.

Explore IBM Event Automation

IBM Cloud Pak for Integration

IBM Cloud Pak® for Integration uses the open source Apache Kafka-built event streaming platform of IBM® Event Streams from IBM to help you build smart applications that can react to events as they happen. Handle mission-critical workloads through enhanced system connectivity, rich deployment and operations, and an event-driven architecture expertise.

Explore IBM Cloud Pak for Integration
Act on events with IBM App Connect

Orchestrate follow-on actions to immediately react to new events. App Connect supports event-driven architectures and can leverage polling mechanisms to work with those that don't. Configure flows to automate tasks and apply conditional logic to streamline decisions and respond to changing customer requirements.

Explore IBM App Connect

Resources Event-driven architecture vs. event streaming

What’s the difference, how can they benefit you and which is the right choice for your business?

Apache Kafka and Apache Flink: An open-source match made in heaven

Stream processing is at the core of real-time data. It allows your business to ingest continuous data streams as they happen and bring them to the forefront for analysis, enabling you to keep up with constant changes.

Leveraging event-driven architecture

Organizations that become more event-driven are able to better differentiate themselves from competitors and ultimately impact their top and bottom lines.

What is Async API?

Learn how the AsyncAPI specification can describe and document Kafka topics.

Real-time AI and event processing

By leveraging AI for real-time event processing, businesses can connect the dots between disparate events to detect and respond to new trends, threats and opportunities.

What is data management?

Data management is the practice of ingesting, processing, securing and storing an organization’s data, where it is then utilized for strategic decision-making to improve business outcomes.

Take the next step

IBM Event Automation, a fully composable solution, enables businesses to accelerate their event-driven efforts, wherever they are on their journey. It offers event distribution, event discovery and event processing capabilities in an intuitive interface so both business and IT users can put events to work and respond in real-time.

Explore Event Automation Book a live demo