October 26, 2017 By Oriana Zambrano 2 min read

Streaming Analytics Updates: IBM Streams Runner for Apache Beam

The IBM Streaming Analytics service is a cloud-based service for IBM Streams. Streams is an analytics platform that allows you to create applications that analyze data from a variety of sources in real time. Streaming Analytics continues to add enhancements to make it easy for you to create streaming applications however you choose. Previously, we announced integration with DSX to allow creating Streams applications in Python. Now, you can run a Beam application/pipeline in Streaming Analytics.

Imagine you are given the task to write an application for a website. The application needs to look at online users and their activity to identify popular content. You’ll need to look at logs, user clickstreams, and existing user data stored in a database. Which platform are you going to use to write this application: Apache Spark, Apache Flink, IBM Streams? Why not write the app with a single interface and choose where you run it later?

This is the goal of Apache Beam, a unified programming model for data processing—batch or streaming. Similar to Streams, Beam allows users to develop data processing applications using a set of functions to manipulate your data. Beam, however, simply provides a programming model, and leaves it up to you to select a runtime platform via a runner when you launch your application.

We’ve added the IBM Streams Runner for Apache Beam to the Streaming Analytics service so that you can run your Beam application on the Streams platform.

Beam on the Industry-leading IBM Streams Platform

IBM Streams offers a continuous, complete, and connected solution. If you use IBM Streams as your Beam runner, you’ll get a fast, stable, industry-leading platform. In addition, since the Streams runner can run in the cloud, you can develop Beam applications locally using the direct runner and then later deploy the applications to the Bluemix cloud.

No Streams Installation Required — The Streams runner allows you to directly send your applications to the Streaming Analytics service to be compiled and executed. This means there’s no need to install Streams on your system.

Interact with Beam pipelines with the newly updated Streams Console — Beam applications appear just like they are laid out in your source code. Additionally, you can view all custom metrics, console logs, data stream flow rates, and even congested streams.

Download today — The Streams Runner is now available to download through your existing Streaming Analytics service. Don’t have an existing service? Create one here.

IBM Streams Runner for Apache Beam Features

  • Support for Beam 2.0 Java SDK

  • Support primitive and custom composite Beam transforms

  • Support for custom Beam metrics

    • Counter, Distribution, and Gauge types

    • Watermark metrics are automatically created for you

  • Support for processing-time and event-time timers and window triggers

  • Support for stateful processing

  • Support for custom parameters specified at application runtime

  • Integration into the Streams Platform

    • Submit Beam applications to a Streaming Analytics service with no local Streams installation required

    • Specify local data files to be available for your application in the Streaming Analytics service

    • Support to cancel Streams job from the Beam application

    • View Beam Pipeline layouts in the Streams Graph

  • Specialized Beam SDK for Streams

    • Publish data streams for other Streams applications to utilize or subscribe to data streams for your application to consume

    • Read/write files to an IBM Object Storage OpenStack Swift for Bluemix service

Learn More

More from Announcements

Enabling customers to streamline document management with IBM watsonx.ai

2 min read - At Accusoft, our mission is to help organizations solve their most complex content workflow challenges by helping them adopt content processing, conversion and automation solutions. We accomplish this through our flagship product, PrizmDoc, which enables developers to enhance their applications with in-browser document viewing and collaboration functionality.  Over the past several years, we’ve been closely monitoring the evolving role of artificial intelligence (AI) in delivering cutting-edge solutions to our customers. Unlike many competitors who hastily incorporate AI  to check a…

Success and recognition of IBM offerings in G2 Summer Reports  

2 min read - IBM offerings were featured in over 1,365 unique G2 reports, earning over 230 Leader badges across various categories.   This recognition is important to showcase our leading products and also to provide the unbiased validation our buyers seek. According to the 2024 G2 Software Buyer Behavior Report, “When researching software, buyers are most likely to trust information from people with similar roles and challenges, and they value transparency above other factors.”  With over 90 million visitors each year and hosting more than 2.6…

Manage the routing of your observability log and event data 

4 min read - Comprehensive environments include many sources of observable data to be aggregated and then analyzed for infrastructure and app performance management. Connecting and aggregating the data sources to observability tools need to be flexible. Some use cases might require all data to be aggregated into one common location while others have narrowed scope. Optimizing where observability data is processed enables businesses to maximize insights while managing to cost, compliance and data residency objectives.  As announced on 29 March 2024, IBM Cloud® released its next-gen observability…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters