Overview of the Z Common Data Provider

The Z Common Data Provider provides the infrastructure for accessing IT operational data from z/OS® systems and streaming it to the analytics platform in a consumable format. It is a single data provider for sources of both structured and unstructured data, and it can provide a near real-time data feed of z/OS operational data, like System Management Facilities (SMF) data and z/OS log data to your analytics platform.

The Z Common Data Provider monitors and collects SMF and z/OS log data and forwards it to the configured destinations.

In each logical partition (LPAR) from which you want to analyze SMF data or z/OS log data, a unique instance of the Z Common Data Provider must be installed and configured to specify the type of data to be gathered and the destination for that data, which is called a subscriber.

The Z Common Data Provider includes a web-based configuration tool that is provided as an application for IBM® WebSphere® Application Server for z/OS Liberty, or as a plug-in for IBM z/OS Management Facility (z/OSMF).

Flow of operational data to analytics platforms

The Z Common Data Provider consists of a set of components that can collect a wide variety of operational data, transforms the data and sends the data to a range of target services and platforms. For more information about the Z Common Data Provider components, see Components of the Z Common Data Provider.

Supported operational data

Operational data is data that is generated by the z/OS system as it runs. This data describes the health of the system and the actions that are taking place on the system. The analysis of operational data by analytics platforms and cognitive agents can produce insights and recommended actions for making the system work more efficiently and for resolving or preventing problems.

The Z Common Data Provider can collect the following types of operational data:

System Management Facilities (SMF) data
z/OS log data from the following sources:
- Job log, which is output that is written to a data definition (DD) by a running job
- z/OS UNIX log file, including the UNIX System Services system log (syslogd)
- Entry-sequenced Virtual Storage Access Method (VSAM) cluster
- z/OS system log (SYSLOG)
- IBM Z® NetView® messages
- IBM WebSphere Application Server for z/OS High Performance Extensible Logging (HPEL) log
- z/OS Resource Measurement Facility (RMF) Monitor III reports
- z/OS sequential data set
User application data, which is operational data from your own applications

Supported analytics platforms and data consumers

An analytics platform is a software program or group of dedicated systems and software that is configured to receive, store, and analyze large volumes of operational data. The following analytics platforms and data consumers are examples:

Z Data Analytics Platform, a component of IBM Z Operational Log and Data Analytics that can receive large volumes of operational data for analysis and can provide insights and recommended actions to the system owners, which are based on expert knowledge about z Systems® and applications.
Enterprise platforms such as Splunk, the Elastic Stack, Apache Kafka, or Humio that can receive and process operational data for analysis. The platforms like the Elastic Stack and Splunk do not include expert knowledge about z Systems and applications, but you can create or import your own analytics to run against the data.
IBM Db2® Analytics Accelerator for z/OS, a database application that provides query-based reporting.

You can deploy full architecture or lightweight architecture for the Z Common Data Provider based on your needs. Both architectures support stream mode and batch mode for processing operational data.

Full architecture

To deploy the full architecture of the Z Common Data Provider with all the features enabled, install the following components on each z/OS system:

System Data Engine
Log Forwarder
Data Streamer
Data Collector

The illustration shows the flow of data among the primary components, as described in the text. — Figure 1. Z Common Data Provider overview

Figure 1 illustrates how operational data (such as SMF data or log data) is gathered by data gatherers, such as the System Data Engine, the Log Forwarder, or the Data Collector, and can be streamed to multiple subscribers.

The data gatherers collect different types of operational data:
- The System Data Engine collects SMF records. It can run as a batch job to create output data for the IBM Z Performance and Capacity Analytics, or run in stream mode to collect data in near real time and send data to the Data Streamer for further transform.
- The Log Forwarder collects SYSLOG, OPERLOG, and middleware log data, and then sends data to the Data Streamer.
- The Open Streaming API is used to send your application data to the Data Streamer.
- The Data Collector is a lightweight component in the Z Common Data Provider. It collects SMF records, application records and log data from z/OS and sends data to an Apache Kafka broker or cluster of brokers. For more information about how the Data Collector works, see the lightweight architecture.
The Data Streamer receives data from the data gatherers, alters the data to make it consumable for the subscribers, and sends the data to the subscribers.

Lightweight architecture

To deploy a new and lightweight architecture of the Z Common Data Provider, install and run a single Data Collector on each z/OS system to collect SMF records, application records and log data, and publish the records and data to an Apache Kafka broker.

Operational data that is published to Apache Kafka topics can be directly consumed by the target analytics platform.
If the target analytics platform does not have the ability to consume data directly from Apache Kafka, then you can configure a central instance of System Data Engine and Data Streamer to read operational data from Apache Kafka and forward them to the target platform.

The illustration shows the flow of operational data (Data Collector specific), which is also described in the text. — Figure 2. Flow of operational data among the Data Collector, Apache Kafka clusters, and subscribers

The Data Collector in the Z Common Data Provider is a new, lightweight component for batch-loading or streaming operational data from z/OS to an Apache Kafka broker or cluster of brokers. You can run the Data Collector on every monitored z/OS system and collect SMF records, application records, and log data.
The Data Collector publishes the operational data to an Apache Kafka cluster that can run on z/OS or on another operating system on or off the mainframe. Each type of operational data is published to a separate Apache Kafka topic.
Consuming analytics platforms can subscribe directly to the Apache Kafka topics to retrieve operational data in batch or stream mode.
Alternatively, you can configure a central instance of System Data Engine and Data Streamer to subscribe to the Apache Kafka topics, to transform the received operational data and to forward the data to established Z Common Data Provider targets such as the Splunk HTTP Event Collector or Logstash.