The Z Common Data Provider provides the
infrastructure for accessing IT operational data from z/OS®
systems and streaming it to the analytics platform in a consumable format. It is a single data
provider for sources of both structured and unstructured data, and it can provide a near real-time
data feed of z/OS
operational data, like System Management Facilities (SMF) data and
z/OS log data to your analytics platform.
The Z Common Data Provider monitors and
collects SMF and z/OS log data and forwards it to the
configured destinations.
In each logical partition (LPAR) from which you want to analyze SMF data or z/OS log data, a unique instance of the Z Common Data Provider must be installed and configured to
specify the type of data to be gathered and the destination for that data, which is called a
subscriber.
The Z Common Data Provider includes a web-based
configuration tool that is provided as an application for IBM® WebSphere® Application Server for z/OS Liberty, or as a plug-in for IBM z/OS Management Facility (z/OSMF).
Flow of operational data to analytics platforms
The Z Common Data Provider consists of a set of components that can collect a wide variety of operational
data, transforms the data and sends the data to a range of target services and platforms. For more
information about the Z Common Data Provider components, see Components of the Z Common Data Provider.
Supported operational data
Operational data is data that is generated by the z/OS
system as it runs. This data describes the health of the system and the actions that are taking
place on the system. The analysis of operational data by analytics platforms and cognitive agents
can produce insights and recommended actions for making the system work more efficiently and for
resolving or preventing problems.
The Z Common Data Provider can collect the
following types of operational data:
System Management Facilities (SMF) data
z/OS log data from the following sources:
Job log, which is output that is written to a data definition (DD) by a running job
z/OS
UNIX log file, including the UNIX System Services system log (syslogd)
IBM
WebSphere Application Server for z/OS High Performance Extensible Logging (HPEL) log
z/OS
Resource Measurement
Facility (RMF) Monitor III reports
z/OS sequential data set
User application data, which is operational data from your own applications
Supported analytics platforms and data consumers
An analytics platform is a software program or group of dedicated systems and software that is
configured to receive, store, and analyze large volumes of operational data. The following analytics
platforms and data consumers are examples:
Z Data Analytics Platform, a component of IBM Z Operational Log and Data Analytics that can receive large volumes of
operational data for analysis and can provide insights and recommended actions to the system owners,
which are based on expert knowledge about z Systems® and
applications.
Enterprise platforms such as Splunk, the Elastic Stack, Apache Kafka, or Humio that can receive
and process operational data for analysis. The platforms like the Elastic Stack and Splunk do not
include expert knowledge about z Systems and
applications, but you can create or import your own analytics to run against the data.
IBM Db2® Analytics Accelerator for z/OS, a database application
that provides query-based reporting.
You can deploy full architecture or lightweight architecture for the Z Common Data Provider based
on your needs. Both architectures support stream mode and batch mode for processing operational data.
Full architecture
To deploy the full architecture of the Z Common Data Provider with
all the features enabled, install the following components on each z/OS system:
System Data Engine
Log Forwarder
Data Streamer
Data Collector
Figure 1. Z Common Data Provider overview
Figure 1 illustrates how operational data
(such as SMF data or log data) is gathered by data gatherers, such as the System Data Engine, the
Log Forwarder, or the Data Collector, and can be streamed to multiple subscribers.
The data gatherers collect different types of operational data:
The System Data Engine collects SMF records. It can run as a batch job to create output data for
the IBM Z Performance and Capacity Analytics, or run in
stream mode to collect data in near real time and send data to the Data Streamer for further
transform.
The Log Forwarder collects SYSLOG, OPERLOG, and middleware log data, and then sends data to the
Data Streamer.
The Open Streaming API is used to send your application data to the Data Streamer.
The Data Collector is a lightweight component in the Z Common Data Provider. It collects SMF records,
application records and log data from z/OS and sends data to
an Apache Kafka broker or cluster of brokers. For more information about how the Data Collector
works, see the lightweight
architecture.
The Data Streamer receives data from the data gatherers, alters the data to make it consumable
for the subscribers, and sends the data to the subscribers.
Lightweight architecture
To deploy a new and lightweight architecture of the Z Common Data Provider,
install and run a single Data Collector on each z/OS system
to collect SMF records, application records and log data, and publish the records and data to an
Apache Kafka broker.
Operational data that is published to Apache Kafka topics can be directly consumed by the target
analytics platform.
If the target analytics platform does not have the ability to consume data directly from Apache
Kafka, then you can configure a central instance of System Data Engine and Data Streamer to read
operational data from Apache Kafka and forward them to the target platform.
Figure 2. Flow of operational data among the Data Collector, Apache Kafka clusters, and
subscribers
The Data Collector in the Z Common Data Provider is a new, lightweight component for batch-loading or streaming
operational data from z/OS to an Apache Kafka broker or
cluster of brokers. You can run the Data Collector on every monitored z/OS system and collect SMF records, application records, and log data.
The Data Collector publishes the operational data to an Apache Kafka cluster that can run on z/OS or on another operating system on or off the mainframe. Each
type of operational data is published to a separate Apache Kafka topic.
Consuming analytics platforms can subscribe directly to the Apache Kafka topics to retrieve
operational data in batch or stream mode.
Alternatively, you can configure a central instance of System Data Engine and Data Streamer to
subscribe to the Apache Kafka topics, to transform the received operational data and to forward the
data to established Z Common Data Provider targets such as the Splunk HTTP Event Collector or Logstash.