OMEGAMON XE on z/OS presentation of zAware alerts

The zAware appliance monitors message traffic in one or more LPARs. zAware runs as a hardware feature in an LPAR, separate from the LPARs it monitors. Therefore, zAware does not directly use processing power in the monitored LPAR.

The zAware appliance builds a model of what is normal message traffic for each z/OS LPAR monitored and updates this model as it continues to monitor each LPAR. zAware generates an "anomaly score" for each monitored z/OS LPAR by using sophisticated statistical analysis to compare the current message traffic to the model. A new anomaly score is generated for each 10-minute period.
  • Anomaly scores from 99.6 to 100.9 are considered to be warning indicators. They indicate that there is message traffic in the monitored z/OS LPAR that is unusual and worthy of investigation.
  • Anomaly scores of 101 are considered critical and are even more important to investigate.

OMEGAMON XE on z/OS can connect to the zAware appliance and retrieve anomaly scores; this is done at the LPAR level. Both the Tivoli Enterprise Portal and the Enhanced 3270 user interface have workspaces that present the zAware anomaly scores for the last hour. The workspace highlights each anomaly score that exceeds the 99.6 or 101 thresholds.

Figure 1. The zAware Tivoli Enterprise Portal workspace for a selected LPAR
The zAware Tivoli Enterprise Portal workspace for a selected LPAR
Figure 2. The zAware Enhanced 3270 user interface workspace for a selected LPAR
The zAware Enhanced 3270 user interface workspace for a selected LPAR, displaying an anomaly score of 101.0

You can see from the screen captures that zAware detected anomalous messages. If you open a browser connection to the zAware server, you can see this same event.

Figure 3. Anomaly scores for several LPARs rendered in the zAware browser workspace
Anomaly scores for several LPARs rendered in the zAware browser workspace

In the browser interface, the Anomaly Scores graph displays the intervals for each LPAR as a series of vertical bars. Taller bars indicate a higher number of unique messages. Bars are also color-coded, depending on the anomaly level of the interval. Blue shades indicate anomaly levels below warning level. Yellow bars indicate warning level anomalies, and orange bars indicate critical anomalies. Select a bar from the graph to see the log messages that contributed most to that event.

Figure 4. zAware details for message traffic in the selected 10-minute interval
zAware details for message traffic in the selected 10-minute interval

By default, the message details screen is sorted by highest contribution to the interval's anomaly score. Click the Message ID cell to open an explanation of the message ID in a new browser window. This explanation can be helpful in understanding the incident that generated this anomaly score. In this scenario, a hardware failure forced several LPARs to restart.