Deriving insight into IT operations with Watson AIOps AI Manager

CIOs are in a constant struggle to balance innovation with stability, complexity, scale, and ensuring that their organization has the necessary skills to keep up with the ever-changing IT landscape. IBM Watson® AIOps AI Manager helps CIOs and site reliability engineers (SREs) by uncovering hidden insights from multiple sources of data (like logs, metrics, and events). It delivers those insights directly in to the tools that teams work with (like Slack) in near real time. AI Manager provides you with unprecedented visibility into your organization's infrastructure, enabling you to predict failures and facilitate problem resolution.

Installing AI Manager

For more information about installing AI Manager, see Watson AIOps AI Manager Prerequisites and requirements.

Available operations information

AI Manager uses Slack to provide a message-based interface for reporting incidents with the IT operations that you are monitoring. This ChatOps interface displays the following types of operations information:

Table 1. Operations information available from the ChatOps interface
Operation Description
Anomaly Detection Detects anomalies from data (real-time or offline).
Event Grouping Groups related events to aid incident diagnosis. Events can include, for example, Pager Duty alerts, Netcool® Operations Insight® alerts, or log anomalies.
Blast Radius and Fault localization Derives root fault component, and derives the full scope of components that are affected by an incident.
Incident Similarity For a particular incident, finds the highest "n"-ranked similar incidents from the past.
Next Best Action For a particular incident, suggests the highest "n" actions from similar incidents from the past.

Training models

You can train AI Manager to hone its ability to identify issues derived from your incoming data connections. To get the most out of AI Manager, you can manually map your log data to the JSON training format and train your models with it. For more information about training specific types of models, including suggested mappings for events and logs, see the following training topics:

For more information about what you can do to gain further insight into your IT infrastructure, see the following topics: