IBM Cloud Pak® for Watson AIOps has partnered with IBM Research and customers to help make AI more explainable.

In the IBM Cloud Pak® for Watson AIOps, we have artificial intelligence (AI) that helps clients and users manage their applications, IT infrastructure and services. It’s AI that can use log, metric, topology, event and ticket data and chat history to learn normal behaviour and help customers avoid issues, resolve them faster when they do occur and automate resolutions.  But how do you trust that the artificial intelligence is doing what you want it to do?

Establishing a foundation of trust

For a start, we work closely with our colleagues in IBM Research — one of the largest industrial research organizations in the world.  We embrace “inner source” — the sharing of ideas and technology, developing them together for the common good of our customers.

From a data science perspective, there are a lot of tests that can be performed to determine the accuracy of AI. Those tests often rely on data sets that have an associated “ground truth,” which indicates the expected behaviour and measures how well the AI can replicate it. The data is typically provided by clients who work closely with us to help meet the desired use cases. This way, when other clients use our AI, they are using something that has been tested and validated with real-world data and honest feedback.

Insights, understanding and decisions

Ultimately, however, the best way to ensure that our users trust AI is by making it easily explainable. We present insights, and allow users to understand how that decision was reached.

Example 1: Temporal correlation

In the following example, we have presented a group of events that tend to co-occur, using our Temporal Correlation algorithm. To help build trust, a user can drill down to a view that shows them why we made the decision to group them together. Every green line represents an occurrence of the event, and the user can immediately see that most of the time, these events occur together. 

The strength of the algorithm can be seen in that we don’t need 100% overlap of events to determine that they tend to occur together. You can see this in the chart, where sometimes the events don’t occur together, but the relationship is still discovered:

Example 2: Metric anomaly

In this insight, we highlight that an anomaly has occurred because a metric, Number of Active Connections, “is now a flat line, where before it was varying.” The user can drill down into a view like the one shown below to view the history of the metric over time, together with a baseline and a red zone indicating precisely where the anomaly is occurring. The user can see that, previously, the metric has occasionally had a value of zero, but now it is at zero for much longer than normal. This is a good indication that the service has been interrupted or stopped:

Example 3: Seasonal events

For our final example, we use AI to highlight when events are occurring at non-random times. Knowing that an event occurs with a certain regular frequency is a good indication that you might be fixing the same problem over and over again. This is something that should be automated away — or the underlying cause addressed once and for all. It might highlight that this event is just noise that experienced operators know to ignore, so it would be good to filter it out altogether. To build trust, the user can drill down, where we present simple concise statements and easy-to-understand visualisations, as shown in the following diagram:

Knowing that this event seems to always occur on Fridays between 2pm and 3pm is good information. The user can also see it is not occurring at any other time. Through explainable AI, the user can build trust that other events enriched like this are doing what is expected.

Summary

Why is trust so important? The primary goal of AI is to help make our lives better and more efficient. If you trust the AI, you will be more likely to put it to use. When you are confident that the AI is doing what you expect and you can understand it, then you feel confident knowing your time is well spent taking action, investigating, triaging and automating the resolution — avoiding incidents, resolving them faster and resolving them automatically the next time they occur.

Let us help you build trust in IBM Cloud Pak® for Watson AIOps.

Was this article helpful?
YesNo

More from Cloud

Fortressing the digital frontier: A comprehensive look at IBM Cloud network security services

6 min read - The cloud revolution has fundamentally transformed how businesses operate. Its superior scalability, agility and cost-effectiveness have made it the go-to platform for organizations of all sizes. However, this shift to the cloud has introduced a new landscape of ever-evolving security threats. Data breaches and cyberattacks continue to hit organizations, making robust cloud network security an absolute necessity. IBM®, a titan in the tech industry, recognizes this critical need, provides a comprehensive suite of tools and offers unmatched expertise to fortify…

How well do you know your hypervisor and firmware?

6 min read - IBM Cloud® Virtual Private Cloud (VPC) is designed for secured cloud computing, and several features of our platform planning, development and operations help ensure that design. However, because security in the cloud is typically a shared responsibility between the cloud service provider and the customer, it’s essential for you to fully understand the layers of security that your workloads run on here with us. That’s why here, we detail a few key security components of IBM Cloud VPC that aim…

New IBM study: How business leaders can harness the power of gen AI to drive sustainable IT transformation

3 min read - As organizations strive to balance productivity, innovation and environmental responsibility, the need for sustainable IT practices is even more pressing. A new global study from the IBM Institute for Business Value reveals that emerging technologies, particularly generative AI, can play a pivotal role in advancing sustainable IT initiatives. However, successful transformation of IT systems demands a strategic and enterprise-wide approach to sustainability. The power of generative AI in sustainable IT Generative AI is creating new opportunities to transform IT operations…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters