Introducing probable root cause: Enhancing Instana's Observability

We are thrilled to announce an enhancement to Instana® with the introduction of the probable root cause capability, now available in public preview starting from release 277. This capability delivers superior insights, allowing quick identification of the source of a system fault—with little to no investigation time.

Probable root cause

Working with IBM Research®, we designed an algorithm that use causal AI and differential observability to analyze data modalities such as traces and topology to identify unhealthy entities after an incident has been triggered. An entity refers to any component within a system that is monitored using Instana’s support for over 300 technologies. By analyzing various data modalities across your infrastructure, applications and services, we are able to identify the likely causes of application outages and point you toward dashboards that will expedite your investigation.

Additionally, we enrich this information by showcasing possible reasons that this entity may have failed, by showing all of the recent events on the identified probable root cause entity. We also present clear explainability as to why our AI identifies an entity as Probable Root Cause. Probable Root Cause also seamlessly directs you toward relevant metrics, traces and logs to speed up further diagnosis of the problem.

Currently, probable root cause automatically runs on all incidents triggered by smart alerts on the following entity types:

Key benefits

Immediate intelligence: Unlike traditional approaches that require extensive setup and training, probable root cause works out-of-the-box almost immediately. Whether you’re using software as a service or self-hosted deployment, you can start benefiting from enhanced observability without delay.
Comprehensive insights: Gain unparalleled visibility into your entire stack with Instana’s comprehensive data coverage. From frontend to backend, microservices to databases, probable root cause considers all aspects of your environment to deliver accurate diagnostics.
Explainable outputs: Instana’s approach is rooted in transparency. We provide clear visibility into the data sources and methodologies used to determine probable root causes, empowering your teams with actionable insights they can trust.
Secure data protection: Probable Root Cause delivers insights without the data ever leaving Instana, ensuring the sanctity and security of your valuable information.

Probable root cause in action

Here’s a sneak peek about how probable root cause can assist you in the Instana incident dashboard to quickly identify a problem.

In this example, an application smart alert is triggered in case of a sudden increase in number of erroneous calls.

Instana automatically identifies the root cause entity from the smart alert (in this case, an endpoint), provides additional explainability of that fault, as well as any associated events that occur on that entity. This allows the user to efficiently find the cause and prioritize resolving the incident.

Get started today

We invite you to explore the power of probable root cause in your own environment. Whether you’re an existing Instana user or exploring observability solutions for the first time, this feature promises to elevate your troubleshooting capabilities to new heights, providing a seamless experience.

To learn more about probable root cause, see our release notes and documentation for detailed guidance and instructions on how to best make use of this feature. If you are using probable root cause, we would love to hear your feedback and develop based on your input.

At Instana, we remain committed to delivering cutting-edge observability solutions that simplify complexity and empower teams to build and maintain resilient applications. Stay tuned for more updates as we continue to innovate in the field of Application Performance Monitoring (APM) and Observability.

We want to thank Saurabh Jha, Larisa Shwartz, Arthur De Magalhaes, Rohith R, and Melissa Denby for their insights and contributions to this work.

Author

Ameet Annasaheb Rahane

Data Scientist

Marc Palaci-Olgun

Machine Learning Engineer

Dan Stingaciu

Software Engineer

Neel Bhavsar

Software Engineer

Ragu Kattinakere

Senior Development Manager, AIOps, Instana

See IBM Instana in action

Introducing probable root cause: Enhancing Instana’s Observability

26 July 2024