We are thrilled to announce an enhancement to Instana® with the introduction of the probable root cause capability, now available in public preview starting from release 277. This capability delivers superior insights, allowing quick identification of the source of a system fault—with little to no investigation time.
Working with IBM Research®, we designed an algorithm that use causal AI and differential observability to analyze data modalities such as traces and topology to identify unhealthy entities after an incident has been triggered. An entity refers to any component within a system that is monitored using Instana’s support for over 300 technologies. By analyzing various data modalities across your infrastructure, applications and services, we are able to identify the likely causes of application outages and point you toward dashboards that will expedite your investigation.
Additionally, we enrich this information by showcasing possible reasons that this entity may have failed, by showing all of the recent events on the identified probable root cause entity. We also present clear explainability as to why our AI identifies an entity as Probable Root Cause. Probable Root Cause also seamlessly directs you toward relevant metrics, traces and logs to speed up further diagnosis of the problem.
Currently, probable root cause automatically runs on all incidents triggered by smart alerts on the following entity types:
Here’s a sneak peek about how probable root cause can assist you in the Instana incident dashboard to quickly identify a problem.
In this example, an application smart alert is triggered in case of a sudden increase in number of erroneous calls.
Instana automatically identifies the root cause entity from the smart alert (in this case, an endpoint), provides additional explainability of that fault, as well as any associated events that occur on that entity. This allows the user to efficiently find the cause and prioritize resolving the incident.
We invite you to explore the power of probable root cause in your own environment. Whether you’re an existing Instana user or exploring observability solutions for the first time, this feature promises to elevate your troubleshooting capabilities to new heights, providing a seamless experience.
To learn more about probable root cause, see our release notes and documentation for detailed guidance and instructions on how to best make use of this feature. If you are using probable root cause, we would love to hear your feedback and develop based on your input.
At Instana, we remain committed to delivering cutting-edge observability solutions that simplify complexity and empower teams to build and maintain resilient applications. Stay tuned for more updates as we continue to innovate in the field of Application Performance Monitoring (APM) and Observability.
We want to thank Saurabh Jha, Larisa Shwartz, Arthur De Magalhaes, Rohith R, and Melissa Denby for their insights and contributions to this work.