June 25, 2024 By Jeremy Hughes 3 min read

With digital transformation all around us, application environments are ever growing leading to greater complexity. Organizations are turning to observability to help them proactively address performance issues efficiently and are leveraging generative AI to gain a competitive edge in delivering exceptional user experiences. This is where Instana’s Intelligent Remediation comes in, as it enhances application performance and resolves issues, before they have a chance to impact customers.

Now generally available: Instana’s Intelligent Remediation

Announced at IBM Think 2024, I’m happy to share that Instana’s Intelligent Remediation is now generally available. These capabilities assist DevOps/SRE teams with over 90 actions, many of them automated, generated by watsonx.ai™, for diagnosing and resolving incidents.

Delivered as prescriptive manual steps, scripts and Ansible® action playbooks, these actions cover a wide array of technology areas including containers, Elasticsearch, Host, JVM, Kafka and Kubernetes.

Instana’s Intelligent Remediation in action

Now, let’s imagine a scenario: Instana has notified you of an incident. There’s a sudden increase in latency of requests to your website. Instana has been observing data from thousands of resources that make up your applications: processes, containers, virtual machines, Kubernetes clusters running websites, Java applications, databases, message queuing systems and more. But something subtle has snapped, like the proverbial tree falling in the uninhabited forest, no-one was listening out for. Except this time, it’s different. Instana was there.

Your website’s response latency has spiked and your Kubernetes hosted application isn’t handling requests as fast as it was. An incident has been created and the Automation section shows Automation Policies and Recommended Actions related to the incident. These are a part of Instana’s Automation Framework. A policy links the incident event type to an action and describes when and how to run the action. Instana has matched the incident with prior similar incidents that have occurred in this environment. Incidents that users have resolved by running an action and when that was successful, creating a policy. This is how the knowledge from work carried out by users, resolving issues, is retained and reused.

Until now, it is only prior work by DevOps/ITOps teams and their experiential knowledge, that can help other application stakeholders. While this is a great store of value, what happens when there are no prior cases of an incident? Enter watsonx.ai with the power of generative AI to help seed new solutions in the form of actions tailored to the context of the incident event.

Select and curate an action generated by AI from the context of an event

Solving the problem with Instana

Let’s go back to our Kubernetes hosted application. There was a significant increase in latency of web site responses. An incident was created and within the incident page, Recommended Actions section, the user can now view an action, generated by watsonx.ai. This takes them through the steps of diagnosing the application, scaling up the Kubernetes deployment and suggesting alternative steps if the issue persists. By adding the action to a policy, the user can share these steps to help future users with the same issue. With the Automation Framework and Action Catalog, users can capture their own actions and policies with other users to help them resolve similar incidents in the future. Intelligent Remediation assists this process further by adding actions generated by watsonx.ai correlated to the incident being viewed. When I think of what the future holds as we further integrate Instana’s original machine learning (ML) capabilities with watsonx.ai, I get excited about all the future incidents that won’t occur because of Instana automated remediation.

Learn more about Instana Intelligent Remediation capabilities

More from Automation

Announcing the general availability of IBM Concert

< 1 min read - At Think 2024, we announced IBM Concert®. It provides generative AI driven insights for your applications and puts site reliability engineers (SREs) and developers in control, enabling them to simplify and optimize their operations across any environment. IBM Concert is now generally available. You can now start using IBM Concert to get a detailed view of your applications and environments and apply generative AI to get insights on how to optimize your applications so your business works better.  Powered by…

Making HTTPS redirects easy with IBM NS1 Connect

3 min read - HTTPS is now the standard for application and website traffic on the internet. Over 85% of websites now use HTTPS by default—it’s to the point where a standard HTTP request now seems suspicious.  This is great for the security of the internet, but it’s a huge pain for the website and application teams that are managing HTTPS records. It was easy to move HTTP records around with a simple URL redirect. HTTPS redirects, on the other hand, require changing the URL…

5 SLA metrics you should be monitoring

7 min read - In business and beyond, communication is king. Successful service level agreements (SLAs) operate on this principle, laying the foundation for successful provider-customer relationships. A service level agreement (SLA) is a key component of technology vendor contracts that describes the terms of service between a service provider and a customer. SLAs describe the level of performance to be expected, how performance will be measured and repercussions if levels are not met. SLAs make sure that all stakeholders understand the service agreement…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters