Perspectives

AIOps – reducing the cost of downtime

Share this post:

For too long – data has been held captive within our systems of record.  Isolated by the rigidity of platform/application/workload choices, segregated by business line, business function, and data type or initial usage.

The result is splintered views of segmented data that’s difficult to access on the whole, and impossible to attempt to gain true analytical insight from.

Even this only speaks to the snapshot today and current models.  The challenges are compounded as businesses look to change, grow, iterate practices, innovate, or disrupt markets. Attempts at data science, machine learning, and deep learning are made moot by the fact that insights are only as good as the access to supporting data – which again is too fragmented to provide full value.

In order to change this paradigm, a hybrid data management strategy should contain the elements here:

  • Access to all data regardless of source or type
  • The flexibility to support changing workloads and consumption cases
  • Possess intelligent analytics such as machine learning AT the data source
  • Provide access to insights across the business, its functions, and to all users for better decision making
  • An evolution of IT Service Management in the shape of AIOps

So what is AIOps? There are many organisations offering AIOps solutions/services so the what is question but it isn’t as definitive as you may think, for me this best describes AIOps:

  1. The ability to bring together structured and unstructured data from a multitude of different repositories, applications and services to find the hidden gems that can add value to both the business and IT Operations.
  2. Train a machine learning solution so in the future automatic remediation of issues are carried out prior to them happening as the tell-tale signs of something starting to fail has already been identified and thus remediated.

There are many areas where AIOps will evolve todays IT Operations, but I wanted to look at one area that will make the business sit up and listen to the IT department – downtime!

Planned or unplanned, because downtime = money (that’s lost money/revenue as well as customer dissatisfaction, market share and possibly brand damage etc) these are some of the knock on effects of downtime.

The following is an all to common occurrence – IT Operations get a notification there is a problem, in some cases it can take them almost 5 hours and 17 separate steps across 4 different tools to diagnose the issue with approximately 10 people being involved with solving the incident. According to Aberdeen The (rising) cost of downtime industry report, an average incident can cost $260k per hour, and there are others, that have been well reported across the press that cost much, much more than this.

AIOps looks at all these different siloed data channels in real time, looking for important signals across structured and unstructured data types.

It groups events together based on spatial and temporal reasoning as well as similarity to past situations and synthesizes a holistic incident report.

That report is surfaced automatically in ChatOps (Slack, Zoom etc) to give the IT Operations key information as soon as it’s available:

  1. there’s a problem that IT Operations need to pay attention to.
  2. a pointer to where the problem is and other services that might be affected.
  3. evidence and advice to diagnose and resolve the situation.

With AIOps, this same workflow can take less than 15 minutes and almost all the work happens within ChatOps – IT Operations don’t have to jump from tool to tool losing time with context switching and don’t need to be in the same incident room!
The result – costs are radically reduced and instead of 10 people working on this for hours, one or two IT staff can confidently do their job with the information they need, delivered by the AIOps algorithms.

In summary – AIOps correlates disparate data across your environments/applications to derive hidden insights and help you identify incident root causes faster. Eliminating the need for multiple dashboards, insights and recommendations are fed directly into your existing workflows so you can rapidly resolve IT incidents.

If you want to learn more about IBM AIOps then join us at London Tech Week where Alex Signoret will be running a session on how AI for IT improves business outcomes, leads to increased revenue and lowers both cost and risk for organisations.

 

Red Hat Synergy Team (AM&I Cloud Paks)

More stories
By Mark Restall on 18 July, 2024

Multi-Modal Intelligence Platform

Traditionally, data management systems provided only numerical or textual based business intelligence primarily for back-office users across finance, sales, customer management and supply chain. Today, we are increasingly seeing data management systems which drive key business functions requiring interrogation of multi-modal data sets from documents, presentations, images, videos to audio. This demands a more sophisticated […]

Continue reading

By Mark Restall and others on 16 July, 2024

The use of GenAI to Migrate and Modernise Organisational Core Programming Languages

GenAI is hugely powerful and supports a diversity of use cases by focusing on routine work – allowing people to focus time on value-add tasks, thus enhancing productivity. The focus of this use case is for an organisation which had previously focussed on a legacy set of tooling and programming languages and needed a way […]

Continue reading

By M Shaikh on 25 June, 2024

Securing the AI Frontier: IBM’s Strategic Approach to Mitigating Risks in AI

From an obscure iPhone game developer to a central figure in a privacy firestorm, Hoan Ton-That’s Clearview AI made headlines in 2020 for all the wrong reasons. The company’s groundbreaking facial recognition technology, capable of matching faces to a vast database of images scraped from the internet, was already raising eyebrows. But when a security breach exposed Clearview AI’s client […]

Continue reading