August 5, 2024 By David Dalton 3 min read

Imagine your enterprise’s critical online services are suddenly down, and the IT operations team is working to identify the cause. Minutes turn into hours, and every second of downtime costs the company revenue and customer trust.  In a rush to recover the systems, it is critical that your technical experts can isolate and resolve the real problem—or better yet, the ability to get ahead of growing issues and avoid the outage altogether. 

This is where an effective cross-platform end-to-end observability strategy becomes essential, allowing organizations to gain rapid insights into the health of their applications and systems.

Meeting the challenge of complex online services

With services running across the hybrid-cloud including on-prem and multiple hyperscaler platforms, locations and regions, detecting latency and resource issues before they become critical is paramount.  As the number of services underpinning application flow increases, the manageability of this environment becomes more challenging.

For born-on-the-cloud applications, an observability approach is essential to provide a unified view of these dynamic dispersed environments. The role of Site Reliability Engineers (SREs) is also critical in ensuring the availability of the full end-to-end application or service. Rather than relying on a less comprehensive  view of each technology, the SRE’s application-centric focus identifies which services are performing suboptimally. This guides development teams as they make detailed investigations and fixes.

OpenTelemetry as a cloud-native observability solution

Observability depends on timely and effective telemetry signals from the underlying systems. The OpenTelemetry project is a direct community-led response to this need and aims to address the head-on challenge of navigating increased complexity. 

OpenTelemetry is a vendor-agnostic, open-source framework hosted by the Cloud Native Computing Foundation (CNCF).  It aims to enable effective observability across distributed applications and systems by providing an open standard and open tools that support high-quality telemetry data from any source to any target. By building on OpenTelemetry, the telemetry capabilities across different tools and domains can be simplified, making it easier to implement end-to-end observability solutions.

OpenTelemetry’s inherent concept of signal correlation enables the linking and association of different types of signals (such as traces, metrics and logs) to gain a comprehensive insight into an application’s behavior and resources. The OpenTelemetry Semantic Conventions support the correlation of signals by defining a common set of attributes, ensuring that standardized metadata facilitates their association. This is crucial for faster detection and resolution of incidents.

Bringing OpenTelemetry to the mainframe

With a growing number of enterprises unlocking the value of their mainframe investments as an integral part of these hybrid cloud environments, end-to-end observability must  also span the applications and data that reside on IBM Z®.  

This brings both teams to the table: SREs, for whom the transition of an application flow into the mainframe domain can obscure the full observability view, and mainframe teams, with their deep knowledge and tools.

As a widely consumed open standard, OpenTelemetry provides a richer set of tools to expedite the identification of the root cause of issues.  Mainframe subject matter experts, with deep mainframe-centric diagnostic tools, can apply these skills in a more targeted and effective fashion. With observability teams and SREs able to identify what is and, critically, what is not a mainframe issue, teams can focus their time more efficiently. This reduces the risk of outages, as well as resolution time.

OpenTelemetry support on IBM Z and IBM LinuxONE

With IBM and its partners already starting to support OpenTelemetry in our observability and monitoring tools, wider adoption is increasing.  We are working with the OpenTelemetry community, with our vendor partners and within our products across IBM Z and IBM LinuxONE to help enable a consistent end-to-end observability experience.  Our approach complements our existing operational management tools and instrumentation and focuses on providing high-quality and timely telemetry at appropriate system overhead.

The value of observability extends beyond operational efficiency. It’s about strategic foresight and competitive advantage. Business leaders are keenly interested in how observability through frameworks like OpenTelemetry can provide clarity amidst complexity and unlock the agility of their IT systems. The rewards can be significant, as they are designed to reduce downtime, increase business agility and improve IT resource utilization.

Learn what observability can do for your business
Was this article helpful?
YesNo

More from Cloud

Serverless use cases: How enterprises are using the technology to let developers innovate

6 min read - Serverless, or serverless computing, is an approach to software development that empowers developers to build and run application code without having to worry about maintenance tasks like installing software updates, security, monitoring and more. With the rise of cloud computing, serverless has become a popular tool for organizations looking to give developers more time to write and deploy code. Despite its name, a serverless framework doesn’t mean computing without servers. In a serverless architecture, a cloud service provider (CSP) handles…

Harnessing XaaS to reduce costs, risks and complexity

3 min read - To drive fast-paced innovation, enterprises are demanding models that focus on business outcomes, as opposed to only measuring IT results. At the same time, these enterprises are under increasing pressure to redesign their IT estates in order to lower cost and risk and reduce complexity. To meet these challenges, Everything as a Service (XaaS) is emerging as the solution that can help address these challenges by simplifying operations, reducing risk and accelerating digital transformation. According to an IDC white paper…

IBM Cloud Virtual Servers and Intel launch new custom cloud sandbox

4 min read - A new sandbox that uses IBM Cloud Virtual Servers for VPC invites customers into a nonproduction environment to test the performance of 2nd Gen and 4th Gen Intel® Xeon® processors across various applications. Addressing performance concerns in a test environment Performance testing is crucial to understanding the efficiency of complex applications inside your cloud hosting environment. Yes, even in managed enterprise environments like IBM Cloud®. Although we can deliver the latest hardware and software across global data centers designed for…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters