Published: 15 August 2023
Contributor: Michael Goodwin
Latency is a measurement of delay in a system. Network latency is the amount of time it takes for data to travel from one point to another across a network. A network with high latency will have slower response times, while a low-latency network will have faster response times.
Though in principle, data should traverse the internet at nearly the speed of light, in practice, data packets move across the internet at a slightly slower rate due to delays caused by distance, internet infrastructure, data packet size, network congestion and other variables.1 The sum of these time delays comprises a network’s latency.
Organizations can reduce latency and improve productivity and user experience by:
Learn about observability, why it's important to the modern enterprise and how improving observability can provide value to your organization.
Read a guide to intelligent automation
Maintaining a low latency network is important because latency directly affects productivity, collaboration, application performance and user experience. The higher the latency (and the slower response times) the more these areas suffer. Low latency is especially crucial as companies pursue digital transformation and become increasingly reliant on cloud-based applications and services within the Internet of Things.
Let’s start with an obvious example. If high network latency causes inadequate application performance or slow load times for an organization's clients, they are likely to look for alternative solutions. Now more than ever, individual and enterprise users alike expect lightning-fast performance. If an organization uses enterprise applications that rely on real-time data pulled from different sources to make resourcing recommendations, high latency can create inefficiencies. These inefficiencies can negatively impact an applications' performance and value.
All businesses prefer low latency. However, in industries and use cases that depend on sensor data or high-performance computing, like automated manufacturing, video-enabled remote operations (think cameras used in surgeries), live streaming or high-frequency trading, low latency is essential to the endeavor’s success.
High latency can also cause wasteful spending. Let’s say an organization wants to improve application and network performance by increasing or reallocating compute, storage and network resource spend. If it fails to address existing latency issues, the organization might end up with a larger bill without realizing improvement in performance, productivity or customer satisfaction.
Network latency is measured in milliseconds by calculating the time interval between the initiation of a send operation from a source system and the completion of the matching receive operation by the target system.2
One simple way to measure latency is by running a “ping” command, which is a network diagnostic tool used to test the connection between two devices or servers. During these speed tests, latency is often referred to as a ping rate.
In this test, an Internet Control Message Protocol (ICMP) echo request packet is sent to a target server and returned. A ping command calculates the time it takes for the packet to travel from source to destination and back again. This total travel time is referred to as round-trip time (RTT), equal to roughly double the latency, since data must travel to the server and back again. Ping is not considered an exact measurement of latency nor an ideal test for detecting directional network latency issues. This limitation is because data can travel over different network paths and encounter varying scenarios on each leg of the trip.
Latency, bandwidth and throughput are related, and sometimes confused as synonyms, but in fact refer to distinct network features. As we’ve noted, latency is the amount of time it takes for a packet of data to travel between two points across a network connection.
Bandwidth is a measurement of the volume of data that can pass through a network at any given time. It is measured in data units per second, such as megabits per second (mbps) or gigabits per second (gbps). Bandwidth is what you’re used to hearing about from your service provider when choosing connection options for your home. This is a source of great confusion, as bandwidth is not a measure of speed but of capacity. While high bandwidth can facilitate high internet speed, that capability is reliant on factors like latency and throughput as well.
Throughput is a measurement of the average amount of data that actually passes through a network in a specific time frame, taking into account the impact of latency. It reflects the number of data packets that arrive successfully and the amount of data packet loss. It is usually measured in bits per second, or sometimes, data per second.
Another factor in network performance is jitter. Jitter refers to the variation in latency of packet flows across a network. A consistent latency is preferable to high jitter, which can contribute to packet loss—data packets that are dropped during transmission and never arrive at their destination.
A simplified, but helpful way to remember the relationship between latency, bandwidth and throughput is that bandwidth is the amount of data that could travel over a network, throughput is the measure of how much actually transfers per second, and latency is the time it takes to do so.
Visualizing the journey data takes from client to server and back helps to understand latency and the various factors that contribute to it. Common causes of network latency include:
Plainly put, the greater the distance between the client initiating a request and the responding server, the higher the latency. The difference between a server in Chicago versus a server in New York responding to a user request in Los Angeles may only be a handful of milliseconds. But in this game, that’s a big deal, and those milliseconds add up.
Next, consider the medium across which data is traveling. Is it a network of fiber optic cables (generally lower latency) or a wireless network (generally higher latency), or a complex web of networks with multiple mediums, as is often the case?
The medium used for data transmission affects latency. As does the number of times data must pass through network devices like routers to move from one network segment to the next—network hops—before it reaches its destination. The greater the hop count, the higher the latency.
The size of data packets, as well as overall data volume on a network, both affect latency. Larger packets take longer to transmit, and if data volume exceeds the compute capacity of network infrastructure, bottlenecks and increased latency are likely to occur.
Outdated or insufficiently resourced servers, routers, hubs, switches and other network hardware can cause slower response times. For instance, if servers are receiving more data than they can handle, packets will be delayed, resulting in slower page loads, download speeds and application performance.
Page assets like images and videos with large file sizes, render-blocking resources and unnecessary characters in source code can all contribute to higher latency.
Sometimes latency is caused by factors on the user side, like insufficient bandwidth, poor internet connections or outdated equipment.
To reduce network latency, an organization might start with this network assessment:
- Is our data traveling along the shortest, most efficient route?
- Do our applications have the necessary resourcing for optimal performance?
- Is our network infrastructure up-to-date and appropriate for the job?
Let’s start with the distance issue. Where are users located? And where are the servers that respond to their requests? By distributing servers and databases geographically closer to users, an organization can cut down on the physical distance data needs to travel and reduce inefficient routing and network hops.
One way to distribute data globally is with a content delivery network, or CDN. Using a network of distributed servers enables an organization to store content closer to end users, reducing the distance data packets need to travel. But what if an organization wants to move beyond serving cached content?
Edge computing is a useful strategy, one that enables organizations to extend their cloud environment from the core data center to physical locations closer to their users and data. Through edge computing, organizations can run applications closer to end users and reduce latency.
A subnet is essentially a smaller network inside a larger network. Subnetting groups together end points that frequently communicate with each other, which can cut down on inefficient routing and reduce latency.
Traditional monitoring tools are not fast or thorough enough to proactively spot and contextualize performance issues in today’s complex environments. To stay ahead of issues, organizations can use advanced solutions that provide real-time, end-to-end observability and dependency mapping. These capabilities allow teams to pinpoint, contextualize, address and prevent application performance issues that contribute to network latency.
If workloads do not have the appropriate compute, storage and network resources, latency increases and performance suffers. Trying to solve this problem by overprovisioning is inefficient and wasteful, and attempting to manually match dynamic demand with resources in complex modern infrastructures is an impossible task.
An application resource management (ARM) solution that continually analyzes resource utilization and the performance of applications and infrastructure components in real time can help solve resourcing issues and reduce latency.
For example, if an ARM platform detects an application with high latency due to resource contention on a server, it can automatically allocate the necessary resources to the application or move it to a less congested server. Such automated actions help reduce latency and improve performance.
Tests like ping command can provide a simple measurement of network latency but are insufficient for pinpointing issues, much less addressing them. Organizations can use a network performance management solution that provides a unified platform to help teams spot, address and prevent network performance issues and reduce latency.
IT teams can make sure they are using up-to-date hardware, software and network configurations and that the organization's infrastructure can handle current demands. Performing regular network checks and maintenance can also help reduce performance issues and latency.
Developers can take steps to make sure that page construction does not add to latency, such as optimizing videos, images and other page assets for faster loading, and through code minification.
The IBM® Instana™ Observability platform provides enhanced application performance monitoring with automated full-stack visibility, 1-second granularity and 3 seconds to notify.
The IBM Turbonomic® hybrid cloud cost optimization platform allows you to continuously automate critical actions in real-time that proactively deliver the most efficient use of compute, storage and network resources to your apps at every layer of the stack.
Designed for modern networks, IBM SevOne® Network Performance Management (NPM) helps you proactively spot, address and prevent network performance issues with hybrid network observability.
Benefit from modern NPM capabilities that are dynamic, flexible and scalable.
Learn about advanced capabilities for network and application visibility, insight and action.
Learn about 5 steps that will help network operators and engineers quickly measure their network performance management capabilities against what is actually required in modern IT environments.
As software architecture paradigms evolve from monoliths to microservices, here’s how observability is helping developers take more responsibility for their programs, even after delivery.
Learn how the leading credit information provider in the Nordics used Instana Observability to enable the fast identification of bugs, lower existing latency and provide real-time visibility into every service request (with no sampling).
Learn how Dealerware's DevOps team used Instana Observability to reduce delivery latency by 98% during a period of exponential growth.
1“Internet at the Speed of Light”(link resides outside ibm.com), Yale.edu, 3 May 2022.
2“Effect of the network on performance”, IBM.com, 3 March 2021.