Latency Explained

You can observe the lack of bandwidth by observing latency, but not the other way around. To understand application outcome, we must understand how latency changes over time.

Watch video

More than just gaming

Latency affects everything you do online — for example, webpage load time, video conference quality, or the time it takes to start a video stream. It is a critical factor that significantly influences all your online experiences.

What is latency?

The Internet is a system for transmitting things from one place to another, much like a postal system. But, instead of letters and packages, the Internet transports data.

Data can take many forms: An email, a colleague on a video conference, or an important cat picture. In the postal system, mail is transported by post cars through a sequence of post offices. On the Internet, data is transported over cables and signals through a sequence of routers.

The animation illustrates how information is transferred back and forth between a laptop and a server. Latency is the time it takes for a packet to get to its destination. It's as simple as that.

Bandwidth is the number of packets, combined with their size, that can be sent at a time.

How does latency affect you?

The Internet is a system for transmitting things from one place to another, much like a postal system. But, instead of letters and packages, the Internet transports data.

Cloud and online game

The Internet is a system for transmitting things from one place to another, much like a postal system. But, instead of letters and packages, the Internet transports data.

Video calls

For video calls, it is important that the latency is slow and consistent. If not, you might experience out-of-sync video and sound, flickering and distortion.

Interactive AR and VR

Interactive AR and VR are the most latency sensitive applications we know. Small additions to latency can mean sea-sickness.

There is a big difference between prioritised and non-prioritised flow. The light blue Dom is your mouse movement, the dark blue Dom is a prioritised flow, and the red Dom is the normal flow.

What causes latency?

No matter the network, three primary factors significantly contribute to latency; physics and travel delay, processing, and queuing and buffering.

Physics and travel delay

The first main cause is simply the traveling time across the network, assuming there is no traffic or processing time. The speed limit on the Internet is primarily governed by the speed of light and how fast electromagnetic waves travel through cables or the air. The path of the cables also matters. This type of latency is typically stable. A rule of thumb is 100 ms across the Atlantic.

Processing

Just like a post office must figure out which post vehicle should transport the letters, Internet traffic is also processed at each “network stop”. Internet traffic typically goes through 5-20 “post offices”. In this case however, processing latency is normally the least significant cause in modern networks.

Queuing and buffering

Try drinking into a densely populated area during rush-hour and you will easily understand this cause. Much like drinking in rush hour traffic, queuing is often the leading cause of latency. Essentially, it is how long you must wait before getting a chance to transfer the data.

You share the network with your family, neighbours and people in other cities. Queues can occur many places.

Low and stable latency is key

Especially for applications where you interact with others.

How do our foundational network protocols induce latency?

Let’s start by taking the point of view of those who make the applications that use the Internet, like Zoom, Netflix, and Microsoft Teams. They don’t know how good your Internet connection is. So, the application itself is built to figure out how much data it can send at once.

To explain how this works, let’s use an analogy: An application has some amount of data it wants to send. Let’s imagine this as water in a tank. The drain represents the network, and the size of the drain is how much data can be sent at once. The key question is: How much can you open the tap?

How big is the drain (the bandwidth)?

Most applications want to send data as quickly as possible. The foundational Internet protocols TCP/BBR/QUIC figure out how much data can be sent at once. Fundamentally, they all rely on trial and error.

A key part to why the Internet is riddled by latency, is that the trial and error process induces latency.

The technique gradually opens the tap, until the sink overflows. As soon as the sink starts to overflow the tap is closed, as the capacity has been found. However, the time spent in the sink is additional queuing latency.

To avoid the instance of spilling over and get maximal utilization of the pipe, some thought it was a good idea to make the sink really big. Really really big.

This is part of a problem known as bufferbloat. A huge contributor to excess latency.

Our foundational Internet protocols induce latency as they must fill the sink to figure out how much data can be sent.

Is it this big? Or is it this big?

When the sink overflows, the capacity has been found.

Why do wireless networks with mobility and multiple users cause frequent latency spikes?

Wireless networks work much like talking. To be heard, one must speak at a time. Waiting for someone else to finish speaking is additional latency.

When seven devices are active at the same time, your device can send less than 1/7th of what it previously could.

When you talk to someone far away, you have to speak more slowly and clearly. For wireless traffic there is a 1000 times difference between slow and fast talking. Talking speeds can be reduced to a tenth in milliseconds. These changes in speed are what enables mobility and multiple users.

The tap and the drain: What do you think happens when the drain size changes?

If the tap is adjusted to the drain size, but suddenly most of the drain’s capacity is allocated to a family member or a neighbor, the sink will fill again and you will get latency.

When the drain’s capacity is allocated to a family member or neighbour, the sink will fill again and latency occurs.

Changing pipe capacity is what enables multiple users and mobility. 95 % of traffic use these network protocols. This is not a problem that will just go away.

The pipe size is how much data you can send at once. Right now it is 3 pieces at a time, as illustrated in the figure to the left. The network protocol has adjusted to this.

But then, the device moves further away from the router and it can no longer communicate as fast. Queue starts to build.

When another device starts a video conference - the green dot in the right image - it has to wait.

Aditionally, if a family member or a neighbor start to uses WiFi, there is even more waiting.

Internet protocols filling the drawing, and varying drain sizes of wireless networks, cause latency spikes.

Why does greater bandwidth not necessarily reduce latency?

Bandwidth is necessary, but not sufficient for a lag free Internet connection. Many applications require far less bandwidth than people imagine, as illustrated in the figure to the right.

You can have lagging experiences with more bandwidth than you need. However, it is nearly impossible to have lag without a latency spike or packet loss occurring. So, a high bandwidth network can still have high latency. Because of big sinks or changing pipe sizes.

Learn more ›

Key takeaways

Latency, sometimes called delay or just time, is the time it takes for a data packet to get to its destination.
Latency affects almost every application we use. Especially interactive applications. Most applications require relatively little bandwidth.
Latency is caused by physics, processing and queuing.
Latency is difficult to get rid of because: (1) Our network protocols induce queuing latency, and combined with wireless networks they cause latency spikes. (2) The laws of physics are hard to fight.
Latency is not necessarily reduced by increasing bandwidth, or by upgrading to the next generation of a technology.